public class MagicDetector extends java.lang.Object implements Detector
Modifier and Type | Field and Description |
---|---|
private boolean |
isRegex
True if pattern is a regular expression, false otherwise.
|
private boolean |
isStringIgnoreCase
True if we're doing a case-insensitive string match, false otherwise.
|
private int |
length
Length of the comparison window.
|
private byte[] |
mask
Bit mask that is applied to the source bytes before pattern matching.
|
private int |
offsetRangeBegin
First offset (inclusive) of the comparison window within the
document input stream.
|
private int |
offsetRangeEnd
Last offset (inclusive) of the comparison window within the document
input stream.
|
private byte[] |
pattern
The magic match pattern.
|
private int |
patternLength
Length of the pattern, which in the case of regular expressions will
not be the same as the comparison window length.
|
private MediaType |
type
The matching media type.
|
Constructor and Description |
---|
MagicDetector(MediaType type,
byte[] pattern)
Creates a detector for input documents that have the exact given byte
pattern at the beginning of the document stream.
|
MagicDetector(MediaType type,
byte[] pattern,
byte[] mask,
boolean isRegex,
boolean isStringIgnoreCase,
int offsetRangeBegin,
int offsetRangeEnd)
Creates a detector for input documents that meet the specified
magic match.
|
MagicDetector(MediaType type,
byte[] pattern,
byte[] mask,
boolean isRegex,
int offsetRangeBegin,
int offsetRangeEnd)
Creates a detector for input documents that meet the specified
magic match.
|
MagicDetector(MediaType type,
byte[] pattern,
byte[] mask,
int offsetRangeBegin,
int offsetRangeEnd)
Creates a detector for input documents that meet the specified magic
match.
|
MagicDetector(MediaType type,
byte[] pattern,
int offset)
Creates a detector for input documents that have the exact given byte
pattern at the given offset of the document stream.
|
Modifier and Type | Method and Description |
---|---|
private static byte[] |
decodeString(java.lang.String value,
java.lang.String type) |
private static byte[] |
decodeValue(java.lang.String value,
java.lang.String type) |
MediaType |
detect(java.io.InputStream input,
Metadata metadata)
Detects the content type of the given input document.
|
int |
getLength() |
static MagicDetector |
parse(MediaType mediaType,
java.lang.String type,
java.lang.String offset,
java.lang.String value,
java.lang.String mask) |
java.lang.String |
toString()
Returns a string representation of the Detection Rule.
|
private final MediaType type
detect(InputStream, Metadata)
method if a match is found.private final int length
private final byte[] pattern
type
is returned.private final int patternLength
private final boolean isRegex
private final boolean isStringIgnoreCase
private final byte[] mask
private final int offsetRangeBegin
private final int offsetRangeEnd
first offset
.
Note that this is not the offset of the last byte read from the document stream. Instead, the last window of bytes to be compared starts at this offset.
public MagicDetector(MediaType type, byte[] pattern)
type
- matching media typepattern
- magic match patternpublic MagicDetector(MediaType type, byte[] pattern, int offset)
type
- matching media typepattern
- magic match patternoffset
- offset of the pattern matchpublic MagicDetector(MediaType type, byte[] pattern, byte[] mask, int offsetRangeBegin, int offsetRangeEnd)
pattern
must NOT be a regular expression.
Constructor maintained for legacy reasons.public MagicDetector(MediaType type, byte[] pattern, byte[] mask, boolean isRegex, int offsetRangeBegin, int offsetRangeEnd)
public MagicDetector(MediaType type, byte[] pattern, byte[] mask, boolean isRegex, boolean isStringIgnoreCase, int offsetRangeBegin, int offsetRangeEnd)
public static MagicDetector parse(MediaType mediaType, java.lang.String type, java.lang.String offset, java.lang.String value, java.lang.String mask)
private static byte[] decodeValue(java.lang.String value, java.lang.String type)
private static byte[] decodeString(java.lang.String value, java.lang.String type)
public MediaType detect(java.io.InputStream input, Metadata metadata) throws java.io.IOException
Detector
application/octet-stream
if the type of the document
can not be detected.
If the document input stream is not available, then the first
argument may be null
. Otherwise the detector may
read bytes from the start of the stream to help in type detection.
The given stream is guaranteed to support the
mark feature
and the detector
is expected to mark
the stream before
reading any bytes from it, and to reset
the stream before returning. The stream must not be closed by the
detector.
The given input metadata is only read, not modified, by the detector.
public int getLength()
public java.lang.String toString()
toString
in class java.lang.Object