public class StringsParser extends AbstractParser
Modifier and Type | Field and Description |
---|---|
private static FileConfig |
DEFAULT_FILE_CONFIG |
private static StringsConfig |
DEFAULT_STRINGS_CONFIG |
private static long |
serialVersionUID
Serial version UID
|
private static java.util.Map<java.lang.String,java.lang.Boolean[]> |
STRINGS_PRESENT |
private static java.util.Set<MediaType> |
SUPPORTED_TYPES |
Constructor and Description |
---|
StringsParser() |
Modifier and Type | Method and Description |
---|---|
private java.lang.String |
doFile(java.io.File input,
FileConfig config)
Runs the "file" command on the given file that aims at providing an
alternative way to determine the file type.
|
private int |
doStrings(java.io.File input,
StringsConfig config,
XHTMLContentHandler xhtml)
Runs the "strings" command on the given file.
|
private int |
extractOutput(java.io.InputStream stream,
XHTMLContentHandler xhtml)
Extracts ASCII strings using the "strings" command.
|
static java.lang.String |
getFileProg() |
static java.lang.String |
getStringsProg() |
java.util.Set<MediaType> |
getSupportedTypes(ParseContext context)
Returns the set of media types supported by this parser when used
with the given parse context.
|
private boolean |
hasFile(FileConfig config)
Checks if the "file" command is supported.
|
private boolean |
hasStrings(StringsConfig config)
Checks if the "strings" command is supported.
|
void |
parse(java.io.InputStream stream,
org.xml.sax.ContentHandler handler,
Metadata metadata,
ParseContext context)
Parses a document stream into a sequence of XHTML SAX events.
|
parse
private static final long serialVersionUID
private static final java.util.Set<MediaType> SUPPORTED_TYPES
private static final StringsConfig DEFAULT_STRINGS_CONFIG
private static final FileConfig DEFAULT_FILE_CONFIG
private static java.util.Map<java.lang.String,java.lang.Boolean[]> STRINGS_PRESENT
public java.util.Set<MediaType> getSupportedTypes(ParseContext context)
Parser
context
- parse contextpublic void parse(java.io.InputStream stream, org.xml.sax.ContentHandler handler, Metadata metadata, ParseContext context) throws java.io.IOException, org.xml.sax.SAXException, TikaException
Parser
The given document stream is consumed but not closed by this method. The responsibility to close the stream remains on the caller.
Information about the parsing context can be passed in the context parameter. See the parser implementations for the kinds of context information they expect.
stream
- the document stream (input)handler
- handler for the XHTML SAX events (output)metadata
- document metadata (input and output)context
- parse contextjava.io.IOException
- if the document stream could not be readorg.xml.sax.SAXException
- if the SAX events could not be processedTikaException
- if the document could not be parsedprivate boolean hasStrings(StringsConfig config)
config
- StringsConfig
object used for testing the strings
command.true
if the strings command is supported.private boolean hasFile(FileConfig config)
config
- private int doStrings(java.io.File input, StringsConfig config, XHTMLContentHandler xhtml) throws java.io.IOException, TikaException, org.xml.sax.SAXException
input
- File
object that represents the file to parse.config
- StringsConfig
object including the strings
configuration.xhtml
- XHTMLContentHandler
object.java.io.IOException
- if any I/O error occurs.TikaException
- if the parsing process has been interrupted.org.xml.sax.SAXException
private int extractOutput(java.io.InputStream stream, XHTMLContentHandler xhtml) throws org.xml.sax.SAXException, java.io.IOException
stream
- InputStream
object used for reading the binary file.xhtml
- XHTMLContentHandler
object.org.xml.sax.SAXException
- if the content element could not be written.java.io.IOException
- if any I/O error occurs.private java.lang.String doFile(java.io.File input, FileConfig config) throws java.io.IOException
input
- File
object that represents the file to detect.java.io.IOException
- if any I/O error occurs.public static java.lang.String getStringsProg()
public static java.lang.String getFileProg()