public class SecureContentHandler extends ContentHandlerDecorator
Currently this class simply compares the number of output characters to to the number of input bytes and keeps track of the XML nesting levels. An exception gets thrown if the output seems excessive compared to the input document. This is a strong indication of a zip bomb.
Modifier and Type | Class and Description |
---|---|
private class |
SecureContentHandler.SecureSAXException
Private exception class used to indicate a suspected zip bomb.
|
Modifier and Type | Field and Description |
---|---|
private long |
characterCount
Number of output characters that Tika has produced so far.
|
private int |
currentDepth
The current XML element depth.
|
private int |
maxDepth
Maximum XML element nesting level.
|
private int |
maxPackageEntryDepth
Maximum package entry nesting level.
|
private java.util.LinkedList<java.lang.Integer> |
packageEntryDepths
Current number of nested <div class="package-entr"> elements.
|
private long |
ratio
Maximum compression ratio.
|
private TikaInputStream |
stream
The input stream that Tika is parsing.
|
private long |
threshold
Output threshold.
|
Constructor and Description |
---|
SecureContentHandler(org.xml.sax.ContentHandler handler,
TikaInputStream stream)
Decorates the given content handler with zip bomb prevention based
on the count of bytes read from the given counting input stream.
|
Modifier and Type | Method and Description |
---|---|
private void |
advance(int length)
Records the given number of output characters (or more accurately
UTF-16 code units).
|
void |
characters(char[] ch,
int start,
int length) |
void |
endElement(java.lang.String uri,
java.lang.String localName,
java.lang.String name) |
private long |
getByteCount() |
long |
getMaximumCompressionRatio()
Returns the maximum compression ratio.
|
int |
getMaximumDepth()
Returns the maximum XML element nesting level.
|
int |
getMaximumPackageEntryDepth()
Returns the maximum package entry nesting level.
|
long |
getOutputThreshold()
Returns the configured output threshold.
|
void |
ignorableWhitespace(char[] ch,
int start,
int length) |
void |
setMaximumCompressionRatio(long ratio)
Sets the ratio between output characters and input bytes.
|
void |
setMaximumDepth(int depth)
Sets the maximum XML element nesting level.
|
void |
setMaximumPackageEntryDepth(int depth)
Sets the maximum package entry nesting level.
|
void |
setOutputThreshold(long threshold)
Sets the threshold for output characters before the zip bomb prevention
is activated.
|
void |
startElement(java.lang.String uri,
java.lang.String localName,
java.lang.String name,
org.xml.sax.Attributes atts) |
void |
throwIfCauseOf(org.xml.sax.SAXException e)
Converts the given
SAXException to a corresponding
TikaException if it's caused by this instance detecting
a zip bomb. |
endDocument, endPrefixMapping, handleException, processingInstruction, setContentHandler, setDocumentLocator, skippedEntity, startDocument, startPrefixMapping, toString
private final TikaInputStream stream
private long characterCount
private int currentDepth
private java.util.LinkedList<java.lang.Integer> packageEntryDepths
private long threshold
private long ratio
private int maxDepth
private int maxPackageEntryDepth
public SecureContentHandler(org.xml.sax.ContentHandler handler, TikaInputStream stream)
handler
- the content handler to be decoratedstream
- the input stream to be parsedpublic long getOutputThreshold()
public void setOutputThreshold(long threshold)
threshold
- new output thresholdpublic long getMaximumCompressionRatio()
public void setMaximumCompressionRatio(long ratio)
ratio
- new maximum compression ratiopublic int getMaximumDepth()
public void setMaximumPackageEntryDepth(int depth)
depth
- maximum package entry nesting levelpublic int getMaximumPackageEntryDepth()
public void setMaximumDepth(int depth)
depth
- maximum XML element nesting levelpublic void throwIfCauseOf(org.xml.sax.SAXException e) throws TikaException
SAXException
to a corresponding
TikaException
if it's caused by this instance detecting
a zip bomb.e
- SAX exceptionTikaException
- zip bomb exceptionprivate long getByteCount() throws org.xml.sax.SAXException
org.xml.sax.SAXException
private void advance(int length) throws org.xml.sax.SAXException
length
- number of new output characters producedorg.xml.sax.SAXException
- if a zip bomb is detectedpublic void startElement(java.lang.String uri, java.lang.String localName, java.lang.String name, org.xml.sax.Attributes atts) throws org.xml.sax.SAXException
startElement
in interface org.xml.sax.ContentHandler
startElement
in class ContentHandlerDecorator
org.xml.sax.SAXException
public void endElement(java.lang.String uri, java.lang.String localName, java.lang.String name) throws org.xml.sax.SAXException
endElement
in interface org.xml.sax.ContentHandler
endElement
in class ContentHandlerDecorator
org.xml.sax.SAXException
public void characters(char[] ch, int start, int length) throws org.xml.sax.SAXException
characters
in interface org.xml.sax.ContentHandler
characters
in class ContentHandlerDecorator
org.xml.sax.SAXException
public void ignorableWhitespace(char[] ch, int start, int length) throws org.xml.sax.SAXException
ignorableWhitespace
in interface org.xml.sax.ContentHandler
ignorableWhitespace
in class ContentHandlerDecorator
org.xml.sax.SAXException