public class SafeContentHandler extends ContentHandlerDecorator
characters(char[], int, int)
or
ignorableWhitespace(char[], int, int)
) passed to the decorated
content handler contain only valid XML characters. All invalid characters
are replaced with spaces.
The XML standard defines the following Unicode character ranges as valid XML characters:
#x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
Note that currently this class only detects those invalid characters whose UTF-16 representation fits a single char. Also, this class does not ensure that the UTF-16 encoding of incoming characters is correct.
Modifier and Type | Class and Description |
---|---|
protected static interface |
SafeContentHandler.Output
Internal interface that allows both character and
ignorable whitespace content to be filtered the same way.
|
private static class |
SafeContentHandler.StringOutput |
Modifier and Type | Field and Description |
---|---|
private SafeContentHandler.Output |
charactersOutput
Output through the
ContentHandler.characters(char[], int, int)
method of the decorated content handler. |
private SafeContentHandler.Output |
ignorableWhitespaceOutput
Output through the
ContentHandler.ignorableWhitespace(char[], int, int)
method of the decorated content handler. |
private static char[] |
REPLACEMENT
Replacement for invalid characters.
|
Constructor and Description |
---|
SafeContentHandler(org.xml.sax.ContentHandler handler) |
Modifier and Type | Method and Description |
---|---|
void |
characters(char[] ch,
int start,
int length) |
void |
endDocument() |
void |
endElement(java.lang.String uri,
java.lang.String localName,
java.lang.String name) |
private void |
filter(char[] ch,
int start,
int length,
SafeContentHandler.Output output)
Filters and outputs the contents of the given input buffer.
|
void |
ignorableWhitespace(char[] ch,
int start,
int length) |
protected boolean |
isInvalid(int ch)
Checks whether the given Unicode character is an invalid XML character
and should be replaced for output.
|
private boolean |
isInvalid(java.lang.String value)
Checks if the given string contains any invalid XML characters.
|
void |
startElement(java.lang.String uri,
java.lang.String localName,
java.lang.String name,
org.xml.sax.Attributes atts) |
protected void |
writeReplacement(SafeContentHandler.Output output)
Outputs the replacement for an invalid character.
|
endPrefixMapping, handleException, processingInstruction, setContentHandler, setDocumentLocator, skippedEntity, startDocument, startPrefixMapping, toString
private static final char[] REPLACEMENT
private final SafeContentHandler.Output charactersOutput
ContentHandler.characters(char[], int, int)
method of the decorated content handler.private final SafeContentHandler.Output ignorableWhitespaceOutput
ContentHandler.ignorableWhitespace(char[], int, int)
method of the decorated content handler.private void filter(char[] ch, int start, int length, SafeContentHandler.Output output) throws org.xml.sax.SAXException
ch
- input bufferstart
- start offset within the bufferlength
- number of characters to read from the bufferoutput
- output channelorg.xml.sax.SAXException
- if the filtered characters could not be written outprivate boolean isInvalid(java.lang.String value)
value
- string to be checkedtrue
if the string contains invalid XML characters,
false
otherwiseprotected boolean isInvalid(int ch)
Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
ch
- charactertrue
if the character should be replaced,
false
otherwiseprotected void writeReplacement(SafeContentHandler.Output output) throws org.xml.sax.SAXException
output
- where the replacement is written toorg.xml.sax.SAXException
- if the replacement could not be writtenpublic void startElement(java.lang.String uri, java.lang.String localName, java.lang.String name, org.xml.sax.Attributes atts) throws org.xml.sax.SAXException
startElement
in interface org.xml.sax.ContentHandler
startElement
in class ContentHandlerDecorator
org.xml.sax.SAXException
public void endElement(java.lang.String uri, java.lang.String localName, java.lang.String name) throws org.xml.sax.SAXException
endElement
in interface org.xml.sax.ContentHandler
endElement
in class ContentHandlerDecorator
org.xml.sax.SAXException
public void endDocument() throws org.xml.sax.SAXException
endDocument
in interface org.xml.sax.ContentHandler
endDocument
in class ContentHandlerDecorator
org.xml.sax.SAXException
public void characters(char[] ch, int start, int length) throws org.xml.sax.SAXException
characters
in interface org.xml.sax.ContentHandler
characters
in class ContentHandlerDecorator
org.xml.sax.SAXException
public void ignorableWhitespace(char[] ch, int start, int length) throws org.xml.sax.SAXException
ignorableWhitespace
in interface org.xml.sax.ContentHandler
ignorableWhitespace
in class ContentHandlerDecorator
org.xml.sax.SAXException