Package com.fasterxml.aalto.async
Class AsyncByteScanner
java.lang.Object
com.fasterxml.aalto.in.XmlScanner
com.fasterxml.aalto.in.ByteBasedScanner
com.fasterxml.aalto.async.AsyncByteScanner
- All Implemented Interfaces:
AsyncInputFeeder
,XmlConsts
,NamespaceContext
,XMLStreamConstants
- Direct Known Subclasses:
AsyncByteArrayScanner
,AsyncByteBufferScanner
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected XmlCharTypes
This is a simple container object that is used to access the decoding tables for characters.protected int
Bytes parsed for the current, incomplete, quadprotected int
Number of bytes pending/buffered, stored in_currQuad
protected boolean
protected boolean
protected PName
protected int
Pointer for the next character of currently being parsed value within attribute value bufferprotected byte
protected int
Pointer for the next character of currently being parsed namespace URI for the current namespace declarationprotected boolean
Flag that is sent when calling application indicates that there will be no more input to parse.protected int
Entity value accumulated so farprotected boolean
Flag that indicates whether we are inside a declaration during parsing of internal DTD subset.protected int
Due to asynchronous nature of parsing, we may know what event we are trying to parse, even if it's not yet complete.protected int
There are some multi-byte combinations that must be handled as a unit: CR+LF linefeeds, multi-byte UTF-8 characters, and multi-character end markers for comments and PIs.protected int[]
This buffer is used for name parsing.protected int
Number of complete quads parsed for current name (quads themselves are stored in_quadBuffer
).protected int
In addition to the event type, there is need for additional state informationprotected int
For token/state combinations that are 'shared' between events (or embedded in them), this is where the surrounding event state is retained.protected ByteBasedPNameTable
For now, symbol table contains prefixed names.protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
Default starting state for many events/contexts -- nothing has been seen so far, no event incomplete.protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
State in which a less-than sign has been seenprotected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
protected static final int
Fields inherited from class com.fasterxml.aalto.in.ByteBasedScanner
_inputEnd, _inputPtr, _tmpChar, BYTE_a, BYTE_A, BYTE_AMP, BYTE_APOS, BYTE_C, BYTE_CR, BYTE_D, BYTE_EQ, BYTE_EXCL, BYTE_g, BYTE_GT, BYTE_HASH, BYTE_HYPHEN, BYTE_l, BYTE_LBRACKET, BYTE_LF, BYTE_LT, BYTE_m, BYTE_NULL, BYTE_o, BYTE_p, BYTE_P, BYTE_q, BYTE_QMARK, BYTE_QUOT, BYTE_RBRACKET, BYTE_s, BYTE_S, BYTE_SEMICOLON, BYTE_SLASH, BYTE_SPACE, BYTE_t, BYTE_T, BYTE_TAB, BYTE_u, BYTE_x
Fields inherited from class com.fasterxml.aalto.in.XmlScanner
_attrCollector, _attrCount, _cfgCoalescing, _cfgLazyParsing, _config, _currElem, _currNsCount, _currRow, _currToken, _defaultNs, _depth, _entityPending, _isEmptyTag, _lastNsContext, _lastNsDecl, _nameBuffer, _nsBindingCache, _nsBindingCount, _nsBindings, _nsBindMisses, _pastBytesOrChars, _publicId, _rowStartOffset, _startColumn, _startRawOffset, _startRow, _systemId, _textBuilder, _tokenIncomplete, _tokenName, _xml11, CDATA_STR, INT_0, INT_9, INT_a, INT_A, INT_AMP, INT_APOS, INT_COLON, INT_CR, INT_EQ, INT_EXCL, INT_f, INT_F, INT_GT, INT_HYPHEN, INT_LBRACKET, INT_LF, INT_LT, INT_NULL, INT_QMARK, INT_QUOTE, INT_RBRACKET, INT_SLASH, INT_SPACE, INT_TAB, INT_z, MAX_UNICODE_CHAR, TOKEN_EOI
Fields inherited from interface com.fasterxml.aalto.util.XmlConsts
CHAR_CR, CHAR_LF, CHAR_NULL, CHAR_SPACE, STAX_DEFAULT_OUTPUT_ENCODING, STAX_DEFAULT_OUTPUT_VERSION, XML_DECL_KW_ENCODING, XML_DECL_KW_STANDALONE, XML_DECL_KW_VERSION, XML_SA_NO, XML_SA_YES, XML_V_10, XML_V_10_STR, XML_V_11, XML_V_11_STR, XML_V_UNKNOWN
Fields inherited from interface javax.xml.stream.XMLStreamConstants
ATTRIBUTE, CDATA, CHARACTERS, COMMENT, DTD, END_DOCUMENT, END_ELEMENT, ENTITY_DECLARATION, ENTITY_REFERENCE, NAMESPACE, NOTATION_DECLARATION, PROCESSING_INSTRUCTION, SPACE, START_DOCUMENT, START_ELEMENT
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected void
Initialization method to call when encoding has been definitely figured out, from XML declarations, or, from lack of one (using defaults).protected void
Since the async scanner has no access to whatever passes content, there is no input source in same sense as with blocking scanner; and there is nothing to close.protected abstract byte
protected final PName
_findXmlDeclName
(int lastQuad, int lastByteCount) protected abstract byte
private final PName
_parseNewXmlDeclName
(byte b) private final PName
protected abstract byte
protected void
protected int
Helper method called when it is determined that the document does NOT start with an xml declaration.protected final PName
addPName
(ByteBasedPNameTable symbols, int hash, int[] quads, int qlen, int lastQuadBytes) protected abstract boolean
protected void
checkPITargetName
(PName targetName) protected int
decodeCharForError
(byte b) Method called by methods when encountering a byte that can not be part of a valid character in the current context.void
Method that should be called after last chunk of data to parse has been fed.protected final PName
findPName
(int lastQuad, int lastByteCount) Method called to process a sequence of bytes that is likely to be a PName.protected void
protected abstract void
protected void
protected void
finishDTD
(boolean copyContents) protected void
finishPI()
protected void
protected final void
This method is called to ensure that the current token/event has been completely parsed, such that we have all the data needed to return it (textual content, PI data, comment text etc)protected abstract boolean
protected abstract int
private int
protected abstract boolean
handleDTDInternalSubset
(boolean init) protected abstract boolean
protected abstract boolean
protected abstract int
handlePI()
private final int
handlePrologDeclStart
(boolean isProlog) protected abstract int
protected abstract int
handleStartElementStart
(byte b) private int
Method called to complete parsing of XML declaration, once it has been reliably detected.protected boolean
loadMore()
final int
nextFromProlog
(boolean isProlog) private final boolean
parseDtdId
(char[] outputBuffer, int outputPtr, boolean system) protected abstract PName
parseNewName
(byte b) protected abstract PName
protected boolean
parseXmlDeclAttr
(char[] outputBuffer, int outputPtr) Method called to try to parse an XML pseudo-attribute value.protected void
reportInvalidOther
(int mask, int ptr) protected void
protected abstract boolean
protected void
protected void
skipPI()
protected void
protected abstract int
startCharacters
(byte b) Method called to initialize state for CHARACTERS event, after just a single byte has been seen.private final Boolean
Method that deals with recognizing XML declaration, but not with parsing its contents.protected int
protected boolean
validPublicIdChar
(int c) Checks that a character for a PublicIdprotected void
verifyAndAppendEntityCharacter
(int charFromEntity) Method called to verify validity of given character (from entity) and append it to the text bufferprotected void
protected void
protected void
protected void
protected void
Methods inherited from class com.fasterxml.aalto.in.ByteBasedScanner
addUTFPName, getCurrentColumnNr, getCurrentLocation, getEndingByteOffset, getEndingCharOffset, getStartingByteOffset, getStartingCharOffset, markLF, markLF, reportInvalidInitial, reportInvalidOther, setStartLocation
Methods inherited from class com.fasterxml.aalto.in.XmlScanner
bindName, bindNs, checkImmutableBinding, close, decodeAttrBinaryValue, decodeAttrValue, decodeAttrValues, decodeElements, findAttrIndex, findOrCreateBinding, fireSaxCharacterEvents, fireSaxCommentEvent, fireSaxEndElement, fireSaxPIEvent, fireSaxSpaceEvents, fireSaxStartElement, getAttrCollector, getAttrCount, getAttrLocalName, getAttrNsURI, getAttrPrefix, getAttrPrefixedName, getAttrQName, getAttrType, getAttrValue, getAttrValue, getConfig, getCurrentLineNr, getDepth, getDTDPublicId, getDTDSystemId, getEndLocation, getInputPublicId, getInputSystemId, getName, getNamespacePrefix, getNamespaceURI, getNamespaceURI, getNamespaceURI, getNonTransientNamespaceContext, getNsCount, getPrefix, getPrefixes, getQName, getStartLocation, getText, getText, getTextCharacters, getTextCharacters, getTextLength, handleInvalidXmlChar, hasEmptyStack, isAttrSpecified, isEmptyTag, isTextWhitespace, loadMoreGuaranteed, loadMoreGuaranteed, nextFromTree, reportDoubleHyphenInComments, reportDuplicateNsDecl, reportEntityOverflow, reportEofInName, reportIllegalCDataEnd, reportIllegalNsDecl, reportIllegalNsDecl, reportInputProblem, reportInvalidNameChar, reportInvalidNsIndex, reportInvalidXmlChar, reportMissingPISpace, reportMultipleColonsInName, reportPrologProblem, reportPrologUnexpChar, reportPrologUnexpElement, reportTreeUnexpChar, reportUnboundPrefix, reportUnexpandedEntityInAttr, reportUnexpectedEndTag, resetForDecoding, skipCoalescedText, skipToken, throwInvalidSpace, throwNullChar, throwUnexpectedChar, verifyXmlChar
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface com.fasterxml.aalto.AsyncInputFeeder
needMoreInput
-
Field Details
-
EVENT_INCOMPLETE
protected static final int EVENT_INCOMPLETE- See Also:
-
STATE_DEFAULT
protected static final int STATE_DEFAULTDefault starting state for many events/contexts -- nothing has been seen so far, no event incomplete. Not used for all event types.- See Also:
-
STATE_PROLOG_INITIAL
protected static final int STATE_PROLOG_INITIALState in which a less-than sign has been seen- See Also:
-
STATE_PROLOG_SEEN_LT
protected static final int STATE_PROLOG_SEEN_LT- See Also:
-
STATE_PROLOG_DECL
protected static final int STATE_PROLOG_DECL- See Also:
-
STATE_TREE_SEEN_LT
protected static final int STATE_TREE_SEEN_LT- See Also:
-
STATE_TREE_SEEN_AMP
protected static final int STATE_TREE_SEEN_AMP- See Also:
-
STATE_TREE_SEEN_EXCL
protected static final int STATE_TREE_SEEN_EXCL- See Also:
-
STATE_TREE_SEEN_SLASH
protected static final int STATE_TREE_SEEN_SLASH- See Also:
-
STATE_TREE_NUMERIC_ENTITY_START
protected static final int STATE_TREE_NUMERIC_ENTITY_START- See Also:
-
STATE_TREE_NAMED_ENTITY_START
protected static final int STATE_TREE_NAMED_ENTITY_START- See Also:
-
STATE_XMLDECL_AFTER_XML
protected static final int STATE_XMLDECL_AFTER_XML- See Also:
-
STATE_XMLDECL_BEFORE_VERSION
protected static final int STATE_XMLDECL_BEFORE_VERSION- See Also:
-
STATE_XMLDECL_VERSION
protected static final int STATE_XMLDECL_VERSION- See Also:
-
STATE_XMLDECL_AFTER_VERSION
protected static final int STATE_XMLDECL_AFTER_VERSION- See Also:
-
STATE_XMLDECL_VERSION_EQ
protected static final int STATE_XMLDECL_VERSION_EQ- See Also:
-
STATE_XMLDECL_VERSION_VALUE
protected static final int STATE_XMLDECL_VERSION_VALUE- See Also:
-
STATE_XMLDECL_AFTER_VERSION_VALUE
protected static final int STATE_XMLDECL_AFTER_VERSION_VALUE- See Also:
-
STATE_XMLDECL_BEFORE_ENCODING
protected static final int STATE_XMLDECL_BEFORE_ENCODING- See Also:
-
STATE_XMLDECL_ENCODING
protected static final int STATE_XMLDECL_ENCODING- See Also:
-
STATE_XMLDECL_AFTER_ENCODING
protected static final int STATE_XMLDECL_AFTER_ENCODING- See Also:
-
STATE_XMLDECL_ENCODING_EQ
protected static final int STATE_XMLDECL_ENCODING_EQ- See Also:
-
STATE_XMLDECL_ENCODING_VALUE
protected static final int STATE_XMLDECL_ENCODING_VALUE- See Also:
-
STATE_XMLDECL_AFTER_ENCODING_VALUE
protected static final int STATE_XMLDECL_AFTER_ENCODING_VALUE- See Also:
-
STATE_XMLDECL_BEFORE_STANDALONE
protected static final int STATE_XMLDECL_BEFORE_STANDALONE- See Also:
-
STATE_XMLDECL_STANDALONE
protected static final int STATE_XMLDECL_STANDALONE- See Also:
-
STATE_XMLDECL_AFTER_STANDALONE
protected static final int STATE_XMLDECL_AFTER_STANDALONE- See Also:
-
STATE_XMLDECL_STANDALONE_EQ
protected static final int STATE_XMLDECL_STANDALONE_EQ- See Also:
-
STATE_XMLDECL_STANDALONE_VALUE
protected static final int STATE_XMLDECL_STANDALONE_VALUE- See Also:
-
STATE_XMLDECL_AFTER_STANDALONE_VALUE
protected static final int STATE_XMLDECL_AFTER_STANDALONE_VALUE- See Also:
-
STATE_XMLDECL_ENDQ
protected static final int STATE_XMLDECL_ENDQ- See Also:
-
STATE_DTD_DOCTYPE
protected static final int STATE_DTD_DOCTYPE- See Also:
-
STATE_DTD_AFTER_DOCTYPE
protected static final int STATE_DTD_AFTER_DOCTYPE- See Also:
-
STATE_DTD_BEFORE_ROOT_NAME
protected static final int STATE_DTD_BEFORE_ROOT_NAME- See Also:
-
STATE_DTD_ROOT_NAME
protected static final int STATE_DTD_ROOT_NAME- See Also:
-
STATE_DTD_AFTER_ROOT_NAME
protected static final int STATE_DTD_AFTER_ROOT_NAME- See Also:
-
STATE_DTD_BEFORE_IDS
protected static final int STATE_DTD_BEFORE_IDS- See Also:
-
STATE_DTD_PUBLIC_OR_SYSTEM
protected static final int STATE_DTD_PUBLIC_OR_SYSTEM- See Also:
-
STATE_DTD_AFTER_PUBLIC
protected static final int STATE_DTD_AFTER_PUBLIC- See Also:
-
STATE_DTD_AFTER_SYSTEM
protected static final int STATE_DTD_AFTER_SYSTEM- See Also:
-
STATE_DTD_BEFORE_PUBLIC_ID
protected static final int STATE_DTD_BEFORE_PUBLIC_ID- See Also:
-
STATE_DTD_PUBLIC_ID
protected static final int STATE_DTD_PUBLIC_ID- See Also:
-
STATE_DTD_AFTER_PUBLIC_ID
protected static final int STATE_DTD_AFTER_PUBLIC_ID- See Also:
-
STATE_DTD_BEFORE_SYSTEM_ID
protected static final int STATE_DTD_BEFORE_SYSTEM_ID- See Also:
-
STATE_DTD_SYSTEM_ID
protected static final int STATE_DTD_SYSTEM_ID- See Also:
-
STATE_DTD_AFTER_SYSTEM_ID
protected static final int STATE_DTD_AFTER_SYSTEM_ID- See Also:
-
STATE_DTD_INT_SUBSET
protected static final int STATE_DTD_INT_SUBSET- See Also:
-
STATE_DTD_EXPECT_CLOSING_GT
protected static final int STATE_DTD_EXPECT_CLOSING_GT- See Also:
-
STATE_TEXT_AMP
protected static final int STATE_TEXT_AMP- See Also:
-
STATE_TEXT_AMP_NAME
protected static final int STATE_TEXT_AMP_NAME- See Also:
-
STATE_COMMENT_CONTENT
protected static final int STATE_COMMENT_CONTENT- See Also:
-
STATE_COMMENT_HYPHEN
protected static final int STATE_COMMENT_HYPHEN- See Also:
-
STATE_COMMENT_HYPHEN2
protected static final int STATE_COMMENT_HYPHEN2- See Also:
-
STATE_CDATA_CONTENT
protected static final int STATE_CDATA_CONTENT- See Also:
-
STATE_CDATA_C
protected static final int STATE_CDATA_C- See Also:
-
STATE_CDATA_CD
protected static final int STATE_CDATA_CD- See Also:
-
STATE_CDATA_CDA
protected static final int STATE_CDATA_CDA- See Also:
-
STATE_CDATA_CDAT
protected static final int STATE_CDATA_CDAT- See Also:
-
STATE_CDATA_CDATA
protected static final int STATE_CDATA_CDATA- See Also:
-
STATE_PI_AFTER_TARGET
protected static final int STATE_PI_AFTER_TARGET- See Also:
-
STATE_PI_AFTER_TARGET_WS
protected static final int STATE_PI_AFTER_TARGET_WS- See Also:
-
STATE_PI_AFTER_TARGET_QMARK
protected static final int STATE_PI_AFTER_TARGET_QMARK- See Also:
-
STATE_PI_IN_TARGET
protected static final int STATE_PI_IN_TARGET- See Also:
-
STATE_PI_IN_DATA
protected static final int STATE_PI_IN_DATA- See Also:
-
STATE_SE_ELEM_NAME
protected static final int STATE_SE_ELEM_NAME- See Also:
-
STATE_SE_SPACE_OR_END
protected static final int STATE_SE_SPACE_OR_END- See Also:
-
STATE_SE_SPACE_OR_ATTRNAME
protected static final int STATE_SE_SPACE_OR_ATTRNAME- See Also:
-
STATE_SE_ATTR_NAME
protected static final int STATE_SE_ATTR_NAME- See Also:
-
STATE_SE_SPACE_OR_EQ
protected static final int STATE_SE_SPACE_OR_EQ- See Also:
-
STATE_SE_SPACE_OR_ATTRVALUE
protected static final int STATE_SE_SPACE_OR_ATTRVALUE- See Also:
-
STATE_SE_ATTR_VALUE_NORMAL
protected static final int STATE_SE_ATTR_VALUE_NORMAL- See Also:
-
STATE_SE_ATTR_VALUE_NSDECL
protected static final int STATE_SE_ATTR_VALUE_NSDECL- See Also:
-
STATE_SE_SEEN_SLASH
protected static final int STATE_SE_SEEN_SLASH- See Also:
-
STATE_EE_NEED_GT
protected static final int STATE_EE_NEED_GT- See Also:
-
PENDING_STATE_CR
protected static final int PENDING_STATE_CR- See Also:
-
PENDING_STATE_XMLDECL_LT
protected static final int PENDING_STATE_XMLDECL_LT- See Also:
-
PENDING_STATE_XMLDECL_LTQ
protected static final int PENDING_STATE_XMLDECL_LTQ- See Also:
-
PENDING_STATE_XMLDECL_TARGET
protected static final int PENDING_STATE_XMLDECL_TARGET- See Also:
-
PENDING_STATE_PI_QMARK
protected static final int PENDING_STATE_PI_QMARK- See Also:
-
PENDING_STATE_COMMENT_HYPHEN1
protected static final int PENDING_STATE_COMMENT_HYPHEN1- See Also:
-
PENDING_STATE_COMMENT_HYPHEN2
protected static final int PENDING_STATE_COMMENT_HYPHEN2- See Also:
-
PENDING_STATE_CDATA_BRACKET1
protected static final int PENDING_STATE_CDATA_BRACKET1- See Also:
-
PENDING_STATE_CDATA_BRACKET2
protected static final int PENDING_STATE_CDATA_BRACKET2- See Also:
-
PENDING_STATE_ENT_SEEN_HASH
protected static final int PENDING_STATE_ENT_SEEN_HASH- See Also:
-
PENDING_STATE_ENT_SEEN_HASH_X
protected static final int PENDING_STATE_ENT_SEEN_HASH_X- See Also:
-
PENDING_STATE_ENT_IN_DEC_DIGIT
protected static final int PENDING_STATE_ENT_IN_DEC_DIGIT- See Also:
-
PENDING_STATE_ENT_IN_HEX_DIGIT
protected static final int PENDING_STATE_ENT_IN_HEX_DIGIT- See Also:
-
PENDING_STATE_ATTR_VALUE_AMP
protected static final int PENDING_STATE_ATTR_VALUE_AMP- See Also:
-
PENDING_STATE_ATTR_VALUE_AMP_HASH
protected static final int PENDING_STATE_ATTR_VALUE_AMP_HASH- See Also:
-
PENDING_STATE_ATTR_VALUE_AMP_HASH_X
protected static final int PENDING_STATE_ATTR_VALUE_AMP_HASH_X- See Also:
-
PENDING_STATE_ATTR_VALUE_ENTITY_NAME
protected static final int PENDING_STATE_ATTR_VALUE_ENTITY_NAME- See Also:
-
PENDING_STATE_ATTR_VALUE_DEC_DIGIT
protected static final int PENDING_STATE_ATTR_VALUE_DEC_DIGIT- See Also:
-
PENDING_STATE_ATTR_VALUE_HEX_DIGIT
protected static final int PENDING_STATE_ATTR_VALUE_HEX_DIGIT- See Also:
-
PENDING_STATE_TEXT_AMP
protected static final int PENDING_STATE_TEXT_AMP- See Also:
-
PENDING_STATE_TEXT_AMP_HASH
protected static final int PENDING_STATE_TEXT_AMP_HASH- See Also:
-
PENDING_STATE_TEXT_DEC_ENTITY
protected static final int PENDING_STATE_TEXT_DEC_ENTITY- See Also:
-
PENDING_STATE_TEXT_HEX_ENTITY
protected static final int PENDING_STATE_TEXT_HEX_ENTITY- See Also:
-
PENDING_STATE_TEXT_IN_ENTITY
protected static final int PENDING_STATE_TEXT_IN_ENTITY- See Also:
-
PENDING_STATE_TEXT_BRACKET1
protected static final int PENDING_STATE_TEXT_BRACKET1- See Also:
-
PENDING_STATE_TEXT_BRACKET2
protected static final int PENDING_STATE_TEXT_BRACKET2- See Also:
-
_charTypes
This is a simple container object that is used to access the decoding tables for characters. Indirection is needed since we actually support multiple utf-8 compatible encodings, not just utf-8 itself.NOTE: non-final due to xml declaration handling occurring later.
-
_symbols
For now, symbol table contains prefixed names. In future it is possible that they may be split into prefixes and local names?NOTE: non-final for async scanners
-
_quadBuffer
protected int[] _quadBufferThis buffer is used for name parsing. Will be expanded if/as needed; 32 ints can hold names 128 ascii chars long. -
_nextEvent
protected int _nextEventDue to asynchronous nature of parsing, we may know what event we are trying to parse, even if it's not yet complete. Type of that event is stored here. -
_state
protected int _stateIn addition to the event type, there is need for additional state information -
_surroundingEvent
protected int _surroundingEventFor token/state combinations that are 'shared' between events (or embedded in them), this is where the surrounding event state is retained. -
_pendingInput
protected int _pendingInputThere are some multi-byte combinations that must be handled as a unit: CR+LF linefeeds, multi-byte UTF-8 characters, and multi-character end markers for comments and PIs. Since they can be split across input buffer boundaries, first byte(s) may need to be temporarily stored.If so, this int will store byte(s), in little-endian format (that is, first pending byte is at 0x000000FF, second [if any] at 0x0000FF00, and third at 0x00FF0000). This can be (and is) used to figure out actual number of bytes pending, for multi-byte (UTF-8) character decoding.
Note: it is assumed that if value is 0, there is no data. Thus, if 0 needed to be added pending, it has to be masked.
-
_endOfInput
protected boolean _endOfInputFlag that is sent when calling application indicates that there will be no more input to parse. -
_quadCount
protected int _quadCountNumber of complete quads parsed for current name (quads themselves are stored in_quadBuffer
). -
_currQuad
protected int _currQuadBytes parsed for the current, incomplete, quad -
_currQuadBytes
protected int _currQuadBytesNumber of bytes pending/buffered, stored in_currQuad
-
_entityValue
protected int _entityValueEntity value accumulated so far -
_elemAllNsBound
protected boolean _elemAllNsBound -
_elemAttrCount
protected boolean _elemAttrCount -
_elemAttrQuote
protected byte _elemAttrQuote -
_elemAttrName
-
_elemAttrPtr
protected int _elemAttrPtrPointer for the next character of currently being parsed value within attribute value buffer -
_elemNsPtr
protected int _elemNsPtrPointer for the next character of currently being parsed namespace URI for the current namespace declaration -
_inDtdDeclaration
protected boolean _inDtdDeclarationFlag that indicates whether we are inside a declaration during parsing of internal DTD subset.
-
-
Constructor Details
-
AsyncByteScanner
-
-
Method Details
-
_activateEncoding
protected void _activateEncoding()Initialization method to call when encoding has been definitely figured out, from XML declarations, or, from lack of one (using defaults).- Since:
- 1.1.1
-
endOfInput
public void endOfInput()Description copied from interface:AsyncInputFeeder
Method that should be called after last chunk of data to parse has been fed. May be called regardless of whatAsyncInputFeeder.needMoreInput()
returns. After calling this method, no more data can be fed; and parser assumes no more data will be available.- Specified by:
endOfInput
in interfaceAsyncInputFeeder
-
_releaseBuffers
protected void _releaseBuffers()- Overrides:
_releaseBuffers
in classXmlScanner
-
_closeSource
Since the async scanner has no access to whatever passes content, there is no input source in same sense as with blocking scanner; and there is nothing to close. But we can at least mark input as having ended.- Specified by:
_closeSource
in classByteBasedScanner
- Throws:
IOException
-
verifyAndSetXmlVersion
- Throws:
XMLStreamException
-
verifyAndSetXmlEncoding
- Throws:
XMLStreamException
-
verifyAndSetXmlStandalone
- Throws:
XMLStreamException
-
verifyAndSetPublicId
- Throws:
XMLStreamException
-
verifyAndSetSystemId
- Throws:
XMLStreamException
-
_currentByte
- Throws:
XMLStreamException
-
_nextByte
- Throws:
XMLStreamException
-
_prevByte
- Throws:
XMLStreamException
-
handlePI
- Throws:
XMLStreamException
-
handleDTDInternalSubset
- Throws:
XMLStreamException
-
handleComment
- Throws:
XMLStreamException
-
handleStartElementStart
- Throws:
XMLStreamException
-
handleStartElement
- Throws:
XMLStreamException
-
parsePName
- Throws:
XMLStreamException
-
parseNewName
- Throws:
XMLStreamException
-
asyncSkipSpace
- Throws:
XMLStreamException
-
handlePartialCR
- Throws:
XMLStreamException
-
finishToken
Description copied from class:XmlScanner
This method is called to ensure that the current token/event has been completely parsed, such that we have all the data needed to return it (textual content, PI data, comment text etc)- Specified by:
finishToken
in classXmlScanner
- Throws:
XMLStreamException
-
startCharacters
Method called to initialize state for CHARACTERS event, after just a single byte has been seen. What needs to be done next depends on whether coalescing mode is set or not: if it is not set, just a single character needs to be decoded, after which current event will be incomplete, but defined as CHARACTERS. In coalescing mode, the whole content must be read before current event can be defined. The reason for difference is that whenXMLStreamReader.next()
returns, no blocking can occur when calling other methods.- Returns:
- Event type detected; either CHARACTERS, if at least one full character was decoded (and can be returned), EVENT_INCOMPLETE if not (part of a multi-byte character split across input buffer boundary)
- Throws:
XMLStreamException
-
handleAttrValue
- Throws:
XMLStreamException
-
handleNsDecl
- Throws:
XMLStreamException
-
finishCData
- Specified by:
finishCData
in classXmlScanner
- Throws:
XMLStreamException
-
finishComment
- Specified by:
finishComment
in classXmlScanner
- Throws:
XMLStreamException
-
finishDTD
- Specified by:
finishDTD
in classXmlScanner
- Throws:
XMLStreamException
-
finishPI
- Specified by:
finishPI
in classXmlScanner
- Throws:
XMLStreamException
-
finishSpace
- Specified by:
finishSpace
in classXmlScanner
- Throws:
XMLStreamException
-
skipCharacters
- Specified by:
skipCharacters
in classXmlScanner
- Returns:
- True if the whole characters segment was succesfully skipped; false if not
- Throws:
XMLStreamException
-
skipCData
- Specified by:
skipCData
in classXmlScanner
- Throws:
XMLStreamException
-
skipComment
- Specified by:
skipComment
in classXmlScanner
- Throws:
XMLStreamException
-
skipPI
- Specified by:
skipPI
in classXmlScanner
- Throws:
XMLStreamException
-
skipSpace
- Specified by:
skipSpace
in classXmlScanner
- Throws:
XMLStreamException
-
loadMore
- Specified by:
loadMore
in classXmlScanner
- Throws:
XMLStreamException
-
finishCharacters
- Specified by:
finishCharacters
in classXmlScanner
- Throws:
XMLStreamException
-
findPName
Method called to process a sequence of bytes that is likely to be a PName. At this point we encountered an end marker, and may either hit a formerly seen well-formed PName; an as-of-yet unseen well-formed PName; or a non-well-formed sequence (containing one or more non-name chars without any valid end markers).- Parameters:
lastQuad
- Word with last 0 to 3 bytes of the PName; not included in the quad arraylastByteCount
- Number of bytes contained in lastQuad; 0 to 3.- Throws:
XMLStreamException
-
addPName
protected final PName addPName(ByteBasedPNameTable symbols, int hash, int[] quads, int qlen, int lastQuadBytes) throws XMLStreamException - Throws:
XMLStreamException
-
verifyAndAppendEntityCharacter
Method called to verify validity of given character (from entity) and append it to the text buffer- Throws:
XMLStreamException
-
validPublicIdChar
protected boolean validPublicIdChar(int c) Checks that a character for a PublicId- Parameters:
c
- A character- Returns:
- true if the character is valid for use in the Public ID of an XML doctype declaration
- See Also:
-
decodeCharForError
Description copied from class:ByteBasedScanner
Method called by methods when encountering a byte that can not be part of a valid character in the current context. Should return the actual decoded character for error reporting purposes.- Specified by:
decodeCharForError
in classByteBasedScanner
- Throws:
XMLStreamException
-
checkPITargetName
- Throws:
XMLStreamException
-
throwInternal
protected int throwInternal() -
reportInvalidOther
- Throws:
XMLStreamException
-
nextFromProlog
- Specified by:
nextFromProlog
in classXmlScanner
- Throws:
XMLStreamException
-
_startDocumentNoXmlDecl
Helper method called when it is determined that the document does NOT start with an xml declaration. Needs to return START_DOCUMENT, and initialize other state appropriately.- Throws:
XMLStreamException
-
handlePrologDeclStart
- Throws:
XMLStreamException
-
startXmlDeclaration
Method that deals with recognizing XML declaration, but not with parsing its contents.- Returns:
- null if parsing is inconclusive (may or may not be XML declaration); Boolean.TRUE if complete XML declaration, and Boolean.FALSE if something else
- Throws:
XMLStreamException
-
handleXmlDeclaration
Method called to complete parsing of XML declaration, once it has been reliably detected.- Returns:
- Completed token (START_DOCUMENT), if fully parsed; incomplete (EVENT_INCOMPLETE) otherwise
- Throws:
XMLStreamException
-
handleDTD
- Throws:
XMLStreamException
-
parseDtdId
private final boolean parseDtdId(char[] outputBuffer, int outputPtr, boolean system) throws XMLStreamException - Throws:
XMLStreamException
-
_parseNewXmlDeclName
- Throws:
XMLStreamException
-
_parseXmlDeclName
- Throws:
XMLStreamException
-
_findXmlDeclName
- Throws:
XMLStreamException
-
parseXmlDeclAttr
Method called to try to parse an XML pseudo-attribute value. This is relatively simple, since we can't have linefeeds or entities; and although there are exact rules for what is allowed, we can do coarse parsing and only later on verify validity (for encoding could do stricter parsing in future?)NOTE: pseudo-attribute values required to be 7-bit ASCII so can do crude cast.
- Returns:
- True if we managed to parse the whole pseudo-attribute
- Throws:
XMLStreamException
-