Class StreamScanner

All Implemented Interfaces:
XmlConsts, NamespaceContext, XMLStreamConstants
Direct Known Subclasses:
Utf8Scanner

public abstract class StreamScanner extends ByteBasedScanner
Base class for various byte stream based scanners (generally one for each type of encoding supported).
  • Field Details

    • _in

      protected InputStream _in
      Underlying InputStream to use for reading content.
    • _inputBuffer

      protected byte[] _inputBuffer
    • _charTypes

      protected final XmlCharTypes _charTypes
      This is a simple container object that is used to access the decoding tables for characters. Indirection is needed since we actually support multiple utf-8 compatible encodings, not just utf-8 itself.
    • _symbols

      protected final ByteBasedPNameTable _symbols
      For now, symbol table contains prefixed names. In future it is possible that they may be split into prefixes and local names?
    • _quadBuffer

      protected int[] _quadBuffer
      This buffer is used for name parsing. Will be expanded if/as needed; 32 ints can hold names 128 ascii chars long.
  • Constructor Details

  • Method Details

    • _releaseBuffers

      protected void _releaseBuffers()
      Overrides:
      _releaseBuffers in class XmlScanner
    • _closeSource

      protected void _closeSource() throws IOException
      Specified by:
      _closeSource in class ByteBasedScanner
      Throws:
      IOException
    • handleEntityInText

      protected abstract int handleEntityInText(boolean inAttr) throws XMLStreamException
      Throws:
      XMLStreamException
    • parsePublicId

      protected abstract String parsePublicId(byte quoteChar) throws XMLStreamException
      Throws:
      XMLStreamException
    • parseSystemId

      protected abstract String parseSystemId(byte quoteChar) throws XMLStreamException
      Throws:
      XMLStreamException
    • nextFromProlog

      public final int nextFromProlog(boolean isProlog) throws XMLStreamException
      Specified by:
      nextFromProlog in class XmlScanner
      Throws:
      XMLStreamException
    • nextFromTree

      public final int nextFromTree() throws XMLStreamException
      Specified by:
      nextFromTree in class XmlScanner
      Throws:
      XMLStreamException
    • _nextEntity

      protected int _nextEntity()
      Helper method used to isolate things that need to be (re)set in cases where
    • handlePrologDeclStart

      private final int handlePrologDeclStart(boolean isProlog) throws XMLStreamException
      Throws:
      XMLStreamException
    • handleDtdStart

      private final int handleDtdStart() throws XMLStreamException
      Throws:
      XMLStreamException
    • handleCommentOrCdataStart

      private final int handleCommentOrCdataStart() throws XMLStreamException
      Throws:
      XMLStreamException
    • handlePIStart

      private final int handlePIStart() throws XMLStreamException
      Method called after leading 'invalid input: '<'?' has been parsed; needs to parse target.
      Throws:
      XMLStreamException
    • handleCharEntity

      protected final int handleCharEntity() throws XMLStreamException
      Returns:
      Code point for the entity that expands to a valid XML content character.
      Throws:
      XMLStreamException
    • handleStartElement

      protected abstract int handleStartElement(byte b) throws XMLStreamException
      Parsing of start element requires parsing of the element name (and attribute names), and is thus encoding-specific.
      Throws:
      XMLStreamException
    • handleEndElement

      protected final int handleEndElement() throws XMLStreamException
      Note that this method is currently also shareable for all Ascii-based encodings, and at least between UTF-8 and ISO-Latin1. The reason is that since we already know exact bytes that need to be matched, there's no danger of getting invalid encodings or such. So, for now, let's leave this method here in the base class.
      Throws:
      XMLStreamException
    • handleEndElementSlow

      private final int handleEndElementSlow(int size) throws XMLStreamException
      Throws:
      XMLStreamException
    • parsePName

      protected final PName parsePName(byte b) throws XMLStreamException
      This method can (for now?) be shared between all Ascii-based encodings, since it only does coarse validity checking -- real checks are done in different method.

      Some notes about assumption implementation makes:

      • Well-formed xml content can not end with a name: as such, end-of-input is an error and we can throw an exception
      Throws:
      XMLStreamException
    • parsePNameMedium

      protected PName parsePNameMedium(int i2, int q1) throws XMLStreamException
      Throws:
      XMLStreamException
    • parsePNameLong

      protected final PName parsePNameLong(int q, int[] quads) throws XMLStreamException
      Throws:
      XMLStreamException
    • parsePNameSlow

      protected final PName parsePNameSlow(byte b) throws XMLStreamException
      Throws:
      XMLStreamException
    • findPName

      private final PName findPName(int onlyQuad, int lastByteCount) throws XMLStreamException
      Method called to process a sequence of bytes that is likely to be a PName. At this point we encountered an end marker, and may either hit a formerly seen well-formed PName; an as-of-yet unseen well-formed PName; or a non-well-formed sequence (containing one or more non-name chars without any valid end markers).
      Parameters:
      onlyQuad - Word with 1 to 4 bytes that make up PName
      lastByteCount - Number of actual bytes contained in onlyQuad; 0 to 3.
      Throws:
      XMLStreamException
    • findPName

      private final PName findPName(int firstQuad, int secondQuad, int lastByteCount) throws XMLStreamException
      Method called to process a sequence of bytes that is likely to be a PName. At this point we encountered an end marker, and may either hit a formerly seen well-formed PName; an as-of-yet unseen well-formed PName; or a non-well-formed sequence (containing one or more non-name chars without any valid end markers).
      Parameters:
      firstQuad - First 1 to 4 bytes of the PName
      secondQuad - Word with last 1 to 4 bytes of the PName
      lastByteCount - Number of bytes contained in secondQuad; 0 to 3.
      Throws:
      XMLStreamException
    • findPName

      private final PName findPName(int lastQuad, int[] quads, int qlen, int lastByteCount) throws XMLStreamException
      Method called to process a sequence of bytes that is likely to be a PName. At this point we encountered an end marker, and may either hit a formerly seen well-formed PName; an as-of-yet unseen well-formed PName; or a non-well-formed sequence (containing one or more non-name chars without any valid end markers).
      Parameters:
      lastQuad - Word with last 0 to 3 bytes of the PName; not included in the quad array
      quads - Array that contains all the quads, except for the last one, for names with more than 8 bytes (i.e. more than 2 quads)
      qlen - Number of quads in the array, except if less than 2 (in which case only firstQuad and lastQuad are used)
      lastByteCount - Number of bytes contained in lastQuad; 0 to 3.
      Throws:
      XMLStreamException
    • findPName

      private final PName findPName(int lastQuad, int lastByteCount, int firstQuad, int qlen, int[] quads) throws XMLStreamException
      Method called to process a sequence of bytes that is likely to be a PName. At this point we encountered an end marker, and may either hit a formerly seen well-formed PName; an as-of-yet unseen well-formed PName; or a non-well-formed sequence (containing one or more non-name chars without any valid end markers).
      Parameters:
      lastQuad - Word with last 0 to 3 bytes of the PName; not included in the quad array
      lastByteCount - Number of bytes contained in lastQuad; 0 to 3.
      firstQuad - First 1 to 4 bytes of the PName (4 if length at least 4 bytes; less only if not).
      qlen - Number of quads in the array, except if less than 2 (in which case only firstQuad and lastQuad are used)
      quads - Array that contains all the quads, except for the last one, for names with more than 8 bytes (i.e. more than 2 quads)
      Throws:
      XMLStreamException
    • addPName

      protected final PName addPName(int hash, int[] quads, int qlen, int lastQuadBytes) throws XMLStreamException
      Throws:
      XMLStreamException
    • skipInternalWs

      protected byte skipInternalWs(boolean reqd, String msg) throws XMLStreamException
      Returns:
      First byte following skipped white space
      Throws:
      XMLStreamException
    • matchAsciiKeyword

      private final void matchAsciiKeyword(String keyw) throws XMLStreamException
      Throws:
      XMLStreamException
    • checkInTreeIndentation

      protected final int checkInTreeIndentation(int c) throws XMLStreamException

      Note: consequtive white space is only considered indentation, if the following token seems like a tag (start/end). This so that if a CDATA section follows, it can be coalesced in coalescing mode. Although we could check if coalescing mode is enabled, this should seldom have significant effect either way, so it removes one possible source of problems in coalescing mode.

      Returns:
      -1, if indentation was handled; offset in the output buffer, if not
      Throws:
      XMLStreamException
    • checkPrologIndentation

      protected final int checkPrologIndentation(int c) throws XMLStreamException
      Returns:
      -1, if indentation was handled; offset in the output buffer, if not
      Throws:
      XMLStreamException
    • loadMore

      protected final boolean loadMore() throws XMLStreamException
      Specified by:
      loadMore in class XmlScanner
      Throws:
      XMLStreamException
    • nextByte

      protected final byte nextByte(int tt) throws XMLStreamException
      Throws:
      XMLStreamException
    • nextByte

      protected final byte nextByte() throws XMLStreamException
      Throws:
      XMLStreamException
    • loadOne

      protected final byte loadOne() throws XMLStreamException
      Throws:
      XMLStreamException
    • loadOne

      protected final byte loadOne(int type) throws XMLStreamException
      Throws:
      XMLStreamException
    • loadAndRetain

      protected final boolean loadAndRetain(int nrOfChars) throws XMLStreamException
      Throws:
      XMLStreamException