Class BaseParser

    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected static int A  
      protected static byte ASCII_CR
      ASCII code for carriage return.
      protected static byte ASCII_LF
      ASCII code for line feed.
      protected static int B  
      protected static int D  
      static String DEF
      This is a string constant that will be used for comparisons.
      protected COSDocument document
      This is the document that will be parsed.
      protected static int E  
      protected static String ENDOBJ_STRING
      This is a string constant that will be used for comparisons.
      protected static String ENDSTREAM_STRING
      This is a string constant that will be used for comparisons.
      protected static int J  
      protected static int M  
      protected static int N  
      protected static int O  
      protected static int R  
      protected static int S  
      protected com.tom_roush.pdfbox.pdfparser.SequentialSource seqSource
      This is the stream that will be read from.
      protected static String STREAM_STRING
      This is a string constant that will be used for comparisons.
      protected static int T  
    • Constructor Summary

      Constructors 
      Constructor Description
      BaseParser​(com.tom_roush.pdfbox.pdfparser.SequentialSource pdfSource)
      Default constructor.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      protected boolean isClosing()
      This will tell if the next character is a closing brace( close of PDF array ).
      protected boolean isClosing​(int c)
      This will tell if the next character is a closing brace( close of PDF array ).
      protected boolean isDigit()
      This will tell if the next byte is a digit or not.
      protected static boolean isDigit​(int c)
      This will tell if the given value is a digit or not.
      protected boolean isEndOfName​(int ch)
      Determine if a character terminates a PDF name.
      protected boolean isEOL()
      This will tell if the next byte to be read is an end of line byte.
      protected boolean isEOL​(int c)
      This will tell if the next byte to be read is an end of line byte.
      protected boolean isSpace()
      This will tell if the next byte is a space or not.
      protected boolean isSpace​(int c)
      This will tell if the given value is a space or not.
      protected boolean isWhitespace()
      This will tell if the next byte is whitespace or not.
      protected boolean isWhitespace​(int c)
      This will tell if a character is whitespace or not.
      protected COSBoolean parseBoolean()
      This will parse a boolean object from the stream.
      protected COSArray parseCOSArray()
      This will parse a PDF array object.
      protected COSDictionary parseCOSDictionary()
      This will parse a PDF dictionary.
      protected COSName parseCOSName()
      This will parse a PDF name from the stream.
      protected COSString parseCOSString()
      This will parse a PDF string.
      protected COSBase parseDirObject()
      This will parse a directory object from the stream.
      protected void readExpectedChar​(char ec)
      Read one char and throw an exception if it is not the expected value.
      protected void readExpectedString​(char[] expectedString, boolean skipSpaces)
      Reads given pattern from seqSource.
      protected void readExpectedString​(String expectedString)
      Read one String and throw an exception if it is not the expected value.
      protected int readGenerationNumber()
      This will read a integer from the Stream and throw an IllegalArgumentException if the integer value has more than the maximum object revision (i.e.
      protected int readInt()
      This will read an integer from the stream.
      protected String readLine()
      This will read bytes until the first end of line marker occurs.
      protected long readLong()
      This will read an long from the stream.
      protected int readObjectNumber()
      This will read a long from the Stream and throw an IOException if the long value is negative or has more than 10 digits (i.e.
      protected String readString()
      This will read the next string from the stream.
      protected String readString​(int length)
      This will read the next string from the stream up to a certain length.
      protected StringBuilder readStringNumber()
      This method is used to read a token by the readInt() method and the readLong() method.
      protected void skipSpaces()
      This will skip all spaces and comments that are present.
      protected void skipWhiteSpace()  
    • Constructor Detail

      • BaseParser

        public BaseParser​(com.tom_roush.pdfbox.pdfparser.SequentialSource pdfSource)
        Default constructor.
    • Method Detail

      • parseCOSDictionary

        protected COSDictionary parseCOSDictionary()
                                            throws IOException
        This will parse a PDF dictionary.
        Returns:
        The parsed dictionary.
        Throws:
        IOException - If there is an error reading the stream.
      • parseCOSString

        protected COSString parseCOSString()
                                    throws IOException
        This will parse a PDF string.
        Returns:
        The parsed PDF string.
        Throws:
        IOException - If there is an error reading from the stream.
      • parseCOSArray

        protected COSArray parseCOSArray()
                                  throws IOException
        This will parse a PDF array object.
        Returns:
        The parsed PDF array.
        Throws:
        IOException - If there is an error parsing the stream.
      • isEndOfName

        protected boolean isEndOfName​(int ch)
        Determine if a character terminates a PDF name.
        Parameters:
        ch - The character
        Returns:
        true if the character terminates a PDF name, otherwise false.
      • parseCOSName

        protected COSName parseCOSName()
                                throws IOException
        This will parse a PDF name from the stream.
        Returns:
        The parsed PDF name.
        Throws:
        IOException - If there is an error reading from the stream.
      • parseBoolean

        protected COSBoolean parseBoolean()
                                   throws IOException
        This will parse a boolean object from the stream.
        Returns:
        The parsed boolean object.
        Throws:
        IOException - If an IO error occurs during parsing.
      • parseDirObject

        protected COSBase parseDirObject()
                                  throws IOException
        This will parse a directory object from the stream.
        Returns:
        The parsed object.
        Throws:
        IOException - If there is an error during parsing.
      • readString

        protected String readString()
                             throws IOException
        This will read the next string from the stream.
        Returns:
        The string that was read from the stream.
        Throws:
        IOException - If there is an error reading from the stream.
      • readExpectedString

        protected void readExpectedString​(String expectedString)
                                   throws IOException
        Read one String and throw an exception if it is not the expected value.
        Parameters:
        expectedString - the String value that is expected.
        Throws:
        IOException - if the String char is not the expected value or if an I/O error occurs.
      • readExpectedString

        protected final void readExpectedString​(char[] expectedString,
                                                boolean skipSpaces)
                                         throws IOException
        Reads given pattern from seqSource. Skipping whitespace at start and end if wanted.
        Parameters:
        expectedString - pattern to be skipped
        skipSpaces - if set to true spaces before and after the string will be skipped
        Throws:
        IOException - if pattern could not be read
      • readExpectedChar

        protected void readExpectedChar​(char ec)
                                 throws IOException
        Read one char and throw an exception if it is not the expected value.
        Parameters:
        ec - the char value that is expected.
        Throws:
        IOException - if the read char is not the expected value or if an I/O error occurs.
      • readString

        protected String readString​(int length)
                             throws IOException
        This will read the next string from the stream up to a certain length.
        Parameters:
        length - The length to stop reading at.
        Returns:
        The string that was read from the stream of length 0 to length.
        Throws:
        IOException - If there is an error reading from the stream.
      • isClosing

        protected boolean isClosing()
                             throws IOException
        This will tell if the next character is a closing brace( close of PDF array ).
        Returns:
        true if the next byte is ']', false otherwise.
        Throws:
        IOException - If an IO error occurs.
      • isClosing

        protected boolean isClosing​(int c)
        This will tell if the next character is a closing brace( close of PDF array ).
        Parameters:
        c - The character to check against end of line
        Returns:
        true if the next byte is ']', false otherwise.
      • readLine

        protected String readLine()
                           throws IOException
        This will read bytes until the first end of line marker occurs. NOTE: The EOL marker may consists of 1 (CR or LF) or 2 (CR and CL) bytes which is an important detail if one wants to unread the line.
        Returns:
        The characters between the current position and the end of the line.
        Throws:
        IOException - If there is an error reading from the stream.
      • isEOL

        protected boolean isEOL()
                         throws IOException
        This will tell if the next byte to be read is an end of line byte.
        Returns:
        true if the next byte is 0x0A or 0x0D.
        Throws:
        IOException - If there is an error reading from the stream.
      • isEOL

        protected boolean isEOL​(int c)
        This will tell if the next byte to be read is an end of line byte.
        Parameters:
        c - The character to check against end of line
        Returns:
        true if the next byte is 0x0A or 0x0D.
      • isWhitespace

        protected boolean isWhitespace()
                                throws IOException
        This will tell if the next byte is whitespace or not.
        Returns:
        true if the next byte in the stream is a whitespace character.
        Throws:
        IOException - If there is an error reading from the stream.
      • isWhitespace

        protected boolean isWhitespace​(int c)
        This will tell if a character is whitespace or not. These values are specified in table 1 (page 12) of ISO 32000-1:2008.
        Parameters:
        c - The character to check against whitespace
        Returns:
        true if the character is a whitespace character.
      • isSpace

        protected boolean isSpace()
                           throws IOException
        This will tell if the next byte is a space or not.
        Returns:
        true if the next byte in the stream is a space character.
        Throws:
        IOException - If there is an error reading from the stream.
      • isSpace

        protected boolean isSpace​(int c)
        This will tell if the given value is a space or not.
        Parameters:
        c - The character to check against space
        Returns:
        true if the next byte in the stream is a space character.
      • isDigit

        protected boolean isDigit()
                           throws IOException
        This will tell if the next byte is a digit or not.
        Returns:
        true if the next byte in the stream is a digit.
        Throws:
        IOException - If there is an error reading from the stream.
      • isDigit

        protected static boolean isDigit​(int c)
        This will tell if the given value is a digit or not.
        Parameters:
        c - The character to be checked
        Returns:
        true if the next byte in the stream is a digit.
      • skipSpaces

        protected void skipSpaces()
                           throws IOException
        This will skip all spaces and comments that are present.
        Throws:
        IOException - If there is an error reading from the stream.
      • readObjectNumber

        protected int readObjectNumber()
                                throws IOException
        This will read a long from the Stream and throw an IOException if the long value is negative or has more than 10 digits (i.e. : bigger than OBJECT_NUMBER_THRESHOLD)
        Returns:
        the object number being read.
        Throws:
        IOException - if an I/O error occurs
      • readInt

        protected int readInt()
                       throws IOException
        This will read an integer from the stream.
        Returns:
        The integer that was read from the stream.
        Throws:
        IOException - If there is an error reading from the stream.
      • readLong

        protected long readLong()
                         throws IOException
        This will read an long from the stream.
        Returns:
        The long that was read from the stream.
        Throws:
        IOException - If there is an error reading from the stream.