made fixes for really broken but readable files. #32
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Can now open and parse files with an incorrect startxref and incorrect stream lengths. Note that adobe acrobat x 10.0.0 will open these files and prompt for save when closed.
When the startxref is incorrect, it looks for the 'trailer' symbol and uses that trailer. Otherwise the xref table is not rebuilt. When the trailer is found, it then parses through the entire file and records the location of each object and places a new PdfReference inside of the PdfCrossReferenceTable.
Will also attempt to correct invalid stream lengths. After the stream length is pulled from the object, we first check for an incorrect 'endstream' symbol. If the 'endstream' symbol is not present where expected, we then look for the next valid 'endstream' symbol after the 'startstream' symbol. We use the 'endstream' symbol index and set the length of the stream.
Note 1: No implementation for a pdf file with a compressed trailer object yet.
Note 2: Not tested with versioned files and will still probably fail.
Note 3: On invalid stream length, should probably check 1k chunks of data for 'endstream'. Currently checks within the invalid length and if not found, loads the rest of the file and checks again.