PDFSharp Fixes #39

mlaukala · 2017-12-20T23:35:41Z

This is my Release branch that I will be working off of from this point forward. I tried making modular fixes in hopes that previous pull requests would be easy to identify changes.

As I work with Pdfs from many Pdf producers, I have been able to make numerous fixes for out of spec Pdfs. If adobe can read the file, my app must be able to read it as well. As a result, dealing with merge conflicts on my Release branch is becoming a major issue.

TH-Soft · 2018-03-23T08:37:05Z

Thanks for all your code changes. I plan to incorporate them after a stable version of PDFsharp 1.50 was published. It would be nice if you could provide non-confidential PDFs that allow us to evaluate the changes.

mlaukala · 2018-04-03T20:24:54Z

I'll need to confirm that I can provide the PDFs. Could be a couple of days.

mlaukala · 2018-05-30T18:41:39Z

I've been extremely busy over the past few months and things have finally slowed down a bit. I can not provide PDFs at this time that will reproduce the errors I was getting. I do hope to create example PDFs that will reproduce the errors. This will make it easier for me to do my testing when updates and new fixes are implemented. Before that happens, I'll try to find the time to revisit the fixes that I have made and write out better comments for them, describing the exact issue and the PDF producer that caused the issue along with a link to the adobe spec and relevant section of the adobe spec.

mlaukala · 2018-06-06T17:03:26Z

This latest fix is for invalid startxref byte offset. If the xref table cannot be found at the specified byte offset, it is assumed that all byte offsets are incorrect and the xref table and trailer is rebuilt.

mlaukala · 2018-06-06T17:15:23Z

Made an amendment that makes sure the latest generation root/catalog is used.

…ctionary delimiter if the next char was a '.'

mlaukala · 2019-04-02T19:30:05Z

The endstream checks were not looking for an eol char before the endstream keyword and causing massive slow downs when reading huge PDF files with a lot of stream.

…g to not catch and rethrow.

…ns the /Kids key.

aggsol · 2019-07-10T08:10:48Z

Will this be ever merged?

mlaukala · 2019-07-10T15:13:12Z

Will this be ever merged?

Sadly, probably not. I am not able to supply them with pdfs that reproduce the errors caused nor do I have the time to create sample pdfs that duplicate the errors. It's on my very long list of things to do but it's not high priority so it keeps getting pushed back.

…m with CRLF.

leonardobaggio · 2019-07-25T14:19:43Z

hi @mlaukala, amazing work on this PR, thank you!
This issue about corrupted PDF has been a long time headache for me. I'm planning to use your forked version of PDFSharp, but I don't know if there another ways instead using it directly referenced on my solution, building it locally. Do you have any suggestions to achieve similar integration as provided by Nuget, but using this fork?

mlaukala · 2019-07-25T15:40:56Z

I do not, sorry.

ken-sands · 2019-09-25T18:06:22Z

Applying these fixes actually caused PDFs to corrupt on saving/reopening for me.
with these in place opening and saving a pdf, then opening and saving it again would end up with elements missing, colours inverted, all sorts of stuff. If after each save the pdf is opened and saved from pdftk or similar it can be brought back from death. While it looks like a great effort towards handling pdfs with issues it currently causes more issues than it solves for us.

mlaukala · 2019-09-25T20:01:56Z

Care to provide a sample PDF and code? I would love to attempt to work out what is causing the problem. We work with thousands of PDFs a day. 99% of the time, none of our PDFs have any issues. Of the ones that do fail, it's usually a result of the PDF producer not following the PDF specification.

ken-sands · 2019-09-26T08:49:42Z

I'll have to edit one to remove details and will need you to agree to take it to test only, delete after testing and not to pass it on at all but yes. is there a way to directly message you with a pdf?

ken-sands · 2019-09-26T09:29:20Z

I've chopped the personalised pages from my pdf so I can share it with you (though it's still a customer document so I can't make it public unfortunately) my discord tag is captain_ken#8332 I'm UK based (GMT timezone) I've just run a test on the chopped pdf with a fresh build of your release (just in case it was other tweaks I have that were causing it) Same issues persist.

mlaukala · 2019-09-27T19:52:07Z

I just sent you a request on discord, I am MJ#2945. I'll be out of town until sunday evening. I should be able to take a look at some point next week.

ken-sands · 2019-09-28T14:19:59Z

Yep cool, I've sent you PDFs and rambling details on what happens, should be enough for you get the same results.

Auto build and nuget [skip ci]

alshezawi · 2020-06-16T13:16:40Z

Thank you.
your fork helped me to solve Unexpected character '0xffff' in PDF stream and PDF corrupted errors.

Fixes an issue related to importing an iref object in a specific PDF document. The issue appeared with changes in empira/PDFsharp@4d1b3f0 See empira/PDFsharp#39 for discussion.

mlaukala force-pushed the Release branch 2 times, most recently from e6fe29b to f595bd0 Compare February 8, 2018 17:55

mlaukala force-pushed the Release branch from f595bd0 to 56e5640 Compare May 30, 2018 17:21

mlaukala force-pushed the Release branch from 30b3ad6 to 236c74c Compare June 6, 2018 17:13

mlaukala added 4 commits April 2, 2019 12:15

cumulative fixes

4d1b3f0

Added fix for 0 position xref entries not being marked as free.

fcdfd3d

Proper fix for invalid startxref.

a3c77c2

Fixed a bug where ScanName() would not correctly identify the BeginDi…

dc61080

…ctionary delimiter if the next char was a '.'

mlaukala force-pushed the Release branch from c161c97 to b2a4beb Compare April 2, 2019 19:25

Resolved bad end stream checks that slowed stream object reading.

3f65706

mlaukala force-pushed the Release branch from b2a4beb to 3f65706 Compare April 3, 2019 21:13

mlaukala added 2 commits April 30, 2019 11:13

Added DEBUG conditional around a try catch block. Helps with debuggin…

2f5ed1e

…g to not catch and rethrow.

GetKids no longer assumes page if missing type when dictionary contai…

6ddd84a

…ns the /Kids key.

When checking for a valid stream length, now also checks for endstrea…

8c45b47

…m with CRLF.

mlaukala added 5 commits November 12, 2019 16:04

Set up CI with Azure Pipelines

3b91dd0

Auto build and nuget [skip ci]

Merge branch 'master' of https://github.com/MLaukala/PDFsharp

f3622fa

Update azure-pipelines.yml for Azure Pipelines

47089a1

Update azure-pipelines.yml for Azure Pipelines

1bb0cff

Fixed another out of spec pdf issue.

3f5e669

Fixed a casting where the types were incompatible. Out of spec pdf fix.

8145e7b

mlaukala force-pushed the Release branch 3 times, most recently from 1d9fe73 to 8145e7b Compare September 9, 2021 03:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PDFSharp Fixes #39

PDFSharp Fixes #39

mlaukala commented Dec 20, 2017

TH-Soft commented Mar 23, 2018

mlaukala commented Apr 3, 2018

mlaukala commented May 30, 2018

mlaukala commented Jun 6, 2018

mlaukala commented Jun 6, 2018

mlaukala commented Apr 2, 2019

aggsol commented Jul 10, 2019

mlaukala commented Jul 10, 2019

leonardobaggio commented Jul 25, 2019

mlaukala commented Jul 25, 2019

ken-sands commented Sep 25, 2019

mlaukala commented Sep 25, 2019

ken-sands commented Sep 26, 2019

ken-sands commented Sep 26, 2019

mlaukala commented Sep 27, 2019

ken-sands commented Sep 28, 2019

alshezawi commented Jun 16, 2020

PDFSharp Fixes #39

Are you sure you want to change the base?

PDFSharp Fixes #39

Conversation

mlaukala commented Dec 20, 2017

TH-Soft commented Mar 23, 2018

mlaukala commented Apr 3, 2018

mlaukala commented May 30, 2018

mlaukala commented Jun 6, 2018

mlaukala commented Jun 6, 2018

mlaukala commented Apr 2, 2019

aggsol commented Jul 10, 2019

mlaukala commented Jul 10, 2019

leonardobaggio commented Jul 25, 2019

mlaukala commented Jul 25, 2019

ken-sands commented Sep 25, 2019

mlaukala commented Sep 25, 2019

ken-sands commented Sep 26, 2019

ken-sands commented Sep 26, 2019

mlaukala commented Sep 27, 2019

ken-sands commented Sep 28, 2019

alshezawi commented Jun 16, 2020