Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDFSharp Fixes #39

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
Open

PDFSharp Fixes #39

wants to merge 14 commits into from

Conversation

mlaukala
Copy link

This is my Release branch that I will be working off of from this point forward. I tried making modular fixes in hopes that previous pull requests would be easy to identify changes.

As I work with Pdfs from many Pdf producers, I have been able to make numerous fixes for out of spec Pdfs. If adobe can read the file, my app must be able to read it as well. As a result, dealing with merge conflicts on my Release branch is becoming a major issue.

@mlaukala mlaukala force-pushed the Release branch 2 times, most recently from e6fe29b to f595bd0 Compare February 8, 2018 17:55
@TH-Soft
Copy link
Contributor

TH-Soft commented Mar 23, 2018

Thanks for all your code changes. I plan to incorporate them after a stable version of PDFsharp 1.50 was published. It would be nice if you could provide non-confidential PDFs that allow us to evaluate the changes.

@mlaukala
Copy link
Author

mlaukala commented Apr 3, 2018

I'll need to confirm that I can provide the PDFs. Could be a couple of days.

@mlaukala
Copy link
Author

I've been extremely busy over the past few months and things have finally slowed down a bit. I can not provide PDFs at this time that will reproduce the errors I was getting. I do hope to create example PDFs that will reproduce the errors. This will make it easier for me to do my testing when updates and new fixes are implemented. Before that happens, I'll try to find the time to revisit the fixes that I have made and write out better comments for them, describing the exact issue and the PDF producer that caused the issue along with a link to the adobe spec and relevant section of the adobe spec.

@mlaukala
Copy link
Author

mlaukala commented Jun 6, 2018

This latest fix is for invalid startxref byte offset. If the xref table cannot be found at the specified byte offset, it is assumed that all byte offsets are incorrect and the xref table and trailer is rebuilt.

@mlaukala
Copy link
Author

mlaukala commented Jun 6, 2018

Made an amendment that makes sure the latest generation root/catalog is used.

@mlaukala
Copy link
Author

mlaukala commented Apr 2, 2019

The endstream checks were not looking for an eol char before the endstream keyword and causing massive slow downs when reading huge PDF files with a lot of stream.

@aggsol
Copy link

aggsol commented Jul 10, 2019

Will this be ever merged?

@mlaukala
Copy link
Author

Will this be ever merged?

Sadly, probably not. I am not able to supply them with pdfs that reproduce the errors caused nor do I have the time to create sample pdfs that duplicate the errors. It's on my very long list of things to do but it's not high priority so it keeps getting pushed back.

@leonardobaggio
Copy link

hi @mlaukala, amazing work on this PR, thank you!
This issue about corrupted PDF has been a long time headache for me. I'm planning to use your forked version of PDFSharp, but I don't know if there another ways instead using it directly referenced on my solution, building it locally. Do you have any suggestions to achieve similar integration as provided by Nuget, but using this fork?

@mlaukala
Copy link
Author

I do not, sorry.

@ken-sands
Copy link

Applying these fixes actually caused PDFs to corrupt on saving/reopening for me.
with these in place opening and saving a pdf, then opening and saving it again would end up with elements missing, colours inverted, all sorts of stuff. If after each save the pdf is opened and saved from pdftk or similar it can be brought back from death. While it looks like a great effort towards handling pdfs with issues it currently causes more issues than it solves for us.

@mlaukala
Copy link
Author

Care to provide a sample PDF and code? I would love to attempt to work out what is causing the problem. We work with thousands of PDFs a day. 99% of the time, none of our PDFs have any issues. Of the ones that do fail, it's usually a result of the PDF producer not following the PDF specification.

@ken-sands
Copy link

I'll have to edit one to remove details and will need you to agree to take it to test only, delete after testing and not to pass it on at all but yes. is there a way to directly message you with a pdf?

@ken-sands
Copy link

I've chopped the personalised pages from my pdf so I can share it with you (though it's still a customer document so I can't make it public unfortunately) my discord tag is captain_ken#8332 I'm UK based (GMT timezone) I've just run a test on the chopped pdf with a fresh build of your release (just in case it was other tweaks I have that were causing it) Same issues persist.

@mlaukala
Copy link
Author

I just sent you a request on discord, I am MJ#2945. I'll be out of town until sunday evening. I should be able to take a look at some point next week.

@ken-sands
Copy link

Yep cool, I've sent you PDFs and rambling details on what happens, should be enough for you get the same results.

@alshezawi
Copy link

Thank you.
your fork helped me to solve Unexpected character '0xffff' in PDF stream and PDF corrupted errors.

@mlaukala mlaukala force-pushed the Release branch 3 times, most recently from 1d9fe73 to 8145e7b Compare September 9, 2021 03:06
ghost2238 referenced this pull request in Vagfas/PDFsharp Apr 4, 2022
Fixes an issue related to importing an iref object in a specific PDF document.

The issue appeared with changes in empira/PDFsharp@4d1b3f0

See empira/PDFsharp#39 for discussion.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants