Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize resources while importing pages #105

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

pchinery
Copy link

@pchinery pchinery commented Oct 8, 2019

We came across a PDF file that was referencing one resource dictionary from every page, which contained all fonts and images. Therefore, extracting a single page would make the resulting file very large, as all fonts and images would be embedded as well. We can provide this file for tests, if desired.

The code changes not treat cloning the resource dictionary differently from cloning other objects, as the resources will be reduced to resources used in the content.

There are a few questions open:

  • Are there (maybe indirect) ways to reference a resource from the content that are not considered here?
  • Is there a way to re-use the lexer/parser to go identify used resources? (currently, this is a rather hacky implementation)
  • Are there any points that we have not considered properly here?

Any feedback is greatly appreciated and we'd love to see this ability in the main branch at some point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants