Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create actor and job(s) to split a multi-page upload, create child work per-page #31

Open
seanupton opened this issue May 8, 2018 · 0 comments

Comments

@seanupton
Copy link

Use case: user uploads 5 page PDF to a master work, wants each page to be found and searchable, and therefore wants child works (members of master/parent work). Goal is to have an actor that intercedes after creation of the master work to use the multi-page PDF to this end.

Assumptions

  1. Derivative creation, and full text extraction can be expensive, so as much as possible of the process for dealing with these steps should be queued and processed as job(s).
  2. There are no special work types, just a single work type that can handle either single page or multi-page.
  3. Works may be scanned or digitally produced, only assumption is that some things uploaded are multi-page, and some things may be single page. If something uploaded is a single-page file (e.g. a TIFF, JP2, or a single-page PDF), child works should not be created.
  4. The actor stack is the appropriate place to intervene/hook into the work creation process for the multi-page work, but actors may need to queue jobs to do most of the child work creation. This may be complicated by the means by which Hyrax also queues creating File Sets asynchronously, so bypassing a need to access an upload that may not yet be stored in a file set seems reasonably safe, and possibly necessary.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant