Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workflow server #652

Open
wants to merge 30 commits into
base: master
Choose a base branch
from
Open

Workflow server #652

wants to merge 30 commits into from

Commits on Dec 1, 2020

  1. Processor.__init__: fix OCR-D#274

    bertsky committed Dec 1, 2020
    Configuration menu
    Copy the full SHA
    239cf3f View commit details
    Browse the repository at this point in the history
  2. add workflow server and API integration…

    - add workflow CLI group:
      - add alias `ocrd workflow process` to `ocrd process`
      - add new `ocrd workflow server`, running a web server
        for the given workflow that tries to instantiate
        all Pythonic processors once (to re-use their API
        instead of starting CLI each time)
    - add `run_api` analogue to existing `run_cli` and let
      `run_processor` delegate to it in `ocrd.processor.helpers`:
      - `run_processor` only has workspace de/serialization and
        processor instantiation
      - `run_api` has core `process()`, but now also enters and
        leaves the workspace directory, and passes any exceptions
    - ocrd.task_sequence: differentiate between `parse_tasks`
      (independent of workspace or fileGrps) and `run_tasks`,
      generalize `run_tasks` to use either `run_cli` or new
      `run_api` (where instances are available, avoiding
      unnecessary METS de/serialisation)
    - amend `TaskSequence` by `instance` attribute
      and `instantiate` method:
      - peek into a CLI to check for Pythonic processors
      - try to compile and exec, using monkey-patching
        to disable normal argument passing, execution, and
        exiting; merely importing and fetching the class
        of the processor
      - instantiate processor without workspace or fileGrps
      - avoid unnecessary CLI call to get ocrd-tool.json
    bertsky committed Dec 1, 2020
    Configuration menu
    Copy the full SHA
    0c3d970 View commit details
    Browse the repository at this point in the history
  3. adapt test_task_sequence

    bertsky committed Dec 1, 2020
    Configuration menu
    Copy the full SHA
    1cb161c View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    63be07d View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    990857f View commit details
    Browse the repository at this point in the history

Commits on Dec 4, 2020

  1. run_processor: set fileGrps already during instantiation (as some imp…

    …lementations currently expect them in the constructor)
    bertsky committed Dec 4, 2020
    Configuration menu
    Copy the full SHA
    f4e71a8 View commit details
    Browse the repository at this point in the history

Commits on Jan 18, 2021

  1. Configuration menu
    Copy the full SHA
    fddb236 View commit details
    Browse the repository at this point in the history

Commits on Jan 25, 2021

  1. Configuration menu
    Copy the full SHA
    b4a8bcb View commit details
    Browse the repository at this point in the history

Commits on Jan 26, 2021

  1. Configuration menu
    Copy the full SHA
    6d15084 View commit details
    Browse the repository at this point in the history

Commits on Feb 9, 2021

  1. Configuration menu
    Copy the full SHA
    6e2e7ff View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    e34b70a View commit details
    Browse the repository at this point in the history

Commits on Mar 9, 2021

  1. Configuration menu
    Copy the full SHA
    1dd2d54 View commit details
    Browse the repository at this point in the history

Commits on May 13, 2021

  1. Configuration menu
    Copy the full SHA
    e637a57 View commit details
    Browse the repository at this point in the history

Commits on Jun 9, 2021

  1. Configuration menu
    Copy the full SHA
    e3c992e View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    2949925 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    ccb369a View commit details
    Browse the repository at this point in the history

Commits on Jun 10, 2021

  1. workflow server: run multi-processed / queued…

    - replace Flask dev server with external uwsgi call
    - factor out Flask app code into separate Python module
      which uWSGI can pick up
    - make uWSGI run given number of workers via multi-processing
      but not multi-threading, and prefork before loading app
      (to protect GPU and non-thread-safe processors, and because of GIL)
    - pass tasks and other settings via CLI options (wrapped in JSON)
    - set worker Harakiri (reload after timeout) based on number of
      pages multiplied by given page timeout
    - add option for number of processes and page timeout
    bertsky committed Jun 10, 2021
    Configuration menu
    Copy the full SHA
    db14b50 View commit details
    Browse the repository at this point in the history

Commits on Jun 11, 2021

  1. Configuration menu
    Copy the full SHA
    e6d61a3 View commit details
    Browse the repository at this point in the history

Commits on Jun 13, 2021

  1. Configuration menu
    Copy the full SHA
    cac80d6 View commit details
    Browse the repository at this point in the history

Commits on Jun 15, 2021

  1. add processing server…

    - add `--server` option to CLI decorator
    - implement via new `ocrd.server.ProcessingServer`:
      - based on gunicorn (for preforking directly from
        configured CLI in Python, but instantiating the
        processor after forking to avoid any shared GPU
        context)
      - using multiprocessing.Lock and Manager to lock
        (synchronize) workspaces among workers
      - using signal.alarm for worker timeout mechanics
      - using pre- and post-fork hooks for GPU- vs CPU-
        worker mechanics
      - doing Workspace validation within the request
    bertsky committed Jun 15, 2021
    Configuration menu
    Copy the full SHA
    6263bb1 View commit details
    Browse the repository at this point in the history

Commits on Jun 20, 2021

  1. Configuration menu
    Copy the full SHA
    4b59396 View commit details
    Browse the repository at this point in the history

Commits on Jun 30, 2021

  1. Configuration menu
    Copy the full SHA
    fa1bc37 View commit details
    Browse the repository at this point in the history

Commits on Sep 29, 2021

  1. Configuration menu
    Copy the full SHA
    fcbcc82 View commit details
    Browse the repository at this point in the history

Commits on Oct 13, 2021

  1. Configuration menu
    Copy the full SHA
    8193559 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    5d48239 View commit details
    Browse the repository at this point in the history
  3. Add process_images endpoint.

    jnphilipp authored and bertsky committed Oct 13, 2021
    Configuration menu
    Copy the full SHA
    6ff1d40 View commit details
    Browse the repository at this point in the history

Commits on Nov 10, 2021

  1. Configuration menu
    Copy the full SHA
    08658f9 View commit details
    Browse the repository at this point in the history

Commits on Jan 17, 2022

  1. Configuration menu
    Copy the full SHA
    417faf0 View commit details
    Browse the repository at this point in the history

Commits on Feb 8, 2022

  1. Configuration menu
    Copy the full SHA
    83b10f5 View commit details
    Browse the repository at this point in the history

Commits on May 4, 2022

  1. Configuration menu
    Copy the full SHA
    d98daa8 View commit details
    Browse the repository at this point in the history