Releases: OCR-D/ocrd_all
v2024-10-15
v2024-10-15
Changes:
- added new module ocrd_page2alto (also in ocrd_fileformat, now with standalone processor)
- new fixup recipes for shared
venv
without dependency conflicts - protect
venv
creation by semaphore as well - docker: update OCRD_MODULES (default selection for custom
make docker
) - docker: fix
minimum
andmedium
module lists - docker: do not rm
venv
created by previous stage - CI/CD: rewrite CircleCI config to split up mini/medi/maxi into interdependent incremental jobs
- CI/CD: fix storing test results
core 92b217e..85bde15
Release: v2.70.0
- PyPI: do not upload deprecated distribution aliases anymore
- deps-cuda: retry micro.mamba.pm even more
- 📦 v2.70.0
- 📝 changelog
- create PyPI CD
- 📝 changelog
- Merge remote-tracking branch 'github/cli-decorator-import-network'
- deps-cuda: retry if micromamba is unresponsive
- Merge branch 'master' of https://github.com/OCR-D/core
- 📝 changelog
- Merge remote-tracking branch 'github/fix_mets_server_zombies'
- 📝 changelog
- Merge remote-tracking branch 'github/deps-torch-torchvision'
- 📝 changelog
- Merge branch 'network_client_block_prints'
- Merge pull request #1280 from OCR-D/fix-docker-cuda-torch
- 📦 v2.69.0
- 📝 changelog
- Merge branch 'mexthecat-master'
- 📝 update changelog again
- 📝 changelog: remove spurious entries
- 📝 changelog
- disableLogging: clearer comment
- ocrd.cli.workspace: use physical_pages if possible, fix default output_field
- OcrdMets.get_physical_pages: cover return_divs w/o for_fileIds for_pageIds
- update OcrdPage from generateds
- OcrdPage: add PageType.get_ReadingOrderGroups()
- ocrd.cli.workspace: assert non-server in cmds mutating METS
- tests: make sure ocrd_utils.config gets reset whenever changing it globally
- lib.bash: fix errexit
- run_processor: be robust if ocrd_tool is missing steps
- ocrd.cli.validate tasks: pass on --mets-server-url, too
- ocrd.cli.workspace server: add 'reload' and 'save'
- ocrd.cli.workspace: consistently pass on --mets-server-url and --backup (also, simplify)
- METS Server: also export+delegate physical_pages
- PcGts.Page.id / make_xml_id: replace '/' with '_'
- test_mets_server: add test for force (overwrite)
- OcrdMetsServer.add_file: pass on 'force' kwarg, too
- Workspace.reload_mets: fix for METS server case
- add test for OcrdEnvConfig.reset_defaults()
- ocrd_utils.config: add reset_defaults()
- bashlib: re-add --log-filename, implement as stderr redirect
- test-logging: also remove ocrd.log from tempdir
- disableLogging: re-instate root logger, to
- test_mets_server: use tmpdir to avoid side effects between suites
- ClientSideOcrdMets: use same logger name prefix as server
- pylint: try ignoring generateds (again)
- update pylintrc
- OcrdMets.add_agent: does not have positional args
- cli.workspace: pass fileGrp as well, improve description
- adapt to PIL.Image moved constants
- fix exception
- fix --log-filename (6fc606027a): apply in ocrd_cli_wrap_processor
- tests report.is_valid: improve output on failure
- Processor.zip_input_files: more verbose log msg
- Processor.zip_input_files: warning instead of exception for missing input files
- fix imports
- ocrd_utils: forgot to export scale_coordinates at toplvl
- allow "from ocrd_models import OcrdPage
- improve output in case of assertion failures
- hide/test expected deprecation warnings
- use up-to-date kwargs (avoiding old deprecations)
- mets_server: ClientSideOcrdMets needs OcrdMets-like kwargs (without deprecation)
- test_mets_server: fix arg vs kwarg
- processor CLI: delegate --resolve-resource, too
- 📦 v2.68.0
- 📝 changelog
- refactor client cli: process -> run
- Merge branch 'master' into extend-network-client
- 📝 changelog
- Merge pull request #1270 from OCR-D/fix-parsing
- fix: exception handling
- add: check processing job log file
- add: discovery cli, processors and processor
- add sort to network agents
- add: parameter_override
- fix: the annoying string dict
- fix: check report validation outside try block
- fix: set ps address if None in constructor
- Fix: server_utils.py > 404 to 400
- Fix: rename to block
- add docstring to cli commands
- fix: required job id
- add: help section to the cli
- add cli job status check
- add help for new env
- refine status check methods
- Update src/ocrd_network/client_utils.py
- add timeout and wait to configs
- add: client workflow run
- fix: client processing request
- fix test
- refactor status checks
- remove the client server
- try docker host ip
- Fix flag typo
- integration test for client
- update network client
- fix the test dir path in docker
- add integration test for client
- Merge branch 'resolve-1257'
- 📝 changelog
- revert, and just use < v43.0.0
- set paramiko logging to INFO
- fix: supress paramiko warnings
- set: propagate 0, logging config
- set: paramiko logging to ERROR
- remove downloading tool json
- add: default ocrd-all-tool.json
- download tool json if missing
- Merge branch 'master' into resolve-1257
- load tool json locally
dinglehopper 129e6eb..071e6a8
Release: v0.9.7
docstruct a7ffdda..004e6ec
- add GHA CD via Dockerhub
Submodule eynollah 032a99e...51f6ef6:- Merge pull request #137 from qurator-spk/dockerfile
- Merge pull request #132 from qurator-spk/extracting_images_only
- Merge pull request #133 from qurator-spk/src-layout
- 📦 v0.3.1
- 📝 changelog
- Merge pull request #129 from qurator-spk/resolving_issue_106
- update Makefile model location
- update pyproject.toml for v0.3.1
- update pyproject.toml
- Update README.md
- rename GH action
- create draft pyproject.toml
- format options table
- Update README.md
- improve huggingface url
- remove CircleCI
- Update model download url
- Merge pull request #127 from bertsky/new-namespace-pkg
- update GitHub actions
- Update README.md
- update supported Python+Tensorflow version combinations
- pin tf2 version to 2.12.1
- use tf1 compatibility for keras backend
< adapt to OcrdFile.local_filename now :Path
< adapt to ocrd>=2.54 url vs local_filename- comment unnecessary print commands
- add supported OS to readme
- filtering separators in a correct way without missing them
- Merge pull request #117 from qurator-spk/tf-2.12-or-greater
- apply missed commit #a56988a back
- Merge pull request #116 from qurator-spk/fix-typos
- Merge pull request #113 from qurator-spk/tf_<2.12.0
- Update citation
- Update bibtex entry
- format citation info as bibtex
- add HIP'23 paper reference
- Merge pull request #109 from bertsky/patch-3
- Merge pull request #105 from bertsky/fix-model-archive-path
< Revert "Merge pull request #97 from qurator-spk/420-namespace-package"- Merge pull request #104 from bertsky/reinstate-namespace-pkg
- Merge pull request #102 from qurator-spk/right2left_reading_order
- delete printing resized image shape
- issue #67 solved
- improve links to GT guidelines
- Update README.md
- Update CHANGELOG.md
- Update ocrd-tool.json
- Merge pull request #86 from qurator-spk/eynollah_light
nmalign 7832c90..1426dbc
Release: v0.0.3
- fix dockerfile
- add GHA CD via Dockerhub
ocrd_calamari caac953..d9cde1f
Release: v1.0.6
ocrd_cis 38ce45b..db65d7f
Release: [v0.1.5](https://github.com/...
v2024-07-01
core c5b5580..79c61e3
Release: v2.66.1
- 📦 v2.66.1
- 📝 changelog
- GHA Docker: build docker.io first, then tag ghcr.io
- Dockerfile.cuda*: adapt to #1225 cc6ea57
- 📦 v2.66.0
- 📝 changelog
- Merge remote-tracking branch 'origin/download-file-no-absolute-urls'
- 📝 changelog
- Merge remote-tracking branch 'bertsky/param-preset-resolve-anew'
- 📝 changelog
- Merge remote-tracking branch 'bertsky/workspace-clean'
- 📝 changelog
- Merge remote-tracking branch 'origin/utilize-ps-proxy-to-ms'
- 📝 changelog
- Merge remote-tracking branch 'bertsky/add-docker-tf1'
- 📝 changelog
- Merge remote-tracking branch 'origin/fix-ci-no37'
- 📝 changelog
- Merge remote-tracking branch 'origin/ocrd-logging-debug-true'
- 📝 changelog
- cli.workspace.find --undo-download: add --keep-files
- cli.workspace.find --undo-download: only if .url exists
- 📝 changelog
- Merge branch 'master' of https://github.com/OCR-D/core
- Merge branch 'rm-coverage'
- 📝 changelog
- Merge pull request #1225 from bertsky/docker-editable
- 📝 changelog;
- Merge pull request #1227 from OCR-D/fix-ocrd-file-remove-url
- fix CD by fetching tags, too
- 📦 v2.65.0
- 📝 changelog
- ci: fix integration test
- Merge branch 'master' into test-workflow
- make network-integration-test: disable ocrd_all test
- disable ocrd all test in core
- ci: disable scrutinizer build
- Remove ocrd_all-tests from core makefile
- make ocrd all tests callable from Makefile
- merge master
- Merge branch 'test-workflow' of github.com:OCR-D/core into test-workflow merge master
- Make make assets in Dockerfile skipable
- remove duplicates
- Add a test for workflow run in ocrd_all
dinglehopper f8e3108..bc5818d
Release: v0.9.6
- ✔ GitHub Actions: Update used actions
- ⚙ pre-commit: Update hooks
- 🐛 Fix reading plain text files
- 📦 v0.9.6
- Revert "✔ Test on Python 3.13"
- 🐛 GHA: Install possible shapely build requirements (if building from source)
- Merge pull request #111 from stweil/typos
- 🐛 GHA: Install possible lxml build requirements (if building from source)
- ✔ Test on Python 3.13
- 🐛 Fix Python 3.12 support by requiring ocrd >= 2.65.0
- ⚙ pre-commit: Update hooks
- ✔ Test using empty files
- ⚙ pre-commit: Update hooks
- 🧹 tests: Move comment out of the code (bad style + weird formatting)
- ⚙ cli: Annotate types in process_dir()
- ⚙ pre-commit: Update hooks
- 🧹 Make from_text_segment()'s textequiv_level keyword-only
- 🧹 Make process_dir() keyword arguments keyword-only
- ✒ README-DEV: Releasing a new version
- 📦 v0.9.5
- ⚙ pre-commit: Add mypy dependencies
- 🐛 Check that we always get a valid ALTO namespace (satifies mypy)
- 🎨 Reformat (Black)
- ⚙ pre-commit: Update hooks
- Merge branch 'master' of https://github.com/qurator-spk/dinglehopper
- 🐛 Fix word segmentation with uniseg 0.8.0
- 🚧 GitLab CI Test: Depend on child pipeline
- 🚧 GitLab CI Test: Push after pulling
ocrd_detectron2 a8402d8..1f56273
Release: v0.1.8
- patch -f instead of -t
- suppress question if patch detectron2#5010 already applied
ocrd_fileformat ba79de9..fb769ff
Release: v0.10.0
- CI: add Dockerhub CD
- update textract2page again
- forgot to add AWS-Textract→PAGE
- update ocr-fileformat (with newer textract2page)
- Merge pull request #53 from OCR-D/osx-cp
ocrd_froc 45d5dcd..42f1ce0
Release: v0.6.1
- 📦 v0.6.1
- Merge pull request #16 from stweil/typos
- 📦 v0.6.0
- 📝 changelog
- Merge branch 'main' into pr-8-feedback
- add parameter min_score_style to filter low-confidence font classification results
- don't replace textstyle, only fontFamily if overwrite_style
- add parameter "overwrite_text" (default: false): whether to add or replace TextEquiv
- rename parameter network -> model
- rename parameter replace_textstyle -> overwrite_style
ocrd_keraslm 472197f..ea79b2a
Release: v0.4.3
- test: with initLogging
- test: allow running with published model_dta_full instead of training model_dta_test
- improve/update readme
- wrapper.rate: improve/update docstring
- wrapper.rate: use resolve_resource for model file path, split off setup()
- save training history (metrics), add cmd to print them
- train: allow passing single file for val_data, too
- fix continuing from chkpt
- update readme
- add model_dta_full.h5 ref
- chlog
- 📦 0.4.3
- generate: add option --variants (nr of nbest seqs)
- plot_context_embeddings_projection: add years
- test: allow passing directory for data, too
ocrd_kraken bdbe6fc..a6160ce
Release: v0.4.1
ocrd_neat 2f7d01c..06c8b38
Release: v0.0.1
- character normalization based on aletheia mapping
- Merge pull request #13 from qurator-spk/fix-ppn-xpath
ocrd_pagetopdf 24f77d3..7c5ab70
Release: v1.1.0
ocrd_segment 3993139..de824e9
Release: v0.1.24
- 📦 v0.1.24
- repair: add params spread / spread_level, update/improve docs
- repair: validate/repair polygons with 0 px tolerance
- project/repair join_polygons: fix rare case of adjacent rings
- from_masks: skip if no seg file
- 📦 v0.1.23
- Merge pull request #67 from OCR-D/project-parent
ocrd_tesserocr ed73d96..d23992b
Release: v0.19.1
- 📦 v0.19.1
- 📦 v0.19.0
- Update submodule tesseract
- Merge pull request #213 from OCR-D/osx-cp
- Merge pull request #212 from stweil/tesseract
- Merge pull request #211 from stweil/libcurl-dev
- model resources: fix frak2021 description
- recognize models: use _best instead of _fast
- sanitise loggers: no points, but warnings
- update changelog
- recognize: more robust polygon handling
- Merge pull request #208 from bertsky/subclass-docstrings
- Merge pull request #205 from stweil/update
- Merge pull request #203 from stweil/build
sbb_binarization b89ec49..978f425
Release: v0.1.0
- document tested Python+TF2 versions
- remove Python 3.11 from test
- Update README.md
- Update tensorflow version requirements
workflow-configuration bd149f8..eeea260
Release: 0.1.3
- ocrd-make -X: no --cleanup to prevent race for --bf
- ocrd-make:...
v2024-03-08
Removed:
- tesserocr and tesseract (which are now submodules of ocrd_tesserocr)
core f54b002..c5b5580
Release: v2.63.3
- 📦 v2.63.3
- 📝 changelog
- OcrdMets.add_file: fix finding existing el_pagediv
- 📝 changelog
- expose uninstall-workaround
- 📦 v2.63.2
- 📦 v2.63.1
- Merge branch 'fix-get-physical-pages'
- make coverage: omit generateDS code
cor-asv-ann 0a4f684..f7ebb74
- compare/evaluate: ensure lines are shown verbatim on single file level
- compare/evaluate: do not record worst lines verbatim
- compare/evaluate: also record 1% worst lines
ocrd_keraslm 759805b..472197f
Release: v0.4.2
- update assets
- 📦 v0.4.2
- generators: do not require input length > window size
- train: allow passing directory for training data, too
- add checkpointing, allow continuing from ckpt
- suppress TF gibberish
ocrd_tesserocr 08a020f..ed73d96
Release: v0.18.1
- :package v0.18.1
- 📝 changelog
- update tesseract/tesserocr to most recent
- update/improve readme
- simplify dockerfile
- CI: add make test
- add repo/assets as proper submodule, rename -clean → clean-
- make Tesseract build configurable
- also install minimal needed models
- explify dependencies install ← install-tesserocr ← install-tesseract
- test: download files of data assets
- test: also set up logging system during tests
workflow-configuration 8418b3f..bd149f8
Release: 0.1.3
- unprefix regex paths by directory argument, if any
- forgot to convert exit to continue in 6fe3c6b3
- no need to close logger FDs on exit
- ocrd-import: use coproc instead of handcrafted FIFOs for loggers
- ocrd-import: rewrite (no parallel jobs, but parallel logging)…
- ocrd-import: add option --basename, default to using directory as well
- ocrd-import: simplify+speedup…
v2024-02-01
Added:
Removed
- https://github.com/qurator-spk/page2tsv/ (temporarily)
cor-asv-fst 076e04e..4211371
- Merge pull request #4 from stweil/master
core ac1f15b..b94b185
Release: v2.62.0
- reenable circle because docker failed to build on ghcr.io
- 📦 v2.62.0
- {pypi,build}-workaround: missed a commit, s/get_distribution(...).version/dist_version(...)
- Merge branch 'circle-to-gha'
- 📝 changelog
- Merge branch 'master' into ocrd-tool-json-root
- expose ocrd-tool.json for ocrd-dummy in root like processors
- 📦 v2.60.3
- 📝 changelog
- Fix --editable install for setuptools>=64, setuptools#3548
- 📦 v2.60.2
- 📝 changelog
- Merge pull request #1161 from OCR-D/logging-downgrade-level
- Merge pull request #1160 from OCR-D/is-oai-content-loglevel-debug
dinglehopper f077ce2..f8e3108
Release: v0.9.4
- 🚧 GitLab CI Test: Push after pulling
- 🚧 GitLab CI Test: Trigger only on default branch (and do not hardcode it)
- 🚧 GitLab CI Test
- 🔍 ruff: Remove ignore configuration, we use multimethods in a compatible way now
- ⚙ pre-commit: Update hooks
- 🚧 GitLab CI Test
- 🔍 mypy: Use an almost strict mypy configuration, and fix any issues
- 🔍 mypy: Use a compatible syntax for multimethod
- 🔍 mypy: Remove ExtractedText.segments converter
- 🔍 mypy: Avoid using check() for all attr validators
- 🔍 mypy: Make cli.process() typed so mypy checks it (and issues no warning)
- Merge branch 'pr103'
- ⚙ Update ruff+mypy dependencies
- ⚙ pre-commit: Update hooks
- ⬆ Move on to supporting Python >= 3.8 only
- 🐛 Use typing.List instead of list, for Python <3.9
- 🐛 Use Optional instead of | none, for Python <3.10
- ⚙ pre-commit: Update hooks
- 🐛 Fix generating word differences
- ⚙ pre-commit: Update hooks
- Merge branch 'master' of https://github.com/qurator-spk/dinglehopper
- Merge branch 'master' into performance
- ⬆ Update uniseg dependency
- ❎ Make joining grapheme clusters more robust by checking joiner and handling an empty joiner
- 🐛 Fix score_hint call in cli_line_dirs
- 🐛 Fix docstring of distance() for grapheme clusters
- 🐛 Fix calculation of score_hint for edge cases, e.g. when CER is infinite
- 🕸Do not use deprecated ID, pageId options
- ✔ Add mets:FLocat's @LOCTYPE/OTHERLOCTYPE to test data
- ⬆ Update multimethod dependency
- 🐛 Update tests for ExtractedText
- use uniseg again
- update rapidfuzz version
- replace uniseg with uniseg2
- apply black
- move grapheme clusters to ExtractedText
- remove python2.7 futures
- remove unused includes
- only call
words_normalized
once
eynollah 706433c..032a99e
Release: v0.2.0
- adapt to OcrdFile.local_filename now :Path
- adapt to ocrd>=2.54 url vs local_filename
ocrd_fileformat c5f0c52..ba79de9
Release: v0.10.0
- 📦 v0.10.0
- 📝 changelog
- Update ocr-fileformat to include UB-Mannheim/ocr-fileformat#172
- Merge branch 'fix-textract2page'
- update ocr-fileformat to latest
- Update ocr-fileformat to v0.6.0
ocrd_repair_inconsistencies cf879c1..94c482f
- 🕸 README: Mention archival of the project
opencv-python 7cfd1ee..8ad8ec1
Release: 80
- Merge branch 'as/4.9.0-readme-update' into 4.x
- Merge pull request #941 from asmorkalov/as/mac_m1_venv_for_test
- Merge pull request #940 from asmorkalov:as/donation
- Merge pull request #938 from asmorkalov"as/4.9.0-pre
- Merge pull request #934 from asmorkalov/as/native_mac_m1_runner
- Merge pull request #936 from asmorkalov:as/mac_intel_update
- Merge pull request #932 from asmorkalov/as/pre-4.9.0_linux_upgrade
- Merge pull request #931 from asmorkalov:as/ipp_icv_license
- Merge pull request #904 from asmorkalov:as/python_3.12
- Merge pull request #927 from dkurt:try_enable_dependents
sbb_binarization f3c6ac8..b89ec49
Release: v0.1.0
- Merge pull request #65 from rettinghaus/update-tests
tesseract ea0b245..8ee020e
Release: 5.3.4
- Create new release 5.3.4
- Set User-Agent: header field in HTTP request for curl downloads
- Merge pull request #4178 from sadra-barikbin/patch-1
- Merge pull request #4174 from stweil/warnings
workflow-configuration cbc3234..f54c91a
Release: 0.1.3
- ocrd-page-transform: local_filename instead of url
- ocrd-page-transform: fix unbound variable
v2023-12-15
core 742906e..ac1f15b
Release: v2.60.1
- 📦 v2.60.1
- docker: we need .git during build for setuptools_scm
- 📝 changelog
- Merge branch 'git-versioning'
- 📝 changelog
- defaults for mets_basename and mets_server_url
- Merge branch 'master' of https://github.com/OCR-D/core
- 📝 changelog
- ocrd workpace list-page: ignore files without pageId, fix #1148
v2023-12-07
core d8c5813..742906e
Release: v2.59.1
- 📦 v2.59.1
- 📝 changelog
- Merge pull request #1142 from OCR-D/fix-ws-locks
- 📝 changelog
- fix ranges, use numpy
- 📦 v2.59.0
- 📝 changelog
- Merge branch 'network-workflow-api'
- 📝 changelog
- Merge pull request #1139 from OCR-D/bagger-filegrp-filter
- 📝 changelog
- Merge branch 'list-page-extended'
- 📝 changelog
- Merge pull request #1136 from OCR-D/network-processor-api
- 📝 changelog
- bagger: fallback to ID-derived filename to avoid filename conflicts
- test bagger: sample workflow with conflicting basenames
- bagger test: rewrite as pytest
- 📝 changelog
- Merge pull request #1134 from OCR-D/page-update
- 📝 changelog
- Merge branch 'update-apidocs'
- OcrdMets: remove dead code exit method, #1130
dinglehopper dbaccdd..f077ce2
Release: v0.9.4
- 🐛 dinglehopper-summarize: Handle reports without difference stats
- Merge pull request #97 from qurator-spk/clean-remove-six-dep-again
- Merge pull request #96 from qurator-spk/test-on-pr-but-really
- Merge pull request #94 from qurator-spk/test-on-pr
- Merge pull request #93 from qurator-spk/update-dep-multimethod
- Merge pull request #92 from qurator-spk/update-pre-commit
- Merge pull request #91 from qurator-spk/test-remove-circleci
- Merge pull request #90 from qurator-spk/test-on-python-3.12
- ✔ Add mets:FLocat's @LOCTYPE/OTHERLOCTYPE to test data
format-converters 9615db1..fa8b4b5
- Merge pull request #24 from stweil/orientation
- Merge pull request #18 from stweil/rotate
- Merge pull request #23 from stweil/TextEquivNone
- Merge pull request #22 from stweil/image
- Merge pull request #19 from stweil/page_version
- Merge pull request #21 from stweil/image_dimensions
- Merge pull request #20 from stweil/close_statement
- Merge pull request #17 from stweil/update
ocrd_calamari c0a4dfd..ad8febd
Release: v1.0.6
- ⚙ pre-commit: update config (using pre-commit-update hook)
- 🎨 Remove extra whitespace in CircleCI confg
- ✔ CircleCI: Update codecov orb
- Merge pull request #102 from OCR-D/review-test-base
- Merge pull request #106 from OCR-D/setup-update-description
- ✔ CircleCI: Try to fix caching of pip cache
- Merge pull request #107 from OCR-D/circleci-pip-cache-1
- Merge pull request #101 from OCR-D/use-ruff-config-like-dinglehoppers
- Merge pull request #100 from OCR-D/fix/set-default-MODEL-for-bare-pytest
- 🐛 Fix importing + exporting test.base.assets
- 🐛 pre-commit/mypy: Install missing types-setuptools depedency
- 🧹 Remove unused exports (done by ruff --fix .)
- 🛠Add pre-commit configuration
- 🐛 Require ocrd >= 2.54.0 for tf_disable_interactive_logs
- ✒ README: Minor changes
- ✒ README: Fix and simplify example instructions
- 🎨 Reformat using Black + ruff
- 🧹 Remove protobuf restriction (→ calamari-ocr 1.0.6 has it now)
- Merge pull request #78 from mikegerber/test-python-3.11
- 💩 Add a script fix-calamari1-model to fix regexen in 1.0 models
- ✒ README: Wrap long line
- 🧹 Remove workaround for fixed np.str problem in Calamari
ocrd_neat 0f64f07..2f7d01c
Release: v0.0.1
ocrd_pagetopdf 7368f51..24f77d3
Release: v1.1.0
- Use first bash from PATH (allows running on macOS)
- Merge pull request #23 from kba/mets-server-url-support
tesseract dc228ed..ea0b245
Release: 5.3.3
- Update status badge for GitHub workflow sw (add missing line break)
- Update status badge for GitHub workflow sw
- Correct indefinite articles before vowels
- Update issue-bug.yml
- Merge pull request #4162 from tfmorris/3000-tcp-scrollview
- Avoid conversions from std::string to char* to std::string
- Remove unnecessary conversions from std::string to C string
- Remove whitespace at line endings
- Update sw.yml
- Update README.md
- Update sw.yml
- Fail on curl download errors
- Add new parameter curl_cookiefile for curl_easy_setopt
- ci: Fix clang build for Ubuntu 22.04
- ci: Allow manual trigger of unittest
- Move bail_out function before libtoolize check
- Merge pull request #4150 from tfmorris/4149-directory-to-stdout
workflow-configuration d0d208c..cbc3234
- add page-ensure-textequiv-conf.xsl
- add page-rename-id-clashes.xsl
- PAGE CLIs: allow combining --pretty and --diff
- new pair of XSLTs: un/flatten table cells
- require ocrd >= 2.58.1
- require ocrd >= 2.58
- ocrd-page-transform: support --mets-server-url
v2023-10-20
core caa7ac3..d8c5813
Release: v2.58.1
- 📦 v2.58.1
- 📝 changelog
- bashlib: set empty string as default value for ocrd__argv[mets_server_url], OCR-D/olena#95
- 📦 v2.58.0
- 📝 changelog
- Merge pull request #1127 from OCR-D/fix-bashlib-log3
- 📝 changelog
- ocrd workspace bulk-add now supports --mets-server-url
- bashlib: short option -U for --mets-server-url
- 📝 changelog
- Merge remote-tracking branch 'origin/fix-bashlib-log2'
- 📝 changelog
- Merge branch 'mets-server-reload'
- 📝 changelog
- Merge pull request #1121 from OCR-D/fix-bashlib-log
ocrd_fileformat cfb24ec..7364265
Release: v0.9.1
- 📦 v0.9.1
- 📝 changelog
- require ocrd >= 2.58.1
- 📦 v0.9.0
- 📝 changelog
- require ocrd >= 2.58
- Support --mets-server-url; get local_filename not url
ocrd_im6convert 105697f..db18917
Release: v0.1.1
- 📦 v0.1.1
- 📝 changelog
- require ocrd >= 2.58.1
- 📦 v0.1.0
- 📝 changelog
- minversion must be MAJOR.MINOR.PATCH
- require ocrd >= 2.58
- add --mets-server-url support
ocrd_olena a2e2520..c1f7cab
Release: v1.5.0
- 📦 v1.5.0
- 📝 changelog
- require ocrd >= 2.58.1
- debug
- minversion must be MAJOR.MINOR.PATCH
- require ocrd >= 2.58
- support --mets-server-url
ocrd_pagetopdf 4f4a330..7368f51
Release: v1.0.0
- require ocrd >= 2.58.1
- require ocrd >= 2.58
- Use local_filename, not url
- restore $zeros assignment
- Support --mets-server-url
workflow-configuration f574f82..d0d208c
Release: 0.1.3
- require ocrd >= 2.58
- ocrd-page-transform: support --mets-server-url
v2023-10-17
cor-asv-ann 006a70e..4216c16
Release: v0.1.14
- adapt to Numpy deprecations
- CI: use ocrd/core-cuda as base image
- CI: dummy venv
- CI: use proper tab character
- CI: clone first
- CI: mkdir first
- CI: chdir to tmp location
- CI: use /tmp for aux clone of ocrd_all
- try getting tensorflow-gpu from Nvidia
- use proper URLs for submodules
- Merge pull request #6 from kba/init-report-dict
- evaluate: skip pages with no results
core de08453..3b0307a
Release: v2.56.0
- 📦 v2.56.0
- 📝 changelog
- Merge branch 'network-log-refactoring'
- 📦 v2.55.2
- 📝 changelog
- 🐛 pydantic fields must not start with underscore
- 📦 v2.55.1
- 📝 changelog
- Merge pull request #1113 from OCR-D/bulk-add-url-local-filename
- 📦 v2.55.0
- 📝 changelog
- Merge branch 'workflow-endpoint'
- 📝 changelog
- generate_page_range: verify single-page range based on start
- generate_page_range: warn, not raise, if start==end, fix #1106
- 📝 changelog
- logging: remove custom logging in ocrd_network, use explicit logger name
- logging remove hard-coded setLevel in decorators/ocrd_network
- helpers.ruin_processor: setOverrideLogLevel if log_level is provided
- ocrd log: default to ocrd.log_cli logger name
- METS Server: add basic logging of operations
- logging: use ocrd.{utils,models,exif} not ocrd_{utils,models,exif}
- ocrd_utils.logging.getLogger: no more initLogging
- 📝 changelog
- Merge branch 'master' into mets-server-fixes-2023-09-15
- mets server: remove socket file on shutdown
- mets server: do the chmod before server start, not before connection
- 📝 changelog
- METS server: make socket world-readable/-writable
- 📦 v2.54.0
- Merge pull request #1095 from OCR-D/run-cli-mets-server-url
- 📝 changelog
- Merge branch 'revise-logging'
- Merge pull request #1093 from OCR-D/create-default-queue
- Merge pull request #1080 from OCR-D/revise-logging
- bashlib: fix --help output
- 📝 changelog
- Merge branch 'keep-remote-links'
- 📝 changelog
- downgrade ValueError to log.warning about inconsistent pageId for processor calls
- Merge branch 'master' into warn-empty-page
- raise ValueError if --page-id is provided but leads to empty result
- 📝 changelog
- Merge pull request #1069 from OCR-D/processing_server_ext_1046
- Merge branch 'mets-server'
- 📝 changelog
- ci: localhost -> 127.0.0.1
- pin requests < 2.30, OCR-D/core#1082
- mets server: forbid local/remote workspace with different directories
- mets server: allow both local_filename and url to be None
- Merge branch 'master' into mets-server
- mets server: test both UDS and TCP variant
- ClientSideOcrdFile et al need url too
- pass mets_server_url from run_processor
- typo: -{,-}mets-server-url
- move ClientSideOcrd{Agent,File} to ocrd_models
- METS server: support -U for processor options
- workspace server start: pass workspace context
- mets server: single option --mets-server-url/-U
- mets server will never pass content to workspace.add_file
- mets server: no content will pass through it
- mets server: clean up is_remote muddle
- mets server: support unique_identifier
- mets server: str handlers
- mets server: provide fallback for non-wrapped OcrdFile methods
- mets server: remove XXX HACK comments, they are not;
- mets server: improve docs
- mets server: add stop
- Update ocrd/ocrd/cli/workspace.py
- ocrd workspace CLI: reference METS server option
- METS server: consistently use local_filename
- Update ocrd/ocrd/cli/workspace.py
- METS Server: equivalent functionality to files for agents
- finish implementation / test mets server
- Merge remote-tracking branch 'origin/master' into mets-server
- workspace: save content to file only if not remote
- Merge branch 'mets-server' of https://github.com/kba/ocrd-core into mets-server
- mets_server: file search/adding on /file not /
- mets_server: missed mimetype kwarg
- mets_server: different loggers for socket/host-port
- Merge branch 'mets-server' of https://github.com/kba/ocrd-core into mets-server
- mets_server: replace Model constructor with static create calls
- --port must be int
- resolver: shorten mets_server_{host,port} check
- mets_server: only save_mets on PUT and DELETE
- OcrdWorkspace.is_remote should be a bool
- ClientSideOcrdMets: fix signature of self.file_groups
- mets-server: bashlib should take same args
- remove noise from makefile
- slowly but determinedly
- getting there
- .
- wip
dinglehopper 0fd4ea1..dbaccdd
Release: v0.9.4
- ✒ README: Minor whitespace cleanup
- ✒ README: Recommend installing via pip and from PyPI
- 📦 v0.9.4
- 🎨 editorconfig: *.json should have a final newline
- 🧹 pyproject: Remove extra *.json
- 🧹 Remove empty setup.cfg
- 📦 v0.9.3
- 🐛 Remove MANIFEST.in workaround, now that setuptools_ocrd is fixed
- 📦 v0.9.2
- 🧹 .gitignore dist/
- 🐛 Workaround sdist not containing top-level ocrd-tool.json
- ⚙ GitHub Actions: Call test workflow when (before) deploying
- 🎨 Release: Make installing setuptools-ocrd conditional on ocrd-tool.json
- 🐛 Release: Try fixing getting the version (install setuptools-ocrd)
- 📦 v0.9.1
- ✒ README: Update badges
- Revert "🚧 GitHub Actions: Try testing on Python 3.12"
- 🚧 GitHub Actions: Try testing on Python 3.12
- 🚧 GitHub Actions/CircleCI: Remove testing from CircleCI config
- 🚧 GitHub Actions: Do no try installing ruff on Python 3.6
- 🚧 GitHub Actions: Do no try installing pytest-ruff on Python 3.6
- 🚧 GitHub Actions: Avoid compiling OpenCV and NumPy on Python 3.6
- 🚧 GitHub Actions: Fix testing for Python 3.6
- 🚧 GitHub Actions: Disable matrix fail-fast
- 🚧 GitHub Actions: Test on multiple Python versions
- 🚧 GitHub Actions: Test report
- 🚧 GitHub Actions: Try shell for loop to install from all requirements*.txt
- 🚧 GitHub Actions: Rework test, run in src/
- 🚧 GitHub Actions: Allow running test manually
- 🚧 GitHub Actions: Rename test workflow, also run on schedule
- 🚧 GitHub Actions: Add test worklow
- 🚧 GitHub Actions: Add release workflow
- 🧹 Make dinglehopper.* exports explicit
- ⚙ ruff: Ignore F811 (no redefinitions) for now, as ruff considers the multimethods redefinitions
- 🎨 Reformat comments + strings manually (not auto-fixed by Black)
- ⬆ Use f-strings
- 🎨 Reformat using Black
- 🎨 Sort imports (auto-fixed by ruff)
- ⚙ Add pre-commit
- 🛠 Replace flake8 + pylint with ruff
- ⚙ Move mypy settings to pyproject.toml
- ⚙ pytest.ini → pyproject.toml
- 🐛 Detect encoding (incl BOM) when reading files
- 🐛 Move source into src/ to fix install
- ⚙ Migrate to pyproject.toml
- 🚧 CircleCI: Run black
- Merge pull request #83 from INL/feat/batch-processing
- Merge pull request #82 from CircleCI-config-suggestions-bot/StoreTestResults
- 🧹 .gitignore .python-version (for pyenv)
- 🧹 Remove qurator. namespace prefix
- 🐛 Fix installing by calling find_namespace_packages in setup.py
- 🕸Do not use deprecated ID, pageId options
- 🔧 Remove explicit namespace_packages
- ✔ CircleCI: Explicitly install binary opencv-python-headless (dep of OCR-D?) to avoid compilation
- 🐛 Remove deprecated declare_namespace call
nmalign cf7c60f..7832c90
Release: v0.0.3
- adapt to Numpy deprecations
ocrd_calamari 3a029ca..c0a4dfd
Release: v1.0.6
- Merge pull request #90 from OCR-D/tf_disable_interactive_logs
- ✒ README: Use backtick syntax for code block
- ✔ CircleCI: Do not test on Python 3.6 anymore (EOL since 2021-12-23)
- v1.0.6
- 🐛 Fix installation by keeping protobuf < 4.0
ocrd_cis 43a356a..fcc02fd
Release: v0.1.5
- adapt to Numpy and Pillow deprecations
- segment: fix baseline extraction
- segment: adapt to OpenCV changes
- resegment (baseline/ccomps): improve handling of fg conflicts
- resegment: add param baseline_only
- check_page/region/line: skip assumptions on number of components
- adapt to Shapely 2.0 deprecations
...
v2023-06-28
Unreleased
v2023-06-28
Changed:
- Bash prompt excludes user name undefined in docker container, #376, #366
- Docker:
ocrd-all-tool.json
is built during container build, #379 - Docker:
XDG_CONFIG_HOME
is set toXDG_DATA_HOME/ocrd-resources
, soresources.yml
is in usually-mounted location, #377, #252 - Docker:
/data
is world-writeable now, so log files can be written there, #377, #252
core 6708624..552cfcd
Release: v2.52.0
- Dockerfile: install wheel before make install
- 📦 v2.52.0
- 📝 changelog
- ci: debug macos
- Makefile: PIP_INSTALL from environment via ?=
- ci: try fixing macos
- Merge branch 'master' into improve-packaging
- Makefile: trailing whitespace
- update gitignore
- add ocrd.processor.builtin.dummy pkg (needed for resource discovery)
- add requirements.txt to manifest (so it's available at build time)
- make pypi: use build module instead of setuptools CLI
- make uninstall: run in reverse BUILD_ORDER
- make install: run conjunctively in BUILD_ORDER
ocrd_calamari 3a029ca..ed7a926
Release: v1.0.6
- v1.0.6
- 🐛 Fix installation by keeping protobuf < 4.0
ocrd_cis a0ea0a2..43a356a
Release: v0.1.5
- postcorrect: improve/update OCR-D wrapper…
- ocropy-train: improve/update OCR-D wrapper…
- ocrd-tool: rm old ocrd-cis-ocropy-rec (gone in 9e20991)
Merge branch 'kba:typo' #91 into fix-alpha-shape
Merge branch 'kba:double-page-max-size' #96 into fix-alpha-shape
Merge branch 'kba:resolve-resources' #83 into fix-alpha-shape
ocrd_detectron2 04bf4c6..5f8bdcb
Release: v0.1.7
- update model URLs to GH release archive
- requirements.txt: avoid pkg names in comments
- requirements.txt: add pycocotools as explicit dependency
- require wheel (since pip does not pull it anymore)
- deps: no build isolation so Detectron2 compilation can use Torch
- Docker: add badges and basic description
- Docker username: vars, not env
- Docker: use underscore in tagname, alright
- add Dockerfile and GH Action to publish at Dockerhub and GHCR
ocrd_kraken b13dd8a..1e71324
Release: v0.3.0
- recognize: ignore 'one_channel_mode' unless model has 1 input channel
- segment/recognize: warn if no GPU available
ocrd_repair_inconsistencies c898d6c..cf879c1
- Merge pull request #13 from stweil/master
opencv-python 474a1cc..b534ea2
Release: 72
- Merge pull request #853 from asmorkalov/as/add_pyi_to_package
sbb_binarization 010ec99..f3c6ac8
Release: v0.1.0
- Update test.yml
tesseract 1569e50..bb8803a
Release: 5.3.1
- Update .mailmap
- Create config.yml
- Remove old broken GitHub action vcpkg-4.1.1 (fixes issue #4078)
- cmake: check if leptonica was build with tiff support
- cmake: provide info about disabled LibArchive and CURL
- cmake: allow to disable tiff (-DDISABLE_TIFF=ON)
- Merge pull request #4073 from stweil/osd
- Merge pull request #4071 from stweil/clean
- Merge pull request #4066 from stweil/lstmtraining
- Merge pull request #4068 from stweil/sprintf
- Remove unused code in function fix_rep_char
- Merge pull request #4067 from stweil/misc
- Support for Sgaw and W Pwo Karen languages in the Myanmar validator. (#4065)
- issue-bug.yml: Windows versions 7, 8, 8.1 are not supported anymore
- snap: Update from leptonica 1.74.2 to latest 1.83.1
- fix: Fix snap package building
- Create new release 5.3.1
- Remove whitespace at line endings
- Fix issue #4010 (#4041)
- cmake: add missing HAVE_NEON to config_auto.h
- Merge branch 'main' of https://github.com/tesseract-ocr/tesseract
- cmake: adjust build to autotool settings
- Merge branch 'main' of https://github.com/tesseract-ocr/tesseract
- cmake: improve NEON build
tesserocr e184c62..3c9519b
Release: v2.6.0
- add github-workflow building wheels
v2023-06-14
Changed:
- All docker images now contain git checkouts and retain
/build
, i.e. behave like the-git
variants - No more git updates within docker build, but fix git module dependency outside
- Reduce docker image size (by reinstating all-in-one layer, removing cache, avoiding duplicate CUDA libraries...)
- Use
git submodule update --single-branch
on CI to reduce docker image size
Added:
make deps-cuda
: non-intrusively support CUDA system dependencies (in docker or native)make ocrd-all-tool.json
: Generate and upload a combination of all processors'ocrd-tool.json
, #362make test-workflow
: Run a workflow with most processors as a general smoke testmake test-cuda
: to test whether CUDA properly set up and has GPU availablemake test-core
: Run OCR-D/core unit tests
Fixed:
- dependencies between modules, esp. with custom
OCRD_MODULES
selection - editable mode (
pip install -e
) - OpenCV build
- get
tesserocr
from PyPI if disabled as a module - get
ocrd
from PyPI if core disabled as a module - consistent interoperable module versions (esp. Numpy/OpenCV/Shapely/Protobuf/Torch/TF Python dependencies)
cor-asv-ann 006a70e..2c4b1ff
Release: v0.1.14
- CI: use ocrd/core-cuda as base image
- CI: dummy venv
- CI: use proper tab character
- CI: clone first
- CI: mkdir first
- CI: chdir to tmp location
- CI: use /tmp for aux clone of ocrd_all
- try getting tensorflow-gpu from Nvidia
- use proper URLs for submodules
- Merge pull request #6 from kba/init-report-dict
- evaluate: skip pages with no results
core de08453..6708624
Release: v2.51.0
- Merge pull request #1055 from bertsky/deps-cuda
- ci: disable upterm for gh actions
- readme: remove dockerhub/travis badge, add GH actions badge
- debug gh actions
- test bashlib: /usr/bin/env bash instead of /bin/bash
- test_workspace_bagger: use ocr-d.de instead of google.com for testing
- disable logging tests until properly fixed
- docker-image: reuse local ghcr.io image instead of docker.io
- 📦 v2.51.0
- 📝 changelog
- make help: improve description
- Revert "Merge remote-tracking branch 'hnesk/no-more-pkg_resources' into release-2.36.0"
- remove out-dated processor resources
- docker-cuda: improve (reduce size) again…
- docker-cuda: rewrite…
- core-cuda: use same CUDA libs as needed for Torch anyway
- Merge branch 'pr-1008' into reduce-cuda
- Merge branch 'master' of https://github.com/OCR-D/core into reduce-cuda
- make install on py36: revert to prefer-binary via install
- make install on py36: fix prefer-binary syntax
- make install on py36: prefer binary OpenCV/Numpy via pip config instead of preinstall
- core-cuda: install more CUDA libs via pip and ld.so.conf, simplify Dockerfile for that
- core-cuda: use CUDA 11.8, install cuDNN via pip and make available system-wide via ld.so.conf
- reinstate workaround for shapely, but more robust
- docker-cuda: change base image, no multi-CUDA runtimes
- keep gcc, no autoremove
- rehash after pip upgrade
- give up workaround for shapely-CUDA issue
dinglehopper 0fd4ea1..35be58c
- Merge pull request #83 from INL/feat/batch-processing
- Merge pull request #82 from CircleCI-config-suggestions-bot/StoreTestResults
- 🧹 .gitignore .python-version (for pyenv)
- 🧹 Remove qurator. namespace prefix
- 🐛 Fix installing by calling find_namespace_packages in setup.py
- 🕸Do not use deprecated ID, pageId options
- 🔧 Remove explicit namespace_packages
- ✔ CircleCI: Explicitly install binary opencv-python-headless (dep of OCR-D?) to avoid compilation
- 🐛 Remove deprecated declare_namespace call
eynollah ea792d1..706433c
Release: v0.2.0
ocrd_cis c90b29f..a0ea0a2
Release: v0.1.5
- Merge branch 'kba:typo' #91 into fix-alpha-shape
- Merge branch 'kba:double-page-max-size' #96 into fix-alpha-shape
- Merge branch 'kba:resolve-resources' #83 into fix-alpha-shape
- segment: adapt to OpenCV changes
- resegment (baseline/ccomps): improve handling of fg conflicts
- resegment: add param baseline_only
- check_page/region/line: skip assumptions on number of components
- adapt to Shapely 2.0 deprecations
- adapt to Numpy 1.24 dtypes
- resegment: list instead of generator
- re/segment: improve polygon simplification
- re/segment: join_baselines: skip lines outside of polygon
- re/segment: join_baselines: for complex subtypes, apply recursively
- re/segment: join_polygons: connect touching neighbours, too
ocrd_fileformat dacfa50..4e7e0de
Release: v0.7.0
- 📦 v0.7.0
- update ocr-fileformat
ocrd_kraken 802c6b0..b13dd8a
Release: v0.3.0
- segment/recognize: default to device=cuda:0 (now backed by safe fall-back)
- segment/recognize: fall back to CPU if no CUDA device
- fix typo
- update changelog
- recognize: project text upwards in order by concatenation
- recognize: ensure baseline/boundary are consistent
- recognize: ignore invalid baselines
- setup metadata: update/improve
- deps-ubuntu: update
- improve/update readme
- Dockerfile: use CUDA base image, improve labels
- update changelog
- recognize: pass lines in baseline format if any baselines are annotated
- update blla.model URL (master→main)
- recognize: workaround for empty/failed line records
- recognize: workaround for better quality box cuts
- recognize: avoid invalid polygons on single-glyph words
- Revert "recognize: avoid invalid polygons on single-glyph words"
- segment: also show tags/type prediction
- recognize: avoid invalid polygons on single-glyph words
- recognize: use proper data structures of rpred
ocrd_pagetopdf 6155605..4f4a330
Release: v1.0.0
- Merge pull request #22 from bertsky/fix-input-files
ocrd_wrap 63c04d5..2cd800d
Release: v0.1.8
- 📦 0.1.8
- Merge pull request #10 from bertsky/update-numpy
opencv-python 6b73d90..474a1cc
Release: 72
- Merge pull request #849 from asmorkalov/as/python3_for_build
- Fix: numpy version for python 3.11 (#839)
- Merge pull request #852 from asmorkalov:as/ci_check
- Merge pull request #837 from bertsky/fix-py38-build
- Merge pull request #838 from henryiii/patch-2
sbb_binarization 39ef3fd..010ec99
Release: v0.1.0
workflow-configuration cb923f7..5aff777
- ocrd-import: add option --regex (positive path selector)
- ocrd-import: fix skipping in subshell
- add METS transforms to TOC
- generalise standalone CLI for both PAGE and METS XSL, update documentation
- mets-copy-agents.xsl: make path for other-mets relative to input m...