Skip to content
This repository has been archived by the owner on Jan 9, 2024. It is now read-only.

Move to ocrd-segment-repair? #2

Open
mikegerber opened this issue Nov 26, 2019 · 8 comments
Open

Move to ocrd-segment-repair? #2

mikegerber opened this issue Nov 26, 2019 · 8 comments
Assignees

Comments

@mikegerber
Copy link
Member

@bertsky wrote in #1:

I still think this would make a very good addition to ocrd-segment-repair...

@mikegerber
Copy link
Member Author

Yes, I think so too. It was unclear what exactly ocrd-segment-repair would do to my files other than my hypothetically added re-ordering operation. If ocrd-segment-repair is going down the "let the user choose a single operation" road, I'm happy to add this as one of those single operations.

To explain: I needed this to fix problems with some hundred ground truth files. As I wanted to be careful with my ground truth files I wanted to exactly fix this problem, nothing more. Therefore I wrote a separate script and did not add the operation to ocrd-segment-repair.

@bertsky
Copy link
Contributor

bertsky commented Nov 26, 2019

Yes, there's definitely going to be fine grained control of what checks and repair heuristics to use for ocrd-segment-repair. Let's delay this until we have baked ocrd-segment-evaluate (PRImA tools re-implementation) and found ourselves some useful module + data structures.

@mikegerber
Copy link
Member Author

Agreed.

@mikegerber mikegerber self-assigned this Dec 5, 2019
@mikegerber mikegerber changed the title Move to ocrd-segment-repair Move to ocrd-segment-repair? Dec 10, 2019
@kba
Copy link
Contributor

kba commented Dec 18, 2019

Shall we include this in ocrd_all or wait until you've decided whether/how to integrate with ocrd_segment?

@bertsky
Copy link
Contributor

bertsky commented Dec 18, 2019

Shall we include this in ocrd_all or wait until you've decided whether/how to integrate with ocrd_segment?

I'd say now is as good a time as ever for ocrd_all. (We want to give users the best possible processing options.)

@cneud
Copy link
Member

cneud commented Sep 26, 2020

Since this is very OCR-D specific stuff, I would actually prefer this moved to ocrd-segment-repair at some point.

@bertsky
Copy link
Contributor

bertsky commented Oct 9, 2020

Since this is very OCR-D specific stuff, I would actually prefer this moved to ocrd-segment-repair at some point.

Sure, but see above – nothing has changed from ocrd_segment's side so far. As soon as we have a good library structure there and self-explaining and orthogonal repair processors/parameters, I'll address having ocrd-repair-inconsistencies flow into it. Segment re-ordering is also connected to layout evaluation (projected in ocrd-segment-evaluate) and to validation auto-repair hooks (as currently planned for coordinates) or auto-repair instrumentation (also projected for coordinates), so we first have to shake everything else together.

@mikegerber
Copy link
Member Author

As I've closed #8 (Find a better name) in favor of merging it into some other tool: I suggest a very specific operation name of reorder-segments-to-match-parent-text in the future.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants