The ability to save the scanned OCR book as a PDF or Word document, not just a text file #111

DraganRatkovich · 2022-02-27T07:56:20Z

�Bookworm currently allows the user to save a scanned book as a plain text file, which is inconvenient in some cases, as either Word document or pdf file formats are currently widely used.

Describe alternatives you've considered

Allow the user to save the scanned book in either pdf format or Microsoft Word document format, which, in turn will give more options in the resulting file for editing in word processing programs.
This can be done in the following ways:

Create additional .pdf and .docx file formats along with the .txt format in the "Save As" dialog box to allow the user to choose from the available file formats;
Create a submenu in the file menu called "Export As" and put the three formats there, .txt, .docx and .pdf, to quickly select and simply enter a file name and save in the previously selected file format.

@mush42 Let me know your thoughts whether this is possible or not.

mush42 · 2022-02-27T08:15:46Z

@DraganRatkovich
It is possible, of course.
But I couldn't see any benefit of those two formats over plain text.
No structure information is extracted from the document, except pages and lines. No headings, no paragraphs, and no formatting information.
You can copy the text from the text file and paste it in word, and word will restore paging and lines.
Best
Musharraf

DraganRatkovich · 2022-02-27T08:27:58Z

@mush42 Of course, but the main advantage of direct saving as pdf or docx is time. It may take a long time to process in Microsoft Word the contents of the extracted text file, especially if the book being scanned contains more than 300 pages.

pauliyobo · 2023-11-29T22:27:10Z

Hello.
One year later. Is this feature still desired? If yes, @DraganRatkovich , would you mind explaining why?
I did read the previous comment, however note that even if we did save the txt into a PDF you would not retain any structure from the original image.
Iirc, what you get now in the scanned file's output is at most the page number. Is that correct?

DraganRatkovich · 2024-12-30T07:08:17Z

@pauliyobo This is the second time I've noted this, don't close an issue that was raised due to community interest and there is still no progress on it.

pauliyobo · 2024-12-30T07:19:57Z

@DraganRatkovich
Hello,
I had closed this for reasons similar to #151
I will leave this open, just in case people are still interested by this, though it'd be interesting to know the motivation behind this proposal.
Also, I think it's best if we all avoid closing and reopening issues without a compelling reason, me included. I hadn't actually commented with a closing reason on this one so this one is on me.
Thanks.

DraganRatkovich added the enhancement New feature or request label Jun 14, 2022

pauliyobo closed this as not planned Won't fix, can't repro, duplicate, stale Dec 28, 2024

DraganRatkovich reopened this Dec 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The ability to save the scanned OCR book as a PDF or Word document, not just a text file #111

The ability to save the scanned OCR book as a PDF or Word document, not just a text file #111

DraganRatkovich commented Feb 27, 2022 •

edited

Loading

mush42 commented Feb 27, 2022

DraganRatkovich commented Feb 27, 2022 •

edited

Loading

pauliyobo commented Nov 29, 2023

DraganRatkovich commented Dec 30, 2024

pauliyobo commented Dec 30, 2024

The ability to save the scanned OCR book as a PDF or Word document, not just a text file #111

The ability to save the scanned OCR book as a PDF or Word document, not just a text file #111

Comments

DraganRatkovich commented Feb 27, 2022 • edited Loading

Describe alternatives you've considered

mush42 commented Feb 27, 2022

DraganRatkovich commented Feb 27, 2022 • edited Loading

pauliyobo commented Nov 29, 2023

DraganRatkovich commented Dec 30, 2024

pauliyobo commented Dec 30, 2024

DraganRatkovich commented Feb 27, 2022 •

edited

Loading

DraganRatkovich commented Feb 27, 2022 •

edited

Loading