Skip to content

Commit

Permalink
Initial commit, WinOCR v1.00
Browse files Browse the repository at this point in the history
  • Loading branch information
Mitsos101 committed Oct 14, 2015
0 parents commit c258d01
Show file tree
Hide file tree
Showing 195 changed files with 54,269 additions and 0 deletions.
6 changes: 6 additions & 0 deletions AUTHORS
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Authors:

Antonio Diaz Diaz [OCRAD] - Coder of OCRAD
Bruno Barberi Gnecco [GOCR] - Programmer
Joerg Schulenburg [GOCR] - Original idea and creation, programmer leader
Sven Bansemer [DLLs] - modifications to use the libs as DLL in Windows world
338 changes: 338 additions & 0 deletions COPYING

Large diffs are not rendered by default.

17 changes: 17 additions & 0 deletions CREDITS
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
Thanks:
...to everyone who contributed to gocr. If you feel that your
name should be in this list, write mail to the author. These
are in no particular order:

G.Kugler for sending me first example files and testing. (MayMM)
Klaas Freitag for the libPgm2asc-patch <freitag@suse.de>
Ryan Dibble for the otsu.c file <dibbler@umich.edu>
Tim Waugh for the man page <twaugh@redhat.com>
David Pinson for the tkispell-patch <dpinson@materials.unsw.EDU.AU>
Martin Goldhahn for some patches <Martin.Goldhahn@Webcenter.no>
Eberhard Burkard for the gocr.tcl patch <E.Burkard@web.de>
James R. Van Zandt for lot of tips <jrv@vanzandt.mv.com>
...

... and everyone else who submitted bug-reports,
feature-requests, patches and lots of example files.
9 changes: 9 additions & 0 deletions ChangeLog
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
History: (Changes,ChangeLog)

1.00 2015-10-08
first release with the sample code in object pascal to call the
two dll's (gocr.dll and ocrad.dll)

* OCRAD as been modified to compile as dll
* GOCR has been modified to compile as dll + some minor changes to avoid
division by zero exceptions
78 changes: 78 additions & 0 deletions GSA-Win-OCR.dpr
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
(* GSA-Win-OCR.dpr
Copyright (C) 2015 Sven Bansemer/GSA
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 2 of the License, or
(at your option) any later version.
This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this library. If not, see <http://www.gnu.org/licenses/>.
*)

{$APPTYPE CONSOLE}
uses windows, sysutils;

var dojob_ext:function(input:pchar; certainty:integer; Output:pchar; Charset:pchar):integer; cdecl;
ocrad_dojob:function(input:pchar; Output:pchar):integer; cdecl;

function get_ocrad(const filename:string):string;
var p:pchar;
begin
getmem(p,102);
ocrad_dojob(pchar(filename),p);
result:=p; result:=trim(result);
FreeMem(p,102);
end;

function get_gocr(const filename:string;charset:string):string;
var p:pchar;
f1:integer;
begin
getmem(p,102);
f1:=DoJob_ext(pchar(filename),0,p,pchar(charset));
if f1>=1 then begin
result:=p; result:=trim(result);
end else result:='';
FreeMem(p,102);
end;

var dll_gsa:thandle;

begin
writeln('GSA-Win-OCR v1.00 (C) 2015 GSA ');
writeln('');
writeln('This tool and the DLL''s are licensed under GNU General Public License.');
writeln('It uses GOCR and OCRAD (both GPL licensed) to extract text from images');
writeln('in ppm format.');
writeln('');

if (paramcount=0) or (not fileexists(paramstr(1))) then begin
writeln('usage: '+paramstr(0)+' [image.ppm]');
exit;
end;

dll_gsa:=Windows.LoadLibrary('gocr.dll');
if dll_gsa<>0 then begin
dojob_ext:=GetProcAddress(dll_gsa, 'DoJob_ext');
writeln('GOCR: '+get_gocr(paramstr(1),'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+-*?='));
windows.FreeLibrary(dll_gsa);
end else begin
writeln('Error loading DLL: gocr.dll');
end;

dll_gsa:=Windows.LoadLibrary('ocrad.dll');
if dll_gsa<>0 then begin
ocrad_dojob:=GetProcAddress(dll_gsa, '_Z5DoJobPcS_');
writeln('OCRAD: '+get_ocrad(paramstr(1)));
windows.FreeLibrary(dll_gsa);
end else begin
writeln('Error loading DLL: ocrad.dll');
end;

end.
36 changes: 36 additions & 0 deletions INSTALL
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@

Requirements
------------
1. You will need a C compiler for the DLLs. I have build the DLL's with Dev-C++ 5 from
http://www.bloodshed.net/devcpp.html

2. You will need an ObjectPascal compiler for the sample program that uses the DLL's.
I used Delphi 7 as I own a license for that but any Delphi compiler will do or even
Lazarus (http://www.lazarus-ide.org/) or Free Pascal (http://www.freepascal.org/).
The source is also not that hard to understand to convert
it to any other language.

3. Images in PPM format. See http://netpbm.sourceforge.net/doc/ppm.html
Any decent viewer will read or convert this for you like IrfanView.

4. You will need a Windows tool to apply the patches to the original sources. I used
WinMerge - http://winmerge.org/

Procedure
---------
1. Unpack the archive if you have not done so already:

unzip winocr[version].zip

2. Apply the patches using WinMerg or any other tool of your choice.
The *.patch files are in the folder where the two different OCR libs are
together with the newly added files.

3. Compile the DLLs using Dev-C++ or any other compatible compiler.

4. Compile the sample program using any of the mentioned compilers.

5. Put the DLLs (goocr.dll and ocrad.dll) and GSA-Win-OCR.exe into one folder.

6. Use the tool with: "GSA-Win-OCR sample.ppm" or any other sample image as
argument.
26 changes: 26 additions & 0 deletions README
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@

GSA-Win-OCR
=============
Description
-----------

This is a tool that shows how to use the DLLs of the modified sources from GOCR
and OCRAD on Windows OSs. GOCR has an windows executable but opening it and
getting results from stdin/stdout was a bit unconfortable so I applied some
modifications to make DLLs from it that can be used in any windows program.
OCRAD however never had any windwos build when I checked it out.

Both OCRs work a bit different and combining the results can improve your final
result.

I hope this is useful for someone experimenting with OCR.

Installation
------------

Please read the file "INSTALL" for details.

License
-------

GNU GENERAL PUBLIC LICENSE - Please read the file "COPYING" for details.
6 changes: 6 additions & 0 deletions gocr-0.50/.cvsignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
.cvsignore
.version
Makefile
autom4te.cache
config.status
config.log
7 changes: 7 additions & 0 deletions gocr-0.50/AUTHORS
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Authors (in chronological order):

Joerg Schulenburg <jNOschulen{at}gmx.SPAM.de> (remove NO+SPAM for valid EMAIL address)
* Original idea and creation, programmer leader

Bruno Barberi Gnecco <brunobg{at}users.sourceforge.net>
* Programmer
55 changes: 55 additions & 0 deletions gocr-0.50/BUGS
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
BUGS

Reporting
---------
Please do not hesitate to report bugs, and if possible their fixes! If you
send an example file, please make sure it's small. To report bugs, do one
of the following:

* go to http://sourceforge.net/bugs/?func=addbug&group_id=7147.
This is the preferred way to report bugs.

* send it to one of the authors. Note that sometimes we may be busy, and
we won't reply it for days. If you post using the previous method, surely
one of the authors will read it.

* use
diff -ru gocr_origin/ gocr_changed/ >patch
to create patches

* if you have compiling problems, do not forget to send your configure-output
and the config.log file


Known bugs (see jocr.SF.net page too)
----------

v0.48 cutting of double melted chars will fail (example: serif MN)
v0.43 on dithered images gocr runs extremely long (seems to hang)
v0.41 linker error using g++ and netpbm under SuSE-9.3
v0.3.5
- segfault on some systems which do not support ifalpha(256+x)
- hexcode not read from database
v0.2.5 german umlauts and i-dots are not handled correctly
problems high resolution fonts
v0.2.4 I guess, there are still bugs.
Some systems do not handle stack in good manner (AmigaOS?).
gocr does extensively consume stack for recursive functions.
Therefore you can get memory protection failures or strange results.
The worst case is a huge black area. If that is a problem for you
request for changing it.
--- --- --- --- only for linux freaks --- --- ---
By mistake I programmed an endless rekursiv function and ...
SuSE6.4+linux2.2.12/13 got several "out of mem" and system CRASHED!!!
ulimit: stack=unlimited
- if text is framed, frame should be ignored, but it is not
v0.2.3 still problems with segmentation
- gcc 2.95.2 (SuSE6.4) error in load_db(), => fixed (thx to jasper)
v0.2.1
- some people have problems running gocr on DOS/Win95
I guess: stack overflow. Is someone able to analyze or fix this?
- large black areas on pbm-files cause a segfault on
Ultra/Sparc (64bit) machines running Linux (2.1.126).
There is a recursive function in the program which causes a
stack overflow, which is not detected by the linux-kernel (BUG?).
I look for a better solution.
17 changes: 17 additions & 0 deletions gocr-0.50/CREDITS
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
Thanks:
...to everyone who contributed to gocr. If you feel that your
name should be in this list, write mail to the author. These
are in no particular order:

G.Kugler for sending me first example files and testing. (MayMM)
Klaas Freitag for the libPgm2asc-patch <freitag@suse.de>
Ryan Dibble for the otsu.c file <dibbler@umich.edu>
Tim Waugh for the man page <twaugh@redhat.com>
David Pinson for the tkispell-patch <dpinson@materials.unsw.EDU.AU>
Martin Goldhahn for some patches <Martin.Goldhahn@Webcenter.no>
Eberhard Burkard for the gocr.tcl patch <E.Burkard@web.de>
James R. Van Zandt for lot of tips <jrv@vanzandt.mv.com>
...

... and everyone else who submitted bug-reports,
feature-requests, patches and lots of example files.
Loading

0 comments on commit c258d01

Please sign in to comment.