Skip to content

Releases: xavctn/img2table

img2table 1.3.0

01 Sep 18:24
8eb9ca9
Compare
Choose a tag to compare

Features

  • Complete overhaul of the line detection algorithm to improve detection of lines defined by background color changes
  • Improvement in detection of semi-bordered cells
  • Update detection of rows in borderless tables
  • Add support for Surya OCR
  • Add detection of implicit columns via the implicit_columns parameter
  • Optimization of code performance via numba refactoring
  • Update of examples notebooks

Bug fixes

  • Fix bug with text position when extracting text from rotated PDFs

img2table 1.2.11

26 Feb 23:03
cd944cf
Compare
Choose a tag to compare
  • Simpler and more consistent line detection
  • Detection of discontinuous columns in borderless tables

1.2.10

11 Feb 23:13
Compare
Choose a tag to compare
  • Fix miscellaneous code left from legacy processing
  • Add margin to top/bottom of borderless tables

1.2.9

11 Feb 21:26
618a7f8
Compare
Choose a tag to compare

What's Changed

  • Update metrics computation and borderless table detection
  • Add compatibility with Python 3.12
  • Add support for documents with black backgrounds

img2table 1.2.8

02 Jan 11:48
Compare
Choose a tag to compare
  • Fix division by zero bug introduced in previous release

img2table 1.2.7

31 Dec 16:31
a688b8d
Compare
Choose a tag to compare
  • Fix bugs
  • Improve computation of image metrics on noisy documents
  • Modify row detection for borderless tables in order to account for merged cells
  • Implement Adaptive Run Length Smoothing Algorithm in order to isolate text areas in images

img2table 1.2.6

16 Dec 16:55
Compare
Choose a tag to compare
  • Fix bugs related to OCR / table content extraction

img2table 1.2.5

03 Dec 08:59
Compare
Choose a tag to compare
  • Fix bug in line detection
  • Fix bug in cell creation
  • Optimization of algorithm performances

img2table 1.2.4

22 Nov 19:58
Compare
Choose a tag to compare
  • Improved processing of tables with dotted lines
  • Add detection of semi-bordered cells in tables
  • Update borderless table algorithm
  • Speed improvements and code optimization (2 to 4x faster depending on inputs)

img2table 1.2.3

18 Oct 20:36
c56fed0
Compare
Choose a tag to compare
  • Add HTML representation to extracted tables
  • Call OCR only on pages/images containing tables
  • Bump Pillow requirements for vulnerabilities