This repository contains code and resources for detecting tables in various types of documents using machine learning and computer vision techniques.
Article link: https://www.analyticsvidhya.com/blog/2023/08/detecting-table-rows-and-columns-in-images-using-transformers/
Detecting tables in documents is a common problem in information extraction and document analysis. This project aims to provide tools and solutions to automate the process of identifying and extracting tables from different types of documents, such as PDFs, images, and scanned documents.
PubTables-1M improves table extraction research with scientific article tables. It supports varied input formats, detailed headers, and addresses over-segmentation issues for accurate annotations.
DETR (DEtection TRansformer) combines a ResNet-based convolutional backbone with an encoder-decoder Transformer, enabling object detection without intricate components like region proposals. It offers end-to-end training using its bipartite matching loss. Experimental results on PubTables-1M underscore the role of canonical data in boosting performance.
This project is licensed under the MIT License.