Skip to content

Commit

Permalink
chore: Add new pypi token for submit table_cls
Browse files Browse the repository at this point in the history
  • Loading branch information
SWHL committed Sep 12, 2024
1 parent 9ebae19 commit 523763d
Show file tree
Hide file tree
Showing 6 changed files with 23 additions and 17 deletions.
7 changes: 3 additions & 4 deletions .github/workflows/table_cls.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,10 +42,10 @@ jobs:
steps:
- uses: actions/checkout@v3

- name: Set up Python 3.7
- name: Set up Python 3.10
uses: actions/setup-python@v4
with:
python-version: '3.7'
python-version: '3.10'
architecture: 'x64'

- name: Run setup.py
Expand All @@ -60,9 +60,8 @@ jobs:
python setup_table_cls.py bdist_wheel "${{ github.event.head_commit.message }}"
- name: Publish distribution 📦 to PyPI
uses: pypa/gh-action-pypi-publish@v1.5.0
with:
password: ${{ secrets.PYPI_API_TOKEN }}
password: ${{ secrets.TABLE_CLS }}
packages_dir: dist/
10 changes: 8 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
</div>

### Introduction

This repo is an inference library used for structured recognition of tables in documents, including table structure recognition algorithm models from PaddleOCR, wired and wireless table recognition algorithm models from Alibaba Duguang, etc.

The repo has improved the pre- and post-processing of form recognition and combined with OCR to ensure that the form recognition part can be used directly.
Expand All @@ -24,6 +25,7 @@ The repo will continue to focus on the field of table recognition, integrate the
Welcome everyone to continue to pay attention.

### What is Table Structure Recognition?

Table Structure Recognition (TSR) aims to extract the logical or physical structure of table images, thereby converting unstructured table images into machine-readable formats.

Logical structure: represents the row/column relationship of cells (such as the same row, the same column) and the span information of cells.
Expand All @@ -37,28 +39,32 @@ Physical structure: includes not only the logical structure, but also the cell's
Figure from: [Improving Table Structure Recognition with Visual-Alignment Sequential Coordinate Modeling](https://openaccess.thecvf.com/content/CVPR2023/html/Huang_Improving_Table_Structure_Recognition_With_Visual-Alignment_Sequential_Coordinate_Modeling_CVPR_2023_paper.html)

### Documentation

Full documentation can be found on [docs](https://rapidai.github.io/TableStructureRec/docs/), in Chinese.

### Acknowledgements

[PaddleOCR Table](https://github.com/PaddlePaddle/PaddleOCR/blob/4b17511491adcfd0f3e2970895d06814d1ce56cc/ppstructure/table/README_ch.md)

[Cycle CenterNet](https://www.modelscope.cn/models/damo/cv_dla34_table-structure-recognition_cycle-centernet/summary)

[LORE](https://www.modelscope.cn/models/damo/cv_resnet-transformer_table-structure-recognition_lore/summary)


### Contributing

Pull requests are welcome. For major changes, please open an issue first
to discuss what you would like to change.

Please make sure to update tests as appropriate.

### [Sponsor](https://rapidai.github.io/Knowledge-QA-LLM/docs/sponsor/)

If you want to sponsor the project, you can directly click the **Buy me a coffee** image, please write a note (e.g. your github account name) to facilitate adding to the sponsorship list below.

<div align="left">
<a href="https://www.buymeacoffee.com/SWHL"><img src="https://raw.githubusercontent.com/RapidAI/.github/main/assets/buymeacoffe.png" width="30%" height="30%"></a>
</div>

### License
This project is released under the [Apache 2.0 license](https://github.com/RapidAI/TableStructureRec/blob/c41bbd23898cb27a957ed962b0ffee3c74dfeff1/LICENSE).

This project is released under the [Apache 2.0 license](https://github.com/RapidAI/TableStructureRec/blob/c41bbd23898cb27a957ed962b0ffee3c74dfeff1/LICENSE).
1 change: 0 additions & 1 deletion demo_table_cls.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@
from table_cls import TableCls

table_cls = TableCls()
output_dir = "outputs"
img_path = "tests/test_files/table_cls/lineless_table.png"
cls_str, elapse = table_cls(img_path)
print(cls_str)
Expand Down
11 changes: 8 additions & 3 deletions docs/README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
</div>

### 简介

该仓库是用来对文档中表格做结构化识别的推理库,包括来自PaddleOCR的表格结构识别算法模型、来自阿里读光有线和无线表格识别算法模型等。

该仓库将表格识别前后处理做了完善,并结合OCR,保证表格识别部分可直接使用。
Expand All @@ -24,6 +25,7 @@
欢迎大家持续关注。

### 表格结构化识别

表格结构识别(Table Structure Recognition, TSR)旨在提取表格图像的逻辑或物理结构,从而将非结构化的表格图像转换为机器可读的格式。

逻辑结构:表示单元格的行/列关系(例如同行、同列)和单元格的跨度信息。
Expand All @@ -37,24 +39,27 @@
图来自: [Improving Table Structure Recognition with Visual-Alignment Sequential Coordinate Modeling](https://openaccess.thecvf.com/content/CVPR2023/html/Huang_Improving_Table_Structure_Recognition_With_Visual-Alignment_Sequential_Coordinate_Modeling_CVPR_2023_paper.html)

### 文档

完整文档请移步:[docs](https://rapidai.github.io/TableStructureRec/docs/)

### 致谢

[PaddleOCR 表格识别](https://github.com/PaddlePaddle/PaddleOCR/blob/4b17511491adcfd0f3e2970895d06814d1ce56cc/ppstructure/table/README_ch.md)

[读光-表格结构识别-有线表格](https://www.modelscope.cn/models/damo/cv_dla34_table-structure-recognition_cycle-centernet/summary)

[读光-表格结构识别-无线表格](https://www.modelscope.cn/models/damo/cv_resnet-transformer_table-structure-recognition_lore/summary)


### 贡献指南

欢迎提交请求。对于重大更改,请先打开issue讨论您想要改变的内容。

请确保适当更新测试。

### [赞助](https://rapidai.github.io/Knowledge-QA-LLM/docs/sponsor/)
如果您想要赞助该项目,可直接点击当前页最上面的Sponsor按钮,请写好备注(**您的Github账号名称**),方便添加到赞助列表中。

如果您想要赞助该项目,可直接点击当前页最上面的Sponsor按钮,请写好备注(**您的Github账号名称**),方便添加到赞助列表中。

### 开源许可证
该项目采用[Apache 2.0](https://github.com/RapidAI/TableStructureRec/blob/c41bbd23898cb27a957ed962b0ffee3c74dfeff1/LICENSE)开源许可证。

该项目采用[Apache 2.0](https://github.com/RapidAI/TableStructureRec/blob/c41bbd23898cb27a957ed962b0ffee3c74dfeff1/LICENSE)开源许可证。
2 changes: 1 addition & 1 deletion table_cls/main.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import time

from pathlib import Path

import numpy as np
import onnxruntime
from PIL import Image
Expand Down
9 changes: 3 additions & 6 deletions table_cls/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,9 @@
from pathlib import Path
from typing import Union

from PIL import UnidentifiedImageError
from PIL import Image
import numpy as np
import cv2
import numpy as np
from PIL import Image, UnidentifiedImageError

InputType = Union[str, np.ndarray, bytes, Path, Image.Image]

Expand All @@ -15,9 +14,7 @@ class LoadImageError(Exception):


class LoadImage:
def __init__(
self,
):
def __init__(self):
pass

def __call__(self, img: InputType) -> np.ndarray:
Expand Down

0 comments on commit 523763d

Please sign in to comment.