New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[구현] OCR Output 전처리 구현 #6

Open

1 of 2 tasks

KyubumShin opened this issue May 23, 2022 · 1 comment

Assignees

Labels

Contributor

KyubumShin commented May 23, 2022 •

edited

Loading

구현 내용

띄어쓰기 단위로 분리되어있는 OCR Output을 Line단위로 합치는 전처리 구현

구체적인 내용

아래의 이미지와 같이 OCR API의 output은 띄어쓰기 단위로 글자를 추출
이 부분을 Line 단위로 합치는 코드를 작성 필요

Task

띄어쓰기 단위로 분리되어있는 output을 Line 단위 변환 코드 작성
Edge Case 고려

예상되는 Edge Case (이미지 삽입 예정)

이미지가 돌아가있는 경우
세로 명함
폰트로 인한 박스간 차이가 존재하는 경우

KyubumShin added the enhancement label

KyubumShin self-assigned this

Contributor Author

KyubumShin commented May 25, 2022

프로토 타입 구현 결과

띄어쓰기 단위로 분리되어있는 ouput을 Line단위로 변환하는 코드 구현
input은 OCR api 에서 받는 json을 받음
output도 동일
text를 합칠때 띄어쓰기로 분리해서 합침 (추후 팀원간의 토의 후 변경 예정)

KyubumShin added a commit that referenced this issue


          [Fix] Fix word2fix.py #6

c5fa617

Line 19 수정
, . : 에 대해서는 ratio 판별을 하지 않음

dudskrla mentioned this issue

[구현] OCR API output 전처리 #9

Closed

2 tasks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment