Build software better, together

lu4p / cat

Sponsor

Star

Extract text from plaintext, .docx, .odt and .rtf files. Pure go.

cat go golang cross-platform text-extraction extract-text pdftotext docx2txt textextracting rtf-to-text pdf2txt odt2txt

Updated Nov 25, 2023
Go

hrushikeshrv / docxlatex

Star

A python library for extracting equations, text, and images from .docx files

converts latex extract docx docx2txt

Updated Jul 27, 2024
Python

MoinDalvs / Resume_Screening_and_Parser

Star

Business objective- The document classification solution should significantly reduce the manual human effort in the HRM. It should achieve a higher level of accuracy and automation with minimal human intervention Sample Data Set Details: Resumes and financial documents

data-science text-mining text-classification text text-analysis text-processing doc2vec docx2txt unstructured-data pdf-document-processor pdf2txt doc2txt docx-converter docx-to-pdf streamlit

Updated Dec 3, 2022
Jupyter Notebook

bat9r / KW

Star

Find words in pdf, docx, txt

pyqt5 python3 pdfminer3k docx2txt

Updated Mar 28, 2017
Python

ehsanghaffar / python-snippets

Star

This repository is a collection of various Python code snippets and small applications that demonstrate Python's versatility and ease of use.

python font discord python-script discord-bot docx docx2txt

Updated Nov 20, 2024
Python

malakhovks / doc-docx-extract-api

Star

Atomic Web Service (AWS, REST API) for converting DOC/DOCX files to plain/text, powered by catdoc, docx2txt and Node.js

nodejs node microservice service text rest-api docx doc restful-api extract-data docx2txt catdoc docx-files doc-files

Updated Nov 8, 2017
JavaScript

HamzaZaidiX / Chrome-Browser-Clone

Star

Chrome Browser Clone By Python

python pyqt5 python-library chrome-browser python3 pip docx2txt python-project browser-clone chrome-browser-clone

Updated Oct 18, 2022
Python

Karthik-02 / plagiarism-detection

Star

Provides a comprehensive solution for detecting plagiarism and finding similarities between text documents

pandas cosine-similarity plagiarism pypdf2 tokenization plagiarism-checker plagiarism-detection datavisualization docx2txt plagiarism-detector nltk-python plotly-express webscrapping-python

Updated May 13, 2024
Python

MoinDalvs / Resume_Classification

Star

Business objective- The document classification solution should significantly reduce the manual human effort in the HRM. It should achieve a higher level of accuracy and automation with minimal human intervention

data-science text-mining text-classification text-analysis classification docx text-processing resume-parser textract classification-algorithm resume-app docx2txt ensemble-machine-learning pdfplumber

Updated Dec 3, 2022
Jupyter Notebook

zhxxch / docx2txt

Star

docx2txt for git-diff in C#

windows docx2txt text-extract

Updated Feb 5, 2021
PowerShell

VaibhavDongre1311 / End_to_end_Resume_Classification__project

Star

Business objective- The document classification solution should significantly reduce the manual human effort in the HRM. It should achieve a higher level of accuracy and automation with minimal human intervention