Flask Image Text Extractor

Flask Image Text Extractor is a RESTful API that extracts text from online images. Given a list of image URLs, it processes each image, applies optical character recognition (OCR) using Tesseract, and returns the extracted text for each image.

Features

Extracts text from online images using Tesseract OCR.
Supports processing multiple image URLs in a single request.
Limits the maximum number of URLs to 8 per request.
Provides error handling for invalid image URLs or processing failures.

Requirements

Python 3.6 or higher
Flask
requests
Pillow
pytesseract
Tesseract OCR

Installation

Clone the repository:

git clone https://github.com/tittoh/image-text-extractor.git

Install the dependencies using pip:
```
pip install -r requirements.txt
```
Install Tesseract OCR. You can follow the installation instructions specific to your operating system:
- Tesseract OCR Installation

Usage

Start the Flask development server:
```
python main.py
```
Make POST requests to the /process_images endpoint with the list of image URLs or to the /process_uploads endpoint with an uploaded image. For example:

Process images from URLs:

POST /process_images HTTP/1.1
Content-Type: application/json

{
  "image_urls": [
    "http://example.com/image1.jpg",
    "http://example.com/image2.jpg",
    "http://example.com/image3.jpg"
  ]
}

Process uploaded image:

POST /process_uploads HTTP/1.1
Content-Type: multipart/form-data

# Include the uploaded image file

The API will process the images, extract the text using OCR, and return a response with the extracted text for each image.

API Endpoints

`POST /process_images`

Extracts text from online images.

Request Body:

{
  "image_urls": [
    "http://example.com/image1.jpg",
    "http://example.com/image2.jpg",
    "http://example.com/image3.jpg"
  ]
}

image_urls (required): A list of image URLs to process. Maximum of 8 URLs allowed.

Response:

[
  {
    "id": "image1",
    "text": "Text extracted from image1.jpg"
  },
  {
    "id": "image2",
    "text": "Text extracted from image2.jpg"
  },
  {
    "id": "image3",
    "error": "Error message for image3.jpg"
  }
]

id: The ID of the image (derived from the image URL's filename without the extension).
text: The extracted text from the image.
error: If an error occurs during image processing, an error message will be present instead of the text field.

`POST /process_uploads`

Extracts text from an uploaded image.

Request Body:

POST /process_uploads HTTP/1.1
Content-Type: multipart/form-data

# Include the uploaded image file

image (required): The uploaded image file.

Response:

{
  "text": "Text extracted from the uploaded image"
}

text: The extracted text from the uploaded image.

Testing

Run tests

python -m unittest test_main.py

Contributing

Contributions are welcome! If you have any suggestions, bug reports, or feature requests, please open an issue or submit a pull request.

License

This project is licensed under the MIT License

Feel free to customize and expand the README file to fit the specific details and requirements of your Flask Image Text Extractor app.

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
uploads		uploads
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
test_image.jpg		test_image.jpg
test_main.py		test_main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Flask Image Text Extractor

Features

Requirements

Installation

Usage

API Endpoints

`POST /process_images`

`POST /process_uploads`

Testing

Contributing

License

About

Releases

Packages

Contributors 2

Languages

License

Tittoh/image-text-extractor

Folders and files

Latest commit

History

Repository files navigation

Flask Image Text Extractor

Features

Requirements

Installation

Usage

API Endpoints

POST /process_images

POST /process_uploads

Testing

Contributing

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

`POST /process_images`

`POST /process_uploads`

Packages