Flask Image Text Extractor is a RESTful API that extracts text from online images. Given a list of image URLs, it processes each image, applies optical character recognition (OCR) using Tesseract, and returns the extracted text for each image.
- Extracts text from online images using Tesseract OCR.
- Supports processing multiple image URLs in a single request.
- Limits the maximum number of URLs to 8 per request.
- Provides error handling for invalid image URLs or processing failures.
- Python 3.6 or higher
- Flask
- requests
- Pillow
- pytesseract
- Tesseract OCR
-
Clone the repository:
git clone https://github.com/tittoh/image-text-extractor.git
-
Install the dependencies using pip:
pip install -r requirements.txt
-
Install Tesseract OCR. You can follow the installation instructions specific to your operating system:
- Start the Flask development server:
python main.py
- Make POST requests to the
/process_images
endpoint with the list of image URLs or to the/process_uploads
endpoint with an uploaded image. For example:
- Process images from URLs:
POST /process_images HTTP/1.1 Content-Type: application/json { "image_urls": [ "http://example.com/image1.jpg", "http://example.com/image2.jpg", "http://example.com/image3.jpg" ] }
- Process uploaded image:
POST /process_uploads HTTP/1.1 Content-Type: multipart/form-data # Include the uploaded image file
- The API will process the images, extract the text using OCR, and return a response with the extracted text for each image.
Extracts text from online images.
Request Body:
{
"image_urls": [
"http://example.com/image1.jpg",
"http://example.com/image2.jpg",
"http://example.com/image3.jpg"
]
}
image_urls
(required): A list of image URLs to process. Maximum of 8 URLs allowed.
Response:
[
{
"id": "image1",
"text": "Text extracted from image1.jpg"
},
{
"id": "image2",
"text": "Text extracted from image2.jpg"
},
{
"id": "image3",
"error": "Error message for image3.jpg"
}
]
id
: The ID of the image (derived from the image URL's filename without the extension).text
: The extracted text from the image.error
: If an error occurs during image processing, an error message will be present instead of thetext
field.
Extracts text from an uploaded image.
Request Body:
POST /process_uploads HTTP/1.1
Content-Type: multipart/form-data
# Include the uploaded image file
image
(required): The uploaded image file.
Response:
{
"text": "Text extracted from the uploaded image"
}
text
: The extracted text from the uploaded image.
Run tests
python -m unittest test_main.py
Contributions are welcome! If you have any suggestions, bug reports, or feature requests, please open an issue or submit a pull request.
This project is licensed under the MIT License
Feel free to customize and expand the README file to fit the specific details and requirements of your Flask Image Text Extractor app.