A custom framework for easy use of LLMs, VLMs, etc. supporting various modes and settings via web-ui
-
Updated
Sep 3, 2024 - Jupyter Notebook
A custom framework for easy use of LLMs, VLMs, etc. supporting various modes and settings via web-ui
Printext is a lightweight, application that extracts text from images.
This project showcases a comprehensive solution for converting images to text using OCR (Optical Character Recognition) and fine-tuning the extracted content. Leveraging powerful tools like AWS, Docker, OpenAI, and CI/CD pipelines, this system ensures high accuracy and efficiency.
This is a program that programmatically reads text in Minecraft. It can be used to "scrape" sign text, chat messages, tooltips and more.
Using AI to write caption for an image (with HuggingFace, Transformer.js or Azure Cognitive Vision API)
Turn image to audio story(Upload image and let AI tells a story about it ).
SmartMeat is a smart BBQ controllable from your phone
This is a repo providing same stable diffusion experiments, regarding textual inversion task and captioning task
caption generator using lavis and argostranslate
OCR with Google's AI technology (Cloud Vision API)
My solution to the Image Captioning Final Project of the Coursera "Introduction to Deep Learning" course with trained model deployed as telegram bot.
Welcome to 📜 Optical_character_recognition using Python 👋
For transforming normal videos to ASCII style
Projects Collections
Add a description, image, and links to the img2txt topic page so that developers can more easily learn about it.
To associate your repository with the img2txt topic, visit your repo's landing page and select "manage topics."