Final project - SDAIA Academy - Bootcamp Data Science
The project, "Eye for Blind," aims to create a deep learning model that can explain the content of an image in the form of speech through caption generation with the attention mechanism on the Flickr8K data set
It aims to use text to speech conversion in order to showcase our result in an audio format, thus, allowing us to recognize the objects and explain them accordingly in an audiblemanner.
create an application to help blind people explain the pictures accordingly in an audible manner.
- 8091 Images
- 40455 Captions
from Kaggle website [Kaggle]
- Inception-v3 model
- CCN Model.
- Attention Model.
- RNN Model.
- Greedy Search
- Beam Search
- Gtts
- VScode
- mp3
- Trello
- Jupyter
- Github
- PowerPoint
- Zoom
- Python
- Pandas
- numpy
- seaborn
- plotly
- sklearn
- PIL
- tqdm
- Adam
- InceptionV3