Skip to content

EngrRaghad/T5-Bootcamp-DeepLearning-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Proposal: T5-Bootcamp-DeepLearning-project

Final project - SDAIA Academy - Bootcamp Data Science

Introduction:

The project, "Eye for Blind," aims to create a deep learning model that can explain the content of an image in the form of speech through caption generation with the attention mechanism on the Flickr8K data set

Goal:

It aims to use text to speech conversion in order to showcase our result in an audio format, thus, allowing us to recognize the objects and explain them accordingly in an audiblemanner.

Future work:

create an application to help blind people explain the pictures accordingly in an audible manner.

Dataset:

  • 8091 Images
  • 40455 Captions

Dataset sourec:

from Kaggle website [Kaggle]

Algorithms:

  • Inception-v3 model
  • CCN Model.
  • Attention Model.
  • RNN Model.
  • Greedy Search
  • Beam Search
  • Gtts

Tools:

Softwares:


  1. VScode
  2. mp3
  3. Trello
  4. Jupyter
  5. Github
  6. PowerPoint
  7. Zoom

Languages & Libarry


  • Python
  • Pandas
  • numpy
  • seaborn
  • plotly
  • sklearn
  • PIL
  • tqdm
  • Adam
  • InceptionV3

Team Members

About

Final project - SDAIA Academy - Bootcamp Data Science

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published