WiDS-Image-Caption-Generation

The aim of this project is to generate captions for images, which describe the image, using Deep Learning. Final model used VGG19 architecture combined with LSTM and NLP for more better results.

Week 1

Started with learning basic python libraries such as Pandas & numpy for data analysis
Studied Exploratory Data Analysis by using various types of Graphs and Figures to depict data
Learnt basic Machine Learning prediction techniques such as Decision Tree, Random Forest etc
Important data pre-processing to deal with Missing values (Imputataion) or Categorical variables (Ordinal / One Hot Encoding) or using Pipelines
Assignment task was to predict individual product failures of new codes using various ML algorithms with evaluation based on AUC under ROC curve obtained
Implemented algorithms like XGBoost, KNN Classifier, Decicion Tree, Logistics Regression and Random Forest for predictions along with some data-preprocessing techniques

Week 2

Started learning Neural Networks and concepts of back-tracking, activation functions, and hyperparameter tuning
Assignment was on building a classifier for MNIST dataset using PyTorch
Trained a Neural Network for recognizing hand-written digits from MNIST
Used Linear layers, Batch Normalization and Rectified Linear Unit in the Network
Final Accuracy achieved : $96.85$ %

Week 3

Started learning theory of Convolutional Neural Networks and their architectures
Used CNNs in PyTorch for classifying images from CIFAR-10 dataset
Used several layers such as BatchNorm2d, Fully Connected, ReLU, Pooling, Linear and Conv2d in the architecture of ConvNet
Final Accuracy achieved : $57.91$ %

Week 4

Learnt Recurrent Neural Networks (RNN), Long Short Term Memory (LSTM) and Natural Language Processing in this week
Worked on 2 Assignments :
1. Predicting future Oil prices for next 30 days
2. Sentimental Analysis on Stock Market statements
Trained LSTM model for predicting future oil prices based on current historical data
Implemented NLP techniques of stopword removal, Tokenization and Lemmaization in sentiment prediction of sentences

Week 5

Final week consisted of using all the concepts leant to build a model that generates captions for images
Used Flickr8k dataset for training and testing the model, which consists of 8000 images with 5 captions for each image
Utilized pre-trained VGG-19 neural network for extracting features from each image
Passed these features into the LSTM network for generating captions
Evaluated the model using BLEU by calculating BLEU-1, BLEU-2, BLEU-3, BLEU-4 scores

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
summary		summary
week1		week1
week2		week2
week3		week3
week4		week4
week5		week5
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WiDS-Image-Caption-Generation

Week 1

Week 2

Week 3

Week 4

Week 5

About

Releases

Packages

Languages

Atishay25/WiDS-Image-Caption-Generation

Folders and files

Latest commit

History

Repository files navigation

WiDS-Image-Caption-Generation

Week 1

Week 2

Week 3

Week 4

Week 5

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages