Attention Based Hierarchical RNN for Image Captioning

Use hierarchical attention from different feature layer to generate image captioning

This is the code for the final project of 10-807 at CMU

You need to have tensorflow installed.

This is the code for the image captioning part

(1) Download MS COCO data, download VGG16 pretrained model [https://www.cs.toronto.edu/~frossard/post/vgg16/]

(2) Resize images to 224 x 224 [resize.py]

(3) Preprocess annotions and extract features [preprocess.py]

(4) Train your model [train.py]

(5) Test your model [test.py]

For evaluation, use pycocoevalcap [git clone https://github.com/tylin/coco-caption.git]

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
core		core
README.md		README.md
preprocess.py		preprocess.py
resize.py		resize.py
test.py		test.py
train.py		train.py

Provide feedback