Caption-Generation

Machine generate a reasonable caption for input video, using attention-based seq2seq model with LSTM cell

Model Prediction : "a small dog is playing with a ball"

Environment

python3
tensorflow 1.0

Data

Usage

Download hw2 data from kaggle, and GloVe 300 dim

./Caption-Generation/MLDS_hw2_data/*
./Caption-Generation/MLDS_hw2_data/glove/glove.6B.300d.txt

Train

First time use, you need to do the preprocessing

$ python3 caption_gen.py --prepro 1

If you already have done the preprocessing

$ python3 caption_gen.py --prepro 0

Model

There are three different models available.

CaptionGeneratorBasic

greedy inference

CaptionGeneratorMyBasic

beam search
greedy inference

CaptionGeneratorSS

schedule sampling
beam search
greedy search

You can set model_type to new different model. e.g.

$ python3 caption_gen.py --prepro [1/0] --model_type=CaptionGeneratorSS

Inference

This code provide two inference methods, Greedy Search and Beam Search
beam search inference is not available in CaptionGeneratorBasic model.
(default beam search @k is set to 5)

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
BLEU.py		BLEU.py
LICENSE		LICENSE
README.md		README.md
bleu_eval.py		bleu_eval.py
caption_gen.py		caption_gen.py
data_utils.py		data_utils.py
img.png		img.png
model.py		model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Caption-Generation

Environment

Data

Usage

Train

Model

Inference

About

Releases

Packages

Languages

License

m516825/Caption-Generation

Folders and files

Latest commit

History

Repository files navigation

Caption-Generation

Environment

Data

Usage

Train

Model

Inference

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages