Installation

Clone the git repo:

$ git clone git@github.com:ttseriotou/image-captioning.git

Create a conda environment:

$ conda env create --file=image_captioning.yml

Actviate the conda environment

$ conda activate image_captioning

Model

The notebook defaults to the git-large-coco image captioning model for inference. There is though a wide range of different image captioning models in Huggingface. Examples of such models are enlisted here

Data

Flickr Sample Data were obtained through: https://www.kaggle.com/datasets/hsankesara/flickr-image-dataset

Sources

Credits to the following sources for processing and GIT captioning: https://www.kaggle.com/code/hsankesara/image-captioning

https://github.com/NielsRogge/Transformers-Tutorials/blob/master/GIT/Fine_tune_GIT_on_an_image_captioning_dataset.ipynb

https://huggingface.co/docs/transformers/model_doc/git

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Installation

Model

Data

Sources

Files

README.md

Latest commit

History

README.md

File metadata and controls

Installation

Model

Data

Sources