Multimodal Image Retrieval

🚦⚠️👷‍♂️🏗️🚦⚠️👷‍♂️🏗️🚦⚠️👷‍♂️🏗️ Repo Under Construction 🚦⚠️👷‍♂️🏗️🚦⚠️👷‍♂️🏗️🚦⚠️👷‍♂️🏗️

Note: Our model hasn't been trained sufficiently and the results are nowhere close to our expectations. We'll be improving the model as we find time and more GPU resources. Until then, play around with this (not so great) model.
Things we're looking to try:

Improve preprocessing

Replace special characters with space

Play around with embedding dimensions

Use the entire InstaNY100K Dataset

Train Word2Vec again

Use different CNNs for regressing Word2Vec embeddings from images.

Try different post-processing strategies for embeddings.

Train with MSELoss

Experiment with other distance functions

A deep learning application to retrieve images by searching with text.

Try out the application here: https://share.streamlit.io/koushikvikram/multimodal-image-retrieval/main/app.py

Project Workflow

Dataset

Download the InstaNY100K dataset from this Google Drive link

Extract the dataset in the path, ./datasets/raw/. You folder structure should look like the one below:

./datasets/raw/
|
|-- InstaNY100K
    |
    |-- captions
    |   |
    |   |-- newyork
    |      | 1487768220566960691.txt
    |      | 1490727714071958379.txt
    |      | ...
    |   
    |-- img_resized
        |
        |-- newyork
            | 1480879485913200243.jpg
            | 1480879539524935620.jpg
            | ...

GitHub Actions for this Repository

Pylint - Code Quality Check

Pytest - Functionality and Behavioral Tests for Classes and Models

Exploring the Word2Vec Model

We recommend using the TensorFlow Embedding Projector to visualize our Word2Vec model.

Load the tensor and metadata tsv files provided in the model directory and visualize words that interest you!

Samples from TensorFlow Embedding Projector:

You can also use models/explore_word2vec.ipynb to explore words of interest.

Samples from the Jupyter Notebook:

Acknowledgment

Articles used as reference during development are documented in the references directory.

If you run into issues while using the repo, please create an issue on this GitHub repository at the following link and I'll be glad to fix it: https://github.com/koushikvikram/multimodal-image-retrieval/issues

If you'd like to collaborate with me or hire me, please feel free to send an email to koushikvikram91@gmail.com

Make sure to check out other repositories on my homepage.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Multimodal Image Retrieval

Project Workflow

Dataset

GitHub Actions for this Repository

Exploring the Word2Vec Model

Acknowledgment

Files

README.md

Latest commit

History

README.md

File metadata and controls

Multimodal Image Retrieval

Project Workflow

Dataset

GitHub Actions for this Repository

Exploring the Word2Vec Model

Acknowledgment