Cross-Lingual Image Search Engine

This repo contains an end-to-end implementation of an image search engine application. It is a rough simulation of a real world implementation and contains various modules, which are as follows

Architecture diagram

Images from Flickr are extracted using an API key and the search API provided by the Flickr team. These images along with some additional metadata (also extracted from Flickr) are converted to producer records and pushed to Kafka.

These records are then processed by a spark streaming application (at a 1-min interval) to extract the embeddings of the images. This embedding alongside the metadata are then written into elasticsearch. ElasticSearch provides vector KNN search capabilities, along with pagination and regular text based search.

Kafka allows for decoupling of the image data extraction from Flickr and the spark processing, this allows for handling Flickr APIs' ratelimiting and spark's resource related issues. This allows us to limit our spark cluster to be in a cost-efficient size.

The CLIP model's artifact files are managed using mlflow, this allows for better versioning of the models and enables model registry capabilities. Allowing for efficient model usage, versioning and management, across the consumer and server applications.

The backend server is a lightweight FastAPI based server, which exposes 2 endpoints to its clients.

/text_search - takes a text phrase
/image_search - takes an image url or base64 image string

and returns the top K matching responses stored in elasticsearch. These are then displayed in the KNN score order on the frontend.

Demo

Below is an example of the results when searching with phrase playing dogs.

OpenAI's CLIP model has been trained on multiple languages and hence it has the capability to infer text from various languages. Below is an example when using the german translated phrase for playing dogs which is spielende Hunde.

As you can see the results are very similar to when an english phrase was used. This demonstrates the multilingual capability of CLIP model.

The results as a whole display the capability of the model to carry out cross domain tasks across text and images.

Tech stack used

Please consider staring the repo if you find this repo to be useful.

Feel free to connect with me on LinkedIn.

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
cache		cache
client		client
elasticsearch		elasticsearch
image-consumer		image-consumer
image-producer		image-producer
imgs		imgs
kafka		kafka
model		model
schemas		schemas
server		server
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cross-Lingual Image Search Engine

Architecture diagram

Demo

Tech stack used

About

Releases

Packages

Languages

chaitanya-basava/Image-Search-Engine

Folders and files

Latest commit

History

Repository files navigation

Cross-Lingual Image Search Engine

Architecture diagram

Demo

Tech stack used

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages