Python search engine tool for detecting similar trademarks. Pipeline below for detecting trademark infringement for an institution like US Patent Office. The project was completed in a fast-paced enviornment in ~3 weeks time alongside attending demos and meetings.
-
backend : This contains the directories for storing pretrained models, data, and results. The backend source modules for the project in form of .py files are stored in the directory.
backend/models directory: This contains the pretrained models. The model needs to be downloaded from the url provided in the "Prerequisites" section.
backend/data directory: This contains several subdirectories for data storage. Raw_data subdirectory contains input raw logo images downloaded using the url provided in the "Download Raw data on local machine" section.
backend/data/tests contains test results to evaluate the performance of the search engine.
Rest of the data subdirectories will be filled automatically during the execution of the program as per the instructions in "Download Raw data on local machine", "Preprocess data", and "Training and feature extraction" sections.
Attribution:
https://live.ece.utexas.edu/publications/2011/am_asilomar_2011.pdf
https://github.com/ilmonteux/logohunter (Internet based labeled Logo images)
https://alexisbcook.github.io/2017/global-average-pooling-layers-for-object-localization/
-
static : This contains readme_images, semisuper, uploaded subdirectories.
static/readme_images subdirectory contains images for the README section. static/semisuper contains images that are used for inference. This subdirectory is filled automatically during the execution of the program. -
templates : Flask web page templates
-
online_search.py : This maps the web page requests to the backend modules
Clone repository and change the directory to the project directory
Create virtual environment in conda
repo_name=Gita_Insight_Project2019 # URL of your new repository
username=snygt2007 # Username for your personal github account
git clone https://github.com/snygt2007/Gita_Insight_Project2019.git
cd $Gita_Insight_Project2019
conda -V
conda search "^python$"
conda create -n yourvirtualenvname python=versionnumber anaconda
Proceed ([y]/n)? y
conda activate yourvirtualenvname
- The packages used to build the code is provided in Requirements.txt
- Please obtain permission from "Tüzkö A., Herrmann C., Manger D., Beyerer J.: “Open Set Logo Detection and Retrieval“, Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications: VISAPP, 2018." for using the input logo images used in this program. Alternatively, you can execute the inference portions.
To install the package above, please run:
pip install -r requirements.txt
- A pretrained model is downloaded in backend/models folder by using following commands from the program directory:
cd backend
python download_models.py
Alternatively, it can be downloaded from https://tinyurl.com/y2ke7stl in backend/models folder
Execute command to goto the project directory
cd backend
python download_raw_data.py
python Create_resized_data.py
Below step is optional for inference but mandatory for training.
python Create_supervised_Data.py
- A CNN based model is trained with transfer learning technique and the details of the configuration are in
Below step is optional for inference but mandatory for training.
python Image_supervised_model.py.
This is a mandatory step to test inference on large set of data.
- Features from the existing logo catalogs are extracted and stored using:
python Extract_semi_supervised_data.py
- Instructions for how to run all tests:
(This requires the model to be downloaded in the backend/models folder)
python online_search.py
start localhost:5000 in your local browser (google chrome)
Select a png image for search query. Alternatively, you can select an existing image from the static/semisuper to execute the search.
Troubleshooting- you may need to install the following before executing the online_search.py
pip install werkzeug==0.15.4
python online_search.py
The project can be extended in future by creating fake logo images using Generative Adversarial Networks (GAN).