This repository contains Python code to retrieve Steam games with similar store banners, using Meta AI's DINOv2.
Image similarity is assessed by the cosine similarity between image features encoded by DINOv2.
Data consists of vertical Steam banners (library_600x900.jpg
at 300x450 resolution).
DINOv2 follows torchvision's pre-processing pipeline for classification:
- resize to 256 resolution, i.e. the smallest edge of the image will match this number,
- center-crop at 224 resolution, i.e. a square crop is made,
- normalize intensity.
transforms_list = [
transforms.Resize(resize_size, interpolation=interpolation),
transforms.CenterCrop(crop_size),
MaybeToTensor(),
make_normalize_transform(mean=mean, std=std),
]
Therefore, downloaded images can be resized to 256x384 resolution before being stored to disk.
As discussed in this Github issue, the crop of DINOv2 can be made less agressive by resizing to 224 instead of 256 resolution. In this case, downloaded images can be resized to 224x336 resolution before being stored to disk.
A snapshot of image data was downloaded on July 20, 2023 and stored as a Github release.
If you wish to re-create this data snapshot, run download_steam_images.ipynb
.
If you wish to filter the image dataset, identify issues with find_dataset_issues.ipynb
.
Note
For the same appID, there can exist some differences between matches computed on the fly and pre-computed matches, because matches are obtained based on features extracted from images resized with different interpolation algorithms:
- for on-the-fly matching, images are resized with
transforms.InterpolationMode.BICUBIC
, - for pre-computed matches, images were resized by
img2dataset
withcv2.INTER_AREA
, as suggested in the doc of OpenCV for downscale interpolation.
For each game in the top 100 most played games in the past 2 weeks, the 10 most similar games are retrieved with:
AppIDs for all these apps can be found in JSON files in data/similar_to_top_100/
.
Note
The linked pages contain a lot of images and might be slow to load depending on your Internet bandwidth.
Basket ball:
Cartoony monsters:
Colorful:
Far West:
Paladins in anime style:
Armed persons with helmets:
Dinosaurs:
Knights:
Robots:
Bows:
Cars:
Castles:
Farms:
Tanks:
Tools:
Borderlands:
Cities (with skyscrapers):
Hollow Knight:
Left 4 Dead (with hands):
Monster Hunter (with dragons):
Mount & Blade (with horses):
Naraka:
Payday:
The Witcher (with Geralt of Rivia):
Truck simulators:
War planes:
War vehicles:
Each model performs reasonably well in terms of image retrieval.
The following results are obtained with ViT-S.
Cars in Rocket League:
Cars in Wallpaper Engine:
Dinosaurs:
Farms:
Persons standing in a forest:
Robots:
Sailing:
Tanks:
Interestingly, the results for The Binding of Isaac may be better than with ViT-L.
The first match for Sekiro reveals a similarity between Wolf in Sekiro and Mitsurugi in Soul Calibur.
Most matches for Naraka include a sword-like weapon with a blue glow:
The following results are obtained with ViT-B.
Bows:
Cartoon characters with a white mask:
Colorful:
Dinosaurs:
Interestingly, the first match for Monster Hunter is correctly in the same franchise.
- A feature extractor for Github repositories which include a
hubconf.py
file. - A feature matcher based on the
faiss
library. match-steam-banners
: retrieve games with similar store banners.- Facebook's DINO:
- Facebook's DINOv2: