The game recommendation engine recommends games that have the most similar keywords, starting with title, platform, game genre, developer and description.
The entire database(title, genre, score, description, etc) was scraped from metacritic (https://www.metacritic.com/browse/games/score/metascore/all/all/filtered to be exact from this subpage) using beautifulsoup4
.
The images were also scraped by searching for the game title on google images and downloading the first 3 records. They are stored on a separate server and are downloaded via a link that has been corrected for the game title. Some image resolutions may be in poorer quality. A dataset is modified and added to another one (scraping was done in two rounds) which is the final dataset. On it the merging of keywords takes place, which are subjected to sentiment analysis.
Link to website: https://game-recommender-engine.herokuapp.com/
How does it decide which item is most similar to the item user likes? Here we use the similarity scores.
It is a numerical value ranges between zero to one which helps to determine how much two items are similar to each other on a scale of zero to one. This similarity score is obtained measuring the similarity between the text details of both of the items. So, similarity score is the measure of similarity between given text details of two items. This can be done by cosine-similarity.
Cosine similarity measures the similarity between two vectors of an inner product space. It is measured by the cosine of the angle between two vectors and determines whether two vectors are pointing in roughly the same direction. It is often used to measure document similarity in text analysis.
Formula for calculating Cosine Similarity:
Source : Cosine Similarity
The projects were inspired by:
1.https://github.com/subhamrex/Movie_Recommendation_System_with_Web_App
2.https://www.datacamp.com/community/tutorials/recommender-systems-python
Some pieces of code have been copied from other sources and are mentioned in the .py files.
Screen from my web-app: