Skip to content

Latest commit

 

History

History
86 lines (50 loc) · 8.09 KB

README.md

File metadata and controls

86 lines (50 loc) · 8.09 KB

Movies Recommendation System ForTheBadge built-with-love

Recommend movies based on inputs provided by the user, such as:

  1. Number of movies for recommendations
  2. From_year
  3. To_year
  4. Genre (for content-based)/Movie title with release year (for collaborative filtering)

Background

Typing SVG

Introduction

With the rapid explosion of video streaming platforms on the Internet, the catalog of movies is rising exponentially, leaving viewers overwhelmed with a huge database of movies to choose from. Movie Recommendation Systems come into play, which consider users' preferences and recommend movies to them. This saves users a lot of time and effort that would otherwise be wasted while searching for a movie manually. This motivated me to start research on the topic ‘Movies Recommendation’.

Data Source

Two different datasets from MovieLens that were collected by the GroupLens Research team for research work in the field of recommender systems were used. They are as follows:

  1. MovieLens 25M Dataset: Approximately 63 thousand movies and 25 million user-ratings
  2. MovieLens Latest Small Dataset: Approximately 10 thousand movies and 100 thousand user-ratings

Code Notebooks

forthebadge Made withJupyter Open In Collab

Recommends Top-N Popular Movies to the user when the user inputs the number of movies for recommendations, from_year, to_year and genre. The Movies Recommender System will take year_range and genre into account and filter the movie list. This movie list will be again filtered by only selecting movies which have a rating higher than the average rating and the number of users who rated the movie higher than the average count. Finally, the selected movie list will be sorted by the number of users who rated the movie in descending order and only top-N movies will be provided as recommendations.

Recommends Top-N Rated Movies to the user when the user inputs the number of movies for recommendations, from_year, to_year and genre. The Movies Recommender System will take year_range and genre into account and filter the movie list. This movie list will be again filtered by only selecting movies which have a rating higher than the average rating and the number of users who rated the movie higher than the average count. Finally, the selected movie list will be sorted by average rating of the movie in descending order and only top-N movies will be provided as recommendations.

Recommends Top-N Similar Movies (by Cosine Similarity) to the user when the user inputs the number of movies for recommendations, from_year, to_year and movie title with release year. The Movies Recommender System will take year_range into account and filter the movie list. A user-ratings matrix will be created for each user to rate different movies. User-ratings of input movie titles will be selected and similarity will be calculated with user-ratings of other movies using cosine similarity. This will result in a list of movies which have similar user-ratings as that of the input movie. Further, this list of movies will be sorted by cosine similarity score in descending order and only the top 0.01% of users-rated movies will be selected. Finally, top-N movies will be provided as recommendations.

Recommends Top-N Similar Movies (by Pearson Correlation) to the user when the user inputs the number of movies for recommendations, from_year, to_year and movie title with release year. The Movies Recommender System will take year_range into account and filter the movie list. A user-ratings matrix will be created for each user to rate different movies. User-ratings of input movie titles will be selected and similarity will be calculated with user-ratings of other movies using pearson correlation. This will result in a list of movies which have similar user-ratings as that of the input movie. Further, this list of movies will be sorted by Pearson correlation score in descending order and only the top 0.01% of users-rated movies will be selected. Finally, top-N movies will be provided as recommendations.

Libraries Used

  1. NumPy: NumPy is the most significant Python package for scientific computing. It is a Python library that includes a multidimensional array object, several derivative objects (such as masked arrays and matrices), and a collection of functions for performing rapid array operations.
  2. Pandas: Pandas is an open source data analysis and manipulation tool built on top of the Python programming language that is quick, powerful, versatile, and user-friendly. It is a Python library provides high-performance, user-friendly data structures and data analysis tools.
  3. Scikit-learn: Scikit-learn is an open-source machine learning library that can do both supervised and unsupervised learning. It also includes tools for model fitting, data preprocessing, model selection and evaluation, and a variety of other functions.

Deployment

  1. Download the zipped code file from the repository main page or you can click here: DOWNLOAD CODE
  2. Unzip the .ipynp notebook of the movie recommender system which you want to deploy.
  3. Open Google Colab -> Go to 'File' tab -> Select 'Upload notebook'.
    • Note: You must have a Google account signed-in before using Colab.
  4. Once the .ipynb notebook is uploaded -> Go to 'Runtime' tab -> Select 'Run all'.
  5. You can try to give your own inputs in the last cell 'Function Call' to get movie recommendations.

Demo

  1. Example screenshot of recommendations result provided by Top-N Popular Movies Recommender System

  1. Example screenshot of recommendations result provided by Top-N Rated Movies Recommender System

  1. Example screenshot of recommendations result provided by Top-N Similar Movies Recommender System (by Cosine Similarity)

  1. Example screenshot of recommendations result provided by Top-N Similar Movies Recommender System (by Pearson Correlation)

Author

Mounik Patel

Acknowledgment

Thanks to Dr. T