Restaurant - Exploratory Data Analysis

During my Data Science internship undertaken at Cognifyz technologies in Jan-Feb 2024, I was given 3 levels of tasks to perform analysis of restaurants and factors affectig their ratings. The main objective was to to gather meaningful insights by conducting exploratory data analysis on the large restaurant dataset, as well as build a ML model to predict ratings.This github repo contains all the files I worked on including the necessary data exploration, preprocessing and various visualization methods to dive deep into finding interesting insights. More details are listed below:

Tasks

All tasks in details can be found here.

Dataset

It can be downloaded here.

Platform used

Google Colab

Libraries and Tools used

pandas, numpy, matplotlib, seaborn, scikitlearn, folium, geopanda

Data Preprocessing & Feature Engineering

Cuisines had 9 null values. So dropped the rows
Removed features that will inhibit model performance
Split training data and test data in the ratio 8:2
Some features/columns needed label encoding

Model Training and Performance

Used Random Forest, Decision Tree Logistic Regression algorithms to build the models
My restaurant rating prediction model (Random Forest and Decision Tree) obtained an aggregate R2 score of 0.93

EDA : Insights

(level and task wise conclusions are given in repo folders in detail.)

There are many restaurants having 0 rating probably due to less popularity.
Visualized the geospatial distribution of restaurants on the map coordinates using folium and geopanda
Most popular restaurants come in the range of ratings 3 to 3.5.
Expensive restaurants (higher price range) tend to have higher ratings.
New Delhi has the highest number of restaurants.
By country, country code “1”, probably North America has most no of restaurants.
'North Indian' is the most popular cuisine overall, followed by "Chinese" and "fast food".
Restaurants having table booking facility have fairly higher average rating.
“Sunda” is the highest rated cuisine and also has the most votes.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
Level 1 - Exploratory Data Analysis		Level 1 - Exploratory Data Analysis
Level 2 - Restaurant EDA		Level 2 - Restaurant EDA
Level 3 - Prediction Model Restaurant		Level 3 - Prediction Model Restaurant
Internship report.pdf		Internship report.pdf
README.md		README.md
RestaurantsDS.csv		RestaurantsDS.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Restaurant - Exploratory Data Analysis

Tasks

Dataset

Platform used

Libraries and Tools used

Data Preprocessing & Feature Engineering

Model Training and Performance

EDA : Insights

Some visualizations for reference

About

Releases

Packages

Languages

galax19ksh/Restaurant-Analysis-and-Predictive-Model

Folders and files

Latest commit

History

Repository files navigation

Restaurant - Exploratory Data Analysis

Tasks

Dataset

Platform used

Libraries and Tools used

Data Preprocessing & Feature Engineering

Model Training and Performance

EDA : Insights

Some visualizations for reference

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages