Ensemble-Modelling-and-Visualization-of-Data

A Major and Minor Project built during Verzeo Internship

I have joined the Team Verzeo for the Internship between June-August 2020.

MINOR PROJECT

I have built a minor project where I have analysed the data, cleaned the data, detected outliers in it and performed the Exploratory Data Analysis (EDA) for the given dataset. Moreover, I have written the code to answer the following questions. However, the answers to the questions can be found here "Mini Project Answers.pdf".

1) Which are the movies with the third lowest and third highest budget?
2) What is the average number of words in movie titles between the year 2000-2005?
3) What is the most common Genre for Vin Diesel & Emma Watson movies?
4) Which are the movies with most and least earned revenue?
5) What is the average runtime of movies in the year 2006?
6) Name any 3 production companies which have invested money in worse revenue movies?

For more details about the project, please refer to "Mini Project.ipynb".

MAJOR PROJECT

The major project assigned in this internship required me to work with Multinomial Naive Bayes, K-Nearest Neighbors (KNN) and Random Forest models for a given dataset (problem) and to decide which is the best classification algorithm (as per accuracy).

I have worked upon Information.csv dataset for this project. I also performed Exploratory Data Analysis on the data set given by Verzeo team. Moreover, I have performed Ensemble Learning where I have built a model using the 3 classfication algorithms which resulted in an accuracy of 61%. Based on the observations, the project was successfully completed.

Tasks Perfromed :

Exploratory Data Analysis
Cleaning the Data
Data Visualization
Normalizing the texts
Feature Engineering
Classification algorithms such as :
- Multinomial Naive Bayes
- K-Nearest Neighbors (KNN)
- Random Forest (RFC)
Ensemble Learning method - Vote Classifier
Answered the following questions :

Q1) What are the most common emotions/words used by Males and Females?
Q2) What is the time when most of the tweets are created by Males and Females?

However, the answers to the questions can be found here "Major Project Summary.pdf".

Conclusion:

The project now classfies the common emotions/words used and also the time when most of the tweets are created by a specific gender i.e., by Males and Females. This is my first project based on Machine Learning.

For more details about the project, please refer to "Major Project.ipynb".

I am glad to share this on GitHub as my contribution to open source.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
ML-MAJOR-PROJECT		ML-MAJOR-PROJECT
ML-MINI-PROJECT		ML-MINI-PROJECT
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ensemble-Modelling-and-Visualization-of-Data

A Major and Minor Project built during Verzeo Internship

MINOR PROJECT

MAJOR PROJECT

Tasks Perfromed :

Conclusion:

About

Releases

Packages

Languages

anirudhjak06/Ensemble-Modelling-and-Visualization-of-Data

Folders and files

Latest commit

History

Repository files navigation

Ensemble-Modelling-and-Visualization-of-Data

A Major and Minor Project built during Verzeo Internship

MINOR PROJECT

MAJOR PROJECT

Tasks Perfromed :

Conclusion:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages