Internship ReadMe: Prodigy Infotech Data Science Tasks

Overview:-

This repository contains the tasks completed during my internship at Prodigy Infotech. Each task demonstrates a different aspect of data science, including data visualization, data cleaning, exploratory data analysis, machine learning, and sentiment analysis. The tasks use various datasets to showcase different techniques and methods commonly used in data science projects.

Task-01: Data Visualization

Objective:-

Created a histogram to visualize the distribution of a categorical or continuous variable, such as the distribution of ages or genders in a population.

Dataset:-

World Bank Population Data: https://data.worldbank.org/indicator/SP.POP.TOTL

Description:-

Loaded the population data from the World Bank.
Processed the data to extract the relevant categorical or continuous variable.
Created a bar chart or histogram to visualize the distribution.
Used Python libraries such as pandas for data manipulation and matplotlib/seaborn for visualization.

Task-02: Data Cleaning and Exploratory Data Analysis (EDA)

Objective:-

Perform data cleaning and exploratory data analysis on a dataset to explore relationships between variables and identify patterns and trends.

Dataset:-

Titanic Dataset from Kaggle: https://www.kaggle.com/c/titanic/data

Description:-

Loaded the Titanic dataset.
Cleaned the data by handling missing values, encoding categorical variables, and normalizing numerical variables.
Conducted EDA to explore the relationships between variables and identify patterns and trends.
Visualized the data using various plots (e.g., scatter plots, box plots, heatmaps).

Task-03: Decision Tree Classifier

Objective:-

Build a decision tree classifier to predict whether a customer will purchase a product or service based on their demographic and behavioral data.

Dataset:-

Bank Marketing Dataset from UCI Machine Learning Repository: https://archive.ics.uci.edu/ml/datasets/Bank+Marketing

Description:-

Loaded the Bank Marketing dataset.
Preprocessed the data by encoding categorical variables and splitting the data into training and test sets.
Built a decision tree classifier using scikit-learn.
Evaluated the classifier's performance using metrics such as accuracy, precision, recall, and F1-score.

Task-04: Sentiment Analysis

Objective:-

Analyze and visualize sentiment patterns in social media data to understand public opinion and attitudes towards specific topics or brands.

Dataset:-

Twitter Entity Sentiment Analysis Dataset from Kaggle: https://www.kaggle.com/datasets/jp797498e/twitter-entity-sentiment-analysis

Description:-

Loaded the Twitter sentiment analysis dataset.
Preprocessed the data by cleaning text, tokenizing, and vectorizing.
Analyzed sentiment patterns using natural language processing techniques.
Visualized the sentiment distribution and identified key trends.

Requirements

To run the scripts and reproduce the results, the following Python libraries are required:

Pandas
NumPy
Matplotlib
Seaborn
Scikit-learn

Conclusion

This repository showcases my data science skills through various tasks involving data visualization, cleaning, exploratory analysis, machine learning, and sentiment analysis. Each task demonstrates my ability to work with different datasets and apply appropriate techniques to extract meaningful insights.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Prodigy DS Internship		Prodigy DS Internship
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Internship ReadMe: Prodigy Infotech Data Science Tasks

Overview:-

Task-01: Data Visualization

Objective:-

Dataset:-

Description:-

Task-02: Data Cleaning and Exploratory Data Analysis (EDA)

Objective:-

Dataset:-

Description:-

Task-03: Decision Tree Classifier

Objective:-

Dataset:-

Description:-

Task-04: Sentiment Analysis

Objective:-

Dataset:-

Description:-

Requirements

Conclusion

About

Releases

Packages

Languages

KeerthanaPalanikumar/Prodigy-Infotech

Folders and files

Latest commit

History

Repository files navigation

Internship ReadMe: Prodigy Infotech Data Science Tasks

Overview:-

Task-01: Data Visualization

Objective:-

Dataset:-

Description:-

Task-02: Data Cleaning and Exploratory Data Analysis (EDA)

Objective:-

Dataset:-

Description:-

Task-03: Decision Tree Classifier

Objective:-

Dataset:-

Description:-

Task-04: Sentiment Analysis

Objective:-

Dataset:-

Description:-

Requirements

Conclusion

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages