Analysis of RollerCoaster data data (raw .csv file) using Python-Pandas, basic and step by step process for cleaning data to data analysis, create visualization
DataSource : Kaggle - https://www.kaggle.com/datasets/robikscube/rollercoaster-database
RawData File : RollerCoaster_Raw_Data.csv
Python-Pandas Source Code : RollerCoaster-DataAnalysis.ipynb
Final Output : RoalerCoaster-FinalAnalysis.pdf
Step 1 : Data Understanding using following topics
- Dataframe Shape, head and tail, dtypes, info, describe
Step 2 : Data Preparation
- Dropping irrelavent columns ans rows
- Identifying duplicated columns
- Renaming columns, change datatypes
- cleaning null values, remove duplicated values
- Feature creation
Step 3 : Feature Understandings, Univariate Analysis
- Plotting Feature Distributions
- Histograms
- KDE
- Boxplot
Step 4 : Feature Relationships
- ScatterPlot
- Heatmap Correlation
- Pairplot
- Groupby Comparisons
Various Graphs for Roller coaster's different Parameters