This project aims to provide a comprehensive analysis of Pakistan's startup ecosystem through data cleaning, preparation, pivot table creation, and visualization using Microsoft Excel. The analysis will help identify trends, growth patterns, and opportunities within the startup industry in Pakistan.
This repository contains resources for performing a data-driven exploration of the startup ecosystem in Pakistan. The analysis includes:
- Cleaning raw startup data.
- Preparing the data for analysis.
- Creating pivot tables to summarize key metrics.
- Visualizing the results using Excel.
Data cleaning involves removing duplicates, handling missing values, and correcting inconsistencies in the dataset. This ensures that the data is accurate and ready for analysis.
- Remove Duplicates: Use
Data > Remove Duplicates
. - Handle Missing Values: Use
Find & Select > Go To Special > Blanks
to locate and fill missing values. - Correct Inconsistencies: Standardize data entries using
Find & Replace
and data validation techniques.
Data preparation includes transforming variables, creating new features, and structuring the data to facilitate meaningful analysis. This step ensures that the data is in a format suitable for creating pivot tables and visualizations.
- Date Formatting: Ensure date fields are in the correct format using
Data > Text to Columns
or date functions. - Categorical Encoding: Use
Data Validation
and lookup tables to standardize categorical variables. - Creating New Features: Generate new columns using formulas like
YEAR()
,MONTH()
, andVLOOKUP()
.
Pivot tables are used to summarize and analyze data by aggregating information across various dimensions, such as funding by year or the number of startups by category. This allows for a comprehensive view of the startup ecosystem.
- Insert Pivot Table: Select your data range and go to
Insert > PivotTable
. - Rows and Columns: Drag fields such as 'Year' to rows and 'Category' to columns.
- Values: Drag numerical fields like 'Funding Amount' to values and select appropriate aggregation (e.g., sum, average).
Excel is used to create dynamic and interactive visualizations, such as bar charts, line charts, and pie charts. These visualizations help in interpreting and communicating the results of the analysis effectively.
- Insert Chart: Select the data you want to visualize and go to
Insert > Chart
. - Customize Chart: Add titles, labels, and adjust the color scheme to improve readability.
- Dynamic Charts: Use slicers and interactive elements to create dynamic charts that allow for filtering and drill-down analysis.
To get a local copy up and running, follow these steps:
- Microsoft Excel
-
Clone the repository:
git clone https://github.com/yourusername/pakistan-startup-analysis.git cd pakistan-startup-analysis
- Open
pakistan_startups.xlsx
. - Clean the data using the steps outlined in the Data Cleaning section.
- Prepare the data using the steps outlined in the Data Preparation section.
- Create pivot tables using the steps outlined in the Pivot Table section.
- Visualize the data using the steps outlined in the Excel Visualization section.
Feel free to reach out with any questions or feedback. Happy analyzing!