Optimizing the Launch and Operation of a New E-commerce Business in Pakistan: A Data-Driven Strategy.
A group of entrepreuners want to start a new E-Commerce business in Pakistan. They have gathered the data of half a million e-commerce orders in Pakistan from March 2016 to August 2018. Now as a data analyst, you have to analyze and explore the data to find out useful insights, answer various analytical and research questions and test different hypothesis with data driven approach, so that entreprueners can take informed decisions after looking what the data says?
-
Loaded the dataset and viewed it i.e it's colums and rows, checked out samples of different rows, looked for columns data type.
-
Handled missing and anomalous data, checked for outliers.
-
Explored the data and extracted different valuable insights through different charts and plots like histogram, bar chart, line plot etc.
Programming Language: Python
- Numpy
- Pandas
- Matplotlib
- Seaborn
- Mobiles and Tablets is the best selling category, along with Men and Women's Fashion and appliances.
- In November there are the most sales due to different sales and campaigns like 11.11 (Giyara-Giyara)
- Sales are also high in May and June due to Ramadan and Eid.
- There is a high cancellation rate in online payments.
- Easypaisa and Banks/Cards/Wallets has the highest rate of cancellation.
- Cash on Delivery orders are the most successful with most them marked as completed.
- Men's and women's fashion are both successful categories as they have more completions as compared to cancellations and refund.
- Mobiles & Tablets is also a good category but with a high risk.