This project demonstrates a comprehensive analysis of supply chain data using SQL. It covers a range of insights such as order trends, customer behavior, item popularity, and advanced classifications. The dataset includes sales data (sales_test.csv
) and cancellations data (canceled_test.csv
). The queries are categorized into Easy, Intermediate, and Advanced levels based on complexity.
- Basic Analytics: Total orders, unique customers, and average items ordered.
- Customer Insights: Top customers, canceled orders, and successful order rates.
- Item Trends: Top-ordered items, ABC classification, and canceled vs. shipped items.
- Advanced Techniques: Cumulative sales, order rankings, and customer contributions to sales.
- Database: MySQL
- Programming Language: SQL
- Dataset:
sales_test.csv
andcanceled_test.csv
- Deployment: Available on GitHub for exploration and learning.
- Orders in January 2017: Count of orders placed in January 2017.
- Units Ordered in February 2017: Total number of units ordered.
- Canceled Orders per Customer: Number of canceled orders for each customer.
- Unique Customers: Count of unique customers.
- Average Items Ordered: Average number of items ordered per order.
- Top 5 Ordered Items: List of the most ordered items.
- Successful Orders for Specific Customers: Total successful orders by selected customers.
- Units Ordered vs. Canceled: Comparison of orders and cancellations for items appearing in both datasets.
- Canceled vs. Successful Orders: A detailed comparison for the same items.
- Order Classification: Categorization of orders as 'High', 'Medium', or 'Low' based on volume.
- Shipped Items Percentage: Calculation of shipped items as a percentage of ordered items for each customer.
- Top 3 Customers with Canceled Orders: Identification of customers with the highest cancellations.
- Items Canceled More than Shipped: Items with more cancellations than shipments.
- Top Ordering Customer in January 2017: Identification of the customer with the most orders.
- Cumulative Ordered Units: Cumulative total of ordered units ranked by date for each customer.
- Top 3 Customers by Canceled Orders: Analysis of customers with the most cancellations and their total sales.
- Top Customers' Contribution to Sales: Percentage contribution of top 5 customers to total sales.
- ABC Classification: Classification of items into A, B, and C categories based on sales contribution.
Contains details about orders placed, including:
ORDER_NO
: Order IDCUSTOMER_NO
: Customer IDITEM
: Item orderedNS_ORDER
: Number of items orderedNS_SHIP
: Number of items shippedDATE
: Date of the order
Contains information on cancellations, including:
CUSTOMER_NO
: Customer IDITEM
: Item canceledNC_ORDER
: Number of items canceledNC_SHIP
: Number of canceled items shipped
- Understanding how frequently orders are canceled and which items are more likely to be canceled.
- Determining service level performance for different customers and identifying top performers.
- Performing an ABC classification to prioritize items based on sales volume.
- Identifying peak sales and cancellation days, helping the business make informed operational decisions.
- Clone the repository:
git clone https://github.com/heyamay/SQL-Supply-Chain-Analysis.git
- Load the datasets (sales_test.csv and canceled_test.csv) into your SQL database.
- Execute the provided SQL queries in your SQL environment.
- Review the results and insights generated.
I am a Data Analytics Enthusiast and Data science practitioner