The primary objective of this project is to implement customer segmentation, leveraging RFM modeling and K-Means clustering. The ultimate goal is to enhance targeted marketing efforts and establish personalized customer relationship management strategies.
-
Description:
-
The dataset encompasses all transactions from a UK-based and registered, non-store online retail entity, spanning the period from 01/12/2009 to 09/12/2011. The company specializes in unique all-occasion gift-ware, with a considerable portion of its clientele consisting of wholesalers.
-
The dataset has been meticulously curated to facilitate customer segmentation. The segmentation is achieved through RFM modeling, allowing businesses to categorize customers based on their transaction behavior. Further refinement is achieved using K-Means clustering, enabling more precise targeting and personalized marketing strategies.
-
The dataset contains over 1.05 million records with 8 features.
-
-
Attributes:
- InvoiceNo: Nominal. A 6-digit integral number uniquely assigned to each transaction. If starting with 'c', indicates a cancellation.
- StockCode: Nominal. A 5-digit integral number uniquely assigned to each distinct product.
- Description: Nominal. Product name.
- Quantity: Numeric. The quantities of each product per transaction.
- InvoiceDate: Numeric. Date and time when a transaction was generated.
- UnitPrice: Numeric. Product price per unit in sterling (£).
- CustomerID: Nominal. A 5-digit integral number uniquely assigned to each customer.
- Country: Nominal. The name of the country where a customer resides.
-
Data Analysis, Preprocessing, Engineering:
- Addressed duplicates, missing values, null values, etc., ensuring its impact on business analysis.
- Conducted in-depth analysis, including understanding country-wise import/export patterns, customer distribution by continent, and comparisons between cancelled and non-cancelled orders.
- Analyzed the top 10 products in each country/continent for strategic business insights.
- Performed logical data cleaning and transformation.
-
RFM Modeling:
- RFM stands for Recency, Frequency, and Monetary Value. It's a marketing model that segments customers based on their purchasing patterns.
- Recency: How recently a customer has made a purchase
- Frequency: How often a customer makes a purchase
- Monetary value: How much money a customer spends on purchases
- Calculated RFM values for each customer.
- Assigned RFM segments to customers.
- Assigned loyalty levels to customers based on RFM analysis for targeted marketing.
- RFM stands for Recency, Frequency, and Monetary Value. It's a marketing model that segments customers based on their purchasing patterns.
-
K-Means Clustering:
- Applied the K-Means clustering algorithm to further refine customer segments.
- Determined the optimal number of clusters based on the RFM data.
-
Based on data analysis, gift-ware exports to continents like Asia, Africa, and the Middle East are relatively lower, potentially influenced by cultural differences and preferences.
-
Europe emerged as the leader in customer distribution by continent, with countries like the United Kingdom, Germany, France, and others prominently involved in gift-ware imports.
-
The best-selling product was identified as the "WHITE HANGING HEART T-LIGHT HOLDER."
-
The dataset revealed a higher number of Bronze-level customers (lower RFM scores), followed by Gold, Silver, and Platinum.
- Customer segmentation based on behavioral patterns and subsequent categorization into loyalty levels have paved the way for more focused and personalized marketing efforts.