Customer Segmentation and Profiling by Modified LRFM Analysis using CLARA Algorithm to Optimize Marketing Strategy
- Introduction
- Problem Analysis
- Goals
- Solution
- Data Source
- Workflow
- Data Preprocessing
- Feature Engineering
- Modelling
- Model Evaluation
- Profiling
- Business Recommendations
- Dashboard
- References
- Important Note
This project aims to enhance the marketing strategy of a company by understanding customer behavior more deeply through customer segmentation and profiling. We utilize a modified LRFM (Length, Recency, Frequency, Monetary) analysis and the CLARA (Clustering Large Applications) algorithm to achieve this.
- Users with their first transaction using a discount show similar or worse retention compared to non-discount users.
- The existing discount promotions are not optimal in attracting new customers and retaining them in the long term.
- Understand customer purchasing behavior to optimize promotional strategies for better targeting.
- Increase customer retention and maintain existing customer loyalty.
- A more personalized promotional strategy analysis by segmenting customers based on purchasing behavior patterns using the modified LRFM metrics.
- Business & Data Understanding
- Data Preprocessing
- Feature Engineering
- Modelling (CLARA)
- Model Evaluation
- Profiling
- Business Recommendation & Solution
- Data Cleaning: Handling missing values, invalid data, and outliers.
- Data Merging: Standardizing column names and formatting data.
- Feature Engineering: Creating and modifying features for clustering.
- Removed transactions with zero gross amount (0.57%).
- Replaced missing discount values with 0 (69.83%).
- Deleted rows with invalid data (4,991 rows).
- Kept outliers to preserve valuable transaction information.
- Length: Difference between the date of the last transaction and the first transaction for each user.
- Recency Score (1/R): Inverse of the value of the difference between the last transaction date and the reference date (2025-01-01).
- Monetary per Frequency (M/F): Average amount spent per transaction.
- Target promotional strategies accurately.
- Identify customers at risk of churn and the most valuable customers.
Algorithm | Robust to Outliers | Handles Large Datasets | Time Complexity | Low Dimension Performance |
---|---|---|---|---|
CLARA | Yes | Yes | Middle | Good |
K-Means | No | Yes | Low | Excellent |
DBSCAN | Yes | Moderate | High | Good |
K-Medoids | Yes | No | High | Good |
- Handles large datasets efficiently.
- Provides robust clustering results due to sampling.
- Performance can be sensitive to the choice of parameters.
- May not be as accurate on small datasets due to its reliance on sampling.
- A modification of the PAM (Partitioning Around Medoids) algorithm.
- Uses medoids as cluster centers and sampling methods for efficiency with large datasets.
n_clusters = 8
init = ‘build’
n_sampling = None
n_clusters = 4
n_sampling = 300
init = k-medoids++
random_state = 42
Silhouette Score: 0.652
- At Risk (413 customers): Long-lasting, medium spenders, last transaction a long time ago.
- Potential Loyalist (452 customers): Long-lasting, small spenders, most recent transactions.
- Lost (111 customers): Short-term, very small spenders, last transaction a long time ago.
- Loyal (453 customers): Long-lasting, medium spenders, most recent transactions.
Cluster Name | Number of Customers | Description |
---|---|---|
At Risk | 413 | Long-lasting, medium spenders, last transaction a long time ago |
Potential Loyalist | 452 | Long-lasting, small spenders, most recent transactions |
Lost | 111 | Short-term, very small spenders, last transaction a long time ago |
Loyal | 453 | Long-lasting, medium spenders, most recent transactions |
- At Risk: Re-engage with personalized offers.
- Potential Loyalist: Encourage repeat purchases with targeted campaigns.
- Lost: Reactivate with exclusive discounts and limited-time offers.
- Loyal: Reward loyalty with community-building initiatives and early access programs.
- Special treatment to make customers feel valued, such as early access to new products or exclusive member offers.
- Personalized offers, nostalgic campaigns, and customer education.
- Exclusive large discounts and time-limited offers to entice customers to return.
- Community-building initiatives, customer appreciation programs, and early access to new collections.
- Zhang, C., Huang, W., Niu, T. et al. (2023). Review of Clustering Technology and Its Application in Coordinating Vehicle Subsystems. Automot. Innov., 6, 89–115. https://doi.org/10.1007/s42154-022-00205-0
All datasets and project results are used solely for educational purposes and do not reflect actual values. Please do not use this project as a reference or recommendation.
Thank you for checking out our project! We hope this work contributes to the optimization of marketing strategies through effective customer segmentation and profiling.