Skip to content

Performing an unsupervised clustering of data on the customer's records from a groceries firm's database with K-Means.

Notifications You must be signed in to change notification settings

arienugroho050396/Customer-Personality-Analysis

Repository files navigation

This is an image

here you can download the cheatsheet

Introduction

For the given Kaggle Dataset, I will perform the exploratory data analysis with the help of customer segmentation. Customer segmentation will be carried out with the help of K-Means alogorithm. At the end of analysis I would like to answer some questions by gaining some insights from the data.

Technical

  • Language : Python (filetype: .ipynb)

Content

Need to perform clustering to summarize customer segments.

Data Field

People :

Variable Name Description
ID Customer's unique identifier
Year_Birth Customer's birth year
Education Customer's education level
Marital_Status Customer's marital status
Income Customer's yearly household income
Kidhome Number of children in customer's household
Teenhome Number of teenagers in customer's household
Dt_Customer Date of customer's enrollment with the company
Recency Number of days since customer's last purchase
Complain 1 if customer complained in the last 2 years, 0 otherwise

Products :

Variable Name Description
MntWines Amount spent on wine in last 2 years
MntFruits Amount spent on fruits in last 2 years
MntMeatProducts Amount spent on meat in last 2 years
MntFishProducts Amount spent on fish in last 2 years
MntSweetProducts Amount spent on sweets in last 2 years
MntGoldProds Amount spent on gold in last 2 years

Promotion :

Variable Name Description
NumDealsPurchases Number of purchases made with a discount
AcceptedCmp1 1 if customer accepted the offer in the 1st campaign, 0 otherwise
AcceptedCmp2 1 if customer accepted the offer in the 2nd campaign, 0 otherwise
AcceptedCmp3 1 if customer accepted the offer in the 3rd campaign, 0 otherwise
AcceptedCmp4 1 if customer accepted the offer in the 4th campaign, 0 otherwise
AcceptedCmp5 1 if customer accepted the offer in the 5th campaign, 0 otherwise
Response 1 if customer accepted the offer in the last campaign, 0 otherwise

Place :

Variable Name Description
NumWebPurchases Number of purchases made through the company’s web site
NumCatalogPurchases Number of purchases made using a catalogue
NumStorePurchases Number of purchases made directly in stores
NumWebVisitsMonth Number of visits to company’s web site in the last month

What We Got From This Project

  • What are the statistical characteristics of the customers?
  • What are the spending habits of the customers?
  • Are there some products which need more marketing?
  • How the marketing can be made effective?

Step Inside The Project

  • Exploratory Data Analysis (EDA)
  • Machine Learning Model (K-Means)
  • Cluster Interpretation
  • Customer Distribution
  • Relationship: Income VS Spendings
  • Spending Habits By Clusters
  • Purchasing Habits By Clusters
  • Promotions Acceptance By Clusters

Conclusion

  • Most of the customers are university graduates.
  • Most of the customers are living with partners.
  • Those living alone have spent more than those living with partners.
  • Most of the customers have only one child.
  • Those having no children have spent more.
  • Middle Age Adults, aged between 40 and 60, are famous age group category.
  • Middle Age Adults are spending on average, more than the other age groups.
  • Most of the customers are earning between 25000 and 85000.
  • Wine and Meat products are very famous among the customers.
  • On the basis of income and total spendings, customers are divided into 4 clusters i.e. Platinum, Gold, Silver and Bronze.
  • Most of the customers fall into the Silver and Gold categories.
  • Those who are earning more are also spending more.
  • Most of the customers like to buy from store and then online from the web.
  • Platinum customers showed more acceptance towards promotion campaigns while bronze customers the least interest.

Answering Question

What are the statistical characteristics of the customers?
The company's customers are mostly married. There are more Middle Aged Adults, aged between 40 and 60 and most of them like to have one child. Most of the customers hold bachelor degree and their earnings are mostly between 25,000 and 85,000.

What are the spending habits of the customers?
Customers have spent more on wine and meat products. Those without children have spent more than those having children. Singles are spending more than the one's with the partners. Middle aged adults have spent more than the other age groups. Store shopping is the preferred channel for purchasing among the customers. Web and Catalog purchasing also have potential.

Are there some products which need more marketing?
Sweets and Fruits need some effective marketing. Company needs to run promotions for these products in order to increase the revenue from these products. Baskets of the least selling products combined with the most selling products can be effective.

How the marketing can be made effective?
As a marketing recommendation give coupons to the old and high spending customers. Market the cheap and on-offer products to the low income and low spending customers. Web purchasing has some potential. To unlock this give special discounts to the customers who sign up on company's website.

About

Performing an unsupervised clustering of data on the customer's records from a groceries firm's database with K-Means.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published