Are your customers at high risk of canceling their services or changing products? Who should I target in my next ad campaign? And who are my most valuable growing client base? These are just a few of the questions that can be answered using your own, readily available, data.
Segmentation divides markets and customers into categorical groups that share common characteristics. Once placed into these “clusters”, we can target each group by considering them as representative of a certain type of customer. This allows for optimized profitability through customized market campaigns, service offerings, and product targeting, etc.
Consider the following dataset: suppose we have a product which we sell to various retail chains, for every retail chain that we sell to, we know their location inside the US, the quantity of our products that they purchased, the chain’s total sales volume over the past year, the number of customers they have, and the length of time they have been in business. Notice that some of the data may be missing from our files, as is the case below.
|Customer||Location||Products purchased||Sales volume||Number of customers||Years in business|
There are now numerous methods to cluster our data: k-means clustering, hierarchal clustering, or matrix factorization. However, as the data contains a mixture of both numerical and categorical features, this situation calls naturally for a hierarchal clustering algorithm.
Hierarchical clustering looks at a distance matrix and groups customers together into a “hierarchy” of clusters. This can be done in a “top-down” or a “bottom-up” manner, where each observation either starts in its own cluster and is slowly merged with its neighbors or one where all observations start in one large group and is split recursively.
Our first step is to compute the distance matrix for each customer, where the distance is some measure of the difference between our groups. To do this we can use Gower's method from the nnets package in R. From here, we can perform Ward's hierarchical clustering:
dist <- daisy(customer_data) fit <- hclust(dist, method="ward")
In both top-down and bottom-up approaches, once the data is fit, a dendrogram of the results can be plotted to show the overall structure of our data.
Longer lines in our dendrogram indicate a greater distance between clusters. Notice that the clustering is complete, and that we can arbitrarily choose when to cut-off the algorithm. This flexibility must be tuned manually using validation tests to confirm your hypotheses. For this case, we have decided on three clusters.
Understanding Your Clusters
Once we have settled on the number of clusters to use, we can investigate the characteristics that each cluster represents. For example, we notice a few things: Cluster 3 has the three accounts with the highest sales volume; Cluster 1 has similar sales volume, and number of customers; and Cluster 2 represent newer business with fewer purchased products.
By profiling our groups we gain insight into our client base so that we can target them separately (but at the same time, maintaining economies of scale). In this case, we may wish to use our sales team to target Cluster 2, the group of clients who have bought the fewest products and that the maximum potential for growth.
As an extension to our example above, a common scenario arises whenever a new customer signs up for our product. We wish to know which of our existing customers they are most similar to. A powerful technique is to develop a multinomial logistic regression to predict the cluster to which the new customer belongs. This immediately allows you to provide a more tailored service to your customers.
Customer segmentation is a powerful, data driven approach to better understand your market. Armed with these simple tools you can make quick and meaningful improvements to your business.