Recommender systems are reshaping the business world, especially E-commerce, in an unprecedented manner. They learn patterns of behaviors to predict someone's preference on a set of items they have not experienced. Over the past few years, thousands of companies have used recommender systems to help their customers find products to purchase, new movies to watch, songs to listen to, and even people they should interact with.
The application of recommender systems that we will focus on is product up-sell/cross-sell. The goal is to suggest upgraded and/or additional products for customers to purchase. If our recommendations are successful, the average order size per customer should increase, thus increasing revenue. A simple example of this is when a customer purchases a new iPod. A common cross-sell would be to suggest also buying new headphones.
Consider the following dataset: suppose we sell various products to a set of clients. For each product we have pricing and feature information, and for each customer we know their demographic and purchase history. Upon checkout from our site, we wish to recommend the most likely products that a customer may wish to additionally purchase.
|Customer||Gender||Age||Product purchased||Product price||Product type|
There are numerous ways to build a recommender system, we will focus on two common techniques: user-based and item-based collaborative filtering (CF). In user-based CF the prediction for the active user (the user whom the prediction is for) can be summarized as:
- Look for users who share similar purchase patterns with the active user.
- Using the purchases from those similar users in step 1, calculate the prediction for the active user.
Item-based collaborative filtering makes predictions for the active user by determining the relationship between sets of items:
- Build an item-item matrix determining relationships between pairs of items.
- Predict the preferences of the active user by examining the matrix and matching that user's data.
Item-based CF example
To illustrate how item-based CF makes recommendations we will use the publicly available data for LastFM. The data set contains information about users and which artists they have listened to. Each row represents a user, and each column an artist. For example, the first few rows and columns of the data set look like:
We can now calculate the similarity of each song with all the others, by computing the "Cosine Similarity" of each column in our data set with every other column.
The cosine similarity ranges from 1 meaning exactly identical to -1 meaning exactly opposite (with 0 indicating dissimilarity):
Now, all we have to do is create a matrix for each item-item combination and calculate the cosine similarity score between them. The output for the first few rows and columns looks like:
Armed with this similarity matrix we are now ready to make recommendations! For each song, we look at the top 5 neighbors (highest cosine similarity scores) - those would be the recommendations we make for users listening to that song.
We can now check our recommendations for a few songs:
This means that if you were listening to ac/dc we would recommend red hot chili peppers, metallica, iron maiden, etc. Seems correct to me!
Getting back to the original problem of recommending new products to our clients upon checkout. We are now able to use item-based CF to recommend products, but other algorithms do exists and may even yield better performance. Instead of writing the code ourselves, we can use the recommenderlab package in R to test various CF algorithms and choose the one that yields the best predictions. For example, the below plot shows the performance of both user-based and item-based collaborative filtering on our data set by looking at the true-positive rate (TPR) vs false-positive rate (FPR) when recommending 1,3,5,10,15,and 20 new products to customers. We see that user-based CF is consistently higher in each bucket, thus yielding better performance.
Therefore, using user-based CF, we can now recommend the most relevant products for each customer. For example, for customer 1, the top two recommended products are an iPod case and iPad. Although this was a simple example, these same methods and algorithms can be applied across millions of users and products.
Recommender systems are a great way to perform product up-sell and cross-sell to your existing customers. It is a sure fire way to generate more revenue while increasing customer retention and overall customer satisfaction. Armed with these simple tools you can make quick and meaningful improvements to your business.