Using a Product Recommendation Engine
Depending on what you are trying to achieve from your product recommendation engine will affect what algorithm or recommendation system you use. Whether you are after quick wins, or a system that has a full ability to learn from a huge data set, product recommendation engines can be utilised in many different scenarios by many different businesses.
“A lot of times, people don’t know what they want until you show it to them” – Steve Jobs, former CEO of Apple Inc.
Since the increase in availability of troves of data due to the ubiquity of the internet and the rise of the Internet of Things (IoT), product recommendation engines (PREs) have become a very popular tool for increasing business sales. Product recommendation engines such as those developed by Netflix and Amazon have had huge levels of success, with Netflix estimating that up to 75% of viewer activity is driven by recommendations and Amazon stating that 35% of their revenue is generated by its recommendation engine.
Product recommendation engines can provide content personalisation where consumers are demanding more personalisation than ever. Typically using millions of points of data, PREs normally use one of two types of product filtering systems, collaborative or content-based.
Collaborative System Filtering
Collaborative systems use a filter that identifies search habits of similar users and recommends a product that it believes the user may be interested in; the motivation behind this sense of filtering being the belief that the best recommendations will come from those with similar interests to themselves. An example of this would be Amazon’s ‘customers who viewed this item also viewed this’ recommendation.
A common method used for a collaborative filter is the K-Nearest Neighbours (KNN) algorithm. The KNN algorithm classifies items in accordance of their locality to their nearest neighbours within a multi-dimensional space. Typically, contributions of neighbours are weighted in accordance with their distance from the object. Once the algorithm is run, you would be able to identify the nearest neighbours, and in turn use this output data to fuel your product recommendation engine.
There are however drawbacks to collaborative filtering methods, most commonly the ‘cold start’ problem. As collaborative filtering methods recommend based on user’s past searches, the system may initially struggle to produce accurate predictions as new users will have little to no previous data. This is similar to when new products are added into a system; they will need a substantial number of ratings before they can be recommended to users with similar tastes. In addition as the size of the data increases KNN does not scale well due to the distance calculations involved, but other algorithms are available to help avoid this problem.
Content Based Filtering
Content based filtering methods on the other hand do not suffer from this ‘cold start’ problem. These systems use a filter that identifies similar keywords that the user is currently using to identify products they may be most interested in. This is why there is no ‘cold start’ problem with this method, as the current search keywords can be used by an algorithm to recommend products.
A common algorithm used within content based filtering is the TF-IDF algorithm. This algorithm shows us the relative frequency of a keyword within a document. It does this by first measuring the search term frequency against results, and then compares this with the relative frequency of the search terms amongst all results. An example of this could be the search term ‘the rise of analytics’; it’s likely that ‘the’ will be the most common search term, however ‘analytics’ will have the greatest search power due to lack of frequency.
An example of content based filtering in action would be Pandora internet radio; a radio station that provides music with similar characteristics to that of a song initially provided by a user.
Like collaborative system filtering, content based filtering does have its limitations too. One drawback is that the system itself has a limited ability to learn. Because the recommendations are completely drawn from within the system, there is no true personalisation as it doesn’t take into account any user history.
The more complex, larger product recommendation systems however typically use a hybrid system approach, utilising both collaborative and content based filtering approaches. These have been proven to be slightly more accurate than single filter systems, giving better recommendations that lead to an increase in sales and engagement.
Recently, a rise in the use of bandit algorithms have been used for product recommendation engines, especially within the news sector; these contextual bandit algorithms are able to learn from every single interaction that takes place.
Contextual Bandits work through collecting information about how the context vectors and rewards relate to each other (in this case the product recommended and whether or not a sale / engagement was achieved), so that it can then predict the next best product to recommend.
Bandits need a metric to calculate how well a certain strategy is performing. This is measured using ‘regret’; which can be defined as the difference between a chosen strategies’ success with an optimal strategies’ success. As choices are made, the bandit learns from its decisions and minimises regret until an optimal strategy has been discovered. In this case, this would mean the best products are being recommended for engagement.
According to how the bandit’s algorithm has been programmed, which run on a trade-off between exploiting the current optimal choice and exploring sub optimal strategies, bandits may become in-efficient. If the initial exploration rate is set too high the bandit will take too long to correctly identify the optimal strategy. On the other hand, if the initial exploitation rate is set too low, then the bandit may incorrectly identify an optimal strategy.