Recommendation systems and machine learning: driving personalization

Recommendation systems and machine learning: driving personalization

March 30, 2022

Andrea Di Stefano

Technology Research Analyst

Imagine having a childhood friend universally recognized as an eminent aesthete. The purest stereotype of a fashionable person clearly hiding some sort of portal to Milan and Paris instead of Narnia in their wardrobe. This lady, we'll call her Nancy, literally watched you grow up and has an idea of not only your tastes but also of their evolution over the years and how your passions, friendships and a thousand other factors have influenced this trend.

For example, Nancy may have noticed your “spiritual conversion” from black to brown shoes and belts once you stopped hanging out in the heavy metal community. In short, she probably knows you better than you know yourself, and since she also works in a large clothing store, you decide to pay her a visit during her shift and rely on her wise advice and in-depth knowledge of the shop’s assortment of products to make over your wardrobe.

But after imagining all this, you remember it's 2022. The pandemic has forced many physical stores to close, your friend has become a software developer in the ecommerce industry, and after a busy workday you are just too tired to venture out  to  go shopping. So, you opt for perusing the options online…like many other people nowadays, according to Statista.

Global ecommerce sales trend and forecast, 2014-2025

However, having a virtual companion to help you out wouldn't be too bad. Well, machine learning-based recommendation systems may not be as chatty as your fashionable friend, but they certainly share some strengths with her . Let's find out how these innovative tools, fostered by growing investments in AI software development and machine learning consulting, are redefining the shopping experience and driving sales in the age of digitalization.

A window into customers’ mind

Following the relentless transition from brick-and-mortar sales to ecommerce, coupled with the growing spread of online platforms ensuring remote access to digital content, retail corporations and service providers have faced two important  issues.

  • How to replicate the in-store customer care, offered by an expert salesperson who provides an undecided purchaser with expert guidance, in a virtual environment, devoid of physical interaction?
  • How to avoid the so-called information overload and direct  customers towards the product (be it physical or digital) they really want, hidden amid an overwhelming offer of merchandise and content? In practical terms, is there a way to prevent streaming service subscribers from spending more time looking for a movie that interests them rather than actually watching it?

The answer to these questions can be  found in the implementation of recommendation systems, powerful engines capable of performing in a digitalized scenario the most essential  task of a sales assistant: connecting buyers and goods while offering a pleasant, tailored shopping experience.

They are among us, but how do they work?

To be fair, basically everyone, including non-tech-savvy users, has heard of recommendation systems and has a very rough idea of their basic functioning, even for a simple matter of personal experience. What's not so well known, however, are the mechanics behind all those extremely accurate, fully personalized suggestions we constantly receive as we browse some online store or platform, making us believe that the laptop in front of us is literally reading our minds.

Obviously, mind reading and other magic tricks have nothing to do with the in-depth analytical capabilities achieved by recommendation systems in recent years. Rather, the real protagonist of this story is machine learning, a sub-branch of artificial intelligence focusing on the creation of computer algorithms that can process huge datasets, identify recurring patterns and correlations among multiple variables, and build mathematical models portraying them.

That "learning" in its name is not there by chance, since machine learning systems as well as reinforcement learning applications leveraging these algorithms can actually enhance their capabilities over time through experience. The more data they process, the more relationships among data points they'll spot and the better they will fine-tune their models.

Such models, typically used for predictive analytics in marketing as well, represent a window into what customers think, as they allow us to frame the logic behind specific purchase patterns and shed light on current and future sales trends. For example, a machine learning-powered solution might notice a recurring connection between the age of customers and their preference for one brand over another.

Machine learning modeling

ML-powered recommendation systems on the ground

To clarify the practical implications of powering recommendation systems with machine learning, let's go back to the fashion connoisseur example. Your friend Nancy can boast solid expertise in both the latest fashion trends and the shop's product range. Furthermore, she's already framed your tastes as she perfectly knows what you usually buy and wear. Not to mention a long list of personal aspects that may have impacted your style (cultural interests, social environment, profession, and more).

But not all shopkeepers have this privilege. Many sellers see you for the first time as you enter their store. Therefore, they'll need to interact with you, gather information, understand what type of customer you are, and guide you towards the merchandise matching your preferences. In marketing terms, they have to segment you, namely categorize you into a certain customer archetype or buyer persona according to your characteristics (purchase patterns, interests, gender, etc.) and target you with a suitable product suggestion reflecting your archetype.

Market segmentation variables

In virtual marketplaces, recommendation systems play a similar role, replacing sales assistants in segmenting and targeting customers. The major difference is that human sellers are driven by their intuition and experience to investigate a small fraction of the aforementioned variables during a short chat with purchasers.

Recommendation engines, instead, rely on machine learning to process huge customer datasets and consider a broader range of parameters, including browsing behavior, purchase history, content usage, personal information from user profiles, product reviews, and access device.

Multiply this for each user on a given platform and you'll understand how a recommendation system can get a pretty clear idea of both individual purchasers and the audience as a whole, as well as figuring out underlying sales dynamics that a human observer would struggle to grasp.

Furthermore, machine learning algorithms can take into account a massive range of purely contextual parameters not strictly related to customers. For example, as December approaches, the ML-based recommendation system of a major web store would start suggesting typical Christmas products. On the other hand, a streaming platform may adapt its recommendations from the day of the week, offering family-friendly films and documentaries over the weekend.

A closer look at the markets

Nowadays, all major digital service providers and ecommerce enterprises rely on ML-powered recommendation systems. These include Netflix, YouTube, Amazon, Google, Spotify, and many other digital colossi tapping into recommendation systems' remarkable capabilities to provide a customized user experience and enhance sales performance. After all, McKinsey's 2019 The Future of Personalization article highlighted that product recommendation solutions may help improve marketing-spend efficiency by 10-30% and increase revenues by 5-15%.

The positive impact of such tools on businesses certainly represents one of the key catalysts driving their adoption across industries. According to Mordor Intelligence's 2021 Recommendation Engine Market report, these benefits and opportunities will also lead to a sharp growth experienced by the global recommendation system market in the years to come (from $2.12 billion in 2020 to $15.13 billion by 2026).

Global recommendation engine market forecast

In their 2021 Recommendation Engine Market Size, Share & Trends Analysis Report, Grand View Research's analysts confirmed these optimistic predictions, pointing out that the retail industry accounted for the largest revenue share in 2020 and that the large enterprise segment was leading the way in terms of recommendation system adoption.

The study also provided further insights regarding recommendation systems' market share by type, reporting that collaborative filtering-based engines still ranked first while the hybrid system segment seems set to expand at the highest CAGR.

At this point, you might be wondering what collaborative filtering and hybrid systems actually are. This brings us to the next step, namely the different approaches adopted by recommendation systems to fulfill their duties. Let's take a look.

Consult our experts to implement machine learning the right way

Turn to Itransition

Approaches to recommendation system design

Most recommendation systems fall into three major sub-categories, depending on the approach embraced to select and suggest the products or services meeting each customer's needs:

  • Recommendation systems adopting collaborative filtering
  • Recommendation systems leveraging content-based filtering
  • Hybrid recommendation systems
Recommendation system approaches in a nutshell

We’ll start with the most popular type on the market, which is collaborative filtering.

Collaborative filtering: a focus on people

"Show me what people similar to you buy, and I will tell you what you may like". That's a good approximation of collaborative filtering, a recommendation approach grouping users with shared characteristics and purchase patterns into clusters and providing them with similar product suggestions.

The focus, in this case, is on customers, their opinions on products, and their interactions with the online platform, rather than on the items' features. This implies that recommendation systems in this category will rely on machine learning algorithms (such as clustering models, K-nearest neighbors, matrix factorization, and Bayesian networks) to survey customers’ perception of products, understand who likes what, and offer items already bought by other users with comparable tastes.

Collaborative filtering

As we've previously said, such analyses require huge datasets to properly train machine learning algorithms. There are two possible ways of gathering this data:

  • Explicit data collection: asking users to compile a list of favorite items and rate previously purchased products on a scale or from the most favorite to the least favorite.
  • Implicit data collection: scanning user activity on AI-driven social media, such as likes and dislikes, or monitoring customer purchases, views, and viewing times on ecommerce websites.

To further expand the range of parameters considered, many collaborative filtering systems are now designed to monitor other context-related variables, as we've previously mentioned in our overview of machine learning applications. This approach, known as context-aware collaborative filtering, complements user data with contextual information (region, time, device, etc.) to better define the scenario in which a customer operates and provide more accurate suggestions.

Excellent accuracy, however, is not the only strength of collaborative filtering. Another relevant benefit is that these systems can literally predict users' interest in a product they didn't know existed by observing that the same item has caught the attention of other customers with similar interests.

Furthermore, by taking into account the relationship between products and the audience more than the product itself, recommendation systems based on this approach can work pretty well even without understanding the nature of each item. This also implies that platforms implementing such systems won't need to accurately describe each product in order to make the system operate properly and boost sales.

For platforms offering a massive range of products such as Amazon, which deployed its famous collaborative filtering-based recommendation system between 2011 and 2012, this represents a significant plus.

On the other hand, collaborative filtering isn't necessarily perfect. Among its potential drawbacks, we might mention:

  • Cold start: providing valuable suggestions to new customers with no purchase history can be challenging even considering several other parameters.
  • Scalability: using machine learning algorithms to search for purchase patterns among a constantly growing number of customers and products requires huge computational power.
  • Rich-get-richer effect: algorithms generally recommend products with many excellent reviews, further increasing their popularity at the expense of new items on the market.
  • Data sparsity: the tremendous offer of content and merchandise on all major platforms risks to reduce recommendations' accuracy, since each product may receive an insignificant number of user reviews.
  • Shilling attacks: because of the previous point, new products are easily vulnerable to rating manipulations (such as negative reviews from competitors).

Content-based filtering: a focus on goods

While being less popular than collaborative filtering, content-based filtering still has a couple of tricks up its sleeve. Here, we have a quite different approach to recommendation system design, since the focus partially shifts from customers to products. In fact, this model mainly considers the item's characteristics, such as price or category, along with user preferences interpreted from their purchases and related feedback.

Based on these metrics, a machine learning algorithm (be it Bayesian classifiers, decision trees, clustering, etc.) will investigate customers' purchase patterns and suggest other products sharing similar features with those previously bought and positively reviewed.

The most advanced content-based recommendation systems can also enrich this set of variables by probing text reviews through AI-based natural language processing. This type of textual feedback represents a valuable source of implicit data both in terms of item features (which may be mentioned by reviewers) and user rating (extracted via sentiment analysis from the same reviews).

Content-based filtering

The plus of this approach over collaborative filtering is that it mitigates the aforementioned problems with new products on sale, as the system can already count on a good deal of information regarding each item's features thanks to the assigned keywords.

At the same time, unfortunately, this tagging procedure implies a massive workload, especially on the largest platforms. Second, the cold start issue is still there, as the historical data associated with new customers is very limited. Furthermore, algorithms may show a rather conservative, risk-free behavior, suggesting categories of products and content already purchased by a certain user while avoiding new, potentially interesting items.

Regarding the picture above, for example, our Leopold may also be interested in hair gel or mustache wax like many other bearded hipsters, even if they do not strictly fall into the category of products ordered before. To spot such nuanced, more subtle relationships, the collaborative approach has proven far superior.

Hybrid systems: the best of both worlds

To find a reasonable compromise between collaborative and content-based filtering aimed at maximizing the respective pros and minimizing their cons, many recommendation systems have embraced a hybrid approach. According to research, in fact, hybrid models significantly enhance recommendation systems' performance. This may explain why hybrid systems are the fastest growing segment in the market, as indicated  by Grand View Research's aforementioned study.

Platforms like Netflix or Spotify, for example, have already adopted a combination of both models to spot similarities among several users through collaborative filtering while identifying movies and songs with the same characteristics via content-based filtering.

There are several ways to hybridize these two recommendation system types. The mixed hybridization technique involves providing users with both collaborative and content-based suggestions at the same time. The weighted technique, on the other hand, merges the score calculated via two different approaches. Another combination trick, namely meta-level, implies using the output of the first approach (basically the machine learning model built by algorithms) as an input source for the second one.

Meta-level technique

Reimagine the shopping experience

Rely on Itransition’s machine learning solutions

Let’s talk

A personal matter?

Recommendation systems offer a fairly balanced synthesis between customers' wishes for a fully customized shopping experience and the need for retail companies and digital service providers to improve their sales performance in an increasingly competitive market. As highlighted by McKinsey, AI and machine learning in ecommerce have further fostered this synergy, allowing enterprises to fully personalize customer journeys end-to-end.

On the flip side, these technologies may easily create tension between sheer performance and the growing interest of public opinion and legislators for privacy and data protection issues, since machine learning-powered systems require huge amounts of customer data to work properly.

Not to mention that these systems are generally considered as black boxes, which means they're great at interpreting data to identify relationships and patterns, but no one really knows how they end up providing a certain recommendation. In short, if we asked our  usually fashionable friend Nancy to explain her recommendations, her answer would probably reflect our behavioral patterns, style, and other aspects of our life that she knows very well. A machine learning system, instead, would randomly bip.

This isn't necessarily a big deal, as we can still enjoy life without talking to a machine. But the other issues will surely need to be addressed with proper data management policies to ensure that recommendation systems deliver their consistent benefits without endangering user privacy.