Stories and Models: Making Sense out of Data

Everyone likes a good story

Olgierd Sroczynski
Towards Data Science

--

You probably know the famous scene in Mad Men, where Don Draper presents Kodak’s “Carousel”. I remember it very well, as when it came out I was working in an agency where everyone was watching the series and wanted to be like Don. What the brilliant Creative Director from Sterling Cooper was doing is called “storytelling” and it has a central place in marketing, even when it changes and becomes data- or even AI-driven.

In traditional marketing you basically had to guess about the right strategy to attract new customers and to appeal to existing ones. If you made money at the end of the day, the strategy was working; if you didn’t, there was apparently a problem and your strategy had to change. What do you do when you have to guess? Well, you create a hypothetical scenario and try to play it out in your head. This is where telling a story comes in handy. You need to think of a character, a buyer persona, assign them with the right motives, needs and desires, and then do some reverse engineering of their thinking processes.

Sounds easy, right? The problem is that your ideas of who your clients are may be completely wrong, even if they sound right to you. Our biggest evolutionary achievement as humans is empathy: the ability of understanding what others are up to, what they feel and what they think. However, this is not just something we can do: this is something we have to do constantly. This is simply how our brain works — we feel constant hunger for meaning. So we tell stories, small and big ones, and we need to believe in them

AI vs the human mind

In a broad sense, a model is a representation of a system: a metaphor. It can be a material object that represents other material objects (e.g. a model of an airplane or a car that is used to test aerodynamics), it can be a drawing, a chart or an equation. And since a model is a metaphor, it also can be a story; some great narratives — like myths — are powerful models of reality.

We create models to understand real objects, processes and connections between them. Understanding is finding a general rule that applies to similar objects, and generality also means simplification. A 1:1 model of reality would be the reality itself and it would not explain anything, so the model is always a limited and generalised image of reality.

Models are just simplified images of reality (Image by Author)

Stories we tell are about the past: the links between events that already happened. This creates causal relations, most often in the form of a decision tree: if there’s P, there will be Q; if there’s ~P, there will be Z. This is how our mind operates — causality is one of the most crucial elements of meaning. We don’t understand probability and stochasticity in our everyday lives. Even if we consider alternative scenarios, we can focus only on a few of possible outcomes.

Scientific models are very different. In most cases, mathematical models used in physics are not causal, but probabilistic. They also operate on a limited number of parameters, but the number is much bigger than any human mind can ever computate. Calculations based on a model create links between events that very often are not clear for the human mind. However, using math to model the reality does not mean that they can’t be translated into a story.

Let’s take a very simple traditional model for marketing segmentation — an RFM analysis. In RFM we split the whole list of clients into 125 segments by using five values on three axes. As a result, the coordinates of every customer (the scoring) are determined by three values. For instance, customers with score 555 — top customers — are at the 80th percentile on each axis.

RFM analysis creates a three-dimensional space for segmentation (Image by Author).

This model doesn’t need any extra explanation nor interpretation. Not just because the mathematical tools used to create this segmentation are very simple (everyone can do it in Excel), but the meaning of the outcome is in the assumptions of the model. We could even just skip the part of using percentiles and build it on arbitrary chosen values. Before knowing the result of an analysis, we already know what exactly connects people in each segment.

Using advanced AI models is quite different. Let’s say we prepare a graph structure to represent all the transaction events. It will create a network of interconnections between them. All nodes of the graph will be connected in different ways: products to buyers, products from the same category, buyers from the same location etc. In order to make the graph computable and use it for machine learning, we would need to transform it into the n-dimensional vector of a real number.

Graph embedding is a powerful machine learning technique (Image by Author).

If the graph embedding was done right — the vector representations well describe the properties of the graph — we can use the embeddings, for instance, to determine the probability that the person X is going to buy product Y and create a personalized promotion, or to create cluster segmentation. We will most probably find out there is no simple story behind it: we don’t know what are the connections between people in each segment created this way and more importantly we don’t have to know that. Some of the correlations and connections are random, some of them make sense for the human mind. The important question is: are they working?

Building a narrative on data

However, the inability to answer the question “why” is really unsatisfactory. It’s not just our inherent need for meaning. We need to know the meaning of certain connections, so we can know what marketing strategy is best for our brand and how we should target next campaigns. Even if the random (from our point of view) connections are not quite explainable based on our knowledge about the world, we can try to find these answers by interpreting the data.

Let’s take a look at this example, where we analyze the target group selected by the model as “customers who are most probably interested in a product Y”.

An analysis of the segmentation on Synerise dashboard (Screenshot by Author)
An analysis of the segmentation on Synerise dashboard (Screenshot by Author)

If we compare an average person from the segment to an average customer in general, we can clearly see that they:

  1. Buy more on average in one transaction (twice as much as the average customer);
  2. Less frequently visit the store (once a month in comparison to once a week)
  3. Visiting the store more frequently on Saturdays and Fridays.

Based on that information we can create a story — a hypothesis about who are the people most probably interested in buying product Y. It will go like that: in the segment we’ve got people who are buying groceries once a month, which is usual for people living in suburban areas, most probably with children. If we add the comparison of the age of customers, we will discover that people in our segment are between 35 and 45 years old.

Is this the whole picture? Obviously not. This is another model we’ve built on top of an AI model. And as a model it will be more a further simplification of what AI told us about the customers.

A story about the model is just another simplified model (Image by Author).

Such a simplification is not necessarily a bad thing, as it can inspire us to create the story that appeals to our clients. And always remember that this model should be tested, using A/B testing and the control group. Testing would help us to check if what we saw as a random connection was really random, and to better understand our customers.

Summary

Artificial Intelligence works in a different way than the human mind. AI models create an image of reality that is not always understandable for our minds, used to order the world according to the rules of causality.

It doesn’t mean though that AI shows the truth, and stories are completely false — we need to learn how to use both as complementary models.

--

--