Learn the difference between deterministic and probabilistic consumer identification and why both methods may be needed for marketing to succeed.
Like most tech-defined marketing stories in the modern era, this one starts with mobile. When technologies like smartphones and tablets rose in popularity, people no longer left all of their consumer data on one device.
Data then became proliferated and fragmented, and marketers had to determine where consumers went when they disappeared from their primary device. Cross-device identification, therefore, became a game of matching one device with another in hopes of marrying their data sets.
But marketers soon realized this was not enough. Users quickly began to use more than two devices, and the average person now owns four. Since consumers jump from one device to the next, leaving a trace of data from multiple sources (like social media accounts) marketers realized that they needed to be able to identify who a consumer is at all points in their journey.
The Importance of a Unified View of a Consumer
Without a unified view of a consumer, carefully crafted marketing pipelines and messages have to start at square one every time a user switches to a different device. Personalized journeys can never be truly realized, resulting in someone seeing irrelevant messages or the same message repeated far beyond frequency capping guidelines. Attribution modeling then falls apart, leading to a skewed view of which media exposures were most valuable to marketers and which ones were wasted.
In other words, marketers struggle to provide a consistent brand experience as a consumer moves from device to device or from channel to channel, including online-to-offline sales, unless they have a way to determine consumer identities. They can still target individuals to some degree, but without a cross-device ID solution, most marketers feel they are still missing the mark.
To solve this problem, marketers have started to turn to two types of methods:
1. Deterministic identification: which looks for personally identifiable information (PII), like a customer log-in, to confirm identity.
2. Probabilistic identification: which uses anonymous data points (i.e. does not contain PII), in conjunction with identity-matching algorithms and a cross-referencing database to detect a probable unique user across each device.
There is no right answer for which one to use, but each method can offer different strengths and weaknesses according to a brand’s goals. At a glance, the deterministic method is used for a more accurate match, whereas probabilistic will provide greater scale. In the end, both may be needed since they can reinforce the other to provide the closest thing we have to a unified, omnichannel customer journey at the present time.
Deterministic Consumer Identification: How It Works, Pros & Cons
Deterministic consumer identification can be readily understood because it is so familiar. The device or platform simply asks for the person to identify themselves using personal information, such as logging in with a username and password. Individuals can be asked to create a new account for every channel they use, or they can share identities across multiple platforms, such as using a social media account to log into websites and apps.
The pros of this method are that they confirm with better precision that a current user corresponds to a unique identity. Since deterministic tracking relies on known data, such as home address, email, or credit card information, this makes it more accurate. Marketers also benefit because they can know with better certainty that someone is the same person no matter what device or platform they happen to be on.
Unfortunately, a method that sounds simple on paper can get complicated quickly in real life. For one, requiring consumers to log in — and especially requiring them to create a new account — can throw up a real barrier to using a certain platform. “Registration fatigue is real,” says expert Zack Martin. He cites data showing that 75% of consumers are frustrated by password management, and 58% of them will back out of a process if it requires them to register for a new account.
“Federated” log-ins using a common source, like a social media profile, help to ease this inconvenience through a process more commonly known as OAuth, or open authorization. However, tech-company owners of common log-ins (e.g. Google and Facebook) are typically not eager sharers of data. They create individual walled gardens where they control the flow of data and often keep it siloed from organizations unless specific arrangements are made.
Companies can operate around these limitations with the right strategy, technology investments, and arrangements with third parties, but they may still face issues when trying to scale their marketing efforts beyond the people willing to log in at multiple points along their journey.
Probabilistic Consumer Identification: How It Works, Pros & Cons
Probabilistic consumer identification uses a method that sounds less than certain but that can actually be fairly accurate when enough data is provided. To drive this type of identification, marketers first need a database — known commonly as an identity graph — that can store all known attributes of a specific consumer or household and often the devices they commonly use.
Identifiers stored in identity graphs can include a consumer’s home IP address, their unique mobile device IDs, location, WiFi network, data entered in forms, timestamps, and any stored browsing behavior or 3rd party cookies on their machines to cobble together a sort of “fingerprint” that can predict someone’s identity or household with a fair degree of accuracy.
The advantages of this type of consumer identification go far beyond the convenience of not requesting a login. For one, organizations can scale personalized digital marketing efforts far beyond their existing user base. For another, platforms can track repeat users without needing them to log in until they must make a purchase or access sensitive data on their actual accounts.
Additionally, the practice of building a unified data program capable of probabilistic identity matching can help organizations harness growing fields of marketing technology, like using probabilistic modeling to identify look-alike audiences with a high degree of accuracy. They also refine their use of machine learning algorithms, expanding their capability to make observations and predictions through once-dissonant data sets.
The drawbacks to this approach are its high barriers to entry. Investment in technology and the right customer data architecture are key obstacles organizations must overcome to get started. Furthermore, they have to ensure that their actions are compliant with regulations like the EU’s new General Data Protection Regulation (GDPR), meaning that consumer identifiers cannot be readily obtained and shared without opt-in permission.
Finally, a discrete log-in may be needed for consumer activities like making purchases or account changes.
Why Marketers Need to Consider Both Consumer Identification Methods
Ultimately, marketers will likely have to embrace both deterministic and probabilistic consumer identification methods. Our rapidly advancing society demands better privacy and security for our consumer data, meaning that regulations like GDPR could lead to a need for ubiquitous secure logins to confirm a user’s data-sharing preferences. Secure authentication will always be needed when accessing a customer’s unique, sensitive data, such as their payment information.
At the same time, the scalability of probabilistic consumer identification methods offers promise and convenience for all parties involved. Marketers can scale their personalization efforts without requesting incessant log-ins, and they can improve activations using anonymized data to produce contextual experiences regardless of how few touchpoints a consumer has had.
So, for an increasing number of marketers, both deterministic and probabilistic strategies are needed to power the activities that truly drive revenues and prove ROI.