Sunday, May 19, 2019

Do Customer Data Platforms Need Real-Time Processing?

Is real-time processing a requirement for a Customer Data Platform? It’s a deceptively simple question that can’t be answered without resolving two additional questions: What do we mean by real-time processing? How do we decide what is and isn’t a requirement?

Let’s tackle the definition of real time first. There are at least four different flavors of real time processing that relate to CDPs.
  • Real-time updates. This means that new data flows into the CDP and is added to the exposed customer profiles in a few seconds. The exact speed requirement isn’t clear: although one second is widely considered to be the threshold for real time interactions (based on a study done in 1968), longer lags are acceptable for sharing customer data between systems. In practical terms, customers will not be annoyed if it takes a half-minute before a call center agent learns about her latest Web interaction.  Many CDP vendors offer what they call “near-real-time” updates, where the lag might be one to five minutes. This is obviously too slow to manage an interaction like online chat. But it can support asynchronous activities such as sending order confirmation emails.
Real-time updates are quite difficult. The first step, adding a new piece of data into the CDP data store, isn’t too hard for most CDPs, which simply append each item without touching existing data. But there’s usually additional processing to update customer profile elements, such as lifetime purchase value, predictive model scores, or last purchase date. This requires finding the right profile, reading the existing data, running whatever calculation is needed, and storing the new result. That takes time. And there’s often an additional step to copy data from the profile into a separate data store that’s optimized for real-time access, such as a flat file or in-memory database. This takes time too.
  • Real-time identity resolution. This is a special case of real-time updates, where the system needs to do some processing to associate a piece of data with the right customer profile. It's needed when the new data isn’t already associated with a known identifier such as a customer ID.  It may require substantial processing to run matching algorithms that test various combinations of data against existing records. Usually this process is limited to adding the data to an existing profile, but it might also re-examine all profiles to see if the new data uncovers a connection that would allow them to be merged or split. That requires still more processing. This sort of comprehensive reexamination is usually done in a batch process that might run anything from nightly to monthly.
  • Real-time access. This means an external system can query the CDP in real time for a copy of an existing customer profile. This might be done with an API call or by accessing a separate data store. Note that the profile itself isn’t necessarily being updated in real time: in fact, companies commonly run a nightly batch process that loads profiles and next-best-actions into an online data store. The data remains unchanged until the next nightly update.
  • Real-time interactions. This means an external system can send data to the CDP and receive back a profile that incorporates that data – for example, with an updated model score, product recommendation, or marketing messages. It’s another flavor of real-time updates but is limited to activities within a single system. In practice, real-time interactions often work by moving a copy of the profile into memory at the start of an interaction, updating the in-memory copy as the interaction proceeds, and then copying the final version of the profile back into the main database after the transaction is over. This avoids the need for real-time updates of the main database. Real-time interactions are often integrated with multi-step campaign flows so the user can guide the interaction process.
It’s important to recognize these distinctions and to clarify which are included when you’re discussing a real-time CDP. All systems in the CDP Institute’s Vendor Comparison Report say they do real-time access while only half do real-time interactions. Real-time data loads are almost universally available but the report doesn’t capture which systems can make the data available in real time. Nor do we ask about real-time identity resolution.

This brings us to the second question, of how to decide whether the CDP definition should include real-time capabilities (and, if so, which ones).  This matters because the market is confused by the variety of companies claiming to be a CDP.  A clear, widely understood definition is essential to reducing that confusion.

One approach is to start with current user expectations. CDP Institute research shows that users’ top priorities are data collection and unification. Real-time access and real-time recommendations rank far down their list. We can’t tell whether users include real-time updates in their definition of data collection.  My suspicion is that most do not. But I haven’t seen any reliable research.



A second approach is to look at what’s included in existing CDP systems. As we’ve already seen, all systems in our Vendor Comparison Report offer real-time access and about half offer real-time interactions. While we don’t ask directly about real-time updates, we do know that quite a few offer near-real-time updates, which presumably means they don’t do full-real-time updates. You may notice there’s some circular logic here, since we have to decide in advance which systems to examine.

A third approach is to look at common use cases. The data are again ambiguous because top applications like predictive modeling and personalization can benefit from real-time updates but don’t require it. Some standard CDP use cases do require real-time updates, such as providing call center agents with information about recent Web behaviors. But plenty of others do not, ranging from customer journey analysis to email audience selection.


Since none of these options offers much guidance, how do we decide?  I'll argue that we ultimately want to make the decision that best serves CDP users.  Many users need real-time updates and interactions but many others do not. If we want the CDP category to include systems that meet the needs of both groups, we should include non-real-time systems in the category.

But that's not enough.  We also need to help users understand which CDPs provide real-time capabilities and which specific capabilities are included.  CDP Institute is addressing this need with its RealCDP program, which will gather and present information about real-time capabilities.  We’ll also add real-time updates to the Vendor Comparison report. Stay tuned for more information.

Saturday, May 11, 2019

Jamie's Excellent Privacy Adventure

Jamie knew something was wrong when his alarm clock didn’t greet him by name. Weird, he thought.

Things quickly got weirder.

The water in the shower was too hot.  The traffic report mentioned an accident nowhere near his route to work. When the radio played a song he had specifically banned from his play list, he knew something serious was wrong.

“Clock, please run a system check,” Jamie muttered, annoyed.

Silence.

He said it again, louder, this time staring directly at the device on his night table.

Silence.

System must be down, he thought. I’ll deal with it later.

But then he looked out the window. The cars all had human drivers and pedestrians were not staring into their phones. He knew he had a much bigger problem.

A multiverse flip.

He'd read about it on the news.  Result of global warming or wireless signal overload or overlapping artificial realities. No one really knew.  What they did know was that people woke up in a different world from the one they’d fallen asleep in.  It seemed to be happening more often. 

Jamie picked up his phone. It didn’t recognize him but on-screen instructions let him unlock it with a thumbprint. He dialed LOST – League of Strayed Travelers – a service that spanned multiverses with enough cross-traffic to cooperate. He was lucky; the call was answered. By a human, which felt odd. But apparently that was the world he was in.

A quick conversation established that Jamie was indeed lucky: the local LOST staff included someone from his home universe. She would be Jamie’s guide. Soon a car pulled up – this one with both a driver and passenger.  Out hopped Emily, a cheerful young woman who quickly began explaining what was different.

“This is a world where privacy has been taken very seriously. Businesses and government are banned from storing any personal information beyond what’s needed for security purposes. That’s why your smartphone recognized your thumbprint. But look more closely at the phone and you’ll see it doesn’t have a record of your past calls or contacts. In the same way, we still have search engines and social networks and ecommerce, but they don’t personalize their services. It’s annoying but you get used to it. What I miss more is voice recognition, which is entirely forbidden as inherently invasive. Alexa was my best friend!”

She glanced wistfully at Jamie’s alarm clock, which he now realized was no smarter than a rock.

“But it gets much worse,” Emily continued. “Without gathering personal data, companies can’t train AI systems for things like self-driving cars, energy conservation, or personalized medicine. So we have more accidents, more pollution, and a vastly less efficient economy. Everyone has less free time and is poorer. Crime is worse because people have to carry cash and video surveillance is strictly limited. And, ironically, social media is still filled with bullying and misinformation – it turns out those have nothing to do with privacy and everything to do with human nature.”

Emily frowned briefly. Then her face brightened.

“But there’s good news, too. We’ve been researching the multiverse flips and have experimental devices that can move you from one world to another. We can’t guarantee where you’ll end up but have enough return traffic to be reasonably sure it won’t be too terrible. At least, most of a time. So there’s a risk but we can give it a shot if you like.”

On the ride back to the LOST office, Jamie and Emily chatted more about this universe, the flipping device, and their previous lives. Turned out she was from New Jersey. And married. By the time they arrived, Jamie had decided to try the new machine.

Emily gave Jamie one final smile as she closed the lid on the flipper. It was dark and warm and filled with white noise.

Then the lid was up.  Jamie was in a different room. New faces. No Emily.

A serious-looking man reached into the flipper and pulled out a package. “May I have this? It’s how we share information across the multiverses.”

The man opened the package and scanned the top document. “I see you’re Jamie. Welcome. I’m Giovanni.”

Giovanni took a closer look at the documents. “Looks like you come from a place where anything goes, privacy-wise. It’s very different here. We believe that people own their own data. But we respect their liberty, so any use is allowed if they give consent.”

Jamie was a bit groggy as he stepped from the flipper. “How’s that working out for you?”

Pretending not to notice that Jamie looked tired, Giovanni smiled and pointed to a seat. “If you please.”  His face turned serious.

“Mixed results, to tell the truth. Most people consent to pretty much everything. Their experience is much like yours – many free services but little privacy. Data breaches are common and much of the data is inaccurate. But the majority put up with it for the convenience and an occasional discount coupon.”

“And everyone else?” asked Jamie.

“There are two groups. Some people are privacy zealots, pure and simple – they won’t give up their data on principle. They pay extra for services that others get for free, and many services aren’t available to them at all.  They’re often left out of business and social events and have harder lives as a result.  Many live off the grid doing things crafting Faraday cage handbags. At least we don’t hear their complaints.” He did not seem amused at his own joke.

“The others are people with enough power and money that they can easily afford the cost of privacy-enhanced alternatives. They have staff or bots to maintain a social media presence without exposing their own data directly. They use special devices and software that hides their identity, regularly erases their data, and maintains separate personas for different purposes. They get the best of both worlds: privacy for themselves and convenience based on the data of others.”

Jamie was puzzled. “Why is privacy protection expensive? I’d think most of the solutions are based on software.  That should be nearly free to run for everyone once it’s built.”

“I thought so too,” sighed Giovanni. “But it’s a lizard-and-the-egg sort of thing: most people won’t pay even a little extra for privacy, especially if they’re rewarded for giving it up. So the market for privacy-enhanced systems is fairly small, which means manufacturers must charge a higher price per customer, which makes the market still smaller, which drives the prices still higher. It ends up as a luxury good.” Giovanni sighed again.

“Yes, I can see how that works out,” said Jamie. He thought for a moment. “Look, it’s nice to meet you but this isn’t my home universe. Can I go back into the flipper and try again?”

“Of course,” replied Giovanni. “With your permission.”

There was time for a quick lunch while Giovanni prepared another information package. Jamie climbed back into the flipper, relaxed for a moment, and the lid was up again. Another room. More new faces. .

Now a veteran, Jamie sat up and handed the information packet to the nearest person. A badge clipped to his shirt showed his name was Tim.

“Hi Tim. I’m Jamie. Where am I this time?”

Tim opened the package and read the cover sheet. He paused a moment.

“Not where you want to be, I’m afraid. But not a bad place. How much do you want to know?”

Jamie was disappointed but Tim seemed friendly enough. This room somehow seemed more cheerful than the last one.

“Well, I’m here, so I might as well find how things work. You’re the first place where people are wearing name tags. What’s up with that?”

“Happy to explain,” said Tim. “But where are my manners. Would you like a glass of wine?”.

“I prefer beer,” replied Jamie. They walked to the front of the building, where a pleasant café fronted the sidewalk. They sat with their drinks.

“Unlike the last two places you’ve been, our universe believes that some data should be shared with everyone while other data should always be kept private. Deciding where the line falls isn’t easy and I can’t say we always get it exactly right. But we keep making adjustments over time. The good thing is people mostly know what to expect and are treated fairly without taking extraordinary measures to protect themselves.”

Jamie didn’t get it. “How does that work? Everyone is wearing name tags, so clearly that’s something you’re required to share. But the tags just show first names. How do you deal with more sensitive data?”

“Excellent question,” smiled Tim, warming to his subject. “So few people really care. Have another drink.”

Tim drained his own wine glass and ordered another. Jamie was still working on his first beer. Tim continued.

“We apply what people in your universe call ‘privacy by design’.  Our badges do more than show our first name: they contain details that are shared on an as-needed basis. For example, when we entered the cafe, a sensor queried my badge and told the server was that I’m allowed to order a drink. But that’s all he learned; it didn’t tell him my age, let alone the name, address, birthdate, and biometrics he’d get from your driver’s license. And if there were some other reason I wasn’t allowed to order a drink – say I was already drunk – it wouldn’t have said that, either. It would just have told the server not to serve me. So my privacy is protected even while drinking restrictions are enforced.”

Jamie pushed away his beer. “So that badge knows that you’ve been drinking? I don’t think I’d want anyone keeping a log of my alcohol consumption.”

Tim looked at his own glass. “Neither would I. But the badge doesn’t keep a log; it just monitors my blood alcohol level. And it doesn’t share that unless there’s reason, like determining whether I can legally order a drink.

Jamie relaxed a bit.  Time continued. 

“There’s lots more information that the badge or other devices do log. My smartphone knows my location history, the Web sites I’ve visited, search queries I’ve made and much more. It uses those to make my life easier, same as in your universe. But, unlike your universe, the data never leaves my device. That way no one can use it in ways I don’t control. When someone wants to serve an ad to people who like red wine” – he lifted his glass – “they just query devices until they find a profile that fits the description. The profiles aren't stored outside the devices and there's no record of which device an ad was served to. That makes things a little harder for advertisers, who can’t control how often one person sees the same ad or connect ad views to subsequent purchases. But it still allows most kinds of behavior- and profile-based personalization.  The economy manages to function.”

Tim stopped short. “Sometimes I repeat myself. I hope I’m not boring you.”

He was, a little. But Jamie could see he loved the topic. “Well, I do want to try to get home. But maybe there’s something here I can bring back that would be useful. What else do you think I should know?”

Tim gathered his thoughts. “To quickly flesh out the picture, the same principles apply to other types of data. So, my phone knows where I live and where I am now, which lets it connect with navigation software to tell me how to get home. But it only asks the central navigation system for a route, without telling it who’s asking. So the navigation system doesn’t know anything about my movements over time. And we do let people view ads for payment, but there are strict rules against trading away personal data.”

“Who makes these rules?” asked Jamie. “In my world we’re pretty skeptical of regulators.”

Tim gave him a sharp look. “So are we. The rules come from a mix of legislators and agency staff.  There's plenty of lobbying from all sides. As I said before, there’s lots of disagreement and they don’t always get things right. But everyone starts from the premise of putting individual interests first and business interests second. Turns out that’s a good guide for many decisions."  Tim paused for a breath.

"And, yes, social interests like public safety come into play. So I can't drive a car if my badge says I'm legally drunk, although the badge doesn't give the car a reason.  I can actually override that rule in an emergency, but then the car also notifies the authorities and turns on special tracking devices.  So, yes, it's complicated.  But just because it’s hard doesn’t mean we shouldn’t try to make it work. You’ve already seen how poorly things turn out in worlds that apply simple solutions instead.”

“Indeed I have,” said Jamie. “I certainly don’t think my universe has it right.” He paused. “But home is home and that’s where I belong. Can we try the flipper again?”

“Of course,” replied Tim. They left the café without paying. Tim winked at Jamie. “Don’t worry. They'll charge my badge. Anonymously.”

Once more into the flipper.

Jamie’s phone buzzed to life before the lid was raised. He knew he was home.

“Good morning, Jamie,” the phone said. “You’re late for work. Shall I call you a car and let the office know you’re on your way?”

Jamie shut it off and opened the lid.