Tuesday, September 25, 2018

Salesforce Customer 360 Solution to Share Data Without a Shared Database

Salesforce has sipped the Kool-Aid: it led off the Dreamforce conference today with news of Customer 360, which aims to “help companies move beyond an app- or department-specific view of each customer by making it easier to create a single, holistic customer profile to inform every interaction”.

But they didn’t drink the whole glass. Customer 360 isn't assembling a persistent, unified customer database as described in the Customer Data Platform Institute's CDP definition.  Instead, they are building connections to data that remains its original systems – and proud of it. As Senior Vice President of Product Management Patrick Stokes says in a supporting blog post, “People talk about a ‘single’ view of the customer, which implies all of this data is stored somewhere centrally, but that's not our philosophy. We believe strongly that a graph of data about your customer combined with a standard way for each of your applications to access that data, without dumping all the data into the same repository, is the right technical approach.”

Salesforce gets lots of points for clarity.  Other things they make clear include:
  • Customer 360 is in a closed pilot release today with general availability in 2019. (Okay, saying when in 2019 might be helpful, but we’ll take it.)
  • Customer 360 currently unifies Salesforce’s B2C products, including Marketing, Customer Service, and Commerce. (Elsewhere, Salesforce does make the apparently conflicting assertions that “For customers of Salesforce B2B products, all information is in one place, in a single data model for marketing, sales, B2B commerce and service” and “Many Salesforce implementations on the B2B side, especially those with multi-org deployments, could be improved with Customer 360.” Still, the immediate point is clear.)
  • Customer 360 will include an identity resolution layer to apply a common customer ID to data in the different Salesforce systems.  (We need details but presumably those will be forthcoming.)
  • Customer 360 is only about combining data within Salesforce products, but can be extended to include API connections with other systems through Salesforce Mulesoft.  (Again, we need details.)
  • Customer 360 is designed to let Salesforce admins set up connections between the related systems: it's not an IT tool (no coding is needed) and it's not for end-users (only some prebuilt packages with particular data mappings are available).
So we have a pretty good idea of what Salesforce is doing. The question is, are they doing the right thing?

I say they’re not. The premise of the Customer Data Platform approach is that customer data needs to be extracted into a shared central database. The fundamental reason is that having the data in one place makes it easier to access because you’re not querying multiple source systems and potentially doing additional real-time processing when data is needed. A central database can do all this work in advance, enabling faster and more consistent response, and place the data in structures that are most appropriate for particular business needs.  Indeed, a CDP can maintain the same data in different structures to support different purposes. It also can retain data that might be lost in operational systems, which frequently overwrite information such as customer status or location. This information can be important to understand trends, behavior patterns, or past context needed to build effective predictions. On a simpler level, a persistent database can accept batch file inputs from source systems that don’t allow an API connection or – and this is very common – from systems whose managers won’t allow direct API connections for fear of performance issues.

Obviously these arguments are familiar to the people at Salesforce, so you have to ask why they chose to ignore them – or, perhaps better stated, what arguments they felt outweighed them. I have no inside information but suspect the fundamental reason Salesforce has chosen not to support a separate CDP-style repository is the data movement required to support that would be very expensive given the designs of their current products, creating performance issues and costs their clients wouldn't accept.  It’s also worth noting that the examples Salesforce gives for using customer data mostly relate to real-time interactions such as phone agents looking up a customer record.  In those situations, access to current data is essential: providing it through real-time replication would be extremely costly while reading it directly from source systems is quite simple.  So if Salesforce feels real-time interactions are the primary use case for central customer data, it makes sense to take their approach and sacrifice the historical perspective and improved analytics that a separate database can provide.

It’s interesting to contrast Salesforce’s approach with yesterday’s Open Source Initiatve announcement from Adobe, Microsoft and SAP.  That group has taken exactly the opposite tack, developing a plan to extract data from source systems and load it into an Azure database. This is a relatively new approach for Adobe, which until recently argued – as Salesforce still does – that creating a common ID and accessing data in place was enough. That they tried and abandoned this method suggests that they found it won’t meet customer needs.  It could even be cited as evidence that Salesforce will eventually reach the same conclusion. But it’s also worth noting that Adobe’s announcement focused primarily on analytical uses of the unified customer data and their strongest marketing product is Web analytics. Conversely, Salesforce’s heritage is customer interactions in CRM and email. So it may be that each vendor has chosen the approach which best supports its core products.

(An intriguing alternative explanation, offered by TechCrunch, is that Adobe, Microsoft and SAP have created a repository specifically to make it easier for clients to extract their data from Salesforce. I’m not sure I buy this but the same logic would explain why Salesforce has chosen an approach that keeps the core data locked safely within existing Salesforce systems.)

I’ll state the obvious by pointing out that companies need both analytics and interactions. We already know that many CDPs can access data in place, most commonly applied to information such as location or weather which changes constantly and is only relevant when an interaction occurs. So a hybrid approach is already common (though not universal) in the CDP world. Salesforce does say
that “Customer 360 creates and stores a customer profile”, so some persistence is already built into the product. We don’t know how much data is kept in that profile and it might only be the identifiers needed for cross-system identity resolution. (That’s what Adobe stored persistently before it changed its approach.)  You could view this as the seed of a hybrid solution already planted within Customer 360.  But while it can probably be extended to some degree, it’s not the equivalent a CDP that is designed to store most data centrally.

My guess is that Salesforce will eventually decide, as Adobe has already, that a large central repository is necessary. Customer 360 builds connections that are needed to support such a repository, so it can be viewed as a step in that direction, whether or not that's the intent.  Since a complete solution needs both central storage and direct access, we can view the challenge as finding the right balance between the two storage models, not picking one or the other exclusively. Finding a balance isn't as much fun as a having religious war over which is absolutely correct but it's ultimately the best solutions for marketers and other users.

And what does all this mean for the independent CDP market? Like Adobe yesterday, Salesforce is describing a product in its early stages – although the Salesforce approach is technically less challenging and closer to delivery. It will appeal primarily to companies that use the three Salesforce B2C systems, which I think is relatively small subset of the business world. Exactly how non-Salesforce systems are integrated through Mulesoft isn’t yet clear: in particular, I wonder how much identity resolution will be possible.

But I still feel the access-in-place approach solves only a part of the problem addressed by CDPs and  not the most important part at that. We know from research and observation that the most common CDP use cases are analytics, not interactions: although coordinated omnichannel customer experience is everyone's ultimate goal, initial CDP projects usually focus on understanding customers and analyzing their behaviors over time. In particular, artificial intelligence relies heavily on comprehensive customer data sets that need to be assembled in a persistent data store outside of the source systems. Given the central role that AI is expected to play in the future, it’s hard to imagine marketers enthusiastically embracing a Salesforce solution that they recognize won't assemble AI training sets.  They’re more likely to invest in one solution that meets both analytical (including AI) and interaction needs. For the moment, that puts them firmly back into CDP territory (including Datorama, which Salesforce bought in August).

The big question is how long this moment lasts. Salesforce and Adobe/Microsoft/SAP will all get lots of feedback from customers once they deploy their solutions. We can expect them to be fast learners and pragmatic enough to extend their architectures in whatever ways are needed to meet customer requirements. The threat of those vendors deploying truly competitive products has always hung over the CDP industry and is now more menacing than ever.  There may even be some damage before those vendors deploy effective solutions, if they scare off investors and confuse buyers or just cause buyers to defer their decisions.  CDP vendors and industry analysts, who are already struggling to help buyers understand the nuances of CDP features, will have an even harder job to explain the strengths and weaknesses of these new alternatives. But the biggest job belongs to the buyers themselves: they're the ones who will most suffer if they pick products that don't truly meet their needs.

Monday, September 24, 2018

Adobe, Microsoft and SAP Announce Open Data Initiative: It's CDP Turf But No Immediate Threat

One of the more jarring aspects in Adobe’s briefing last week about its Marketo acquisition* were several statements that suggested Marketo and Adobe’s other products were going to access shared customer data. This would be the Experience Cloud Profile announced  in March and based on an open source data model developed jointly with Microsoft and stored on Microsoft Azure.**  When I tried to reconcile Adobe’s statements with reality, the best I could come up with was they were saying that Adobe systems and Marketo would push their data into the Experience Cloud Profiles and then synchronize whatever bits they found useful with each application’s data store. That’s not the same as replacing the separate data stores with direct access to the shared Azure files but it is sharing of a sort. Whether even that level of integration is available today is unclear but if we required every software vendor to only describe capabilities that are actually in place, the silence would be deafening.

The reason the shared Microsoft project was on Adobe managers’ minds became clear today when Adobe, Microsoft and SAP announced an “Open Data Initiative” that seemed pretty much the same news as before – open source data models (for customers and other objects) feeding a system hosted on Azure. The only thing really seemed new was SAP’s involvement. And, as became clear during analyst questions after the announcement at Microsoft’s Ignite conference, this is all in very early stages of planning.

I’ll admit to some pleasure that these firms have finally admitted the need for unified customer data, a topic close to my heart. Their approach – creating a persistent, standardized repository – is very much the one I’ve been advocating under the Customer Data Platform label. I’ll also admit to some initial fear that a solution from these vendors will reduce the need for stand-alone CDP systems. After all, stand-alone CDP vendors exist because enterprise software companies including Microsoft, Adobe and SAP have left a major need unfilled.

But in reviewing the published materials and listening to the vendors, it’s clear that their project is in very early stages. What they said on the analyst call is that engineering teams have just started to work on reconciling their separate data models – which is heart of the matter. They didn’t put a time frame on the task but I suspect we’re talking more than a year to get anything even remotely complete. Nor, although the vendors indicated this is a high strategic priority, would I be surprised if they eventually fail to produce something workable.  That could mean they produce something, but it’s so complicated and exception-riddled that it doesn’t meet the fundamental goal of creating truly standardized data.

Why I think this could happen is that enterprise-level customer data is very complicated.  Each of these vendors has multiple systems with data models that are highly tuned to specific purposes and are still typically customized or supplemented with custom objects during implementation. It’s easy to decide there’s an entity called “customer” but hard to agree on one definition that will apply across all channels and back-office processes. In practice, different systems have different definitions that suit their particular needs.

Reconciling these is the main challenge in any data integration project.  Within a single company, the solution involves detailed, technical discussions among the managers of different systems. Trying to find a general solution that applies across hundreds of enterprises may well be impossible. In practice, you’re likely to end up with data models that support different definitions in different circumstances with some mechanism to specify which definition is being used in each situation. That may be so confusing that it defeats the purpose of having shared data, which is for different people to easily make use of it.

Note that CDPs are deployed at the company level, so they don’t need to solve the multi-company problem.*** This is one reason I suspect the Adobe/Microsoft/SAP project doesn’t pose much of a threat to the current CDP vendors, at least so long as buyers actually look at the details rather than just assuming the big companies have solved the problem because they’ve announced they're working on it.

The other interesting aspect of the joint announcement was its IT- rather than marketing-centric focus. All three of the supporting quotes in the press release came from CIOs, which tells you who the vendors see as partners. Nothing wrong with that: one of trends I see in the CDP market is a separation between CDPs that focus primarily on data management (and enterprise-wide use cases and IT departments as primary users) and those that incorporate marketing applications (and marketing use cases and marketers as users). As you may recall, we recently changed the CDP Institute definition of CDP from “marketer-controlled” to “packaged software” to reflect the use of customer data across the enterprise. But most growth in the CDP industry is coming from the marketing-oriented systems. The Open Data Initiative may eventually make life harder for the enterprise-oriented CDPs, although I’m sure they would argue it will help by bringing attention to a problem that it doesn’t really solve, opening the way to sales of their products.  It’s even less likely to impact sales of the marketing-oriented CDPs, which are bought by marketing departments who want tightly integrated marketing applications.

Another indication of the mindset underlying the Open Data Initiative is this more detailed discussion of their approach, from Adobe’s VP of Platform Engineering. Here the discussion is mostly about making the data available for analysis. The exact quote “to give data scientists the speed and flexibility they need to deliver personalized experiences” will annoy marketers everywhere, who know that data scientists are not responsible for experience design, let alone delivery. Although the same post does mention supporting real-time customer experiences, it’s pretty clear from context that the core data repository is a data lake to be used for analysis, not a database to be accessed directly during real-time interactions. Again, nothing wrong with that and not all CDPs are designed for real-time interactions, either. But many are and the capability is essential for many marketing use cases.

In sum: today’s announcement is important as a sign that enterprise software vendors are (finally) recognizing that their clients need unified customer data. But it’s early days for the initiative, which may not deliver on its promises and may not promise what marketers actually want or need. It will no doubt add more confusion to an already confused customer data management landscape. But smart marketers and IT departments will emerge from the confusion with a sound understanding of their requirements and systems that meet them. So it's clearly a step in the right direction.

*I didn't bother to comment the Marketo acquisition in detail because, let’s face it, the world didn’t need one more analysis. But now that I’ve had a few days to reflect, I really think it was a bad idea. Not because Marketo is a bad product or it doesn’t fill a big gap in the Adobe product line (B2B marketing automation).  It's because filling that gap won’t do Adobe much good. Their creative and Web analysis products already gave them a presence in every marketing department worth considering, so Marketo won’t open many new doors. And without a CRM product to sell against Salesforce, Adobe still won’t be able to position itself as a Salesforce replacement. So all they bought for $4.75 billion was the privilege of selling a marginally profitable product to their existing customers. Still worse, that product is in a highly competitive space where growth has slowed and the old marketing automation approach (complex, segment-based multi-step campaign flows) may soon be obsolete. If Adobe thinks they’ll use Marketo to penetrate small and mid-size accounts, they are ignoring how price-sensitive, quality-insensitive, support-intensive, and change-resistant those buyers are. And if they think they’ll sell a lot of add-on products to Marketo customers, I’d love to know what those would be.

** I wish Microsoft would just buy Adobe already. They’re like a couple that’s been together for years and had kids but refuses to get married.

*** Being packaged software, CDPs let users implement solutions via configuration rather than custom development. This is why they’re more efficient than custom-built data warehouses or data lakes for company-level projects.

Thursday, September 06, 2018

Customer Data Platforms vs Master Data Management: How They Differ

My wanderings through the Customer Data Platform landscape have increasingly led towards the adjacent realm of Master Data Management (MDM). Many people are starting to ask whether they’re really the same thing or could at least be used for some of the same purposes.

Master Data Management can be loosely defined as maintaining and distributing consistent information about core business entities such as people, products, and locations. (Start here
if you’d like to explore more formal definitions.) Since customers are one of the most important core entities, it clearly overlaps with CDP.

Specifically, MDM and CDP both require identity resolution (linking all identifiers that apply to a particular individual), which enables CDPs to bring together customer data into a comprehensive unified profile. In fact some CDPs rely on MDM products to perform this function.

MDM and (some) CDP systems also create a “golden record” containing the business’s best guess at customer attributes such as name and address. That’s the “master” part of MDM.  It often requires choosing between conflicting information captured by different systems or by the same system at different times. CDP and MDM both share that golden record with other systems to ensure consistency.

So how do CDP and MDM differ? The obvious answer is that CDP manages a lot more than just master data: it captures all the details of transactions and behaviors that are not themselves customer attributes. But many MDM products are components of larger data integration suites from IBM, SAP, Oracle, SAS, Informatica, Talend and others. These also manage more than the identifying attributes of core data objects. You could argue that this is a bit of a bait-and-switch: the CDP-like features in these suites are not part of their MDM products. But it does mean that the vendors may be able to meet CDP data requirements, even if you need more than their MDM module to do it.

Another likely differentiator is that MDM systems run on SQL databases and work with structured data. This is the best way to manage standardized entity attributes.  By contrast, CDPs work with structured, semi-structured and unstructured data which requires a NoSQL file system like Hadoop. But, again, the larger integration suites often support semi-structured and unstructured data and NoSQL databases.  So the boundary remains blurry.

On the practical level, MDMs are primarily tools that IT departments buy to improve data quality and consistency.  Business user interfaces are typically limited to specialized data governance and workflow functions. CDPs are designed to be managed by business users although deploying them does take some technical work. Marketing departments are the main CDP buyers and users while MDM is clearly owned by IT.

One CDP vendor recently told me the main distinction they saw was that MDM takes a very rigid approach to identity data, creating a master ID that all connected systems are required to use as the primary customer ID. He contrasted this with the CDP approach that lets each application work with its own IDs and only unifies the data within the CDP itself.  He also argued that some CDPs (including his, of course) let users apply different matching rules for different purposes, applying more stringent matches in some cases and looser matches in others. I’m not sure that all MDM systems are really this rigid.  But it’s something to explore if you’re assessing how an MDM might work in your environment.

Going back to practical differences, most CDPs have standard connectors for common marketing data sources, analysis tools, and execution systems. Those connectors are tuned to ingest complete data streams, not the handful of entity attributes needed for master data management.  There are certainly exceptions to this among CDPs: indeed, CDPs that focus on analytics and personalization are frequently used in combination with other CDPs that specialize in data collection. MDM vendors are less marketing-centric than CDPs so you’ll find fewer marketing-specific connectors and data models. Similarly, most MDMs are not designed to store, expose, reformat, or deliver complete data sets. But, again, MDMs are often part of larger integration suites that do offer these capabilities.

So, where does this leave a weary explorer of the CDP jungle? On one hand, MDM in itself is very different from CDP: it provides identity resolution and shares standard (“golden”) customer attributes, but doesn’t ingest, store or expose full details for all data types.  On the other hand, many MDM products are part of larger suites that do have these capabilities.

The real differentiator is focus: CDPs are built exclusively for customer data, are packaged software built for business users (mostly marketers), and have standard connectors for customer-related systems. MDM is a general-purpose tool designed as a component in systems built and run by IT departments.

Those differences won’t necessarily show up on paper, where both types of systems will check the boxes on most capabilities lists. But they’ll be clear enough as you work through the details of use cases and deployment plans.

As any good explorer will tell you, there’s no substitute for seeing the ground on foot.

Monday, September 03, 2018

Third Party Data Is Not Dead Yet

Third party data is not dead yet.

It was supposed to be. The culprit was to be the EU’s General Data Protection Regulation, which would cut off the flow of personal data to third party brokers and, even more devastatingly, prevent marketers from buying third party data for fear it wasn’t legitimately sourced. 

The expectations are real.  A recent Sizmek study found that 77% of marketers predicted data regulations such as GDPR would make targeting audiences with third party data increasingly difficult.  In a Demandbase study, 60% of respondents said that GDPR was forcing a change in their global privacy approach.  And 44% of marketers told Trusted Media Brands  that they expected GDPR would lead to more use of first party data vs. cookies.

Marketers say they're acting on these concerns by cutting back on use of third party data. Duke Fuqua’s most recent CMO Survey found that use of online (first party) customer data has grown at 63% of companies in the past two years while just 31% expanded use of third party data.  Seventy percent expected to further grow first party data in the next two years compared with just 31% for third party data.  A Dentsu Ageis survey had similar results: 57% of CMOs were expanding use of existing (first party) data compared with 37% expanding use of purchased data.

The irony is that reports of GDPR impact seem to have been greatly exaggerated. A Reuters Institute study found 22% fewer third party cookies on European news sites after GDPR deployment, a significant drop which nevertheless means that 78% remain.  Meanwhile, Quantcast reported that clients using its consent manager achieved a consent rate above 90%.  In other words, third party data is still flowing freely even in Europe even if the volume is down a little. The flow is even freer in the U.S., where developments like the new California privacy regulation will almost surely be watered down before taking effect, if not blocked entirely by Federal pre-emption.

Of course, what regulation can’t achieve, self-interest could still make happen. There’s at least some debate (stoked by interested parties) over whether targeting ads with third party data is really more effective than contextual targeting, which is the latest jargon for putting ads on Web pages related to the product. Online ad agency Roast and ad platform Teads did an exhaustive study that concluded contextual targeting and demographic targeting with third party data worked about equally well. The previously-mentioned Sizmek study found that 87% of marketers plan to increase their contextual targeting in the next year and 85% say brand safety is a high or critical priority. (Ads appearing on brand-unsafe Web pages is a problem when ads are targeted at individuals, a primary use for third party data.)  The Trusted Media Brands study also listed brand safety as a major concern about digital media buying (ranked third and cited by 58%) although, tellingly, ROI and viewability were higher (first and second at 62% and 59%, respectively).

But third party data isn’t going away.

It’s become increasingly central for business marketers as Account Based Marketing puts a premium on understanding potential buyers whether or not they're already in the company’s own database.  Third party data also includes intent information based on behaviors beyond the company’s own Web site. Indeed companies including Lattice Engines, Radius, 6Sense and Demandbase have all shifted much of their positioning away from predictive modeling or ad targeting based on internal data and towards the value of the data they bring.

Then again, business marketing always relied heavily on third party data. What arguably more surprising is that consumer marketers also seems to be using it more.  Remember that the CMO surveys cited earlier showed expectations for slower growth, not actual declines.  There's more evidence in the steady stream of vendor announcements touting third party data applications.

Many of these announcements are from established vendors selling established applications, such as ad targeting and marketing performance measurements. For targeting, see recent announcements from TruSignal, Thunder, and AdTheorent; for attribution, see news from Viant and  IRI.

But what's most interesting are the newer applications. These go beyond lists of target customers or comparing anonymized online and offline data. They provide something that only third party data can do at scale: connect online and offline identities. This is something that companies like LiveRamp and Neustar have done for years.  But we're now seeing many interesting new players:

Bridg helps retailers to identify previously anonymous in-store customers, based on probabilistic matching against their proprietary consumer database.  It then executes tailored online marketing campaigns.

SheerID verifies the identities of online visitors, enabling marketers to safely limit offers to members of specific groups such as teachers, students, or military veterans. They do this by building connections to reference databases holding identity details..

PebblePost links previously anonymous Web visitors to postal addresses, using yet another proprietary database to make the connections. They use this to target direct mail based on Web behaviors.

You’ll have noticed that the common denominator here is a unique consumer database.  These do something not available from other third party sources or not available with the same coverage.  Products like these will keep marketers coming back for third party data whether or not privacy regulations make Web-based data gathering more difficult.  So don't cry for third party data: the truth is it never has left you.