Saturday, August 18, 2018

CDP Myths vs Realities

A few weeks ago, I critiqued several articles that attacked “myths” about Customer Data Platforms. But, on reflection, those authors had it right: it’s important to address misunderstandings that have grown as the category gains exposure. So here's my own list of CDP myths and realities. 

Myth: CDPs are all the same.
Reality: CDPs vary widely. In fact, most observers recognize this variation and quite a few consider it a failing. So perhaps the real myth is that CDPs should be the same. It’s true that the variation causes confusion and means buyers must work hard to ensure they purchase a system that fits their needs. But buyers need to match systems to their needs in every category, including those where features are mostly similar.

Myth: CDPs have no shared features.
Reality: This is the opposite of the previous myth but grows from the same underlying complaint about CDP variation. It’s also false: CDPs all do share core characteristics. They’re packaged software; they ingest and retain detailed data from all sources; they combine this data into a complete view of each customer; they update this view over time; and they expose the view to other systems. This list excludes many products from the CDP category that share some but not all of these features. But it doesn’t exclude products that share all these features and add some other ones. These additional features, such as segmentation, data analysis, predictive models, and message selection, account for most of the variation among CDP systems. Complaining that these mean CDPs are not a coherent category is like complaining that automobiles are not a category because they have different engine types, body styles, driving performance, and seating capacities. Those differences make them suitable for different purposes but they still share the same core features that distinguish a car from a truck, tractor, or airplane.

Myth: CDP is a new technology.
Reality: CDPs use modern technologies, such as NoSQL databases and API connectors. But so do other systems. What’s different about CDP is that it combines those technologies in prebuilt systems, rather than requiring technical experts to assemble them from scratch. Having packaged software to build a unified, sharable customer database is precisely the change that led to naming CDP as a distinct category in 2013.

Myth: CDPs don’t need IT support.
Reality: They sure do, but not as much. At a minimum, CDPs need corporate IT to provide access to corporate systems to acquire data and to read the CDP database. In practice, corporate IT is also often involved in managing the CDP itself. (This recent Relevancy Group study put corporate IT participation at 49%.)   But the packaged nature of CDPs means they take less technical effort to maintain than custom systems and many CDPs provide interfaces that empower business users to do more for themselves. Some CDP vendors have set their goal as complete business user self-service but I haven’t seen anyone deliver on this and suspect they never will.

Myth: CDPs are for marketing only.
Reality: It’s clear that departments outside of marketing can benefit from unified customer data and there’s nothing inherent in CDP technology that limits them to marketing applications. But it’s also true that most CDPs so far have been purchased by marketers and have been connected primarily to marketing systems. The optional features mentioned previously – segmentation, analytics, message selection, etc. – are often marketing-specific. But CDPs with those features must still be able to share their data outside of marketing or they wouldn’t be CDPs.

Myth: CDPs manage only first party, identified data.
Reality: First party, identified data is the primary type of information stored in a CDP and it’s something that other systems (notably Data Management Platforms) often handle poorly or not at all. But nothing prevents a CDP from storing third party and/or anonymous data, and some CDPs certainly do.  Indeed, CDPs commonly store anonymous first party data, such as Web site visitor profiles, which will later be converted into identified data when a customer reveals herself. The kernel of truth inside this myth is that few companies would use a CDP to store anonymous, third party data by itself.

Myth: Identity resolution is a core CDP capability.
Reality: Many CDP systems provide built-in identity resolution (i.e., ability to link different identifiers that relate to the same person).  But many others do not.  This is by far the most counter-intuitive CDP reality, since it seems obvious that a system which builds a unified customer profiles should be able to connect data from different sources.  But quite a few CDP buyers don’t need this feature, either because they get data from a single source system (e.g., ecommerce or publishing), because their company has existing systems to assemble identities (common in financial services), or because they rely on external matching systems (frequent in retail and business marketing). What nearly all CDPs do have is the ability to retain links over time, so unified profiles can be stitched together as new identifiers are connected to each customer’s master ID. One way to think about this is: the function of identity resolution is essential for building a unified customer database, but the feature may be part of a CDP or something else.

Myth: CDPs are not needed if there’s an Enterprise Data Warehouse.
Reality: It’s a reasonable simplification to describe a CDP as packaged software that builds a customer-centric Data Warehouse. But a Data Warehouse is almost always limited to highly structured data stored in a relational database.  CDPs typically include large amounts of semi-structured and unstructured data in a NoSQL data store. Relational technology means changing a Data Warehouse is usually a complex, time-consuming project requiring advanced technical skill. Pushing data into a CDP is much easier, although some additional work may later be required to make it usable. Even companies with an existing Data Warehouse often find a CDP offers new capabilities, flexibility, and lower operating costs that make it a worthwhile investment.

Myth: CDPs replace application data stores.
Reality: Mea culpa: I’ve often explained CDPs by showing separate silo databases replaced by a single shared CDP.  But that’s an oversimplification to get across the concept. There are a handful of situations where a delivery system will read CDP data directly, such as injecting CDP-selected messages into a Web page or exposing customer profile details to a call center agent. But in most cases the CDP will synchronize its data with the delivery system’s existing database. This is inevitable: the delivery systems are tightly integrated products with databases optimized for their purpose. The value of the CDP comes from feeding better data into the delivery system database, not from replacing it altogether.


Myth: CDP value depends on connecting all systems.
Reality: CDPs can deliver great value if they connect just some systems, or sometimes even if they only expose data from a single system that was otherwise inaccessible.  This matters because connecting all of a company's systems can be a huge project or even impossible if some systems are not built to integrate with others.  This shouldn't be used as an argument against CDP deployment so long as a less comprehensive implementation will still provide real value.

Myth: The purpose of CDP is to coordinate customer experience across all channels.
Reality: That's one goal and perhaps the ultimate.  But there are many other, simpler applications a CDP makes possible, such as better analytics and more accurate data shared with delivery systems.   In practice, most CDP users will start with these simpler applications and add the more demanding ones over time.

Myth: The CDP is a silver bullet that solves all customer data problems.
Reality: There are plenty of problems beyond the CDP's control, such as the quality of input data and limits on execution systems.  Moreover, the CDP is just a technology and many obstacles are organizational and procedural, such as cooperation between departments, staff skills, regulatory constraints, and reward systems.  What a CDP will do is expose some obstacles that were formerly hidden by the technical difficulty of attempting the tasks they obstruct.  Identifying the problems isn't a solution but it's a first step towards finding one.

Of course, everyone knows there are no silver bullets but there's always that tiny spark of hope that one will appear.  I hesitate to quench that spark because it's one of the reasons people try new things, CDPs included.  But I think the idea of CDPs is now well enough established for marketers to absorb a more nuanced view of how they work without losing sight of their fundamental value.  Gradual deflation of expectations is preferable to a sudden collapse.  Let's hope a more realistic understanding of CDPs will ultimately lead to better results for everyone involved.

Thursday, August 02, 2018

Arm Ltd. Buys Treasure Data CDP

Customer Data Platform vendor Treasure Data today confirmed earlier reports that it is being purchased by Arm Limited, which licenses semi-conductor technologies and is itself a subsidiary of the giant tech holding company SoftBank. The price was not announced but was said to be around $600 million.

The deal was the second big purchase of a Customer Data Platform vendor in a month, following the Salesforce’s Datorama acquisition. Arm seems a less likely CDP buyer than Salesforce but made clear their goal is to use Treasure Data o manage Internet of Things data. That’s an excellent fit for Treasure Data’s technology, which is very good at handling large volumes of semi-structured data. Treasure Data will operate as a separate business under its existing management and will continue to sell its product to marketers as a conventional Customer Data Platform.

While Arm is an unexpected CDP buyer, the deal does illustrate some larger trends in the CDP market. One is the broadening of CDP beyond pure marketing use cases: as critics have noted, unified customer data has applications throughout an organization so it doesn’t make sense to limit CDP to marketing users. In fact, the time has probably come to remove “marketer-managed” from the formal definition of CDP.  But that’s a topic for another blog post.

A complementary trend is use of CDP technology for non-customer data. Internet of Things is obviously of growing importance and, although you might argue thats IoT data is really just another type of customer data, there’s a reasonable case that the sheer volume and complexity of IoT data rightly justifies considering it a category of its own. More broadly, there are other kinds of data, such as product and location information, which also should be considered in their own terms.

What’s really going on here is that one category of CDPs – the systems that focus primarily on data management, as opposed to marketing applications – is merging with general enterprise data management systems. These are companies like Qubole and Trifacta that often use AI to simplify the process of assembling enterprise data.  These systems do for all sorts of information what a CDP does for customer information. This is a new source of competition for CDPs, especially as corporate IT departments get more involved. There are also a handful of CDP systems, including ActionIQ, Aginity, Amperity, and Reltio, that have the potential to expand beyond customer information. It’s possible that those vendors will eventually exit the CDP category altogether, leaving the field to CDPs that provide marketing-specific functions for analysis and customer engagement. (If that happens, then “marketer-managed” should stay in the definition.)

In any case, the Treasure Data acquisition is another milestone in the evolution of the CDP industry, illustrating that at least some of the systems have unique technology that is worth buying at a premium. I can imagine some of the other data-oriented vendors being purchased for similar reasons. I can also imagine acquisition of companies like Segment and Tealium that have particularly strong collections of connectors to source and target systems. That’s another type of asset that’s hard to replicate.

So we'll see how the industry evolves.  Don't be surprised if it follows several paths simultaneously: some buyers may take an enterprise-wide approach while others limit CDP use to marketing. What I don't yet see is any type of consolidation around a handful of winners who gobble up most of the market share.  That might still happen but, for now, the industry will remain vibrant and varied, as different vendors try different configurations to see which most marketers find appealing.

Wednesday, August 01, 2018

Salesforce Buys Datorama Customer Data Platform: It's Complicated

News that Salesforce had purchased Datorama crossed the wire just as I was starting on two weeks of travel, so I haven’t been able to comment until now. This was purchase was noteworthy as the first big CDP acquisition by a marketing cloud vendor. That the buyer was Salesforce was even more intriguing, given that they had purchased Mulesoft in March for $6.5 billion and that Marketing Cloud CEO Bob Stutz (who announced the Datorama deal) had called CDPs “a passing fad” and said Salesforce already had “all the pieces of a CDP” in an interview in June.

The Salesforce announcement didn’t refer to Datorama as a CDP and Datorama itself doesn’t use the term either. They do meet the requirements – packaged software building a unified, persistent customer database that’s open to other systems – but are definitely an outlier. In particular, Datorama ingests all types of marketing-related data, notably including ad campaign- and segment-level performance information as well as customer-level detail. Their stated positioning as “one centralized platform for all your marketing data and decision making” sure sounds like a CDP, but their focus has been on marketing performance, analytics, and data visualization. Before the acquisition, they told me some of their clients ingest customer-level detail but most do not. So it would appear that while Salesforce’s acquisition reflects recognition of the need for a persistent, unified marketing database (something they didn’t get with MuleSoft), they didn’t buy Datorama as a way to build a Single Customer View.

Datorama’s closest competitors are marketing analysis tools like Origami Logic and Beckon. I’ve never considered either of those CDPs because they clearly do not work with customer-level detail. Datorama competes to a lesser extent with generic business intelligence systems like Looker, Domo, Tableau, and Qlik. These traditionally have limited data integration capabilities although both Qlik and Tableau have recently purchased database building products (Podium Data and Empirical Systems, respectively), suggesting a mini-trend of its own. It’s worth noting that one of Datorama’s particular strengths is use of AI to simplify integration of new data sources. The firm’s more recent announcements have touted use of AI to find opportunities for new marketing programs.

Datorama is much larger than most other CDP vendors: it ranked third (behind Tealium and IgnitionOne) in the CDP Institute’s most recent industry report, based on number of employees found in LinkedIn. The company doesn’t release revenue figures but, assuming the 360 employees currently shown on LinkedIn generate $150,000 each, it would have a run rate of $54 million. (This is a crude guess: actual figure could easily be anywhere from $30 million to $80 million.) Sticking with the $54 million figure, the $800 million purchase price is 15x revenue, which is about what such companies cost. (Mulesoft went for 22x revenue.)  The company reports 3,000 clients, which again is a lot for a CDP but gives an average of under $20,000 per client. That’s very low for an enterprise CDP.  It reflects the fact that most of Datorama’s clients use it to analyze aggregated marketing data, not to manage customer-level details.

Seeing Datorama as more of an marketing analysis system than CDP makes it a little easier to understand why Salesforce continues to work with other CDP vendors. The Datorama announcement was followed a week later by news that Salesforce Ventures had led a $23.8 million investment in the SessionM CDP, which had announced an expanded Salesforce integration just one month earlier  SessionM builds its own database but its main strength is real-time personalization and loyalty. Salesforce in June also introduced Marketing Cloud Interaction Studio, a licensed version of the Thunderhead journey orchestration engine. Thunderhead also builds its own database and I consider it a CDP although they avoid the term, reflecting their primary focus on journey mapping and orchestration. The Salesforce announcement states explicitly that the Interaction Studio will shuffle customers between campaigns defined in the Marketing Cloud’s own Journey Builder, clarifying what any astute observer already knew: that Journey Builder is really about campaign flows, not true journey management.

So, how do all these pieces fit with each other and the rest of Salesforce Marketing Cloud? It’s possible that Salesforce will let Datorama, SessionM, and Interaction Studio independently build their own isolated databases but the disadvantages of that are obvious. It’s more likely that Salesforce will continue to argue that ExactTarget should be the central customer database, something that’s been their position so far even though every ExactTarget user I’ve ever spoken with has said it doesn’t work. The best possible outcome might be for Salesforce to use Datorama as its true CDP when a client wants one, and have it feed data into SessionM, Interaction Studio, ExactTarget, and other Marketing Cloud components as needed.  We'll see if that happens: it could evolve that way even if Salesforce doesn't intend it at the start.

Looking at this from another perspective: the combination of Datorama, SessionM, and Interaction Studio (Thunderhead) almost exactly fills every box in my standard diagram of CDP functions, which distinguishes the core data processing capabilities (ingest, process, expose) from optional analytics and engagement features.  Other Marketing Cloud components provide the Delivery capabilities that sit outside of the CDP, either directly (email and DMP) or through integrations.  The glaring gap is identity linkage, which Datorama didn't do the last time I checked.  But that's actually missing in many CDPs and often provided by third party systems.  Still, you shouldn't be too surprised to see Salesforce make another acquisition to plug that hole.  If you're wondering where Mulesoft fits, it may play a role in some of the data aggregation, indexing, reformatting, and exposing steps; I'm not clear how much of that is available in Datorama.  But Mulesoft also has functions outside of this structure.

In short, it's quite true that Salesforce has all the components of a CDP, especially you include  Datorama in the mix.


The idea of stringing these systems together raises a general point that extends beyond Salesforce.  The reality is that almost every marketing system must import data into its own database, rather than connecting to a shared central data store. I’ll admit I’ve often drawn the picture as if there would be a direct connection between the CDP database and other applications.  This should never have been taken literally. There are indeed some situations where the CDP data is read directly, such as real time access to data about a single customer. But even those configurations usually require the CDP data to be indexed or extracted into a secondary format: absent special technology, you don’t do that sort of query directly against the primary “big data” store used by most CDPs.

Outside of those exceptions, a subset of CDP data will usually be loaded into the primary data store of the customer-facing applications (email, DMP, Web personalization, etc.). Realistically, those data stores are optimized for their own application and the applications read them directly.  There’s no practical way the applications can work without them.

This is a nuance that was rightly avoided in the early days of CDP as we struggled to explain the concept. But I think now that CDP is well enough understood that we can safely add some details to the picture to make it more realistic and avoid creating false expectations. I'll try to do that in the futre.