Customer Experience Matrix: big data

Showing posts with label big data. Show all posts

Tuesday, June 11, 2019

Tableau, Looker, and Origami Logic Acquisitions Show Analytics Is In Fashion

One of the unwritten laws of punditry is that one event is random, two events are interesting, and three events make a trend. By that measure, the purchases of data analytics vendors Looker by Google, Tableau by Salesforce, and Origami Logic by Intuit within a three week span must signify something. Is it that martech suites must now include business intelligence software?

I think not. Even though the acquired products were fairly similar, each of these deals had a different motivation. Conveniently, the buyers all stated their purposes quite clearly in their announcements.

- Intuit bought Origami Logic to advance its strategy to become an “A.I.-driven expert platform”. Specifically, they see Origami Logic as providing a “strong data architecture” that will “accelerate Intuit’s ability to organize, understand, and use data to deliver personalized insights that help customers quickly achieve success and build confidence whenever they use Intuit products.” Reading between the lines, Intuit recognized its existing data architecture can’t support the kinds of analysis needed to generate AI-based recommendations for its clients, and bought Origami Logic to close that technology gap. In other words, Origami Logic will be the foundation of new Intuit products.

- Google bought Looker “to provide customers with a more comprehensive analytics solution — from ingesting and integrating data to gain insights, to embedded analytics and visualizations — enabling enterprises to leverage the power of analytics, machine learning and AI.” That’s somewhat similar to Intuit’s purpose, but Looker is building applications on top of Google Cloud’s existing, extremely powerful data management capabilities rather than providing a new data management foundation. Indeed, Looker already runs on Google Cloud. So Looker is adding another layer of value to Google Cloud, letting it meet more needs of its existing clients.

- Salesforce bought Tableau so it can “play an even greater role in driving digital transformation, enabling companies around the world to tap into data across their entire business and surface deeper insights to make smarter decisions, drive intelligent, connected customer experiences and accelerate innovation”. That’s not exactly pithy, but we’re dealing with Marc Benioff. The key is digital transformation, which lets Salesforce participate in projects beyond its current base in sales and marketing departments. That is, the purpose isn’t to add products for existing customers but to serve entirely new customers. The huge size of Tableau’s customer community – “more than 1 million passionate data enthusiasts” -- clearly a draw for Salesforce. This makes complete sense for Salesforce, which is always straining to maintain its growth rate.

Is there some commonality here? Sure: each of these vendors is striving to offer products based on advanced data management and analytics. Intuit is focused on the data management foundation while Google Cloud and Salesforce are focused more on analytics. All are acknowledging that it’s easier to buy mature technology than to build it from scratch. But of the three buyers, only Salesforce is a martech vendor and their purpose is explicitly to serve customers outside the martech user base. So whatever these deals prove, it’s not that business intelligence is the latest martech must-have.

Thursday, February 07, 2019

European CDP Market Is Still Behind the U.S.

I returned earlier this week from a sequence of workshops, speeches, and meetings in Europe, all focused on Customer Data Platforms. Here are some observations:

- The European CDP market is indeed behind the U.S. My own conversations are with people who already care about CDPs, so they're a very skewed sample. But vendors, consultants, agencies, and marketers I spoke with mostly agreed that the larger community is just beginning to hear about the concept. Many are seeking to position themselves as early adopters or experts, sensing a big business opportunity.

- Separate martech staff is rare. Nearly every large and mid-size company that I see in the U.S today has someone in charge of marketing technology, and often an entire team of marketing technologists reporting to the CMO. I was told this is much less common in Europe and personally didn’t meet anyone with a martech title. Nor did I hear about powerful IT departments taking charge. Rather, it seems that marketers still mostly act on their own, which is how it worked in the U.S. until a few years ago. I did have the impression that European marketers rely more heavily on specialist consultants to help them out, but that might be biased by the fact that many of my meetings were with consultants.

- DMP means something different in Europe. We consistently heard that marketers throughout Europe, and especially in France, were oversold several years ago on Data Management Platforms as a complete solution to handle all customer data needs. This contrasts quite sharply with the U.S., where DMPs have in most cases been understood as limited to serving up digital ad audiences. European DMPs are now recognized as having failed to deliver on the broader promise, which is beyond their technical capabilities. The resulting backlash greatly damaged the image of DMP products and has left marketers looking for a new solution that is truly capable of meeting their needs. Many recognize that CDP could be this solution and are intrigued. But they're also skeptical and worried that they’ll be fooled again. This makes it harder for CDP vendors to sell their products. On the bright side, it also means the problem CDPs address is already well understood.

- CRM also means something different. Back when Bill Clinton was president, CRM was described as a trinity of sales, service, and marketing systems, with marketing much weaker than the other two. It commonly referred to B2C as well as B2B. Later, in the U.S., the term came to be more associated with B2B sales and customer service in general and the Salesforce.com Sales cloud in particular. In Europe, CRM is used very broadly to mean any and all customer data, extending far beyond sales, service, and marketing, and including both B2B and B2C. On reflection, I may have recently been hearing people in the U.S. apply the term more broadly as well.

- Use cases are everything. We’ve seen a huge demand to present CDP use cases in the U.S. But it seemed even more pressing in Europe, perhaps because understanding of the CDP concept is weaker. One difference seemed to be that Europeans are willing to interact with vendors as a way of learning: while many U.S. buyers actively avoid vendors during the early stages of the purchase process, we heard quite a few requests in Europe to see detailed demonstrations of how individual vendors accomplish specific tasks. Maybe the European salespeople do a better job of being consultative, or maybe European buyers are less determined to find things out on their own. Or maybe it’s just my imagination.

- Immediate ROI is required. We also found a greater focus in Europe on use cases that tie directly to marketing programs, as opposed to the analytical use cases that are most common starting CDP applications in the U.S. The reason seems to be that European buyers are more insistent on finding a specific financial justification for their investment. Many U.S. buyers will accept a broader strategic justification and start with analytical use cases. This may be why European CDP vendors are more likely to offer a full scope of data, analytical, and campaign capabilities, since buying them in a single package makes it easier to tie new marketing programs directly to the CDP investment.

- National markets are distinct. Some of the big U.S. vendors are present throughout Europe, but many local vendors are largely limited to individual markets. We had some sense of this beforehand but the isolation was greater than expected. The French market in particular has its own ecosystem of CDPs and other types of software that have a major domestic position but little presence elsewhere. The Netherlands, German, Nordic and UK markets show more cross-over, probably because English is widely spoken in all of them. The greater interest in CDP-based marketing programs may also encourage this, since marketing programs are closely tied to specific local markets.

- GDPR hasn’t caused much change. We had some discussions about using CDP for GDPR compliance but privacy constraints in general rarely come up. The common attitude was that privacy rules were already tight in the countries we visited (Belgium, Netherlands, Germany and France), so GDPR hadn’t required significant adjustments. There was also some discussion about waiting to see how the rules are actually enforced, which might require further adjustments if the regulators are strict.

Summary

While these differences are interesting, they’re also fairly minor. Over all, the European marketers were feeling the same pressures as their U.S. counterparts to create unified data for better customer experiences. So while each market will have its own quirks and proceed at a its own pace, it looks like they’ll follow the same general path as the U.S.

Saturday, August 18, 2018

CDP Myths vs Realities

A few weeks ago, I critiqued several articles that attacked “myths” about Customer Data Platforms. But, on reflection, those authors had it right: it’s important to address misunderstandings that have grown as the category gains exposure. So here's my own list of CDP myths and realities.

Myth: CDPs are all the same.
Reality: CDPs vary widely. In fact, most observers recognize this variation and quite a few consider it a failing. So perhaps the real myth is that CDPs should be the same. It’s true that the variation causes confusion and means buyers must work hard to ensure they purchase a system that fits their needs. But buyers need to match systems to their needs in every category, including those where features are mostly similar.

Myth: CDPs have no shared features.
Reality: This is the opposite of the previous myth but grows from the same underlying complaint about CDP variation. It’s also false: CDPs all do share core characteristics. They’re packaged software; they ingest and retain detailed data from all sources; they combine this data into a complete view of each customer; they update this view over time; and they expose the view to other systems. This list excludes many products from the CDP category that share some but not all of these features. But it doesn’t exclude products that share all these features and add some other ones. These additional features, such as segmentation, data analysis, predictive models, and message selection, account for most of the variation among CDP systems. Complaining that these mean CDPs are not a coherent category is like complaining that automobiles are not a category because they have different engine types, body styles, driving performance, and seating capacities. Those differences make them suitable for different purposes but they still share the same core features that distinguish a car from a truck, tractor, or airplane.

Myth: CDP is a new technology.
Reality: CDPs use modern technologies, such as NoSQL databases and API connectors. But so do other systems. What’s different about CDP is that it combines those technologies in prebuilt systems, rather than requiring technical experts to assemble them from scratch. Having packaged software to build a unified, sharable customer database is precisely the change that led to naming CDP as a distinct category in 2013.

Myth: CDPs don’t need IT support.
Reality: They sure do, but not as much. At a minimum, CDPs need corporate IT to provide access to corporate systems to acquire data and to read the CDP database. In practice, corporate IT is also often involved in managing the CDP itself. (This recent Relevancy Group study put corporate IT participation at 49%.) But the packaged nature of CDPs means they take less technical effort to maintain than custom systems and many CDPs provide interfaces that empower business users to do more for themselves. Some CDP vendors have set their goal as complete business user self-service but I haven’t seen anyone deliver on this and suspect they never will.

Myth: CDPs are for marketing only.
Reality: It’s clear that departments outside of marketing can benefit from unified customer data and there’s nothing inherent in CDP technology that limits them to marketing applications. But it’s also true that most CDPs so far have been purchased by marketers and have been connected primarily to marketing systems. The optional features mentioned previously – segmentation, analytics, message selection, etc. – are often marketing-specific. But CDPs with those features must still be able to share their data outside of marketing or they wouldn’t be CDPs.

Myth: CDPs manage only first party, identified data.
Reality: First party, identified data is the primary type of information stored in a CDP and it’s something that other systems (notably Data Management Platforms) often handle poorly or not at all. But nothing prevents a CDP from storing third party and/or anonymous data, and some CDPs certainly do. Indeed, CDPs commonly store anonymous first party data, such as Web site visitor profiles, which will later be converted into identified data when a customer reveals herself. The kernel of truth inside this myth is that few companies would use a CDP to store anonymous, third party data by itself.

Myth: Identity resolution is a core CDP capability.
Reality: Many CDP systems provide built-in identity resolution (i.e., ability to link different identifiers that relate to the same person). But many others do not. This is by far the most counter-intuitive CDP reality, since it seems obvious that a system which builds a unified customer profiles should be able to connect data from different sources. But quite a few CDP buyers don’t need this feature, either because they get data from a single source system (e.g., ecommerce or publishing), because their company has existing systems to assemble identities (common in financial services), or because they rely on external matching systems (frequent in retail and business marketing). What nearly all CDPs do have is the ability to retain links over time, so unified profiles can be stitched together as new identifiers are connected to each customer’s master ID. One way to think about this is: the function of identity resolution is essential for building a unified customer database, but the feature may be part of a CDP or something else.

Myth: CDPs are not needed if there’s an Enterprise Data Warehouse.
Reality: It’s a reasonable simplification to describe a CDP as packaged software that builds a customer-centric Data Warehouse. But a Data Warehouse is almost always limited to highly structured data stored in a relational database. CDPs typically include large amounts of semi-structured and unstructured data in a NoSQL data store. Relational technology means changing a Data Warehouse is usually a complex, time-consuming project requiring advanced technical skill. Pushing data into a CDP is much easier, although some additional work may later be required to make it usable. Even companies with an existing Data Warehouse often find a CDP offers new capabilities, flexibility, and lower operating costs that make it a worthwhile investment.

Myth: CDPs replace application data stores.
Reality: Mea culpa: I’ve often explained CDPs by showing separate silo databases replaced by a single shared CDP. But that’s an oversimplification to get across the concept. There are a handful of situations where a delivery system will read CDP data directly, such as injecting CDP-selected messages into a Web page or exposing customer profile details to a call center agent. But in most cases the CDP will synchronize its data with the delivery system’s existing database. This is inevitable: the delivery systems are tightly integrated products with databases optimized for their purpose. The value of the CDP comes from feeding better data into the delivery system database, not from replacing it altogether.

Myth: CDP value depends on connecting all systems.
Reality: CDPs can deliver great value if they connect just some systems, or sometimes even if they only expose data from a single system that was otherwise inaccessible. This matters because connecting all of a company's systems can be a huge project or even impossible if some systems are not built to integrate with others. This shouldn't be used as an argument against CDP deployment so long as a less comprehensive implementation will still provide real value.

Myth: The purpose of CDP is to coordinate customer experience across all channels.
Reality: That's one goal and perhaps the ultimate. But there are many other, simpler applications a CDP makes possible, such as better analytics and more accurate data shared with delivery systems. In practice, most CDP users will start with these simpler applications and add the more demanding ones over time.

Myth: The CDP is a silver bullet that solves all customer data problems.
Reality: There are plenty of problems beyond the CDP's control, such as the quality of input data and limits on execution systems. Moreover, the CDP is just a technology and many obstacles are organizational and procedural, such as cooperation between departments, staff skills, regulatory constraints, and reward systems. What a CDP will do is expose some obstacles that were formerly hidden by the technical difficulty of attempting the tasks they obstruct. Identifying the problems isn't a solution but it's a first step towards finding one.

Of course, everyone knows there are no silver bullets but there's always that tiny spark of hope that one will appear. I hesitate to quench that spark because it's one of the reasons people try new things, CDPs included. But I think the idea of CDPs is now well enough established for marketers to absorb a more nuanced view of how they work without losing sight of their fundamental value. Gradual deflation of expectations is preferable to a sudden collapse. Let's hope a more realistic understanding of CDPs will ultimately lead to better results for everyone involved.

Wednesday, March 28, 2018

Adobe Adds Experience Cloud Profile: Why It's Good News for Customer Data Platforms


"A CDP by any other name still stores unified customer data."

Adobe on Tuesday announced the Experience Cloud Profile, which it described as a “complete, real-time view of customers” including data from outside of Adobe Cloud systems. The announcement was frustratingly vague but some ferreting around* uncovered this blog post by Adobe VP of Product Engineering Anjul Bhambhri, who clarified that (a) the new product will persistently store data ingested from all sources and (b) perform the identity stitching needed to build a meaningfully unified customer view. Adobe doesn’t use the term Customer Data Platform but that’s exactly what they’ve described here. So, unlike last week's news that Salesforce is buying MuleSoft, this does have the potential to offer a viable alternative to stand-alone CDP products.

Of course, the devil is in the details but this is still a significant development. Adobe’s offering is well thought out, including not just an Azure database to provide storage but also an open source Experience Data Model to simplify sharing of ingested data and compatible connectors from SnapLogic, Informatica, TMMData, and Microsoft Dynamics to make dozens of sources immediately available. Adobe even said they’ve built in GDPR-required controls over data sharing, which is a substantial corporate pain point and key CDP use case.

The specter of competition from the big marketing clouds has always haunted the CDP market. Salesforce’s MuleSoft deal was a dodged bullet but the Adobe announcement seems like a more palpable hit.** Yet the blow is far from fatal – and could actually make the market stronger over time. Let me explain.

First the bad news: Adobe now has a reasonable product to offer clients who might otherwise be frustrated by the lack of integration of its existing Experience Cloud products. This has been a substantial and widely recognized pain point. Tony Byrne of the Real Story Group has been particularly vocal on the topic. The Experience Cloud Profile doesn’t fully integrate Adobe’s separate products, but it does seem to let them share a rich set of customer data. That’s exactly the degree of integration offered by a CDP. So any Adobe client interested in a CDP will surely take a close look at the new offering.

The good news is that not everyone is an Adobe client. It’s true that the Cloud Profile could in theory be used on its own but Adobe would need to price it very aggressively to attract companies that don’t already own other Adobe components. The could of course be an excellent acquisition strategy but we don’t know if it’s what Adobe has in mind. (I haven’t seen anything about the Cloud Profile pricing but it’s a core service of the Adobe Experience Platform, which isn’t cheap.) What this means is that Adobe is now educating the market about the value of a persistent, unified, comprehensive, open customer database – that is, about the value of CDPs. This should make it much easier for CDP vendors to sell their products to non-Adobe clients and even to compete with Adobe to deliver CDP functions to Adobe’s own clients.

I’ll admit I have a vested interest in the success of the CDP market, as inventor of the term and founder of the CDP Institute. So I’m not entirely objective here. But as CDP has climbed to the peak of the hype cycle, I’ve been exquisitely aware that it has no place to go but down – and that this is inevitable. The best CDP vendors can hope for is to exchange being a “hot product” for being an established category – something that people recognize as a standard component of a complete marketing architecture, alongside other components such as CRM, marketing automation, and Web content management. I’ve long felt that the function provided by CDP – a unified, persistent, sharable customer database – fills a need that won’t go away, regardless of whether the need is filled by stand-alone CDPs or components of larger suites like Adobe Experience Cloud. In other words, the standard diagram will almost surely include a box with that database; the question is whether the label on that box will be CDP. Adobe’s move makes it more likely the diagram will have that box. It’s up to the CDP industry to promote their preferred label.

________________________________________________________________________
*okay, the first page of a Google search. No Pulitzer Prize for this one.
** yes, I’ve just combined references to Karl Marx and William Shakespeare in the same paragraph, garnished with a freshly mixed metaphor. You’re welcome.

Friday, February 02, 2018

Celebrus CDP Offers In-Memory Profiles

It’s almost ten years to the day since I first wrote about Celebrus, which then called itself speed-trap (a term that presumably has fewer negative connotations in the U.K. than the U.S.). Back then, they were an easy-to-deploy Web site script that captured detailed visitor behaviors. Today, they gather data from all sources, map it to a client-tailored version of a 100+ table data model, and expose the results to analytics and customer engagement systems as in-memory profiles.

Does that make them a Customer Data Platform? Well, Celebrus calls itself one – in fact, they were an early and enthusiastic adopter of the label. More important, they do what CDPs do: gather, unify, and share customer data. But Celebrus does differ in several ways from most CDP products:

- in-memory data. When Celebrus described their product to me, it sounded like they don’t keep a persistent copy of the detailed data they ingest. But after further discussion, I found they really meant they don’t keep it within those in-memory profiles. They can actually store as much detail as the client chooses and query it to extract information that hasn't been kept in memory. The queries can run in real time if needed. That’s no different from most other CDPs, which nearly always need to extract and reformat the detailed data to make it available. I’m not sure why Celebrus presents themselves this way; it might be that they have traditionally partnered with companies like Teradata and SAS that themselves provided the data store, or that they partnered with firms like Pega, Salesforce, and Adobe that positioned themselves as the primary repository, or simply to avoid ruffling feathers in IT departments that didn't want another data warehouse or data lake. In any case, don’t let this confuse you: Celebrus can indeed store all your detailed customer data and will expose whatever parts you need.

- standard data model. Many CDPs load source data without mapping it to a specific schema. This helps to reduce the time and cost of implementation. But mapping is needed later to extract the data in a usable form. In particular, any CDP needs to identify core bits of customer information such as name, address, and identifiers that connect records related to the same person. Some CDPs do have elaborate data models, especially if they’re loading data from specific source systems or are tailored to a specific industry. Celebrus does let users add custom fields and tables, so its standard data model doesn’t ultimately restrict what the system can store.

- real-time access. The in-memory profiles allow external systems to call Celebrus for real-time tasks such as Web site personalization or bidding on impressions.. Celebrus also loads, transforms, and exposes its inputs in real time. It isn't the only CDP to do this, but it's one of just a few..

Celebrus is also a bit outside the CDP mainstream in other ways. Their clients have been largely concentrated in financial services, while most CDPs have sold primarily to online and offline retailers. While most CDPs run as a cloud-based service, Celebrus supports cloud and on-premise deployments, which are preferred by many financial services companies. Most CDPs are bought by marketing departments, but Celebrus is often purchased by customer experience, IT, analytics, and digital transformation teams and used for non-marketing applications such as fraud detection and system performance monitoring.

Other Celebrus features are found in some but not most CDPs, so they’re worth noting if they happen to be on your wish list. These include ability to scan for events and issue alerts; handling of offline as well as online identity data; and specialized functions to comply with the European Union’s GDPR privacy rules.

And Celebrus is fairly typical in limiting its focus to data assembly functions, without adding extensive analytics or customer engagement capabilities. That's particularly common in CDPs that sell to large enterprises, which is Celebrus' main market. Similarly, Celebrus is typical in providing only deterministic matching functions to assemble customer data.

So, yes, Celebrus is a Customer Data Platform. But, like all CDPs, it has its own particular combination of capabilities that should be understood by buyers who hope to find a system that fits their needs.

As I already mentioned, Celebrus is sold mostly to large enterprises with complex needs. Pricing reflects this, tending to be "in the six or seven figures" according the company and being based on input volume, types of connected systems, and license model (term or perpetual, SaaS, on-premise, or hybrid). The company hasn’t released the number of clients but says it gathers data from "tens of thousands" of Web sites, apps, and other digital sources. Celebrus has been owned since 2011 by D4T4 Solutions (which looks like the word “data” if you use the right type face), a firm that provides data management services and analytics.

Tuesday, January 02, 2018

What's Next for Customer Data Platforms? New Report Offers Some Clues.

The Customer Data Platform Institute released its semi-annual Industry Update today. (Download it here). It’s the third edition of this report, which means we now can look at trends over time. The two dozen vendors in the original report have grown about 25% when measured by employee counts in LinkedIn, which is certainly healthy although not the sort of hyper growth expected from an early stage industry. On the other hand, the report has added two dozen more vendors, which means the measured industry size has doubled. Total employee counts have doubled too. Since many of the new vendors were outside the U.S., LinkedIn probably misses a good portion of their employees, meaning actual growth was higher still.

The tricky thing about this report is that the added vendors aren’t necessarily new companies. Only half were founded in 2014 or later, which might mean they’ve just launched their products after several years of development. The rest are older. Some of these have always been CDPs but just recently came to our attention. This is especially true of companies from outside the U.S. But most of the older firms started as something else and reinvented themselves as CDPs, either through product enhancements or simply by adopting the CDP label.

Ultimately it’s up to the report author (that would be me) to decide which firms qualify for inclusion. I’ve done my best to list only products that actually meet the CDP definition.* But I do give the benefit of the doubt to companies that adopted the label. After all, there’s some value in letting the market itself decide what’s included in the category.

What’s most striking about the newly-listed firms is they are much more weighted towards customer engagement systems than the original set of vendors. Of the original two dozen vendors, eleven focused primarily on building the CDP database, while another six combined database building with analytics such as attribution or segmentation. Only the remaining seven offered customer engagement functions such as personalization, message selection, or campaign management. That’s 29%.**

By contrast, 18 of the 28 added vendors offer customer engagement – that’s 64%. It’s a huge switch. The added firms aren’t noticeably younger than the original vendors, so this doesn’t mean there’s a new generation of engagement-oriented CDPs crowding out older, data-oriented systems. But it does mean that more engagement-oriented firms are identifying themselves as CDPs and adding CDP features as needed to support their positioning. So I think we can legitimately view this as validation that CDPs offer something that marketers recognize they need.

What we don’t know is whether engagement-oriented CDPs will ultimately come to dominate the industry. Certainly they occupy a growing share. But the data- and analysis-oriented firms still account for more than half of the listed vendors (52%) and even higher proportions of employees (57%), new funding (61%) and total funding (74%). So it’s far from clear that the majority of marketers will pick a CDP that includes engagement functions.

So far, my general observation has been that engagement-oriented CDPs appeal more to mid-size firms while data and analysis oriented CDPs appeal most to large enterprises. I think the reason is that large enterprises already have good engagement systems or prefer to buy such systems separately. Smaller firms are more likely to want to replace their engagement systems at the same time they add a CDP and want to tie the CDP directly to profit-generating engagement functions. Smaller firms are also more sensitive to integration costs, although those should be fairly small when CDPs are concerned.

There’s nothing in the report to support or refute this view, since it doesn’t tell us anything about the numbers or sizes of CDP clients. But assuming it’s correct, we can expect engagement-oriented vendors to increase their share as more mid-size companies buy CDPs. We can also expect engagement-oriented systems to be more common outside the U.S., where companies are generally smaller. For what it’s worth, the report does confirm that’s already the case.

If the market does move towards engagement-oriented systems, will the current data and analytics CDPs add those features? That’s another unknown. There’s already been some movement: four of the original eleven data-only CDPs have added analytics features over the past year. But it’s a much bigger jump to add customer engagement features, and sophisticated clients won’t accept a stripped-down engagement system. We might see some acquisitions if the large data and analytics vendors want to add those features quickly. But those firms must also be careful about competing with the engagement vendors they currently connect with. Nor are they necessarily eager to lose their differentiation from the big marketing clouds. Nor is there much attraction to entering the most crowded segment of the market with a me-too product.

So most data and analytics vendors may well limit their themselves to their current scope and invest instead in improving their data and analytics functions. That will limit them to the upper end of the market but it's where they sell now and offers plenty of room for growth. Certainly there’s a great deal of room for improved machine learning, attribution, scalability, speed, and automated data management. If I had to bet, I’d expect most data and analytics vendors to focus on those areas.

But I don’t have to bet and neither do you. So we’ll just wait to see what comes next. It will surely be interesting.

_________________________________________________________________________
*CDP is defined as a marketer-controlled system that builds a persistent, unified customer database that is accessible by other systems.

**To further clarify, customer engagement systems select messages for individuals or segments. Analytics systems may create segments but don't decide which messages go to which segment. And execution systems, such as email engines, Web content management, or mobile app platforms, deliver the selected messages.

Friday, September 08, 2017

B2B Marketers Are Buying Customer Data Platforms. Here's Why.

I’m currently drafting a paper on use of Customer Data Platforms by B2B SaaS marketers. The topic is more intriguing than it sounds because it raises the dual questions of why CDPs haven’t previously been used much by B2B SaaS companies and what's changed. To build some suspense, let’s first review who else has been buying CDPs.

We can skip over the first 3.8 billion years of life on earth, when the answer is no one. When true CDPs first emerged from the primordial ooze, their buyers were concentrated among B2C retailers. That’s not surprising, since retailers have always been among the data-driven marketers. They’re the R in BRAT (Banks, Retailers, Airlines, Telcos), the mnemonic I’ve long used to describe the core data-driven industries*.

What's more surprising is that the B's, A's, and T's weren't also early CDP users. I think the reason is that banks, airlines, and telcos all capture their customers’ names as part of their normal operations. This means they’ve always had customer data available and thus been able to build extensive customer databases without a CDP.

By contrast, offline retailers must work hard to get customer names and tie them to transactions, using indirect tools such as credit cards and loyalty programs. This means their customer data management has been less mature and more fragmented. (Online retailers do capture customer names and transactions operationally. And, while I don’t have firm data, my impression is that online-only retailers have been slower to buy CDPs than their multi-channel cousins. If so, they're the exception that proves the rule.)

Over the past year or two, as CDPs have moved beyond the early adopter stage, more BATs have in fact started to buy CDPs. As a further sign of industry maturity, we’re now starting to see CDPs that specialize in those industries. Emergence of such vertical systems is normal: it happens when demand grows in new segments because the basic concepts of a category are widely understand. Specialization gives new entrants as a way to sell successfully against established leaders. Sure enough, we're also seeing new CDPs with other types of specialties, such as products from regional markets (France, India, and Australia have each produced several) and for small and mid-size organizations (not happening much so far, but there are hints).

And, of course, the CDP industry has always been characterized by an unusually broad range of product configurations, from systems that only build the central database to systems that provide a database, analytics, and message selection; that's another type of specialization. I recently proposed a way to classify CDPs by function on the CDP Institute blog.**

B2B is another vertical. B2B marketers have definitely been slow to pick up on CDPs, which may seem surprising given their frenzied adoption of other martech. I’d again explain this in part by the state of the existing customer data: the more advanced B2B marketers (who are the most likely CDP buyers) nearly all have a marketing automation system in place. The marketers' initial assumption would be that marketing automation can assemble a unified customer database, making them uninterested in exploring a separate CDP. Eventually they'd discover that nearly all B2B marketing automation systems are very limited in their data management capabilities. That’s happening now in many cases – and, sure enough, we’re now seeing more interest among B2B marketers in CDPs.

But there's another reason B2B marketers have been uncharacteristically slow adopters when it comes to CDPs. B2B marketers have traditionally focused on acquiring new leads, leaving the rest of the customer life cycle to sales, account, and customer success teams. So B2B marketers didn't need the rich customer profiles that a CDP creates. Meanwhile, the sales, account and customer success teams generally worked with individual and account records stored in a CRM system, so they weren't especially interested in CDPs either. (That said, it’s worth noting that customer success systems like Gainsight and Totango were on my original list of CDP vendors.)

The situation in B2B has now changed. Marketers are taking more responsibility for the entire customer life cycle and work more closely with sales, account management, and customer success teams. This pushes them to look for a complete customer view that includes data from marketing automation, CRM, and additional systems like Web sites, social media, and content marketing. That quest leads directly to CDP.

Can you guess who's leading that search? Well, which B2B marketers have been the most active martech adopters? That’s right: B2B tech marketers in general and B2B SaaS product marketers in particular. They’re the B2B marketers who have the greatest need (because they have the most martech) and the greatest inclination to try new solutions (which is why they ended up with the most martech). So it’s no surprise they’re the earliest B2B adopters of CDP too.

And do those B2B SaaS marketers have special needs in a CDP? You bet. Do we know those needs are? Yes, but you’ll have to read my paper to find out.

_______________________________________________________
*It might more properly be FRAT, since Banking really stands for all Financial services including insurance, brokers, investment funds, and so on. Similarly, Airlines represents all of travel and hospitality, while Telco includes telephone, cable, and power utilities and other subscription networks. We should arguably add healthcare and education as late arrivals to the list. That would give us BREATH. Or, better still, replace Banks with Financial Services and you get dear old FATHER.

**It may be worth noting that part of the variety is due to the differing origins of CDP systems, which often started as products for other purposes such as tag management, big data analytics, and campaign management. That they've all ended up serving roughly the same needs is a result of convergent evolution (species independently developing similar features to serve a similar need or ecological niche) rather than common origin (related species become different over time as they adapt to different situations). You could look at new market segments as new ecological niches, which are sometimes filled by specialized variants of generic products and are other times filled by tangentially related products adapting to a new opportunity.

My point here is there are two separate dynamics at play: the first is market readiness and the second is vendor development. Market readiness is driven by reasons internal to the niche, such as the types of customer data available in an industry. Vendor development is driven by vendor capabilities and resources. One implication of this is that vendors from different origins could end up dominating different niches; that is, there's no reason to assume a single vendor or standard configuration will dominate the market as a whole. Although perhaps market segments served by different configurations are really separate markets.

Wednesday, January 18, 2017

Customer Data Platform Industry Profile: A Look Inside the Numbers

My snarky twin at the Customer Data Platform Institute just published a new report on the CDP industry. Since few industry vendors release financial or business details, the report relies on public sources including Owler for revenue estimates, Crunchbase for funding history, and LinkedIn for employee counts. Most vendors did provide client counts, and several privately shared other information where the public data was clearly wrong. You can download the report here. I'll wait while you do that. (Sound of fingers tapping.)

Okay, you've downloaded it, right? Good.

As you see, the report only presents figures for the industry as a whole. We feel those are reasonably accurate but that data for individual vendors are too unreliable to show separately. That may sound illogical but bear in mind that figures for the larger vendors are more reliable, so many errors that are significant for individual small vendors don’t materially change the total. Also remember that some vendors provided information in confidence and we made estimates of our own for some others.

I do feel I can safely publish statistics for three groups within the industry. This gives some additional insight without exposing any proprietary or misleading vendor data. The groups are based on each vendor's original business. They are:

Tag managers. This may seem an unlikely starting point, but it actually makes sense. Tag management was originally about collecting data once (when a Web page loaded) and then sharing it with other systems that would otherwise have their own tags. This gave the Web site owner more control over what went where and reduced the number of tags on each page. The data sharing was similar to what happens in integration platforms/data hubs like Jitterbit and Zapier. So tag managers were always about data distribution. To become true CDPs, the tag vendors had to ingest data from additional sources and send the data to a persistent database. Ingesting new sources can be challenging but vendors could grow incrementally by choosing which sources to accept. Feeding a persistent data is basically just adding a new destination for data sharing. So the transition to CDP offered a reasonable path to escape being a commodity tag manager.

Campaign managers. I’m using this term loosely to include companies that offered any sort of marketing message selection. It includes systems that do email, Web site messages, mobile app messages, and omnichannel campaigns. These vendors all started out as CDPs in the sense that they always built unified customer databases. Among other things, this meant that most included reasonably robust cross-channel identity resolution. These vendors didn’t necessarily start by sharing their database with other systems. But they do it now or I wouldn't consider them a CDP.

Data assembly systems. This is a bit of a catch-all category but almost every system in this group was designed primarily to create a customer database that would be accessible to other systems. Intended uses included analytics, marketing execution, or both. (I say "almost" because the group includes two systems that built databases primarily to support their own attribution services.) There’s more variety within this group than the other two. But many vendors provide advanced identity resolution and all are strong at providing external access.

Here are key statistics for each group.

original purpose:	vendors	funding	2016 revenue	customer count	revenue / customer	employee count	revenue / employee
Tag Management	6	$356 million	$118 million	13,500	$9,000	840	$141,000
Campaign Management	8	$106 million	$108 million	1,000	$108,000	520	$207,000
Data Assembly	13	$246 million	$100 million	3,000	$34,000	920	$108,000

Here are some observations:

Revenue is split about evenly among the three groups. That’s a bit surprising because tag management is an older and more established category than the others, so you might have expected it to have more revenue. Vendors in the other categories do tend to be newer, smaller, and growing more quickly.

Tag management vendors have many more customers and earn less revenue per customer. This largely reflects the original tag management products, which are sold to many non-enterprise customers. But the tag management vendors also have hundreds of enterprise clients. Many of those clients are building the large-scale customer databases we expect to call a CDP. The tag management group also includes a couple of vendors who specialize in building CDPs for smaller companies. These are not as expensive as the enterprise installations. The campaign management vendors average about $100,000 per customer, which is what you'd expect for an enterprise CDP. Revenue per customer is just $34,000 for the data assembly vendors, but that's largely due to one vendor with 2,000 clients. Without that vendor, revenue per customer figures for the data assembly group would be $88,000. Backing out non-CDP clients is why the report puts the number of CDP customers for the entire industry at 2,500.

Revenue per employee is generally in line with what we expect to see at growing Software as a Service companies. The standout here is the campaign management group, which has an impressively high figure of $207,000 per employee. This suggests the campaign managers have a high value-added business. The relatively low amount of external funding is more evidence that campaign managers throw off considerable cash from their own operations. The much lower revenue per employee for the data assembly companies, $108,000, is more typical of new SaaS ventures. Indeed, several of these are just starting to earn revenue from their first clients. These data assembly companies have attracted considerably more funding than the campaign managers, giving them a cushion to invest in growth. (If you’re wondering about that company with 2,000 clients, its revenue per employee is similar to others in its group.)

There are other nuances to consider in assessing these figures. For example, several vendors do business through agencies, which makes it harder to count clients and to compare revenue per client. But the over-all picture that emerges is a healthy industry that is already attracting substantial revenue and funding.

The report projects a 50% annual growth rate, which yields an estimated $1 billion revenue for 2019. The projection is based on public and private reported growth rates, which actually averaged much higher than 50% on a revenue-weighted basis. The report used 50% to be conservative. While past performance doesn't guarantee future growth, I think CDP revenues will if anything accelerate because most marketers still don't realize what a CDP can do for them. As more of them get the message, CDP adoption should skyrocket. So the future is bright indeed.

Wednesday, November 09, 2016

ActionIQ Merges Customer Data Without Reformatting

One of the fascinating things about the Customer Data Platform Institute is how developers from different backgrounds have converged on similar solutions. The leaders of ActionIQ, for example, are big data experts: Tasso Argyros founded Aster Data, which was later purchased by Teradata, and Nitay Joffe was a core contributor to HBase and the data infrastructure at Facebook. In their previous lives, both saw marketers struggling to assemble and activate useful customer data. Not surprisingly, they took a database-centric approach to solving the problem.

What particularly sets ActionIQ apart is the ability to work with data from any source in its original structure. The system simply takes a copy of source files as they are, lets users define derived variables based on those files, and uses proprietary techniques to query and segment against those variables almost instantly. It’s the scalability that’s really important here: at one client, ActionIQ scans two billion events in a few seconds. Or, more precisely, it’s the scalability plus flexibility: because all queries work by re-reading the raw data, users can redefine their variables at any time and apply them to all existing data. Or, really, it's scalability, flexibility, and speed, because new data is available within the system in minutes.

So, amongst ActionIQ’s many advantages are scalability, flexibility, and speed. These contrast with systems that require users to summarize data in advance and then either discard the original detail or take much longer to resummarize the data if a definition changes.

ActionIQ presents its approach as offering self-service data access for marketers and other non-technical users. That’s true insofar as marketers work with previously defined variables and audience segments. But defining those variables and segments in the first place takes the same data wrangling skills that analysts have always needed when faced with raw source data. ActionIQ reduces work for those analysts by making it easier to save and reuse their definitions. Its execution speed also reduces the cost of revising those definitions or creating alternate definitions for different purposes. Still, this is definitely a tool for big companies with skilled data analysts on staff.

The system does have some specialized features to support marketing data. These include identity resolution tools including fuzzy matching of similar records (such as different versions of a mailing address) and chaining of related identifiers (such as a device ID linked to an email linked to an account ID). It doesn’t offer “probabilistic” linking of devices that are frequently used in the same location although it can integrate with vendors who do. ActionIQ also creates correlation reports and graphs showing the relationship between pairs of user-specified variables, such as a customer attribute and promotion response. But it doesn’t offer multi-variable predictive models or machine learning.

ActionIQ gives users an interface to segment its data directly. It can also provide a virtual database view that is readable by external SQL queries or PMML-based scoring models. Users can also export audience lists to load into other tools such as campaign managers, Web ad audiences, or Web personalization systems. None of this approaches the power of the multi-step, branching campaign flows of high-end marketing automation systems, but ActionIQ says most of its clients are happy with simple list creation. Like most CDPs, ActionIQ leaves actual message delivery to other products.

The company doesn’t publicly discuss the technical approach it takes to achieve its performance, but they did describe it privately and it makes perfect sense. Skeptics should be comforted by the founders’ technical pedigree and demonstrated actual performance. Similarly, ActionIQ asked me not to share screen shots of their user interface or details of their pricing. Suffice to say that both are competitive.

ActionIQ was founded in 2014 and has been in production with its pilot client for over one year. The company formally launched its product last month.

Saturday, August 08, 2015

Intent Data Basics: Where It Comes From, What It's Good For, What To Test

Intent data is a marketer’s dream come true: rather than advertising to mass audiences in the hope of getting a handful of active buyers to identify themselves, just buy a list of those buyers and talk to them directly. It lops a whole layer off the top of the funnel and finally lets you discard the wasted half of your advertising.

But intent data is a complicated topic. It comes from different places, has different degrees of accuracy and coverage, and can be used in many different ways. Here’s a little primer on the topic.

What is intent data? It’s data that tells you an addressable individual is interested in your product. Ideally, that individual will be identifiable, meaning you have an email address, postal address, device ID, mobile app registration, phone number, or other piece of information that tells you who they are and lets you communicate with them directly. But sometimes you may only have an anonymous cookie or segment identifier that lets you reach them through only one channel and not by name.

Where does it come from? Lots of places. Behavior your company captures by itself is first-party intent data. This is probably the most reliable but only applies to people you already know. Behavior captured by others is third-party intent data. It's the most interesting because it provides new names or new information about names already in your database. Search queries are an obvious indicator of intent, but search engine vendors don’t sell lists based on query terms because they’d rather sell you ads based on those terms. So most third-party intent data is based on visits to Web pages whose content attracts prospective buyers of specific products. A smaller portion comes from other behaviors such as downloads, email clicks, or social media posts.

How is it sold? There are two primary formats. The first is ads served to people who have shown intent; this is generally called retargeting. The people are identified by cookies or device IDs and can be shown ads on any Web site that is part of the retargeter’s network. The original action could happen on your own Web site – the classic example is an abandoned shopping cart – or on some other site. Intent-based ads on social networks work roughly the same way, although the users are known individuals because they had to sign in. Retargeting is arguably a form of advertising and therefore not intent “data” at all. But I’m including it here because so many retargeting vendors describe their products in terms of intent. The second format is lists of email addresses. These are clearly data. The addresses are usually gathered through user registration with the Web site. Alternatively, the email can be derived from the cookie or device through matching services like Acxiom LiveRamp.

How reliable is it? Good question. Hopefully everyone reading knows that data isn’t perfect. But intent data is especially dicey because it comes from many different sources, some of which may be stronger indicators of intent than others. Users generally can’t see the original source. In addition, the data is aggregated by assigning the original Web content to standardized categories. These may not precisely match your own products, or they may be grouped together so that some highly relevant intent is mixed with a lot of less relevant intent. These problems are especially acute for B2B marketers, whose products may have a very narrow focus. There’s also an issue of freshness: intent can change rapidly, either because someone already made their purchase or because their interests have shifted. So behavior that isn’t gathered and processed quickly may be obsolete by the time it reaches the marketer.

How complete is it? It’s worth distinguishing completeness from reliability because completeness is a big problem on its own. Intent vendors won’t necessarily capture every person in market for a particular product. In fact, depending on the situation, they may capture just a small fraction. Some people may not visit any site in the aggregator’s network; some may not visit often enough to register the required level of interest; some may decline to provide their identity or delete their cookies. In some businesses, reaching only a small fraction of interested buyers is still very useful; in others – especially where there are relatively few buyers to begin with – the marketer may be forced to run her own outreach programs to capture as many as possible. In that case, the intent vendor's list would probably not include enough additional new names to be worth buying.

How do you use it? More ways than you might think. It’s tempting just treat intent-based lists as sales leads. But often the quality isn't high enough for that. So the intent lists are often considered prospects to be touched through email, targeted advertising, and other low-cost media. Similarly, retargeting ads can be used to make hard sales offers or to more gently present brand messages and name-capturing content. Other applications include using presence on an intent list as a data point in a lead score, reaching out to dormant leads or current customers who suddenly register on intent lists, and tailoring messages based on the which topics the intent vendor finds an individual is consuming.

How do you test it? On the simplest level, you just apply the intent data to whatever type of program you’re testing (sales qualification, prospecting, lead scoring, reactivation, personalization, etc.) and read the results. Where things get a little tricky is figuring out which of the names would have registered as leads through some other channel, since they should be excluded from the analysis. Similarly, you need to carefully test how new programs like lead scoring, reactivation, or topic selection programs would have performed without the intent data – it may be the good idea was the program itself. As a general rule of thumb, expect your own data to gain power as you build a longer history, so intent data is most likely to prove valuable on names early in the buying process.

Who sells it? Consumer marketers have a wide variety of intent sources, including Nielsen eXelate, Oracle Datalogix, and Neustar for lists and AdRoll, Retargeter, Fetchback, Chango and Magnetic for retargeting. B2B marketers can work with Bombora, The Big Willow, TechTarget, IDG, and Demandbase.

Thursday, September 19, 2013

New Study: Three Types of Customer Data Platform Address Cross-Channel Marketing Needs

My detailed study of Customer Data Platforms should be released next week. Now that the information is assembled, I can at last pull back and get a good overview of what I’ve found.

Perhaps the most interesting discovery has been that the CDP vendors cluster into three main groups.

• B2B data enhancement. These build a large reference database of companies and employees, which they match against records imported from their clients. They generally return corrected and enhanced data and lead scores based on models built from the client’s customer files. Their reference databases are built from multiple public, commercial, and proprietary sources, and are assembled using sophisticated matching engines. Most also perform their own scans of Web sites and social networks to extract sales-relevant information such as technology use and changes that suggest buying opportunities. These vendors vary considerably in the data they return, ranging from lead scores only to recommended marketing treatments to full customer profiles. Some also provide prospect lists of companies that are not already in the client’s own database. CDP vendors in this group include Infer, Lattice Engines, Mintigo, and ReachForce.

These systems compete with non-CDP products which also add or enhance prospect records but do not maintain a database with their clients’ customers. These include Web scanning systems such as InsideView, LeadSpace, and SalesLoft, and general data compilers including NetProspex, Demandbase, Data.com, ZoomInfo, and OneSource. The predictive modeling features also compete to some degree with end-user-oriented marketing analytics and modeling software such as Birst, GoodData, Cloud9 Analytics, AutoBox, and Predixion Software. Data cleansing competitors include services from firms such as D&B, as well as data management software for technical users such as Informatica, Experian QAS, and FullContact.

• Campaigns. These systems build a multi-source marketing database from the client’s own data and either recommend marketing treatments to execution systems or execute marketing campaigns directly. These are primarily used for consumer marketing although they also have B2B clients. Most have sophisticated matching capabilities. This group includes Silverpop with its Universal Behavior feature, NICE’s Causata, AgilOne, and RedPoint.

This group competes with conventional consumer marketing automation products, which provide similar campaign management abilities but lack the CDPs' database flexibility, database management, and customer matching features.

• Audience management. These systems build a database of customers and their responses to online display advertisements. They then build models that predict the customers’ probability of responding to future advertisements and provide recommendations for how much to bid and which content to display. These systems perform the same basic functions as standard online audience management systems (Data Management Platforms, or DMPs) and provide the same very quick responses needed for real time bidding (usually under 100 milliseconds). The major difference is that they also recommend messages in other channels, such as Web site personalization or email campaigns. Like DMPs, they work primarily at the Web cookie level, can link cookies known to relate to the same customer, and can be linked to actual customer names and addresses in external systems. This group includes IgnitionOne, [x+1], and Knotice.

This group overlaps with recommendation and ad targeting engines and DMP systems. Those products provide similar functions but do not track identified individuals and are often limited to single channel executions.

Given that each group addresses a different business need, you might wonder why I think they should all be lumped together under the CDP label. Quite simply, it’s because they are all addressing a portion of the same larger problem, which is how marketers can get a complete view of their customers and use that view to coordinate treatments across channels. What marketers truly need is a combination of the features from each group: data enhancement from external sources, for consumers as well as B2B; sophisticated customer matching and treatment selection; and integration of online advertising audiences with traditional customer databases. Each of these systems has the potential to grow into a complete solution, and the normal dynamics of software industry growth will push them towards pursuing that potential. So I expect the categories to overlap increasingly over the next few years and eventually merge into complete Customer Data Platforms as I envision them.

Incidentally and tangentially related: I'll be giving a Webinar with ReachForce on October 2 on Data Quality for Hipsters, a name that started as a joke but does make the point that data quality is essential for cutting-edge marketing. YOLO, so you might as well attend. I'm already working on the mustache.

Wednesday, February 29, 2012

SAS Unveils High Performance Analytics Technology

I spent the early part of this week at SAS’s annual analysts conference, where the company reviewed the past year and presented its vision for 2012. The story this year was simple: “big data”, and SAS’s “high performance analytics” approach to taming it.

Of course, “high performance analytics” is what SAS has always done and, like “big data” itself, the term is relative. What SAS specifically presented was a re-engineering of its core analytical procedures to run in “shared nothing” multi-processor environments. Each data set is split into pieces that are loaded into separate units, processed independently and simultaneously, and then brought together for a result. SAS cited tremendous performance improvements, such as reducing the time to build a loan default model on a billion rows of records from 11 hours to 50 seconds. This obviously makes possible new, tactical applications.

The high performance architecture is becoming available in stages as each SAS procedure is rewritten to support it. The change from a customer perspective is purposely minimal: existing SAS procedure calls are simply modified by adding a "HP" prefix. This will make it easy for clients to take advantage of the new capabilities.

The company revealed a just a few new products at the conference, most notably a Visual Analytics tool that uses in-memory processing to render billion-row data sets in seconds. But the real benefit of high performance will come less from new products than from using it with existing SAS procedures and tools. The SAS product that may benefit most of all is SAS Decision Management, which creates rule-based decision flows that can call on scoring models and other analytics to help guide tactical processes. The product itself isn’t new, but high-performance analytics will let it do new things.

SAS’s “big data” story also included Hadoop integration and expanded cloud deployments. By the end of March (if I understood the roadmap correctly), SAS will be able to read from and write to Hadoop data sets, embed Hadoop commands within SAS scripts, and send SAS metadata to Hadoop. Over the coming year, it will support cloud deployments through a variety of enhancements related to virtualization, open APIs, and eventually an app marketplace. The cloud-based initiatives also support SAS’s own on-demand business, which grew 57% last year to reach more than $100 million.

These are all positive developments for SAS, which must certainly support "big data" to remain relevant. The new capabilities will also create some business changes as SAS competes more directly with companies like IBM and Oracle to embed analytics within operational processes. SAS itself noted the company is now more involved in architectural discussions of how its systems interact with the rest of the enterprise infrastructure. Other issues may include educating non-technical users and providing technology to protect privacy. SAS leaders seem to think they can leave those issues to others, but I’m not so sure.

The conference produced little news directly related to marketing systems. The company reports 38% growth in marketing applications – which it reports under the label of “customer intelligence” – so that is clearly a healthy business. But the product road maps showed just incremental improvements of existing products, without any major new offerings. Again, high-performance analytics will make new things possible without other changes in the products themselves. The high performance version of marketing optimization is due by the end of the year.

If you want more evidence of how little attention was paid to marketing systems: SAS's biggest recent piece of marketing-related news, last week’s acquisition of online ad server aiMatch, got exactly one mention during the day-long presentation and was positioned as simply filling a small gap in the marketing product line. The company did announce, very casually, that aiMatch would be extended to include ad buying optimization as well as its current ad-selling optimization. That struck me as a pretty big deal, since ad buying is the heart of an already-huge industry that’s clearly the future of marketing. Then again, it’s also an intensely competitive, heavily-funded space that’s crawling with advanced technologies. Although SAS's high performance analytics could have a huge impact on ad serving, that won't happen unless SAS makes a major commitment of people and money. We’ll see whether they make one.

Sunday, May 15, 2011

DIGIDAY:TARGET, or, Yogi Berra Meets Data in the Online World.

I was scheduled to attend the DIGIDAY:TARGET conference on May 4 but wasn't able to be there. (Download the conference agenda and presentations.) Happily, my colleague and big-data guru Matt Doering was able to take my place. Here are Matt's thoughts:

Yogi Berra meets data in the online world.

At the recent Digiday:Target conference (Park Central Hotel, NYC, May 4 2011) a moderator posed the question “Which is better: More Data, Consistent Data or Data Expertise”. Not surprisingly there was a wide variety of opinions both from the panel as well as from many attendees I talked to later in the day. Many I listened to were really intrigued and conflicted by this question. To understand the real answer let us first review the pros and cons of the three possible answers.

Background

More Data – Large volumes of data from varied sources.
Pros:
• Richer data content from any given data source.
• Data sources tend to enrich each other, if properly managed.
• More likely to find the outliers that many times can be the real profit makers.

Cons:
• Many companies don’t have the resources to handle very large volumes of data.
• Lack of Metadata about data sources.
• No real experience with merging multiple data sources with different element codes and timeframes.
• Data hygiene can be an issue if you are working with a data set that is new to your organization.

Consistent Data – All data conforms to some industry standard. Any data not conforming to the model is discarded or reduced.
Pros:
• All data is easily understood and documented in a Metadata stack.
• Data hygiene is easy to define and enforce.
• Data processing performance profiles are well understood. This makes it very easy to scope a system or project.

Cons:
• Let’s admit it; all homogenized milk tastes the same. Where is the differentiation potential?
• In the process of conforming to a standard more detailed data is lost. For example if the industry standard requires that age elements be bucketed into 10 year breaks what happens if for your product offering you need 6.5 year breaks?

Data Expertise – Deep experience with very large data sets.
Pros:
• Small data, large data, inconsistent data are not a problem. Expertise can handle all these issues.
• These resources understand the role that standardized data plays in data analysis (like a good coat of primer on a wall) but also know that the real value is in what is different.
• Most data experts love to teach so the entire data IQ of you organization increases.
• Able to distinguish between dirty data and gold nuggets.

Cons:
• These resources can be hard to find. It’s not a matter of having the right degree its more of who they are. Just as simply having a degree in fine arts doesn’t make you an artist a degree in stats doesn’t make you a good data scientist. In fact one of the best data scientists I know never took a stats course.

“Its déjà vu all over again”

Yogi had it right. If, as I strongly believe, data expertise is of critical importance for the media world it’s not the first industry where this is true. A number of industries over the past 25 years have had to deal with the “big data” problem. Early examples of this are the classic CPG scanner data, pharmaceutical detailing data and financial services direct marketing data sets. All these industries faced large and diverse data issues and they all succeeded in overcoming the problem with technique not CPU.

Now it might be tempting to claim that our space generates significantly higher volumes of data or more diverse data, but is that really true? At first this appears to be true, but when you factor in the computing power available at the times it is not that far fetched to say the adjusted data volumes are actually very similar. Keep in mind that the data scientists of those days were working with computers with less horse power and memory then the average iPad used by the majority of attendees at Digiday:Target.

So where do you find this expertise? Look to the industries named above. Membership of the Direct Marketing Association and those who attended the NCDM (National Center for Database Marketing) is a good place to start. Look for people from the telecommunications industry who helped build systems to analyze Call Detail Records (CDRs). Experience in genome sequencing and pairing should grab your attention. Do these people know clicks from conversions? Probably not, but on the other hand for them more data is the breath of life. We need to recruit the talent that is out there into the industry and avoid having to reinvent it “all over again”.

Customer Experience Matrix