I wish I could come up with a really cool reason for that, but it’s only because NGData just recently came to my attention. The company itself has been serving clients in late 2012. As my classification suggests, they are in the business of assembling client data from multiple systems, using it to build detailed customer profiles, and making the results available to execution systems like campaign management, call centers, Web sites, and mobile apps.
The technology involved is Hadoop as the primary data store and HBase to expose the profiles to external systems. Data is unified mostly with customer IDs supplied by the source systems, although NGData can also build cross reference tables to associate related IDs and do some probabilistic matching to link related devices. The profiles include both raw data and calculated metrics such as trends, exceptions, signals, affinities, predictions of fraud risk, churn likelihood, and next product to buy. The predictions can be based on conventional predictive methods like regression or on integrated machine learning. Often the company is able to use regression models that the client has built already for other data sources. The base version of the company’s system, called Lily Enterprise, has about 700 such metrics, and clients can add their own. (There’s also an open source version of Lily, which has only the data management components of Lily Enterprise.) External systems access profiles via SQL queries against HBase, API calls, or file transfers.
This may all sound pretty straightforward and should be familiar to readers of this blog. But out there is the real world, most companies are still struggling with the challenges of assembling customer data and making it accessible. Those firms will find this is pretty novel stuff. NGData stresses its ability to easily add new data sources and to retain huge amounts of detail, something it inherits from Hadoop. Complex metrics like exceptions and trends are also relatively unusual. They make it much easier for execution systems to act intelligently, since the hard work of surfacing opportunities and selecting responses is largely done in advance. Predefining those metrics as part of the base system speeds initial deployment. One proof point: NGData says clients can typically start running programs based on its system in four to five months, and sometimes as quickly as three months. Conventional data warehouse projects usually take more than one year and many are never delivered.*
NGData stresses that marketers get better results when they can look at all the data associated with each individual, rather than treating people as members of broad segments. Again, no reader of this blog will disagree. But I did still like their example of using behaviors to identify people who are likely to call customer support with a particular problem, and then preempting those calls with personalized messages about how to solve the problem. Everybody wins: the customer appreciates the proactive service, and company fields fewer phone calls. The underlying point, which NGData stresses often, is that they're going beyond insight to producing an actionable result.
NGData also highlights the value of responding to customer behaviors in real time or near real time. It can do this because it immediately updates its metrics in Hbase whenever it receives new information. Still is more old news, but conventional solutions often update their scores and recommendations nightly or even less often. To be fair, NGData can only update in real time if the source systems are providing immediate updates – which often isn't the case. The vendor does have its own Web tags to capture real time information about Web visits, which lets it use in-session behavior to drive product and page recommendations.
The one thing you may not expect about Lily is that it’s old-style on-premise software, not software-as-a-service. That’s mostly because current clients are big companies in financial services, telecommunications, and media, industries that have been reluctant to let precious company data outside their walls (although plenty of hackers still manage to get it, he added snarkily). It’s also not clear how much SaaS would benefit NGData, since each client’s data store would remain separate in any case and the software needs to be installed on relatively few desktops. For what it's worth, Aginity is also on-premise and also serves primarily enterprise clients.
I suspect it's no coincidence that the two "pure" Customer Data Platform vendors both focus on enterprises. Smaller firms need to justify a CDP by combining it with a specific application, but big companies can afford a separate CDP project that they'll later tie to separate execution systems. Perhaps this will change as CDPs get cheaper, the model is better understood, and integration with execution systems becomes easier.
NGData has about 25 current paying clients in Europe and the U.S. Pricing is based on volume of data and/or number of customers. Total cost for software and services starts around $150,000 to $200,000 and can go much higher. As I said, this is enterprise software.
*This is one of those things that “everybody knows” but few cab document. Here’s one paper that says data warehouses “usually” take 12 to 36 months, cost $1 to $1.5 million, and have a 70% failure rate. It’s not clear where the author got his data but it all sounds about right.