Wednesday, December 26, 2007

Eventricity Lets Banks Buy, Not Build, Event-Based Marketing Systems

As you may recall from my posts on Unica and SAS, event-based marketing (also called behavior identification) seems to be gaining traction at long last. By coincidence, I recently found some notes I made two years about a UK-based firm named eventricity Ltd. This led to a long conversation with eventricity founder Mark Holtom, who turned out to be an industry veteran with background at NCR/Teradata and AIMS Software, where he worked on several of the pioneering projects in the field.

Eventricity, launched in 2003, is Holtom’s effort to convert the largely custom implementations he had seen elsewhere into a more packaged software product. Similar offerings do exist, from Harte-Hanks, Teradata and Conclusive Marketing (successor to Synapse Technology) as well as Unica and SAS. But those are all part of a larger product line, while eventricity offers event-based software alone.

Specifically, eventricity has two products: Timeframe event detection and Coffee event filtering. Both run on standard servers and relational databases (currently implemented on Oracle and SQL Server). This contrasts with many other event-detection systems, which use special data structures to capture event data efficiently. Scalability doesn’t seem to be an issue for eventricity: Holtom said it processes data for one million customers (500 million transactions, 28 events) in one hour on a dual processor Dell server.

One of the big challenges with event detection is defining the events themselves. Eventricity is delivered with a couple dozen basic events, such as unusually large deposit, end of a mortgage, significant birthday, first salary check, and first overdraft. These are defined with SQL statements, which imposes some limits in both complexity and end-user control. For example, although events can consider transactions during a specified time period, they cannot be use a sequence of transactions (e.g., an overdraft followed by a withdrawal). And since few marketers can write their own SQL, creation of new events takes outside help.

But users do have great flexibility once the events are built. Timeframe has a graphical interface that lets users specify parameters, such as minimum values, percentages and time intervals, which are passed through to the underlying SQL. Different parameters can be assigned to customers in different segments. Users can also give each event its own processing schedule, and can combine several events into a “super event”.

Coffee adds still more power, aimed at distilling a trickle of significant leads from the flood of raw events. This involves filters to determine which events to consider, ranking to decide which leads to handle first, and distribution to determine which channels will process them. Filters can consider event recency, rules for contact frequency, and customer type. Eligible events are ranked based on event type and processing sequence. Distribution can be based on channel capacity and channel priorities by customer segment: the highest-ranked leads are handled first.

What eventricity does not do is decide what offer should be triggered by each event. Rather, the intent is to feed the leads to call centers or account managers who will call the customer, assess the situation, and react appropriately. Several event-detection vendors share this similar approach, arguing that automated systems are too error-prone to pre-select a specific offer. Other vendors do support automated offers, arguing that automated contacts are so inexpensive that they are profitable even if the targeting is inexact. The counter-argument of the first group is that poorly targeted offers harm the customer relationship, so the true cost goes beyond the expense of sending the message itself.

What all event-detection vendors agree on is the need for speed. Timeframe cites studies showing that real time reaction to events can yield an 82% success rate, vs. 70% for response within 12 hours, 25% in 48 hours and 10% in four days. Holtom argues that the difference in results between next-day response and real time (which Timeframe does not support) is not worth the extra cost, particularly since few if any banks can share and react to events across all channels in real time .

Still, the real question is not why banks won’t put in real-time event detection systems, but why so few have bought the overnight event detection products already available. The eventricity Web site cites several cases with mouth-watering results. My own explanation has long been that most banks cannot act on the leads these systems generate: either they lack the contact management systems or cannot convince personal bankers to make the calls. Some vendors have agreed.

But Holtom and others argue the main problem is banks build their own event-detection systems rather than purchasing someone else’s. This is certainly plausible for the large institutions. Event detection looks simple. It’s the sort of project in-house IT and analytical departments would find appealing. The problem for the software vendors is that once a company builds its own system, it’s unlikely to buy an outside product: if the internal system works, there’s no need to replace it, and if it doesn’t work, well, then, the idea has been tested and failed, hasn’t it?

For the record, Holtom and other vendors argue their experience has taught them where to look for the most important events, providing better results faster than an in-house team. The most important trick is event filtering: identifying the tiny fraction of daily events that are most likely to signal productive leads. In one example Holtom cites, a company’s existing event-detection project yielded an unmanageable 660,000 leads per day, compared with a handy 16,000 for eventricity.

The vendors also argue that buying an external system is much cheaper than building one yourself. This is certainly true, but something that internal departments rarely acknowledge, and accounting systems often obscure.

Eventricity’s solution to the marketing challenge is a low-cost initial trial, which includes in-house set-up and scanning for three to five events for a three month period. Cost is 75,000 Euros, or about $110,000 at today’s pitiful exchange rate. Pricing on the actual software starts as low as $50,000 and would be about 250,000 Euros ($360,000) for a bank with one million customers. Implementation takes ten to 12 weeks. Eventricity has been sold and implemented at Banca Antonveneta in Italy, and several other trials are in various stages.

Tuesday, December 18, 2007

Unica Strategy Stays the Course

I recently caught up with Unica Vice President Andrew Hally as part of my review of developments at the major marketing automation vendors. It’s been a good year for Unica, which will break $100 million in annual revenue for the first time. On the product front, they continue their long-time strategy of offering all the software a marketing department would need. This has mostly meant incremental development of existing components, including continued assimilation of past years’ acquisitions in Web analytics, email, marketing planning, lead management, and event detection. The one major acquisition in 2007 was Marketing Central, which provides hosted marketing resource management. This was well within Unica’s traditional scope, although the “hosted” part is a bit of a change. It will help Unica serve smaller companies in addition to its traditional enterprise clients. But since Unica must penetrate this segment to continue growing, this is less a detour from the company’s primary strategy than a logical extension of it.

Hally did say that Unica still sees substantial growth potential among large enterprises. He said the company still frequently finds itself replacing home-grown systems at big companies, not just earlier generations of purchased campaign management systems. This surprises me a bit, although I suppose many firms never saw the need to replace systems that were already working. I do wonder how many of the holdouts will change their minds each year. If the number is small, then Unica and its competitors will be mostly selling into a replacement market, which can’t be very large. On the other hand, continued demand for new capabilities in Web and mobile marketing should lead companies to replace even relatively new systems. So perhaps the replacement market will be bigger than it seems. Certainly Unica has added features for email and Web analytics. But it still has gaps in ad serving, keyword management, and mobile. Acquisitions in those areas would not be surprising.

Probably the most interesting development Hally reported was a sharp rise in sales for Unica’s Affinium Detect, an event-detection system based on the Elity software acquired with MarketSoft in 2005. Hally said Detect is now Unica’s third-best-selling product, with one to two dozen installations. This compares with the handful that Elity had sold when it was independent. He attributed the growth both to increased demand and to the reduced risk marketers see in buying from Unica. He also reported the product has been sold for telecommunication and credit card applications, in addition to the traditional retail banking.

While at NCDM, I also took a look at Unica’s newest module, an ad hoc reporting package called Affinium Insight. This provides some basic visualization and analysis capabilities for non-technical users. It is designed to free up marketing analysts for more demanding work by giving business users a simple graphical interface linked to data marts assembled with standard Unica technology. The interface resembles the NetTracker Web analytics system Unica acquired with Sane Solutions in 2006.

Wednesday, December 12, 2007

Advanced Analytics and Still More Reasons I Love QlikView

I’m at the National Center for Database Marketing Conference this week. NCDM is always a good place to get a feel for what’s on people’s minds.

One theme I’ve picked up is a greater interest in advanced analytics. Richard Deere of Direct Data Mining Consultants, a very experienced modeler, told me that interest in segmentation always increases during recessions, because companies are more determined to increase the return on diminished budgets. This is apparently common knowledge in the modeling community, although it was news to me. In any case, it is consistent with what I’ve been hearing from vendors both at the show and in separate conversations over the past few weeks—products for event detection, automated recommendations and contact optimization, which have existed for years but gained few users, are now showing substantial growth.

Before anyone gets too excited, let’s remember that this is off a very small base. Products that sold two installations over the past three years might have added another six in 2007. Many of these products are now offered by larger firms than before, either because the original developer was purchased by a bigger company or because a large company developed their own version. So it’s possible the growth could simply be due to better distribution and more vendor credibility, in which case it could be a one-time increase. But the vendors tell me that interest is strong across all stages of the pre-purchase pipeline, so I suspect this uptick in sales is a precursor to continued expansion.

My personal theory is that the industry has matured in the sense that there are many more people now who have been doing serious database marketing for ten or fifteen years. These people saw the benefits of advanced techniques at the handful of industry leaders early in their careers, and have now moved into senior positions at other companies where they have the experience and authority to replicate those environments.

Of course, other reasons contribute as well: much of the infrastructure is finally in place (data warehouses, modeling systems, etc.); integration is getting easier due to modern technologies like J2EE and Service Oriented Architectures; vendors are becoming more willing to open up their systems through published APIs; and the analytical systems themselves are getting more powerful, more reliable, easier to use, and cheaper. Plus, as we’re all tired of hearing, customers have higher expectations for personalized treatments and competitive pressures continue to increase. I’d still argue that the presence of knowledgeable buyers is the really critical variable. I think this is an empirically testable hypothesis, but it's not worth the trouble of finding whether I’m correct.

Back to NCDM. The other major conclusion I’m taking from the show is confirmation of our recent experience at Client X Client that people are very interested in accessing their data in general, and in “marketing dashboards” in particular. There were several sessions at the show on dashboards and marketing measurement, and these seemed quite well attended. There were also a number of exhibitors with dashboard-type products, including TFC, Nicto from Integrale MDB, Tableau Software, and of course our own QlikView-based Client X Client. While there are substantial differences among the offerings, they all take the databases that people have spent years developing and make them more accessible without a lot of IT support.

This has been the problem we’ve heard about constantly: the data is there, but you need a technical person to write a SQL query, build a data cube or expose some metadata to use it, and that involves waiting on a long queue. We’ve found that QlikView addresses this directly because a business analyst can load detail data (e.g. individual transactions or accounts) and analyze it without any IT involvement (other than providing the data access in the first place). The other products listed can also access detail data, although they mostly read it from conventional relational databases, which are nowhere near as fast as QlikTech’s large-volume, in-memory data engine. (Tableau does some in-memory analysis, but only on relatively small data volumes.) It’s not that I’m trying to sell QlikTech here, but it’s important to understand that its combination of in-memory and high scalability provides capabilities these other systems cannot. (The other systems have their own strengths, so they’re not exactly direct competitors. Tableau has vastly better visualization than QlikTech, and both TFC and Integrale provide database building services that Client X Client does not.)

I’ve concluded that QlikView’s core market is the business analyst community, precisely because it empowers them to do things without waiting for IT. IT departments are less interested in the product, not because they are protecting their turf but simply because they do know how to write the SQL queries and build the cubes, so it doesn’t really let them do anything new. From the IT perspective, QlikView looks like just another tool they have to learn, which they’ll avoid if possible. You can argue (and we frequently do) that IT benefits from having business users become more self-reliant, but that hasn’t seemed to help much, perhaps because IT can’t quite believe it’s true. A more persuasive advantage for IT seems to be that they themselves can use QlikView for data exploration and prototyping, since it’s great for that kind of work (very easy, very fast, very low cost). This is a direct benefit (makes their own job easier) rather than an indirect benefit (makes someone else’s job easier), so it’s no surprise it carries more weight.

Thursday, December 06, 2007

1010data Offers A Powerful Columnar Database

Back in October I wrote here about the resurgent interest in alternatives to standard relational databases for analytical applications. Vendors on my list included Alterian, BlueVenn (formerly SmartFocus), Vertica and QD Technology. Most use some form of a columnar structure, meaning data is stored so the system can load only the columns required for a particular query. This reduces the total amount of data read from disk and therefore improves performance. Since a typical analytical query might read only a half-dozen columns out of hundreds or even thousands available, the savings can be tremendous.

I recently learned about another columnar database, Tenbase from 1010data. Tenbase, introduced in 2000, turns out to be a worthy alternative to better-known columnar products.

Like other columnar systems, Tenbase is fast: an initial query against a 4.3 billion row, 305 gigabyte table came back in about 12 seconds. Subsequent queries against the results were virtually instantaneous, because they were limited to the selected data and that data had been moved into memory. Although in-memory queries will always be faster, Tenbase says reading from disk takes only three times as long, which is a very low ratio. The reflects a focused effort by the company to make disk access as quick as possible.

What’s particularly intriguing is Tenbase achieves this performance without compressing, aggregating or restructuring the input. Although indexes are used in some situations, queries generally read the actual data. Even with indexes, the Tenbase files usually occupy about the same amount of space as the input. This factor varies widely among columnar databases, which sometimes expand file size significantly and sometimes compress it. Tenbase also handles very large data sets: the largest in production is nearly 60 billion rows and 4.6 terabytes. Fast response on such large installations is maintained by adding servers that process queries in parallel. Each server contains a complete copy of the full data set.

Tenbase can import data from text files or connect directly to multi-table relational databases. Load speed is about 30 million observations per minute for fixed width data. Depending on the details, this comes to around 10 gigabytes per hour. Time for incremental loads, which add new data to an existing table, is determined only by the volume of the new data. Some columnar databases essentially reload the entire file during an ‘incremental’ update.

Regardless of the physical organization, Tenbase presents loaded data as if it were in the tables of a conventional relational database. Multiple tables can be linked on different keys and queried. This contrasts with some columnar systems that require all tables be linked on the same key, such as a customer ID.

Tenbase has an ODBC connector that lets it accept standard SQL queries. Results come back as quickly as queries in the system’s own query language. This is also special: some columnar systems run SQL queries much more slowly or won’t accept them at all. The Tenbase developers demonstrated this feature by querying a 500 million row database through Microsoft Access, which feels a little like opening the door to a closet and finding yourself in the Sahara desert.

Tenbase’s own query language is considerably more powerful than SQL. It gives users advanced functions for time-series analysis, which actually allows many types of comparisons between rows in the data set. It also contains a variety of statistical, aggregation and calculation functions. It’s still set-based rather than a procedural programming language, so it doesn't support features like if/then loops. This is one area where some other columnar databases may have an edge.

The Tenbase query interface is rather plain but it does let users pick the columns and values to select by, and the columns and summary types to include in the result. Users can also specify a reference column for calculations such as weighted averages. Results can be viewed as tables, charts or cross tabs (limited to one value per intersection), which can themselves be queried. Outputs can be exported in Excel, PDF, XML, text or CSV formats. The interface also lets users create calculated columns and define links among tables.

Under the covers, the Tenbase interface automatically creates XML statements written to the Tenbase API. Users can view and edit the XML or write their own statements from scratch. This lets them create alternate interfaces for special purposes or simply to improve the esthetics. Queries built in Tenbase can be saved and rerun either in their original form or with options for users to enter new values at run time. The latter feature gives a simple way to build query applications for casual users.

The user interface is browser-based, so no desktop client is needed. Perhaps I'm easily impressed, but I like that the browser back button actually works. This is often not the case in such systems. Performance depends on the amount of data and query complexity but it scales with the number of servers, so even very demanding queries against huge databases can be returned in a few minutes with the right hardware. The servers themselves are commodity Windows PCs. Simple queries generally come back in seconds.

Tenbase clients pay for the system on a monthly basis. Fees are based primarily on the number of servers, which is determined by the number of users, amount of data, types of queries, and other details. The company does not publish its pricing but the figures it mentioned seemed competitive. The servers can reside at 1010data or at the client, although the 1010data will manage them either way. Users can load data themselves no matter where the server is located.

Most Tenbase clients are in the securities industry, where the product is used for complex analytics. The company has recently added several customers in retail, consumer goods and health care. There are about 45 active Tenbase installations, including the New York Stock Exchange, Proctor & Gamble and Pathmark Stores.