Wednesday, December 26, 2007

Eventricity Lets Banks Buy, Not Build, Event-Based Marketing Systems

As you may recall from my posts on Unica and SAS, event-based marketing (also called behavior identification) seems to be gaining traction at long last. By coincidence, I recently found some notes I made two years about a UK-based firm named eventricity Ltd. This led to a long conversation with eventricity founder Mark Holtom, who turned out to be an industry veteran with background at NCR/Teradata and AIMS Software, where he worked on several of the pioneering projects in the field.

Eventricity, launched in 2003, is Holtom’s effort to convert the largely custom implementations he had seen elsewhere into a more packaged software product. Similar offerings do exist, from Harte-Hanks, Teradata and Conclusive Marketing (successor to Synapse Technology) as well as Unica and SAS. But those are all part of a larger product line, while eventricity offers event-based software alone.

Specifically, eventricity has two products: Timeframe event detection and Coffee event filtering. Both run on standard servers and relational databases (currently implemented on Oracle and SQL Server). This contrasts with many other event-detection systems, which use special data structures to capture event data efficiently. Scalability doesn’t seem to be an issue for eventricity: Holtom said it processes data for one million customers (500 million transactions, 28 events) in one hour on a dual processor Dell server.

One of the big challenges with event detection is defining the events themselves. Eventricity is delivered with a couple dozen basic events, such as unusually large deposit, end of a mortgage, significant birthday, first salary check, and first overdraft. These are defined with SQL statements, which imposes some limits in both complexity and end-user control. For example, although events can consider transactions during a specified time period, they cannot be use a sequence of transactions (e.g., an overdraft followed by a withdrawal). And since few marketers can write their own SQL, creation of new events takes outside help.

But users do have great flexibility once the events are built. Timeframe has a graphical interface that lets users specify parameters, such as minimum values, percentages and time intervals, which are passed through to the underlying SQL. Different parameters can be assigned to customers in different segments. Users can also give each event its own processing schedule, and can combine several events into a “super event”.

Coffee adds still more power, aimed at distilling a trickle of significant leads from the flood of raw events. This involves filters to determine which events to consider, ranking to decide which leads to handle first, and distribution to determine which channels will process them. Filters can consider event recency, rules for contact frequency, and customer type. Eligible events are ranked based on event type and processing sequence. Distribution can be based on channel capacity and channel priorities by customer segment: the highest-ranked leads are handled first.

What eventricity does not do is decide what offer should be triggered by each event. Rather, the intent is to feed the leads to call centers or account managers who will call the customer, assess the situation, and react appropriately. Several event-detection vendors share this similar approach, arguing that automated systems are too error-prone to pre-select a specific offer. Other vendors do support automated offers, arguing that automated contacts are so inexpensive that they are profitable even if the targeting is inexact. The counter-argument of the first group is that poorly targeted offers harm the customer relationship, so the true cost goes beyond the expense of sending the message itself.

What all event-detection vendors agree on is the need for speed. Timeframe cites studies showing that real time reaction to events can yield an 82% success rate, vs. 70% for response within 12 hours, 25% in 48 hours and 10% in four days. Holtom argues that the difference in results between next-day response and real time (which Timeframe does not support) is not worth the extra cost, particularly since few if any banks can share and react to events across all channels in real time .

Still, the real question is not why banks won’t put in real-time event detection systems, but why so few have bought the overnight event detection products already available. The eventricity Web site cites several cases with mouth-watering results. My own explanation has long been that most banks cannot act on the leads these systems generate: either they lack the contact management systems or cannot convince personal bankers to make the calls. Some vendors have agreed.

But Holtom and others argue the main problem is banks build their own event-detection systems rather than purchasing someone else’s. This is certainly plausible for the large institutions. Event detection looks simple. It’s the sort of project in-house IT and analytical departments would find appealing. The problem for the software vendors is that once a company builds its own system, it’s unlikely to buy an outside product: if the internal system works, there’s no need to replace it, and if it doesn’t work, well, then, the idea has been tested and failed, hasn’t it?

For the record, Holtom and other vendors argue their experience has taught them where to look for the most important events, providing better results faster than an in-house team. The most important trick is event filtering: identifying the tiny fraction of daily events that are most likely to signal productive leads. In one example Holtom cites, a company’s existing event-detection project yielded an unmanageable 660,000 leads per day, compared with a handy 16,000 for eventricity.

The vendors also argue that buying an external system is much cheaper than building one yourself. This is certainly true, but something that internal departments rarely acknowledge, and accounting systems often obscure.

Eventricity’s solution to the marketing challenge is a low-cost initial trial, which includes in-house set-up and scanning for three to five events for a three month period. Cost is 75,000 Euros, or about $110,000 at today’s pitiful exchange rate. Pricing on the actual software starts as low as $50,000 and would be about 250,000 Euros ($360,000) for a bank with one million customers. Implementation takes ten to 12 weeks. Eventricity has been sold and implemented at Banca Antonveneta in Italy, and several other trials are in various stages.

Tuesday, December 18, 2007

Unica Strategy Stays the Course

I recently caught up with Unica Vice President Andrew Hally as part of my review of developments at the major marketing automation vendors. It’s been a good year for Unica, which will break $100 million in annual revenue for the first time. On the product front, they continue their long-time strategy of offering all the software a marketing department would need. This has mostly meant incremental development of existing components, including continued assimilation of past years’ acquisitions in Web analytics, email, marketing planning, lead management, and event detection. The one major acquisition in 2007 was Marketing Central, which provides hosted marketing resource management. This was well within Unica’s traditional scope, although the “hosted” part is a bit of a change. It will help Unica serve smaller companies in addition to its traditional enterprise clients. But since Unica must penetrate this segment to continue growing, this is less a detour from the company’s primary strategy than a logical extension of it.

Hally did say that Unica still sees substantial growth potential among large enterprises. He said the company still frequently finds itself replacing home-grown systems at big companies, not just earlier generations of purchased campaign management systems. This surprises me a bit, although I suppose many firms never saw the need to replace systems that were already working. I do wonder how many of the holdouts will change their minds each year. If the number is small, then Unica and its competitors will be mostly selling into a replacement market, which can’t be very large. On the other hand, continued demand for new capabilities in Web and mobile marketing should lead companies to replace even relatively new systems. So perhaps the replacement market will be bigger than it seems. Certainly Unica has added features for email and Web analytics. But it still has gaps in ad serving, keyword management, and mobile. Acquisitions in those areas would not be surprising.

Probably the most interesting development Hally reported was a sharp rise in sales for Unica’s Affinium Detect, an event-detection system based on the Elity software acquired with MarketSoft in 2005. Hally said Detect is now Unica’s third-best-selling product, with one to two dozen installations. This compares with the handful that Elity had sold when it was independent. He attributed the growth both to increased demand and to the reduced risk marketers see in buying from Unica. He also reported the product has been sold for telecommunication and credit card applications, in addition to the traditional retail banking.

While at NCDM, I also took a look at Unica’s newest module, an ad hoc reporting package called Affinium Insight. This provides some basic visualization and analysis capabilities for non-technical users. It is designed to free up marketing analysts for more demanding work by giving business users a simple graphical interface linked to data marts assembled with standard Unica technology. The interface resembles the NetTracker Web analytics system Unica acquired with Sane Solutions in 2006.

Wednesday, December 12, 2007

Advanced Analytics and Still More Reasons I Love QlikView

I’m at the National Center for Database Marketing Conference this week. NCDM is always a good place to get a feel for what’s on people’s minds.

One theme I’ve picked up is a greater interest in advanced analytics. Richard Deere of Direct Data Mining Consultants, a very experienced modeler, told me that interest in segmentation always increases during recessions, because companies are more determined to increase the return on diminished budgets. This is apparently common knowledge in the modeling community, although it was news to me. In any case, it is consistent with what I’ve been hearing from vendors both at the show and in separate conversations over the past few weeks—products for event detection, automated recommendations and contact optimization, which have existed for years but gained few users, are now showing substantial growth.

Before anyone gets too excited, let’s remember that this is off a very small base. Products that sold two installations over the past three years might have added another six in 2007. Many of these products are now offered by larger firms than before, either because the original developer was purchased by a bigger company or because a large company developed their own version. So it’s possible the growth could simply be due to better distribution and more vendor credibility, in which case it could be a one-time increase. But the vendors tell me that interest is strong across all stages of the pre-purchase pipeline, so I suspect this uptick in sales is a precursor to continued expansion.

My personal theory is that the industry has matured in the sense that there are many more people now who have been doing serious database marketing for ten or fifteen years. These people saw the benefits of advanced techniques at the handful of industry leaders early in their careers, and have now moved into senior positions at other companies where they have the experience and authority to replicate those environments.

Of course, other reasons contribute as well: much of the infrastructure is finally in place (data warehouses, modeling systems, etc.); integration is getting easier due to modern technologies like J2EE and Service Oriented Architectures; vendors are becoming more willing to open up their systems through published APIs; and the analytical systems themselves are getting more powerful, more reliable, easier to use, and cheaper. Plus, as we’re all tired of hearing, customers have higher expectations for personalized treatments and competitive pressures continue to increase. I’d still argue that the presence of knowledgeable buyers is the really critical variable. I think this is an empirically testable hypothesis, but it's not worth the trouble of finding whether I’m correct.

Back to NCDM. The other major conclusion I’m taking from the show is confirmation of our recent experience at Client X Client that people are very interested in accessing their data in general, and in “marketing dashboards” in particular. There were several sessions at the show on dashboards and marketing measurement, and these seemed quite well attended. There were also a number of exhibitors with dashboard-type products, including TFC, Nicto from Integrale MDB, Tableau Software, and of course our own QlikView-based Client X Client. While there are substantial differences among the offerings, they all take the databases that people have spent years developing and make them more accessible without a lot of IT support.

This has been the problem we’ve heard about constantly: the data is there, but you need a technical person to write a SQL query, build a data cube or expose some metadata to use it, and that involves waiting on a long queue. We’ve found that QlikView addresses this directly because a business analyst can load detail data (e.g. individual transactions or accounts) and analyze it without any IT involvement (other than providing the data access in the first place). The other products listed can also access detail data, although they mostly read it from conventional relational databases, which are nowhere near as fast as QlikTech’s large-volume, in-memory data engine. (Tableau does some in-memory analysis, but only on relatively small data volumes.) It’s not that I’m trying to sell QlikTech here, but it’s important to understand that its combination of in-memory and high scalability provides capabilities these other systems cannot. (The other systems have their own strengths, so they’re not exactly direct competitors. Tableau has vastly better visualization than QlikTech, and both TFC and Integrale provide database building services that Client X Client does not.)

I’ve concluded that QlikView’s core market is the business analyst community, precisely because it empowers them to do things without waiting for IT. IT departments are less interested in the product, not because they are protecting their turf but simply because they do know how to write the SQL queries and build the cubes, so it doesn’t really let them do anything new. From the IT perspective, QlikView looks like just another tool they have to learn, which they’ll avoid if possible. You can argue (and we frequently do) that IT benefits from having business users become more self-reliant, but that hasn’t seemed to help much, perhaps because IT can’t quite believe it’s true. A more persuasive advantage for IT seems to be that they themselves can use QlikView for data exploration and prototyping, since it’s great for that kind of work (very easy, very fast, very low cost). This is a direct benefit (makes their own job easier) rather than an indirect benefit (makes someone else’s job easier), so it’s no surprise it carries more weight.

Thursday, December 06, 2007

1010data Offers A Powerful Columnar Database

Back in October I wrote here about the resurgent interest in alternatives to standard relational databases for analytical applications. Vendors on my list included Alterian, BlueVenn (formerly SmartFocus), Vertica and QD Technology. Most use some form of a columnar structure, meaning data is stored so the system can load only the columns required for a particular query. This reduces the total amount of data read from disk and therefore improves performance. Since a typical analytical query might read only a half-dozen columns out of hundreds or even thousands available, the savings can be tremendous.

I recently learned about another columnar database, Tenbase from 1010data. Tenbase, introduced in 2000, turns out to be a worthy alternative to better-known columnar products.

Like other columnar systems, Tenbase is fast: an initial query against a 4.3 billion row, 305 gigabyte table came back in about 12 seconds. Subsequent queries against the results were virtually instantaneous, because they were limited to the selected data and that data had been moved into memory. Although in-memory queries will always be faster, Tenbase says reading from disk takes only three times as long, which is a very low ratio. The reflects a focused effort by the company to make disk access as quick as possible.

What’s particularly intriguing is Tenbase achieves this performance without compressing, aggregating or restructuring the input. Although indexes are used in some situations, queries generally read the actual data. Even with indexes, the Tenbase files usually occupy about the same amount of space as the input. This factor varies widely among columnar databases, which sometimes expand file size significantly and sometimes compress it. Tenbase also handles very large data sets: the largest in production is nearly 60 billion rows and 4.6 terabytes. Fast response on such large installations is maintained by adding servers that process queries in parallel. Each server contains a complete copy of the full data set.

Tenbase can import data from text files or connect directly to multi-table relational databases. Load speed is about 30 million observations per minute for fixed width data. Depending on the details, this comes to around 10 gigabytes per hour. Time for incremental loads, which add new data to an existing table, is determined only by the volume of the new data. Some columnar databases essentially reload the entire file during an ‘incremental’ update.

Regardless of the physical organization, Tenbase presents loaded data as if it were in the tables of a conventional relational database. Multiple tables can be linked on different keys and queried. This contrasts with some columnar systems that require all tables be linked on the same key, such as a customer ID.

Tenbase has an ODBC connector that lets it accept standard SQL queries. Results come back as quickly as queries in the system’s own query language. This is also special: some columnar systems run SQL queries much more slowly or won’t accept them at all. The Tenbase developers demonstrated this feature by querying a 500 million row database through Microsoft Access, which feels a little like opening the door to a closet and finding yourself in the Sahara desert.

Tenbase’s own query language is considerably more powerful than SQL. It gives users advanced functions for time-series analysis, which actually allows many types of comparisons between rows in the data set. It also contains a variety of statistical, aggregation and calculation functions. It’s still set-based rather than a procedural programming language, so it doesn't support features like if/then loops. This is one area where some other columnar databases may have an edge.

The Tenbase query interface is rather plain but it does let users pick the columns and values to select by, and the columns and summary types to include in the result. Users can also specify a reference column for calculations such as weighted averages. Results can be viewed as tables, charts or cross tabs (limited to one value per intersection), which can themselves be queried. Outputs can be exported in Excel, PDF, XML, text or CSV formats. The interface also lets users create calculated columns and define links among tables.

Under the covers, the Tenbase interface automatically creates XML statements written to the Tenbase API. Users can view and edit the XML or write their own statements from scratch. This lets them create alternate interfaces for special purposes or simply to improve the esthetics. Queries built in Tenbase can be saved and rerun either in their original form or with options for users to enter new values at run time. The latter feature gives a simple way to build query applications for casual users.

The user interface is browser-based, so no desktop client is needed. Perhaps I'm easily impressed, but I like that the browser back button actually works. This is often not the case in such systems. Performance depends on the amount of data and query complexity but it scales with the number of servers, so even very demanding queries against huge databases can be returned in a few minutes with the right hardware. The servers themselves are commodity Windows PCs. Simple queries generally come back in seconds.

Tenbase clients pay for the system on a monthly basis. Fees are based primarily on the number of servers, which is determined by the number of users, amount of data, types of queries, and other details. The company does not publish its pricing but the figures it mentioned seemed competitive. The servers can reside at 1010data or at the client, although the 1010data will manage them either way. Users can load data themselves no matter where the server is located.

Most Tenbase clients are in the securities industry, where the product is used for complex analytics. The company has recently added several customers in retail, consumer goods and health care. There are about 45 active Tenbase installations, including the New York Stock Exchange, Proctor & Gamble and Pathmark Stores.

Thursday, November 29, 2007

Low Cost CDI from Infosolve, Pentaho and StrikeIron

As I’ve mentioned in a couple of previous posts, QlikView doesn’t have the built-in matching functions needed for customer data integration (CDI). This has left me looking for other ways to provide that service, preferably at a low cost. The problem is that the major CDI products like Harte-Hanks Trillium, DataMentors DataFuse and SAS DataFlux are fairly expensive.

One intriguing alternative is Infosolve Technologies. Looking at the Infosolve Web site, it’s clear they offer something relevant, since two flagship products are ‘OpenDQ’ and ‘OpenCDI’ and their tag line is ‘The Power of Zero Based Data Solutions’. But I couldn't figure out exactly what they were selling since they stress that there are ‘never any licenses, hardware requirements or term contracts’. So I broke down and asked them.

It turns out that Infosolve is a consulting firm that uses free open source technology, specifically the Pentaho platform for data integration and business intelligence. A Certified Development partner of Pentaho, Infosolve has developed its own data quality and CDI components on the platform and simply sells the consulting needed to deploy it. Interesting.

Infosolve Vice President Subbu Manchiraju and Director of Alliances Richard Romanik spent some time going over the details and gave me a brief demonstration of the platform. Basically, Pentaho lets users build graphical workflows that link components for data extracts, transformation, profiling, matching, enhancement, and reporting. It looked every bit as good as similar commercial products.

Two particular points were worth noting:

- the actual matching approach itself seems acceptable. Users build rules that specify which fields to compare, the methods used to measure similarity, and similarity scores required for each field. This is less sophisticated than the best commercial products, but field comparisons are probably adequate for most situations. Although setting up and tuning such rules can be time-consuming, Infosolve told me they can build a typical set of match routines in about half a day. More experienced or adventurous users could even do it for themselves; the user interface makes the mechanics very simple. A half-day of consulting might cost $1,000, which is not bad at all when you consider that the software itself is free. The price for a full implementation would be higher since it would involve additional consulting to set up data extracts, standardization, enhancement and other processes, but cost should still be very reasonable. You’d probably need as much consulting with other CDI systems where you'd pay for the software too.

- data verification and enhancement is done by calls to StrikeIron, which provides a slew of on-demand data services. StrikeIron is worth knowing about in its own right: it lets users access Web services including global address verification and corrections; consumer and business data lookups using D&B and Gale Group data; telephone verification, appends and reverse appends; geocoding and distance calculations; Mapquest mapping and directions; name/address parsing; sales tax lookups; local weather forecasts; securities prices; real-time fraud detection; and message delivery for text (SMS) and voice (IVR). Everything is priced on a per use basis. This opens up all sorts of interesting possibilities.

The Infosolve software can be installed on any platform that can run Java, which is just about everything. Users can also run it within the Sun Grid utility network, which has a pay-as-you-go business model of $1 per CPU hour.

I’m a bit concerned about speed with Infosolve: the company said it takes 8 to 12 hours to run a million record match on a typical PC. But that assumes you compare every record against every other record, which usually isn’t necessary. Of course, where smaller volumes are concerned, this is not an issue.

Bottom line: Infosolve and Pentaho may not meet the most extreme CDI requirements, but they could be a very attractive option when low cost and quick deployment are essential. I’ll certainly keep them in mind for my own clients.

Tuesday, November 27, 2007

Just How Scalable Is QlikTech?

A few days ago, I replied to a question regarding QlikTech scalability. (See What Makes QlikTech So Good?, August 3, 2007) I asked QlikTech itself for more information on the topic but haven’t learned anything new. So let me simply discuss this based on my own experience (and, once again, remind readers that while my firm is a QlikTech reseller, comments in this blog are strictly my own.)

The first thing I want to make clear is that QlikView is a wonderful product, so it would be a great pity if this discussion were to be taken as a criticism. Like any product, QlikView works within limits that must be understood to use it appropriately. No one benefits from unrealistic expectations, even if fans like me sometimes create them.

That said, let’s talk about what QlikTech is good at. I find two fundamental benefits from the product. The first is flexibility: it lets you analyze data in pretty much any way you want, without first building a data structure to accommodate your queries. By contrast, most business intelligence tools must pre-aggregate large data sets to deliver fast response. Often, users can’t even formulate a particular query if the dimensions or calculated measures were not specified in advance. Much of the development time and cost of conventional solutions, whether based in standard relational databases or specialized analytical structures, is spent on this sort of work. Avoiding it is the main reason QlikTech is able to deliver applications so quickly.

The other big benefit of QlikTech is scalability. I can work with millions of records on my desktop with the 32-bit version of the system (maximum memory 4 GB if your hardware allows it) and still get subsecond response. This is much more power than I’ve ever had before. A 64-bit server can work with tens or hundreds of millions of rows: the current limit for a single data set is apparently 2 billion rows, although I don’t know how close anyone has come to that in the field. I have personally worked with tables larger than 60 million rows, and QlikTech literature mentions an installation of 300 million rows. I strongly suspect that larger ones exist.

So far so good. But here’s the rub: there is a trade-off in QlikView between really big files and really great flexibility. The specific reason is that the more interesting types of flexibility often involve on-the-fly calculations, and those calculations require resources that slow down response. This is more a law of nature (there’s no free lunch) than a weakness in the product, but it does exist.

Let me give an example. One of the most powerful features of QlikView is a “calculated dimension”. This lets reports construct aggregates by grouping records according to ad hoc formulas. You might want to define ranges for a value such as age, income or unit price, or create categories using if/then/else statements. These formulas can get very complex, which is generally a good thing. But each formula must be calculated for each record every time it is used in a report. On a few thousand rows, this can happen in an instant, but on tens of millions of rows, it can take several minutes (or much longer if the formula is very demanding, such as on-the-fly ranking). At some point, the wait becomes unacceptable, particularly for users who have become accustomed to QlikView’s typically-immediate response.

As problems go, this isn’t a bad one because it often has a simple solution: instead of on-the-fly calculations, precalculate the required values in QlikView scripts and store the results on each record. There’s little or no performance cost to this strategy since expanding the record size doesn’t seem to slow things down. The calculations do add time to the data load, but that happens only once, typically in an unattended batch process. (Another option is to increase the number and/or speed of processors on the server. QlikTech makes excellent use of multiple processors.)

The really good news is you can still get the best of both worlds: work out design details with ad hoc reports on small data sets; then, once the design is stabilized, add precalculations to handle large data volumes. This is vastly quicker than prebuilding everything before you can see even a sample. It’s also something that’s done by business analysts with a bit of QlikView training, not database administrators or architects.

Other aspects of formulas and database design also more important in QlikView as data volumes grow larger. The general solution is the same: make the application more efficient through tighter database and report design. So even though it’s true that you can often just load data into QlikView and work with it immediately, it’s equally true that very large or sophisticated applications may take some tuning to work effectively. In other words, QlikView is not pure magic (any result you want for absolutely no work), but it does deliver much more value for a given amount of work than conventional business intelligence systems. That’s more than enough to justify the system.

Interestingly, I haven’t found that the complexity or over-all size of a particular data set impacts QlikView performance. That is, removing tables which are not used in a particular query doesn’t seem to speed up that query, nor does removing fields from tables within the query. This probably has to do with QlikTech’s “associative” database design, which treats each field independently and connects related fields directly to each other. But whatever the reason, most of the performance slow-downs I’ve encountered seem related to processing requirements.

And, yes, there are some upper limits to the absolute size of a QlikView implementation. Two billions rows is one, although my impression (I could be wrong) is that could be expanded if necessary. The need to load data into memory is another limit: even though the 64-bit address space is effectively infinite, there are physical limits to the amount of memory that can be attached to Windows servers. (A quick scan of the Dell site finds a maximum of 128 GB.) This could translate into more input data, since QlikView does some compression. At very large scales, processing speed will also impose a limit . Whatever the exact upper boundary, it’s clear that no one will be loading dozens of terabytes into QlikView any time soon. It can certainly be attached a multi-terabyte warehouse, but would have to work with multi-gigabyte extracts. For most purposes, that’s plenty.

While I’m on the topic of scalability, let me repeat a couple of points I made in the comments on the August post. One addresses the notion that QlikTech can replace a data warehouse. This is true in the sense that QlikView can indeed load and join data directly from operational systems. But a data warehouse is usually more than a federated view of current operational tables. Most warehouses include data integration to link otherwise-disconnected operational data. For example, customer records from different systems often can only be linked through complex matching techniques because there is no shared key such as a universal customer ID. QlikView doesn’t offer that kind of matching. You might be able to build some of it using QlikView scripts, but you’d get better results at a lower cost from software designed for the purpose.

In addition, most warehouses store historical information that is not retained in operational systems. A typical example is end-of-month account balance. Some of these values can be recreated from transaction details but it’s usually much easier just to take and store a snapshot. Other data may simply be removed from operational systems after a relatively brief period. QlikView can act as a repository for such data: in fact, it’s quite well suited for this. Yet in such cases, it’s probably more accurate to say that QlikView is acting as the data warehouse than to say a warehouse is not required.

I hope this clarifies matters without discouraging anyone from considering QlikTech. Yes QlikView is a fabulous product. No it won’t replace your multi-terabyte data warehouse. Yes it will complement that warehouse, or possibly substitute for a much smaller one, by providing a tremendously flexible and efficient business intelligence system. No it won’t run itself: you’ll still need some technical skills to do complicated things on large data volumes. But for a combination of speed, power, flexibility and cost, QlikTech can’t be beat.

Thursday, November 15, 2007

SAS Adds Real Time Decisioning to Its Marketing Systems

I’ve been trying to pull together a post on SAS for some time. It’s not easy because their offerings are so diverse. The Web site lists 13 “Solution Lines” ranging from “Activity-Based Management” to “Web Analytics”. (SAS being SAS, these are indeed listed alphabetically.) The “Customer Relationship Management” Solution Line has 13 subcategories of its own (clearly no triskaidekaphobia here), ranging from “Credit Scoring” to “Web Analytics”.

Yes, you read that right: Web Analytics is listed both as a Solution Line and as a component of the CRM Solution. So is Profitability Management.

This is an accurate reflection of SAS’s fundamental software strategy, which is to leverage generic capabilities by packaging them for specific business areas. The various Solution Lines overlap in ways that are hard to describe but make perfect business sense.

Another reason for overlapping products is that SAS has made many acquisitions as it expands its product scope. Of the 13 products listed under Customer Relationship Managment, I can immediately identify three as based on acquisitions, and believe there are several more. This is not necessarily a problem, but it always raises concerns about integration and standardization.

Sure enough, integration and shared interfaces are two of the major themes that SAS lists for the next round of Customer Intelligence releases, due in December. (“Customer Intelligence” is SAS’s name for the platform underlying its enterprise marketing offerings. Different SAS documents show it including five or seven or nine of the components within the CRM Solution, and sometimes components of other Solution Lines. Confused yet? SAS tells me they're working on clarifying all this in the future.)

Labels aside, the biggest news within this release of Customer Intelligence is the addition of Real-Time Decision Manager (RDM), a system that…well…makes decisions in real time. This is a brand new, SAS-developed module, set for release in December with initial deployments in first quarter 2008. It is not to be confused with SAS Interaction Management, an event-detection system based on SAS's Verbind acquisition in 2002. SAS says it intends to tightly integrate RDM and Interaction Manager during 2008, but hasn’t worked out the details.

Real-Time Decision Manager lets users define the flow of a decision process, applying reusable nodes that can contain both decision rules and predictive models. Flows are then made available to customer touchpoint systems as Web services using J2EE. The predictive models themselves are built outside of RDM using traditional SAS modeling tools. They are then registered with RDM to become available for use as process nodes.

RDM's reliance on externally-built models contrasts with products that automatically create and refresh their own predictive models, notably the Infor CRM Epiphany Inbound Marketing system recently licensed by Teradata (see my post of October 31). SAS says that skilled users could deploy similar self-adjusting models, which use Bayesian techniques, in about half a day in RDM. The larger issue, according to SAS, is that such models are only appropriate in a limited set of situations. SAS argues its approach lets companies deploy whichever techniques are best suited to their needs.

But the whole point of the Infor/Epiphany approach is that many companies will never have the skilled statisticians to build and maintain large numbers of custom models. Self-generating models let these firms benefit from models even if the model performance is suboptimal. They also permit use of models in situations where the cost of building a manual model is prohibitive. Seems to me the best approach is for software to support both skilled users and auto-generated models, and let firms to apply whichever makes sense.

Back to integration. RDM runs on its own server, which is separate from the SAS 9 server used by most Customer Intelligence components. This is probably necessary to ensure adequate real-time performance. RDM does use SAS management utilities to monitor server performance. More important, it shares flow design and administrative clients with SAS Marketing Automation, which is SAS’s primary campaign management software. This saves users from moving between different interfaces and allows sharing of user-built nodes across Customer Intelligence applications.

RDM and other Customer Intelligence components also now access the same contact history and response data. This resides in what SAS calls a “lightweight” reporting schema, in contrast to the detailed, application-specific data models used within the different Customer Intelligence components. Shared contact and response data simplifies coordination of customer treatments across these different systems. Further integration among the component data models would probably be helpful, but I can't say for sure.

The December release also contains major enhancements to SAS’s Marketing Optimization and Digital Marketing (formerly E-mail Marketing) products. Optimization now works faster and does a better job finding the best set of contacts for a group of customers. Digital Marketing now includes mobile messaging, RSS feeds and dynamic Web pages. It also integrates more closely with Marketing Automation, which would generally create lists that are sent to Digital Marketing for transmission. Within Marketing Automation itself, it’s easier to create custom nodes for project flows and to integrate statistical models.

These are some pretty interesting trees, but let’s step back and look at the forest. Loyal readers of this blog know I divide a complete marketing system into five components: planning/budgets, project management, content management, execution, and analytics. SAS is obviously focused on execution and analytics. Limited content management functions are embedded in the various execution modules. There is no separate project management component, although the workflow capabilities of Marketing Automation can be applied to tactical tasks like campaign setup.

Planning and budgeting are more complicated because they are spread among several components. The Customer Intelligence platform includes a Marketing Performance Management module which is based on SAS’s generic Performance Management solutions. This provides forecasting, planning, scorecards, key performance indicators, and so on. Separate Profitability Management and Activity-Based Management modules are similarly based on generic capabilities. (If you’re keeping score at home, Profitability Management is usually listed within Customer Intelligence and Activity-Based Management is not.) Finally, Customer Intelligence also includes Veridiem MRM. Acquired in 2006 and still largely separate from the other products, Veridiem provides marketing reporting, modeling, scenarios and collaborative tools based on marketing mix models.

This is definitely a little scary. Things may not be as bad as they sound: components that SAS has built from scratch or reengineered to work on the standard SAS platforms are probably better integrated than the jumble of product names suggests. Also bear in mind that most marketing activities occur within SAS Marketing Automation, a mature campaign engine with more than 150 installations. Still, users with a broad range of marketing requirements should recognize that while SAS will sell you many pieces of the puzzle, some assembly is definitely required.

Monday, November 12, 2007

BridgeTrack Integrates Some Online Channels

What do “Nude Pics of Pam Anderson” and “Real-Time Analytics, Reporting and Optimization Across All Media Channels” have in common?

1. Both headlines are sure to draw the interest of certain readers.
2. People who click on either are likely to be disappointed.

Truth be told, I’ve never clicked on a Pam Anderson headline, so I can only assume it would disappoint. But I found the second headline irresistible. It was attached to a press release about the 5.0 release of Sapient’s BridgeTrack marketing software.

Maybe next time I’ll try Pam instead. BridgeTrack seems pretty good at what it does, but is nowhere near what the headline suggests.

First the good news: BridgeTrack integrates email, ad serving, offer pages, and keyword bidding (via an OEM agreement with Omniture) through a single campaign interface. All channels draw on a common content store, prospect identifiers, and data warehouse to allow integrated cross-channel programs. Results from each channel are posted and available for analysis in real time.

That’s much more convenient than working with separate systems for each function, and is the real point of BridgeTrack. I haven’t taken a close look at the specific capabilities within each channel but they seem reasonably complete.

But it’s still far from “optimization across all media channels”.

Let’s start with “all media channels”. Ever hear of a little thing called “television”? Most people would include it in a list of all media channels. But the best that BridgeTrack can offer for TV or any other off-line channel is a media buying module that manages the purchasing workflow and stores basic planning information. Even in the digital world, BridgeTrack does little to address organic search optimization, Web analytics, mobile phones, or the exploding realm of social networks. In general, I prefer to evaluate software based on what it does rather than what it doesn’t do. But if BridgeTrack is going to promise me all channels, I think it’s legitimate to complain when they don’t deliver.

What about “optimization”? Same story, I’m afraid. BridgeTrack does automatically optimize ad delivery by comparing results for different advertisements (on user-defined measures such as conversion rates) and automatically selecting the most successful. The keyword bidding system is also automated, but that’s really Omniture.

Otherwise, all optimization is manual. For example, the press release says the BridgeTrack campaign manager “reallocates marketing dollars across channels that generate the most incremental return-on-spend.” But all it really does is present reports. Users have to interpret them and make appropriate changes in marketing programs. Similarly, email and offer page optimization means watching the results of user-defined rules and adjusting the rules manually. Rather than claiming that BridgeTrack “does” optimization, it might be accurate to say it “enables” it through integrated real time reports and unified campaign management. Given how hard it is to assemble information and coordinate campaigns without a tool like BridgeTrack, that’s actually quite enough.

Even within its chosen channels, BridgeTrack lacks automated predictive modeling and advanced analytics in general. (The ad server does offer some cool heat maps of placement performance.) This has direct consequences, since it means the system must rely heavily on user-defined rules to select appropriate customer treatments. Unfortunately, rule management is quite limited: users don’t even get statistics on how often different rules are fired or how they perform. The problem is compounded because rules can exist at many different levels, including within content, in content templates, and in campaign flows. Understanding interactions across different levels can be difficult, yet BridgeTrack provides little assistance. The central content store helps a bit, since rules embedded in a particular piece of content are shared automatically when the content is shared. BridgeTrack managers recognize this issue and hope to improve its rule management in the future.

In fact, despite the press release, BridgeTrack managers have a fairly realistic view of the product’s actual scope. This shows in recent agreements with Unica and Omniture to integrate with their respective marketing automation and Web analytics products. Users of the combined set of products would have many of the the planning, off-line marketing, project management and analytical tools that BridgeTrack itself does not provide.

(Actually, based on the July 2007 press release describing the BridgeTrack integration, Omniture positions itself as “online business optimization software” that provides “one standard view across all marketing initiatives”. That’s a bold set of claims. I’m skeptical but will wait to examine them some other day.)

BridgeTrack is a hosted solution. Pricing is designed to be comparable with the point solutions it replaces and therefore is calculated differently for specific activities: message volume for ad serving and email, cost per click for search bid management, and traffic levels for landing page hosting. The campaign manager and reporting systems support all the different channels. These are not usually sold independently but could be purchased for a monthly fee. Customer data integration, which combines BridgeTrack-generated data with information from external sources for reporting and customer treatments, is charged as a professional services project.

Tuesday, November 06, 2007

Datran Media Sells Email Like Web Ads

I wasn’t able to get to the ad:tech conference in New York City this week, but did spend a little time looking at the show sponsors’ Web sites. (Oddly, I was unable to find an online listing of all the exhibitors. This seems like such a basic mistake for this particular group that I wonder whether it was intentional. But I can’t see a reason.)

Most of the sponsors are offering services related to online ad networks. These are important but just marginally relevant my own concerns. I did however see some intriguing information from Datran Media, an email marketing vendor which seems to be emulating the model of the online ad networks. It’s hard to get a clear picture from its Web site, but my understanding is that Datran both provides conventional email distribution services via its Skylist subsidiary and helps companies purchase use of other firms’ email lists.

This latter capability is what’s intriguing. Datran is packaging email lists in the same way as advertising space on a Web site or conventional publication. That is, it treats each list as “inventory” that can be sold to the highest bidder in an online exchange. Datran not very creatively calls this “Exchange Online”, or EO. Presumably (this is one of the things I can’t tell from the Web site) the inventory is limited by the number of times a person can be contacted within a given period.

Datran also speaks of having an email universe of over100 million unique consumers. I can’t tell if this is its own compiled list or the sum of the lists it sells on behalf of its clients, although I’m guessing the former. The company offers event-based selections within this universe, such as people who have recently responded to an offer or made a purchase. This is more like traditional direct mail list marketing than Web ad sales, not that there’s anything wrong with that. Completing the circle, Datran also offers event-triggered programs to its conventional email clients, for retention, cross sales and loyalty building. This is not unique, but it’s still just emerging as a best practice.

From my own perspective, treating an email list as an inventory of contact opportunities exactly mirrors the way we see things at Client X Client. In our terminology, each piece of inventory is a “slot” waiting to be filled. Wasting slots is as bad as wasting any other perishable inventory, be it a Web page view, airline seat, or stale doughnut. One of the core tasks in the CXC methodology is identifying previously unrecognized slots and then attempting to wring the greatest value possible from them. It’s pleasing to see that Datran has done exactly that, even though they came up with the idea without our help.

The notion of slots also highlights another piece of ambiguity about Datran: are email customers purchasing an entire email, or an advertisement inserted into existing email? There is language on the company site that suggests both possibilities, although I suspect it’s the entire email. Actually, embedding ads in existing emails might be a more productive use of the “slots” that those emails represent, since it would allow delivery of more messages per customer. Whether it would also annoy recipients or diminish response is something that would have to be tested.

Datran offers other services related to online marketing, such as landing page optimization. This illustrates another trend: combining many different online channels and methods in a single package. This is an important development in the world of marketing software, and I plan to write more about it in the near future.

Friday, November 02, 2007

The Next Big Leap for Marketing Software

I’ve often written about the tendency of marketing automation vendors to endlessly expand the scope of their products. Over all this is probably a good thing for their customers. But at some point, the competitive advantage of adding yet another capability probably approaches nil. If so, then what will be the next really important change in marketing systems?

My guess is it will be a coordination mechanism to tie together all of those different components – resource management, execution, analysis, and so on. Think of each function as a horse: the more horses you rope to your chariot, the harder it is to keep control.

I’m not talking about data integration or marketing planning, which are already part of the existing architectures, or even the much desired (though rarely achieved) goal of centralized customer interaction management. Those are important but too rigid. Few companies will be technically able or politically willing to turn every customer treatment over to one great system in the sky. Rather, I have in mind something lighter-handed: not a rigid harness, but a carrot in front of those horses that gets them voluntarily pulling in the same direction. (Okay, enough with the horse metaphor.)

The initial version of this system will probably be a reporting process that gathers interactions from customer contact systems and relates them to future results. I’m stating that as simply as possible because I don’t think a really sophisticated approach – say, customer lifecycle simulation models – will be accepted. Managers at all levels need to see basic correlations between treatments and behaviors. They can then build their own mental models about how these are connected. I fully expect more sophisticated models to evolve over time, including what-if simulations to predict the results of different approaches and optimization to find the best choices. But today most managers would find such models too theoretical to act on the results. I’m avoiding mention of lifetime value measures for the same reason.

So what, concretely, is involved here and how does it differ from what’s already available? What’s involved is building a unified view of all contacts with individual customers and making these easy to analyze. Most marketing analysis today is still at the program level and rarely attempts to measure anything beyond immediate results. The new system would assemble information on revenues, product margins and service costs as well as promotions. This would give a complete picture of customer profitability at present and over time. Changes over time are really the key, since they alert managers as quickly as possible to problems and opportunities.

The system will store this information at the lowest level possible (preferably, down to individual interactions) and with the greatest detail (specifics about customer demographics, promotion contents, service issues, etc.), so all different kinds of analysis can be conducted on the same base. Although the initial deployments will contain only fragments of the complete data, these fragments will themselves be enough to be useful. The key to success will be making sure that the tools in the initial system are so attractive (that is, so powerful and so easy to use) that managers in all groups want to use them against their own data, even though this means exposing that data to others in the company. (If you’re lucky enough to work in a company where all data is shared voluntarily and enthusiastically – well, congratulations.)

You may feel what I’ve described is really no different from existing marketing databases and data warehouses. I don’t think so: transactions in most marketing databases are limited to promotions and responses, while most data warehouses lack the longitudinal customer view. But even if the technology is already in place, the approach is certainly distinct. It means asking managers to look not just at their own operational concerns, but at how their activities affect results across the company and the entire customer life cycle. More concretely, it allows managers to spot inconsistencies in customer treatments from one department to the next, and to compare the long-term results (and, therefore, return on investment) of treatments in different areas. Comparing return on investment is really a form of optimization, but that’s another term we’re avoiding for the moment.

Finally, and most important of all, assembling and exposing all this information makes it easy to see where customer treatments support over-all business strategies, and where they conflict with them. This is the most important benefit because business strategy is what CEOs and other top-level executives care about—so a system that helps them execute strategy can win their support. That support is what’s ultimately needed for marketing automation to make its next great leap, from one department to the enterprise as a whole.

Wednesday, October 31, 2007

Independent Teradata Makes New Friends

I had a product briefing from Teradata earlier this week after not talking for nearly two years. They are getting ready to release version 6 of their marketing automation software, Teradata Relationship Manager (formerly Teradata CRM). The new version has a revamped user interface and large number of minor refinements such as allowing multiple levels of control groups. But the real change is technical: the system has been entirely rebuilt on a J2EE platform. This was apparently a huge effort – when I checked my notes from two years ago, Teradata was talking about releasing the same version 6 with pretty much the same changes. My contact at Teradata told me the delay was due to difficulties with the migration. She promises the current schedule for releasing version 6 by December will definitely be met.

I’ll get back to v6 in a minute, but did want to mention the other big news out of Teradata recently: alliances Assetlink and Infor for marketing automation enhancements, and with SAS Institute for analytic integration. Each deal has its own justification, but it’s hard not to see them as showing a new interest in cooperation at Teradata, whose proprietary technology has long kept it isolated from the rest of the industry. The new attitude might be related to Teradata’s spin-off from NCR, completed October 1, which presumably frees (or forces) management to consider options it rejected while inside the NCR family. It might also reflect increasing competition from database appliances like Netezza, DATAllegro, and Greenplum. (The Greenplum Web site offers links to useful Gartner and Ventana Research papers if you want to look at the database appliance market in more detail.)

But I digress. Let’s talk first about the alliances and then v6.

The Assetlink deal is probably the more significant yet least surprising new arrangement. Assetlink is one of the most complete marketing resource management suites, so it gives Teradata a quick way to provide a set of features that are now standard in enterprise marketing systems. (Teradata had an earlier alliance in this area with Aprimo, but that never took hold. Teradata mentioned technical incompatibility with Aprimo’s .NET foundation as well as competitive overlap with Aprimo’s own marketing automation software.) In the all-important area of integration, Assetlink and Teradata will both run on the same data structures and coordinate their internal processes, so they should work reasonably seamlessly. Assetlink still has its own user interface and workflow engine, though, so some separation will still be apparent. Teradata stressed that it will be investing to create a version of Assetlink that runs on the Teradata database and will sell that under the Teradata brand.

The Infor arrangement is a little more surprising because Infor also has its own marketing automation products (the old Epiphany system) and because Infor is more oriented to mid-size businesses than the giant retailers, telcos, and others served by Teradata. Perhaps the separate customer bases make the competitive issue less important. In any event, the Infor alliance is limited to Infor’s real time decision engine, currently known as CRM Epiphany Inbound Marketing, which was always Epiphany’s crown jewel. Like Assetlink, Infor gives Teradata a quick way to offer a capability (real time interaction management, including self-adjusting predictive models) that is increasingly requested by clients and offered by competitors. Although Epiphany is also built on J2EE, the initial integration (available today) will still be limited: the software will run on a separate server using SQL Server as its data store. A later release, due in the first quarter of next year, will still have a separate server but connect directly with the Teradata database. Even then, though, real-time interaction flows will be defined outside of Teradata Relationship Manager. Integration will be at the data level: Teradata will provide lists of customers are eligible for different offers and will be notified of interaction results. Teradata will be selling its own branded version of the Infor product too.

The SAS alliance is described as a “strategic partnership” in the firms' joint press release, which sounds jarring from two previous competitors. Basically, it involves running SAS analytic functions inside of the Teradata. This turns out to be part of a larger SAS initiative called “in-database processing” which seeks similar arrangements with other database vendors. Teradata is simply the first partner to be announced, so maybe the relationship isn’t so special after all. On the other hand, the companies’ joint roadmap includes deeper integration of selected SAS “solutions” with Teradata, including mapping of industry-specific SAS logical data models to corresponding Teradata structures. The companies will also create a joint technical “center of excellence” where specialists from both firms will help clients improve performance of SAS and Teradata products. We’ll see whether other database vendors work this closely with SAS. In the specific area of marketing automation, the two vendors will continue to compete head-to-head, at least for the time being.

This brings us back to Teradata Relationship Manager itself. As I already mentioned, v6 makes major changes at the deep technical level and in the user interface, but the 100+ changes in functionality are relatively minor. In other words, the functional structure of the product is the same.

This structure has always been different from other marketing automation systems. What sets Teradata apart is a very systematic approach to the process of customer communications: it’s not simply about matching offers to customers, but about managing all the components that contribute to those offers. For example, communication plans are built up from messages, which contain collateral, channels and response definitions, and the collateral itself may contain personalized components. Campaigns are created by attaching communication plans to segment plans, which are constructed from individual segments. All these elements in turn are subject to cross-campaign constraints on channel capacity, contacts per customer, customer channel preferences, and message priorities. In other words, everything is related to everything else in a very logical, precise fashion – just like a database design. Did I mention that Teradata is a database company?

This approach takes some practice before you understand how the parts are connected – again, like a sophisticated database. It can also make simple tasks seem unnecessarily complicated. But it rewards patient users with a system that handles complex tasks accurately and supports high volumes without collapsing. For example, managing customers across channels is very straightforward because all channels are structurally equivalent.

The functional capabilities of Relationship Manager are not so different from Teradata’s main competitors (SAS Marketing Automation and Unica). But those products have evolved incrementally, often through acquisition, and parts are still sold as separate components. It’s probably fair to say that they not as tightly or logically integrated as Teradata.

This very tight integration also has drawbacks, since any changes to the data structure need careful consideration. Teradata definitely has a tendency to fit new functions into existing structures, such as setting up different types of campaigns (outbound, multi-step, inbound) through a single interface. Sometimes that’s good; sometimes it’s just easier to do different things in different ways.

Teradata has also been something of a laggard at integrating statistical modeling into its system. Even what it calls “optimization” is rule-based rather than the constrained statistical optimization offered by other vendors. I’m actually rather fond of Teradata’s optimization approaches: its ability to allocate leads across channels based on sophisticated capacity rules (e.g., minimum and maximum volumes from different campaigns; automatically sending overflow from one channel to another; automatically reallocating leads based on current work load) has always impressed me and I believe remains unrivaled. But allowing marketers to build and deploy true predictive models is increasingly important and, unless I’ve missed something, is still not offered by Teradata.

This is why the new alliances are so intriguing. Assetlink adds a huge swath of capabilities that Teradata otherwise would have very slowly and painstakingly created by expanding its core data model. Infor and SAS both address the analytical weaknesses of the existing system, while Infor in particular adds another highly desired feature without waiting to build new structures in-house. All these changes suggest a welcome sense of urgency in responding quickly to customer needs. If this new attitude holds true, it seems unlikely that Teradata will accept another two year delay in the release of Relationship Manager version 7.

Thursday, October 25, 2007

Business Rules Forum Targets Enterprise Decisioning as the Next Big Thing

I’m headed back from the combined Business Rules Forum / Rules Technology Summit / Rules Expo conference in Orlando. Theme of the conference was ‘Enterprise Decisioning Comes of Age’. The general idea is that business rules have been applied extensively in a few areas, including fraud detection and insurance rating, and are now poised to play a larger role in coordinating decisions throughout the enterprise. This idea has been championed in particular by James Taylor, formerly of Fair Isaac and now running his own firm Smart (Enough) Systems, who seems to have near rock star status within this community.

This is all music to my own ears, since the Customer Experience Matrix is precisely designed as a tool to help enterprises understand how all their pieces fit together across channels and the customer life cycle. It therefore provides an essential framework for organizing, prioritizing and integrating enterprise decision projects. Whether the generic notion of ‘enterprise decision management’ really takes off as the Next Big Thing remains to be seen – it still sounds pretty geeky. One question is what it takes for CEOs to get really excited about the concept: because it is by definition enterprise wide, it takes CEO sponsorship to make it happen. Neither James nor other panelists at the discussions I heard really had an answer for that one.

Thursday, October 18, 2007

Neolane Offers a New Marketing Automation Option

Neolane, a Paris-based marketing automation software vendor, formally announced its entry to the U.S. market last week. I’ve been tracking Neolane for some time but chose not to write about it until they established a U.S. presence. So now the story can be told.

Neolane is important because it’s a full-scale competitor to Unica and the marketing automation suites of SAS and Teradata, which have pretty much had the high-end market to themselves in recent years. (You might add SmartFocus and Alterian to the list, but they sell mostly to service providers rather than end-users.) The company originally specialized in email marketing but has since broadened to incorporate other channels. Its email heritage still shows in strong content management and personalization capabilities. These are supplemented by powerful collaborative workflow, project management and marketing planning. Like many European products, Neolane was designed from the ground up to support trans-national deployments with specialized features such as multiple languages and currencies. The company, founded in 2001, now has over 100 installed clients. These include many very large firms such as Carrefour, DHL International and Virgin Megastores.

In evaluating enterprise marketing systems, I look at the five sets of capabilities: planning/budgeting; project management; content management; execution, and analysis. (Neolane itself offers a different set of five capabilities, although they are pretty similar.) Let’s go through these in turn.

Neolane does quite well in the first three areas, which all draw on its robust workflow management engine. This engine is part of Neolane’s core technology, which allows tight connections with the rest of the system. By contrast, many Neolane competitors have added these three functions at least in part through acquisition. This often results in less-than-perfect integration among the suite components.

Execution is a huge area, so we’ll break it into pieces. Neolane’s roots in email result in a strong email capability, of course. The company also claims particular strength in mobile (phone) marketing, although it’s not clear this involves more than supporting SMS and MMS output formats. Segmentation and selection features are adequate but not overly impressive: when it comes to really complex queries, Neolane users may find themselves relying on hand-coded SQL statements or graphical flow charts that can quickly become unmanageable. Although most of today’s major campaign management systems have deployed special grid-based interfaces to handle selections with hundreds or thousands of cells, I didn’t see that in Neolane.

On the other hand, Neolane might argue that its content personalization features reduce the need for building so many segments in the first place. Personalization works the same across all media: users embed selection rules within templates for email, Web pages and other messages. This is a fairly standard approach, but Neolane offers a particularly broad set of formats. It also provides different ways to build the rules, ranging from a simple scripting language to a point-and-click query builder. Neolane’s flow charts allow an additional level of personalized treatment, supporting sophisticated multi-step programs complete with branching logic. That part of the system seems quite impressive.

Apart from personalization, Neolane doesn’t seem to offer execution features for channels such as Web sites, call centers and sales automation. Nor, so far as I can tell, does it offer real-time interaction management—that is, gathering information about customer behavior during an interaction and determining an appropriate response. This is still a fairly specialized area and one where the major marketing automation vendors are just now delivering real products, after talking about it for years. This still puts them ahead of Neolane.

Execution also includes creation and management of the marketing database itself. Like most of its competitors, Neolane generally connects to a customer database built by an external system. (The exceptions would be enterprise suites like Oracle/Siebel and SAP, which manage the databases themselves. Yet even they tend to extract operational data into separate structures for marketing purposes.) Neolane does provide administrative tools for users to define database columns and tables, so it’s fairly easy to add new data if there’s no place else to store it. This would usually apply to administrative components such as budgets and planning data or to marketing-generated information such as campaign codes.

Analytics is the final function. Neolane does provide standard reporting. But it relies on connections to third-party software including SPSS and KXEN for more advanced analysis and predictive modeling. This is a fairly common approach and nothing to particularly complain about, although you do need to look closely to ensure that the integration is suitably seamless.

Over all, Neolane does provide a broad set of marketing functions, although it may not be quite as strong as its major competitors in some areas of execution and analytics. Still, it’s a viable new choice in a market that has offered few alternatives in recent years. So for companies considering a new system, it’s definitely worth a look.

Monday, October 08, 2007

Proprietary Databases Rise Again

I’ve been noticing for some time that “proprietary” databases are making a come-back in the world of marketing systems. “Proprietary” is a loaded term that generally refers to anything other than the major relational databases: Oracle, SQL Server and DB2, plus some of the open source products like MySQL. In the marketing database world, proprietary systems have a long history tracing back to the mid-1980’s MCIF products from Customer Insight, OKRA Marketing, Harte-Hanks and others. These originally used specialized structures to get adequate performance from the limited PC hardware available in the mid-1980’s. Their spiritual descendants today are Alterian and BlueVenn (formerly SmartFocus), both with roots in the mid-1990’s Brann Viper system and both having reinvented themselves in the past few years as low cost / high performance options for service bureaus to offer their clients.

Nearly all the proprietary marketing databases used some version of an inverted (now more commonly called “columnar”) database structure. In such a structure, data for each field (e.g., Customer Name) is physically stored in adjacent blocks on the hard drive, so it can be accessed with a single read. This makes sense for marketing systems, and analytical queries in general, which typically scan all contents of a few fields. By contrast, most transaction processes use a key to find a particular record (row) and read all its elements. Standard relational databases are optimized for such transaction processing and thus store entire rows together on the hard drive, making it easy to retrieve their contents.

Columnar databases themselves date back at least to mid-1970’s products including Computer Corporation of America Model 204, Software AG ADABAS, and Applied Data Research (now CA) Datacom/DB. All of these are still available, incidentally. In an era when hardware was vastly more expensive, the great efficiency of these systems at analytical queries made them highly attractive. But as hardware costs fell and relational databases became increasingly dominant, they fell by the wayside except for a special situations. Their sweet spot of high-volume analytical applications was further invaded by massively parallel systems (Teradata and more recently Netezza) and multi-dimensional data cubes (Cognos Powerplay, Oracle/Hyperion EssBase, etc.). These had different strengths and weaknesses but still competed for some of the same business.

What’s interesting today is that a new generation of proprietary systems is appearing. Vertica has recently gained a great deal of attention due to the involvement of database pioneer Michael Stonebraker, architect of INGRES and POSTGRES. (Click here for an excellent technical analysis by the Winter Corporation; registration required.) QD Technology, launched last year (see my review), isn’t precisely a columnar structure, but uses indexes and compression in a similar fashion. I can’t prove it, but suspect the new interest in alternative approaches is because analytical databases are now getting so large—tens and hundreds of gigabytes—that the efficiency advantages of non-relational systems (which translate into cost savings) are now too great to ignore.

We’ll see where all this leads. One of the few columnar systems introduced in the 1990’s was Expressway (technically, a bit map index—not unlike Model 204), which was purchased by Sybase and is now moderately successful as Sybase IQ. I think Oracle also added some bit-map capabilities during this period, and suspect the other relational database vendors have their versions as well. If columnar approaches continue to gain strength, we can certainly expect the major database vendors to add them as options, even though they are literally orthogonal to standard relational database design. In the meantime, it’s fun to see some new options become available and to hope that costs will come down as new competitors enter the domain of very large analytical databases.

Friday, October 05, 2007

Analytica Provides Low-Cost, High-Quality Decision Models

My friends at DM News, which has published my Software Review column for the past fifteen years, unceremoniously informed me this week that they had decided to stop carrying all of their paid columnists, myself included. This caught me in the middle of preparing a review of Lumina Analytica, a pretty interesting piece of simulation modeling software. Lest my research go to waste, I’ll write about Analytica here.

Analytica falls into the general class of software used to build mathematical models of systems or processes, and to then predict the results of a particular set of inputs. Business people typically use such software to understand the expected results of projects such as a new product launch or a marketing campaign, or to forecast the performance of their business as a whole. They can also be used to model the lifecycle of a customer or to calculate the results of key performance indicators as linked on a strategy map, if the relationships among those indicators have been defined with sufficient rigor.

When the relationships between inputs and outputs are simple, such models can be built in a spreadsheet. But even moderately complex business problems are beyond what a spreadsheet can reasonably handle: they have too many inputs and outputs and the relationships among these are too complicated. Analytica makes it relatively easy to specify these relationships by drawing them on an “influence diagram” that looks like a typical flow chart. Objects within the chart, representing the different inputs and outputs, can then be opened up to specify the precise mathematical relationships among the elements.

Analytica can also build models that run over a number of time periods, using results from previous periods as inputs to later periods. You can do something like this in a spreadsheet, but it takes a great many hard-coded formulas which are easy to get wrong and hard to change. Analytica also offers a wealth of tools for dealing with uncertainties, such as many different types of probability distributions. These are virtually impossible to handle in a normal spreadsheet model.

Apart from spreadsheets, Analytica fits between several niches in the software world. Its influence diagrams resemble the pictures drawn by graphics software: but unlike simple drawing programs, Analytica has actual calculations beneath its flow charts. On the other hand, Analytica is less powerful than process modeling software used to simulate manufacturing systems, call center operations, or other business processes. That software has many very sophisticated features tailored to modeling such flows in detail: for example, ways to simulate random variations in the arrival rates of telephone calls or to incorporate the need for one process step to wait until several others have been completed. It may be possible to do much of this in Analytica, but it would probably be stretching the software beyond its natural limits.

What Analytica does well is model specific decisions or business results over time. The diagram-building approach to creating models is quite powerful and intuitive, particularly because users can build modules within their models, so a single object on a high-level diagram actually refers to a separate, detailed diagram of its own. Object attributes include inputs, outputs and formulas describing how the outputs are calculated. Objects can also contain arrays to handle different conditions: for example, a customer object might use arrays to define treatments for different customer segments. This is a very powerful feature, since its lets an apparently simple model capture a great deal of actual detail.

Setting up a model in Analytica isn’t exactly simple, although it may be about as easy as possible given the inherent complexity of the task. Basically, users place the objects on a palette, connect them with arrows, and then open them up to define the details. There are many options within these details, so it does take some effort to learn how to get what you want. The vendor provides a tutorial and detailed manual to help with the learning process, and offers a variety of training and consulting options. Although it is accessible to just about anyway, the system is definitely oriented towards sophisticated users, providing advanced statistical features and methods that no one else would understand.

The other intriguing feature of Analytica is its price. The basic product costs a delightfully reasonable $1,295. Other versions range up to $4,000 including ability to access ODBC data sources, handle very large arrays, and run automated optimization procedures. A server-based version costs $8,000, but only very large companies would need that one.

This pricing is quite impressive. Modeling systems can easily cost tens or hundreds of thousands of dollars, and it’s not clear they provide much more capability than Analytica. On the other hand, Analytica’s output presentation is rather limited—some basic tables and graphs, plus several statistical measures of uncertainty. There’s that statistical orientation again: as a non-statistician, I would have preferred better visualization of results.

In my own work, Analytica could definitely provide a tool for building models to simulate customers’ behaviors as they flow through an Experience Matrix. This is already more than a spreadsheet can handle, and although it could be done in QlikTech it would be a challenge. Similarly, Analytica could be used in business planning and simulation. It wouldn’t be as powerful as a true agent-based model, but could provide an alternative that costs less and is much easier to learn how to build. If you’re in the market for this sort of modeling—particularly if you want to model uncertainties and not just fixed inputs—Analytica is definitely worth a look.

Tuesday, October 02, 2007

Marketing Performance Measurement: No Answers to the Really Tough Questions

I recently ran a pair of two-day workshops on marketing performance measurement. My students had a variety of goals, but the two major ones they mentioned were the toughest issues in marketing: how to allocate resources across different channels and how to measure the impact of marketing on brand value.

Both questions have standard answers. Channel allocation is handled by marketing mix models, which analyze historical data to determine the relative impact of different types of spending. Brand value is measured by assessing the important customer attitudes in a given market and how a particular brand matches those attitudes.

Yet, despite my typically eloquent and detailed explanations, my students found these answers unsatisfactory. Cost was one obstacle for most of them; lack of data was another. They really wanted something simpler.

I’d love to report I gave it to them, but I couldn't. I had researched these topics thoroughly as preparation for the workshops and hadn’t found any alternatives to the standard approaches; further research since then still hasn’t turned up anything else of substance. Channel allocation and brand value are inherently complex and there just are no simple ways to measure them.

The best I could suggest was to use proxy data when a thorough analysis is not possible due to cost or data constraints. For channel allocation, the proxy might be incremental return on investment by channel: switching funds from low ROI to high ROI channels doesn’t really measure the impact of the change in marketing mix, but it should lead to an improvement in the average level of performance. Similarly, surveys to measure changes in customer attitudes toward a brand don’t yield a financial measure of brand value, but do show whether it is improving or getting worse. Some compromise is unavoidable here: companies not willing or able to invest in a rigorous solution must accept that their answers will be imprecise.

This round of answers was little better received than the first. Even ROI and customer attitudes are not always available, and they are particularly hard to measure in multi-channel environments where the result of a particular marketing effort cannot easily be isolated. You can try still simpler measures, such as spending or responses for channel performance or market share for brand value. But these are so far removed from the original question that it’s difficult to present them as meaningful answers.

The other approach I suggested was testing. The goal here is to manufacture data where none exists, thereby creating something to measure. This turned out to be a key concept throughout the performance measurement discussions. Testing also shows that marketers are at least doing something rigorous, thereby helping satisfy critics who feel marketing investments are totally arbitrary. Of course, this is a political rather than analytical approach, but politics are important. The final benefit of testing is it gives a platform for continuous improvement: even though you may not know the absolute value of any particular marketing effort, a test tells whether one option or another is relatively superior. Over time, this allows a measurable gain in results compared with the original levels. Eventually it may provide benchmarks to compare different marketing efforts against each other, helping with both channel allocation and brand value as well.

Even testing isn’t always possible, as my students were quick to point out. My answer at that point was simply that you have to seek situations where you can test: for example, Web efforts are often more measurable than conventional channels. Web results may not mirror results in other channels, because Web customers may themselves be very different from the rest of the world. But this again gets back to the issue of doing the best with the resources at hand: some information is better than none, so long as you keep in mind the limits of what you’re working with.

I also suggested that testing is more possible than marketers sometimes think, if they really make testing a priority. This means selecting channels in part on the basis of whether testing is possible; designing programs so testing is built in; and investing more heavily in test activities themselves (such as incentives for survey participants). This approach may ultimately lead to a bias in favor of testable channels—something that seems excessive at first: you wouldn’t want to discard an effective channel simply because you couldn’t test it. But it makes some sense if you realize that testable channels can be improved continuously, while results in untestable channels are likely to stagnate. Given this dynamic, testable channels will sooner or later become more productive than untestable channels. This holds even if the testable channels are less efficient at the start.

I offered all these considerations to my students, and may have seen a few lightbulbs switch on. It was hard to tell: by the time we had gotten this far into the discussion, everyone was fairly tired. But I think it’s ultimately the best advice I could have given them: focus on testing and measuring what you can, and make the best use possible of the resulting knowledge. It may not directly answer your immediate questions, but you will learn how to make the most effective use of your marketing resources, and that’s the goal you are ultimately pursuing.

Sunday, September 16, 2007

Tableau Software Makes Good Visualization Easy

I took a close look recently at Tableau data visualization software. I liked Tableau a lot, even though it wasn’t quite what I expected. I had thought of it as a way to build aesthetically-correct charts, according to the precepts set down by Edward Tufte and like-minded visualization gurus such as Stephen Few. But even though Tableau follows many of these principles, it is less for building charts than interactive data exploration.

This is admittedly a pretty subtle distinction, since the exploration is achieved through charts. What I mean is that Tableau is designed to make it very easy to see the results of changing one data element at a time, for example to find whether a particular variable helps to predict an outcome. (That’s a little vague: the example that Tableau uses is analyzing the price of a condominium, and adding variables like square footage, number of rooms, number of baths, location, etc. to see if they explain differences in the sales price.) What makes Tableau special is it automatically redraws the graphs as the data changes, often producing a totally different format, The formats are selected according to the fore-mentioned visualization theories, and for the most part are quite effective.

It may be worth diving a bit deeper into those visualization techniques, although I don’t claim to be an expert. You’ve probably heard some of the gurus’ common criticisms: ‘three dimensional’ bars that don’t mean anything; pie charts and gauges that look pretty but show little information given the space they take up; radar charts that are fundamentally incomprehensible. The underlying premise is that humans are extremely good at finding patterns in visual data, so that is what charts should be used for—not to display specific information, which belongs in tables of numbers. Building on this premise, research shows that people find patterns more easily in certain types of displays: shapes, graduations of color, and spatial relationships work well, but not reading numbers, making subtle size comparisons (e.g., slices in a pie chart), or looking up colors in a key. This approach also implies avoiding components that convey no information, such as the shadows on those ‘3-d’ bar charts, since these can only distract from pattern identification.

In general, these principles work well, although I have trouble with some of the rules that result. For example, grids within charts are largely forbidden, on the theory that charts should only show relative information (patterns) and you don’t need a grid to know whether one bar is higher than another. My problem with that one, in fact, it’s often difficult to compare two bars that are not immediately adjacent, and a grid can help. A grid can also provide a useful reference point, such as showing ‘freezing’ on a temperature chart. The gurus might well allow grid lines in some of those circumstances.

On the other hand, the point about color is very well taken. Americans and Europeans often use red for danger and green for good, but there is nothing intuitive about those—they depend on cultural norms. In China, red is a positive color. Worse, the gurus point out, a significant portion of the population is color-blind and can’t distinguish red from green anyway. They suggest that color intensity is a better way to show gradations, since people naturally understand a continuum from light to dark (even though it may not be clear which end of the scale is good or bad). They also suggest muted rather than bright colors, since it’s easier to see subtle patterns when there is less color contrast. In general, they recommend against using color to display meaning (say, to identify regions on a bar chart) because it takes conscious effort to interpret. Where different items must be shown on the same chart, they would argue that differences in shape are more easily understood.

As I say, Tableau is consistent with these principles, although it does let users make other choices if they insist. There is apparently some very neat technology inside Tableau that builds the charts using a specification language rather than conventional configuration parameters. But this is largely hidden from users, since the graphs are usually designed automatically. It may have some effect on how easily the system can switch from one format to another, and on the range of display options.

The technical feature that does impact Tableau users is its approach to data storage. Basically, it doesn’t have one: that is, it relies on external data stores to hold the information it requires, and issues queries against those sources as required. This was a bit of a disappointment to me, since it means Tableau’s performance really relies on the external systems. Not that that’s so terrible—you could argue (as Tableau does) that this avoids loading data into a proprietary format, making it easier to access the information you need without pre-planning. But it also means that Tableau can be painfully slow when you’re working with large data sets, particularly if they haven’t been optimized for the queries you’re making. In a system designed to encourage unplanned “speed of thought” data exploration, I consider this a significant drawback.

That said, let me repeat that I really liked Tableau. Query speed will be an issue in only some situations. Most of the time, Tableau will draw the required data into memory and work with it there, giving near-immediate response. And if you really need quick response from a very large database, technical staff can always apply the usual optimization techniques. For people with really high-end needs, Tableau already works with the Hyperion multidimensional database and is building an adapter for the Netezza high speed data appliance.

Of course, looking at Tableau led me to compare it with QlikTech. This is definitely apples to oranges: one is a reporting system and the other is a data exploration tool; one has its own database and the other doesn’t. I found that with a little tweaking I could get QlikView to produce many of the same charts as Tableau, although it was certainly more work to get there. I’d love to see the Tableau interface connected with the QlikView data engine, but suspect the peculiarities of both systems make this unlikely. (Tableau queries rely on advanced SQL features; QlikView is not a SQL database.) If I had to choose just one, I would pick the greater data access power and flexibility of QlikTech over the easy visualizations of Tableau. But Tableau is cheap enough—$999 to $1,799 for a single user license, depending on the data sources permitted—that I see no reason most people who need them couldn’t have both.

Thursday, August 30, 2007

Marketing Performance Involves More than Ad Placement

I received a thoughtful e-mail the other day suggesting that my discussion of marketing performance measurement had been limited to advertising effectiveness, thereby ignoring the other important marketing functions of pricing, distribution and product development. For once, I’m not guilty as charged. At a minimum, a balanced scorecard would include measures related to those areas when they were highlighted as strategic. I’d further suggest that many standard marketing measures, such as margin analysis, cross-sell ratios, and retail coverage, address those areas directly.

Perhaps the problem is that so many marketing projects are embedded in advertising campaigns. For example, the way you test pricing strategies is to offer different prices in the marketplace and see how customers react. Same for product testing and cross-sales promotions. Even efforts to improve distribution are likely to boil down to campaigns to sign up new dealers, training existing ones, distribute point of sale materials, and so on. The results will nearly always be measured in terms of sales results, exactly as you measure advertising effectiveness.

In fact, since everything is measured through advertising it and recording the results, the real problem may be how to distinguish “advertising” from the other components of the marketing mix. In classic marketing mix statistical models, the advertising component is representing by ad spend, or some proxy such as gross rating points or market coverage. At a more tactical level, the question is the most cost-effective way to reach the target audience, independent of the message content (which includes price, product and perhaps distribution elements, in addition to classic positioning). So it does make sense to measure advertising effectiveness (or, more precisely, advertising placement effectiveness) as a distinct topic.

Of course, marketing does participate in activities that are not embodied directly in advertising or cannot be tested directly in the market. Early-stage product development is driven by market research, for example. Marketing performance measurement systems do need to indicate performance in these sorts of tasks. The challenge here isn’t finding measures—things like percentage of sales from new products and number of research studies completed (lagging and leading indicators, respectively) are easily available. Rather, the difficulty is isolating the contribution of “marketing” from the contribution of other departments that also participate in these projects. I’m not sure this has a solution or even needs one: maybe you just recognize that these are interdisciplinary teams and evaluate them as such. Ultimately we all work for the same company, eh? Now let’s sing Kumbaya.

In any event, I don’t see a problem using standard MPM techniques to measure more than advertising effectiveness. But it’s still worth considering the non-advertising elements explicitly to ensure they are not overlooked.