As I’ve mentioned in a couple of previous posts, QlikView doesn’t have the built-in matching functions needed for customer data integration (CDI). This has left me looking for other ways to provide that service, preferably at a low cost. The problem is that the major CDI products like Harte-Hanks Trillium, DataMentors DataFuse and SAS DataFlux are fairly expensive.
One intriguing alternative is Infosolve Technologies. Looking at the Infosolve Web site, it’s clear they offer something relevant, since two flagship products are ‘OpenDQ’ and ‘OpenCDI’ and their tag line is ‘The Power of Zero Based Data Solutions’. But I couldn't figure out exactly what they were selling since they stress that there are ‘never any licenses, hardware requirements or term contracts’. So I broke down and asked them.
It turns out that Infosolve is a consulting firm that uses free open source technology, specifically the Pentaho platform for data integration and business intelligence. A Certified Development partner of Pentaho, Infosolve has developed its own data quality and CDI components on the platform and simply sells the consulting needed to deploy it. Interesting.
Infosolve Vice President Subbu Manchiraju and Director of Alliances Richard Romanik spent some time going over the details and gave me a brief demonstration of the platform. Basically, Pentaho lets users build graphical workflows that link components for data extracts, transformation, profiling, matching, enhancement, and reporting. It looked every bit as good as similar commercial products.
Two particular points were worth noting:
- the actual matching approach itself seems acceptable. Users build rules that specify which fields to compare, the methods used to measure similarity, and similarity scores required for each field. This is less sophisticated than the best commercial products, but field comparisons are probably adequate for most situations. Although setting up and tuning such rules can be time-consuming, Infosolve told me they can build a typical set of match routines in about half a day. More experienced or adventurous users could even do it for themselves; the user interface makes the mechanics very simple. A half-day of consulting might cost $1,000, which is not bad at all when you consider that the software itself is free. The price for a full implementation would be higher since it would involve additional consulting to set up data extracts, standardization, enhancement and other processes, but cost should still be very reasonable. You’d probably need as much consulting with other CDI systems where you'd pay for the software too.
- data verification and enhancement is done by calls to StrikeIron, which provides a slew of on-demand data services. StrikeIron is worth knowing about in its own right: it lets users access Web services including global address verification and corrections; consumer and business data lookups using D&B and Gale Group data; telephone verification, appends and reverse appends; geocoding and distance calculations; Mapquest mapping and directions; name/address parsing; sales tax lookups; local weather forecasts; securities prices; real-time fraud detection; and message delivery for text (SMS) and voice (IVR). Everything is priced on a per use basis. This opens up all sorts of interesting possibilities.
The Infosolve software can be installed on any platform that can run Java, which is just about everything. Users can also run it within the Sun Grid utility network, which has a pay-as-you-go business model of $1 per CPU hour.
I’m a bit concerned about speed with Infosolve: the company said it takes 8 to 12 hours to run a million record match on a typical PC. But that assumes you compare every record against every other record, which usually isn’t necessary. Of course, where smaller volumes are concerned, this is not an issue.
Bottom line: Infosolve and Pentaho may not meet the most extreme CDI requirements, but they could be a very attractive option when low cost and quick deployment are essential. I’ll certainly keep them in mind for my own clients.
This is the blog of David M. Raab, marketing technology consultant and analyst. Mr. Raab is founder and CEO of the Customer Data Platform Institute and Principal at Raab Associates Inc. All opinions here are his own. The blog is named for the Customer Experience Matrix, a tool to visualize marketing and operational interactions between a company and its customers.
Thursday, November 29, 2007
Tuesday, November 27, 2007
Just How Scalable Is QlikTech?
A few days ago, I replied to a question regarding QlikTech scalability. (See What Makes QlikTech So Good?, August 3, 2007) I asked QlikTech itself for more information on the topic but haven’t learned anything new. So let me simply discuss this based on my own experience (and, once again, remind readers that while my firm is a QlikTech reseller, comments in this blog are strictly my own.)
The first thing I want to make clear is that QlikView is a wonderful product, so it would be a great pity if this discussion were to be taken as a criticism. Like any product, QlikView works within limits that must be understood to use it appropriately. No one benefits from unrealistic expectations, even if fans like me sometimes create them.
That said, let’s talk about what QlikTech is good at. I find two fundamental benefits from the product. The first is flexibility: it lets you analyze data in pretty much any way you want, without first building a data structure to accommodate your queries. By contrast, most business intelligence tools must pre-aggregate large data sets to deliver fast response. Often, users can’t even formulate a particular query if the dimensions or calculated measures were not specified in advance. Much of the development time and cost of conventional solutions, whether based in standard relational databases or specialized analytical structures, is spent on this sort of work. Avoiding it is the main reason QlikTech is able to deliver applications so quickly.
The other big benefit of QlikTech is scalability. I can work with millions of records on my desktop with the 32-bit version of the system (maximum memory 4 GB if your hardware allows it) and still get subsecond response. This is much more power than I’ve ever had before. A 64-bit server can work with tens or hundreds of millions of rows: the current limit for a single data set is apparently 2 billion rows, although I don’t know how close anyone has come to that in the field. I have personally worked with tables larger than 60 million rows, and QlikTech literature mentions an installation of 300 million rows. I strongly suspect that larger ones exist.
So far so good. But here’s the rub: there is a trade-off in QlikView between really big files and really great flexibility. The specific reason is that the more interesting types of flexibility often involve on-the-fly calculations, and those calculations require resources that slow down response. This is more a law of nature (there’s no free lunch) than a weakness in the product, but it does exist.
Let me give an example. One of the most powerful features of QlikView is a “calculated dimension”. This lets reports construct aggregates by grouping records according to ad hoc formulas. You might want to define ranges for a value such as age, income or unit price, or create categories using if/then/else statements. These formulas can get very complex, which is generally a good thing. But each formula must be calculated for each record every time it is used in a report. On a few thousand rows, this can happen in an instant, but on tens of millions of rows, it can take several minutes (or much longer if the formula is very demanding, such as on-the-fly ranking). At some point, the wait becomes unacceptable, particularly for users who have become accustomed to QlikView’s typically-immediate response.
As problems go, this isn’t a bad one because it often has a simple solution: instead of on-the-fly calculations, precalculate the required values in QlikView scripts and store the results on each record. There’s little or no performance cost to this strategy since expanding the record size doesn’t seem to slow things down. The calculations do add time to the data load, but that happens only once, typically in an unattended batch process. (Another option is to increase the number and/or speed of processors on the server. QlikTech makes excellent use of multiple processors.)
The really good news is you can still get the best of both worlds: work out design details with ad hoc reports on small data sets; then, once the design is stabilized, add precalculations to handle large data volumes. This is vastly quicker than prebuilding everything before you can see even a sample. It’s also something that’s done by business analysts with a bit of QlikView training, not database administrators or architects.
Other aspects of formulas and database design also more important in QlikView as data volumes grow larger. The general solution is the same: make the application more efficient through tighter database and report design. So even though it’s true that you can often just load data into QlikView and work with it immediately, it’s equally true that very large or sophisticated applications may take some tuning to work effectively. In other words, QlikView is not pure magic (any result you want for absolutely no work), but it does deliver much more value for a given amount of work than conventional business intelligence systems. That’s more than enough to justify the system.
Interestingly, I haven’t found that the complexity or over-all size of a particular data set impacts QlikView performance. That is, removing tables which are not used in a particular query doesn’t seem to speed up that query, nor does removing fields from tables within the query. This probably has to do with QlikTech’s “associative” database design, which treats each field independently and connects related fields directly to each other. But whatever the reason, most of the performance slow-downs I’ve encountered seem related to processing requirements.
And, yes, there are some upper limits to the absolute size of a QlikView implementation. Two billions rows is one, although my impression (I could be wrong) is that could be expanded if necessary. The need to load data into memory is another limit: even though the 64-bit address space is effectively infinite, there are physical limits to the amount of memory that can be attached to Windows servers. (A quick scan of the Dell site finds a maximum of 128 GB.) This could translate into more input data, since QlikView does some compression. At very large scales, processing speed will also impose a limit . Whatever the exact upper boundary, it’s clear that no one will be loading dozens of terabytes into QlikView any time soon. It can certainly be attached a multi-terabyte warehouse, but would have to work with multi-gigabyte extracts. For most purposes, that’s plenty.
While I’m on the topic of scalability, let me repeat a couple of points I made in the comments on the August post. One addresses the notion that QlikTech can replace a data warehouse. This is true in the sense that QlikView can indeed load and join data directly from operational systems. But a data warehouse is usually more than a federated view of current operational tables. Most warehouses include data integration to link otherwise-disconnected operational data. For example, customer records from different systems often can only be linked through complex matching techniques because there is no shared key such as a universal customer ID. QlikView doesn’t offer that kind of matching. You might be able to build some of it using QlikView scripts, but you’d get better results at a lower cost from software designed for the purpose.
In addition, most warehouses store historical information that is not retained in operational systems. A typical example is end-of-month account balance. Some of these values can be recreated from transaction details but it’s usually much easier just to take and store a snapshot. Other data may simply be removed from operational systems after a relatively brief period. QlikView can act as a repository for such data: in fact, it’s quite well suited for this. Yet in such cases, it’s probably more accurate to say that QlikView is acting as the data warehouse than to say a warehouse is not required.
I hope this clarifies matters without discouraging anyone from considering QlikTech. Yes QlikView is a fabulous product. No it won’t replace your multi-terabyte data warehouse. Yes it will complement that warehouse, or possibly substitute for a much smaller one, by providing a tremendously flexible and efficient business intelligence system. No it won’t run itself: you’ll still need some technical skills to do complicated things on large data volumes. But for a combination of speed, power, flexibility and cost, QlikTech can’t be beat.
The first thing I want to make clear is that QlikView is a wonderful product, so it would be a great pity if this discussion were to be taken as a criticism. Like any product, QlikView works within limits that must be understood to use it appropriately. No one benefits from unrealistic expectations, even if fans like me sometimes create them.
That said, let’s talk about what QlikTech is good at. I find two fundamental benefits from the product. The first is flexibility: it lets you analyze data in pretty much any way you want, without first building a data structure to accommodate your queries. By contrast, most business intelligence tools must pre-aggregate large data sets to deliver fast response. Often, users can’t even formulate a particular query if the dimensions or calculated measures were not specified in advance. Much of the development time and cost of conventional solutions, whether based in standard relational databases or specialized analytical structures, is spent on this sort of work. Avoiding it is the main reason QlikTech is able to deliver applications so quickly.
The other big benefit of QlikTech is scalability. I can work with millions of records on my desktop with the 32-bit version of the system (maximum memory 4 GB if your hardware allows it) and still get subsecond response. This is much more power than I’ve ever had before. A 64-bit server can work with tens or hundreds of millions of rows: the current limit for a single data set is apparently 2 billion rows, although I don’t know how close anyone has come to that in the field. I have personally worked with tables larger than 60 million rows, and QlikTech literature mentions an installation of 300 million rows. I strongly suspect that larger ones exist.
So far so good. But here’s the rub: there is a trade-off in QlikView between really big files and really great flexibility. The specific reason is that the more interesting types of flexibility often involve on-the-fly calculations, and those calculations require resources that slow down response. This is more a law of nature (there’s no free lunch) than a weakness in the product, but it does exist.
Let me give an example. One of the most powerful features of QlikView is a “calculated dimension”. This lets reports construct aggregates by grouping records according to ad hoc formulas. You might want to define ranges for a value such as age, income or unit price, or create categories using if/then/else statements. These formulas can get very complex, which is generally a good thing. But each formula must be calculated for each record every time it is used in a report. On a few thousand rows, this can happen in an instant, but on tens of millions of rows, it can take several minutes (or much longer if the formula is very demanding, such as on-the-fly ranking). At some point, the wait becomes unacceptable, particularly for users who have become accustomed to QlikView’s typically-immediate response.
As problems go, this isn’t a bad one because it often has a simple solution: instead of on-the-fly calculations, precalculate the required values in QlikView scripts and store the results on each record. There’s little or no performance cost to this strategy since expanding the record size doesn’t seem to slow things down. The calculations do add time to the data load, but that happens only once, typically in an unattended batch process. (Another option is to increase the number and/or speed of processors on the server. QlikTech makes excellent use of multiple processors.)
The really good news is you can still get the best of both worlds: work out design details with ad hoc reports on small data sets; then, once the design is stabilized, add precalculations to handle large data volumes. This is vastly quicker than prebuilding everything before you can see even a sample. It’s also something that’s done by business analysts with a bit of QlikView training, not database administrators or architects.
Other aspects of formulas and database design also more important in QlikView as data volumes grow larger. The general solution is the same: make the application more efficient through tighter database and report design. So even though it’s true that you can often just load data into QlikView and work with it immediately, it’s equally true that very large or sophisticated applications may take some tuning to work effectively. In other words, QlikView is not pure magic (any result you want for absolutely no work), but it does deliver much more value for a given amount of work than conventional business intelligence systems. That’s more than enough to justify the system.
Interestingly, I haven’t found that the complexity or over-all size of a particular data set impacts QlikView performance. That is, removing tables which are not used in a particular query doesn’t seem to speed up that query, nor does removing fields from tables within the query. This probably has to do with QlikTech’s “associative” database design, which treats each field independently and connects related fields directly to each other. But whatever the reason, most of the performance slow-downs I’ve encountered seem related to processing requirements.
And, yes, there are some upper limits to the absolute size of a QlikView implementation. Two billions rows is one, although my impression (I could be wrong) is that could be expanded if necessary. The need to load data into memory is another limit: even though the 64-bit address space is effectively infinite, there are physical limits to the amount of memory that can be attached to Windows servers. (A quick scan of the Dell site finds a maximum of 128 GB.) This could translate into more input data, since QlikView does some compression. At very large scales, processing speed will also impose a limit . Whatever the exact upper boundary, it’s clear that no one will be loading dozens of terabytes into QlikView any time soon. It can certainly be attached a multi-terabyte warehouse, but would have to work with multi-gigabyte extracts. For most purposes, that’s plenty.
While I’m on the topic of scalability, let me repeat a couple of points I made in the comments on the August post. One addresses the notion that QlikTech can replace a data warehouse. This is true in the sense that QlikView can indeed load and join data directly from operational systems. But a data warehouse is usually more than a federated view of current operational tables. Most warehouses include data integration to link otherwise-disconnected operational data. For example, customer records from different systems often can only be linked through complex matching techniques because there is no shared key such as a universal customer ID. QlikView doesn’t offer that kind of matching. You might be able to build some of it using QlikView scripts, but you’d get better results at a lower cost from software designed for the purpose.
In addition, most warehouses store historical information that is not retained in operational systems. A typical example is end-of-month account balance. Some of these values can be recreated from transaction details but it’s usually much easier just to take and store a snapshot. Other data may simply be removed from operational systems after a relatively brief period. QlikView can act as a repository for such data: in fact, it’s quite well suited for this. Yet in such cases, it’s probably more accurate to say that QlikView is acting as the data warehouse than to say a warehouse is not required.
I hope this clarifies matters without discouraging anyone from considering QlikTech. Yes QlikView is a fabulous product. No it won’t replace your multi-terabyte data warehouse. Yes it will complement that warehouse, or possibly substitute for a much smaller one, by providing a tremendously flexible and efficient business intelligence system. No it won’t run itself: you’ll still need some technical skills to do complicated things on large data volumes. But for a combination of speed, power, flexibility and cost, QlikTech can’t be beat.
Thursday, November 15, 2007
SAS Adds Real Time Decisioning to Its Marketing Systems
I’ve been trying to pull together a post on SAS for some time. It’s not easy because their offerings are so diverse. The Web site lists 13 “Solution Lines” ranging from “Activity-Based Management” to “Web Analytics”. (SAS being SAS, these are indeed listed alphabetically.) The “Customer Relationship Management” Solution Line has 13 subcategories of its own (clearly no triskaidekaphobia here), ranging from “Credit Scoring” to “Web Analytics”.
Yes, you read that right: Web Analytics is listed both as a Solution Line and as a component of the CRM Solution. So is Profitability Management.
This is an accurate reflection of SAS’s fundamental software strategy, which is to leverage generic capabilities by packaging them for specific business areas. The various Solution Lines overlap in ways that are hard to describe but make perfect business sense.
Another reason for overlapping products is that SAS has made many acquisitions as it expands its product scope. Of the 13 products listed under Customer Relationship Managment, I can immediately identify three as based on acquisitions, and believe there are several more. This is not necessarily a problem, but it always raises concerns about integration and standardization.
Sure enough, integration and shared interfaces are two of the major themes that SAS lists for the next round of Customer Intelligence releases, due in December. (“Customer Intelligence” is SAS’s name for the platform underlying its enterprise marketing offerings. Different SAS documents show it including five or seven or nine of the components within the CRM Solution, and sometimes components of other Solution Lines. Confused yet? SAS tells me they're working on clarifying all this in the future.)
Labels aside, the biggest news within this release of Customer Intelligence is the addition of Real-Time Decision Manager (RDM), a system that…well…makes decisions in real time. This is a brand new, SAS-developed module, set for release in December with initial deployments in first quarter 2008. It is not to be confused with SAS Interaction Management, an event-detection system based on SAS's Verbind acquisition in 2002. SAS says it intends to tightly integrate RDM and Interaction Manager during 2008, but hasn’t worked out the details.
Real-Time Decision Manager lets users define the flow of a decision process, applying reusable nodes that can contain both decision rules and predictive models. Flows are then made available to customer touchpoint systems as Web services using J2EE. The predictive models themselves are built outside of RDM using traditional SAS modeling tools. They are then registered with RDM to become available for use as process nodes.
RDM's reliance on externally-built models contrasts with products that automatically create and refresh their own predictive models, notably the Infor CRM Epiphany Inbound Marketing system recently licensed by Teradata (see my post of October 31). SAS says that skilled users could deploy similar self-adjusting models, which use Bayesian techniques, in about half a day in RDM. The larger issue, according to SAS, is that such models are only appropriate in a limited set of situations. SAS argues its approach lets companies deploy whichever techniques are best suited to their needs.
But the whole point of the Infor/Epiphany approach is that many companies will never have the skilled statisticians to build and maintain large numbers of custom models. Self-generating models let these firms benefit from models even if the model performance is suboptimal. They also permit use of models in situations where the cost of building a manual model is prohibitive. Seems to me the best approach is for software to support both skilled users and auto-generated models, and let firms to apply whichever makes sense.
Back to integration. RDM runs on its own server, which is separate from the SAS 9 server used by most Customer Intelligence components. This is probably necessary to ensure adequate real-time performance. RDM does use SAS management utilities to monitor server performance. More important, it shares flow design and administrative clients with SAS Marketing Automation, which is SAS’s primary campaign management software. This saves users from moving between different interfaces and allows sharing of user-built nodes across Customer Intelligence applications.
RDM and other Customer Intelligence components also now access the same contact history and response data. This resides in what SAS calls a “lightweight” reporting schema, in contrast to the detailed, application-specific data models used within the different Customer Intelligence components. Shared contact and response data simplifies coordination of customer treatments across these different systems. Further integration among the component data models would probably be helpful, but I can't say for sure.
The December release also contains major enhancements to SAS’s Marketing Optimization and Digital Marketing (formerly E-mail Marketing) products. Optimization now works faster and does a better job finding the best set of contacts for a group of customers. Digital Marketing now includes mobile messaging, RSS feeds and dynamic Web pages. It also integrates more closely with Marketing Automation, which would generally create lists that are sent to Digital Marketing for transmission. Within Marketing Automation itself, it’s easier to create custom nodes for project flows and to integrate statistical models.
These are some pretty interesting trees, but let’s step back and look at the forest. Loyal readers of this blog know I divide a complete marketing system into five components: planning/budgets, project management, content management, execution, and analytics. SAS is obviously focused on execution and analytics. Limited content management functions are embedded in the various execution modules. There is no separate project management component, although the workflow capabilities of Marketing Automation can be applied to tactical tasks like campaign setup.
Planning and budgeting are more complicated because they are spread among several components. The Customer Intelligence platform includes a Marketing Performance Management module which is based on SAS’s generic Performance Management solutions. This provides forecasting, planning, scorecards, key performance indicators, and so on. Separate Profitability Management and Activity-Based Management modules are similarly based on generic capabilities. (If you’re keeping score at home, Profitability Management is usually listed within Customer Intelligence and Activity-Based Management is not.) Finally, Customer Intelligence also includes Veridiem MRM. Acquired in 2006 and still largely separate from the other products, Veridiem provides marketing reporting, modeling, scenarios and collaborative tools based on marketing mix models.
This is definitely a little scary. Things may not be as bad as they sound: components that SAS has built from scratch or reengineered to work on the standard SAS platforms are probably better integrated than the jumble of product names suggests. Also bear in mind that most marketing activities occur within SAS Marketing Automation, a mature campaign engine with more than 150 installations. Still, users with a broad range of marketing requirements should recognize that while SAS will sell you many pieces of the puzzle, some assembly is definitely required.
Yes, you read that right: Web Analytics is listed both as a Solution Line and as a component of the CRM Solution. So is Profitability Management.
This is an accurate reflection of SAS’s fundamental software strategy, which is to leverage generic capabilities by packaging them for specific business areas. The various Solution Lines overlap in ways that are hard to describe but make perfect business sense.
Another reason for overlapping products is that SAS has made many acquisitions as it expands its product scope. Of the 13 products listed under Customer Relationship Managment, I can immediately identify three as based on acquisitions, and believe there are several more. This is not necessarily a problem, but it always raises concerns about integration and standardization.
Sure enough, integration and shared interfaces are two of the major themes that SAS lists for the next round of Customer Intelligence releases, due in December. (“Customer Intelligence” is SAS’s name for the platform underlying its enterprise marketing offerings. Different SAS documents show it including five or seven or nine of the components within the CRM Solution, and sometimes components of other Solution Lines. Confused yet? SAS tells me they're working on clarifying all this in the future.)
Labels aside, the biggest news within this release of Customer Intelligence is the addition of Real-Time Decision Manager (RDM), a system that…well…makes decisions in real time. This is a brand new, SAS-developed module, set for release in December with initial deployments in first quarter 2008. It is not to be confused with SAS Interaction Management, an event-detection system based on SAS's Verbind acquisition in 2002. SAS says it intends to tightly integrate RDM and Interaction Manager during 2008, but hasn’t worked out the details.
Real-Time Decision Manager lets users define the flow of a decision process, applying reusable nodes that can contain both decision rules and predictive models. Flows are then made available to customer touchpoint systems as Web services using J2EE. The predictive models themselves are built outside of RDM using traditional SAS modeling tools. They are then registered with RDM to become available for use as process nodes.
RDM's reliance on externally-built models contrasts with products that automatically create and refresh their own predictive models, notably the Infor CRM Epiphany Inbound Marketing system recently licensed by Teradata (see my post of October 31). SAS says that skilled users could deploy similar self-adjusting models, which use Bayesian techniques, in about half a day in RDM. The larger issue, according to SAS, is that such models are only appropriate in a limited set of situations. SAS argues its approach lets companies deploy whichever techniques are best suited to their needs.
But the whole point of the Infor/Epiphany approach is that many companies will never have the skilled statisticians to build and maintain large numbers of custom models. Self-generating models let these firms benefit from models even if the model performance is suboptimal. They also permit use of models in situations where the cost of building a manual model is prohibitive. Seems to me the best approach is for software to support both skilled users and auto-generated models, and let firms to apply whichever makes sense.
Back to integration. RDM runs on its own server, which is separate from the SAS 9 server used by most Customer Intelligence components. This is probably necessary to ensure adequate real-time performance. RDM does use SAS management utilities to monitor server performance. More important, it shares flow design and administrative clients with SAS Marketing Automation, which is SAS’s primary campaign management software. This saves users from moving between different interfaces and allows sharing of user-built nodes across Customer Intelligence applications.
RDM and other Customer Intelligence components also now access the same contact history and response data. This resides in what SAS calls a “lightweight” reporting schema, in contrast to the detailed, application-specific data models used within the different Customer Intelligence components. Shared contact and response data simplifies coordination of customer treatments across these different systems. Further integration among the component data models would probably be helpful, but I can't say for sure.
The December release also contains major enhancements to SAS’s Marketing Optimization and Digital Marketing (formerly E-mail Marketing) products. Optimization now works faster and does a better job finding the best set of contacts for a group of customers. Digital Marketing now includes mobile messaging, RSS feeds and dynamic Web pages. It also integrates more closely with Marketing Automation, which would generally create lists that are sent to Digital Marketing for transmission. Within Marketing Automation itself, it’s easier to create custom nodes for project flows and to integrate statistical models.
These are some pretty interesting trees, but let’s step back and look at the forest. Loyal readers of this blog know I divide a complete marketing system into five components: planning/budgets, project management, content management, execution, and analytics. SAS is obviously focused on execution and analytics. Limited content management functions are embedded in the various execution modules. There is no separate project management component, although the workflow capabilities of Marketing Automation can be applied to tactical tasks like campaign setup.
Planning and budgeting are more complicated because they are spread among several components. The Customer Intelligence platform includes a Marketing Performance Management module which is based on SAS’s generic Performance Management solutions. This provides forecasting, planning, scorecards, key performance indicators, and so on. Separate Profitability Management and Activity-Based Management modules are similarly based on generic capabilities. (If you’re keeping score at home, Profitability Management is usually listed within Customer Intelligence and Activity-Based Management is not.) Finally, Customer Intelligence also includes Veridiem MRM. Acquired in 2006 and still largely separate from the other products, Veridiem provides marketing reporting, modeling, scenarios and collaborative tools based on marketing mix models.
This is definitely a little scary. Things may not be as bad as they sound: components that SAS has built from scratch or reengineered to work on the standard SAS platforms are probably better integrated than the jumble of product names suggests. Also bear in mind that most marketing activities occur within SAS Marketing Automation, a mature campaign engine with more than 150 installations. Still, users with a broad range of marketing requirements should recognize that while SAS will sell you many pieces of the puzzle, some assembly is definitely required.
Monday, November 12, 2007
BridgeTrack Integrates Some Online Channels
What do “Nude Pics of Pam Anderson” and “Real-Time Analytics, Reporting and Optimization Across All Media Channels” have in common?
1. Both headlines are sure to draw the interest of certain readers.
2. People who click on either are likely to be disappointed.
Truth be told, I’ve never clicked on a Pam Anderson headline, so I can only assume it would disappoint. But I found the second headline irresistible. It was attached to a press release about the 5.0 release of Sapient’s BridgeTrack marketing software.
Maybe next time I’ll try Pam instead. BridgeTrack seems pretty good at what it does, but is nowhere near what the headline suggests.
First the good news: BridgeTrack integrates email, ad serving, offer pages, and keyword bidding (via an OEM agreement with Omniture) through a single campaign interface. All channels draw on a common content store, prospect identifiers, and data warehouse to allow integrated cross-channel programs. Results from each channel are posted and available for analysis in real time.
That’s much more convenient than working with separate systems for each function, and is the real point of BridgeTrack. I haven’t taken a close look at the specific capabilities within each channel but they seem reasonably complete.
But it’s still far from “optimization across all media channels”.
Let’s start with “all media channels”. Ever hear of a little thing called “television”? Most people would include it in a list of all media channels. But the best that BridgeTrack can offer for TV or any other off-line channel is a media buying module that manages the purchasing workflow and stores basic planning information. Even in the digital world, BridgeTrack does little to address organic search optimization, Web analytics, mobile phones, or the exploding realm of social networks. In general, I prefer to evaluate software based on what it does rather than what it doesn’t do. But if BridgeTrack is going to promise me all channels, I think it’s legitimate to complain when they don’t deliver.
What about “optimization”? Same story, I’m afraid. BridgeTrack does automatically optimize ad delivery by comparing results for different advertisements (on user-defined measures such as conversion rates) and automatically selecting the most successful. The keyword bidding system is also automated, but that’s really Omniture.
Otherwise, all optimization is manual. For example, the press release says the BridgeTrack campaign manager “reallocates marketing dollars across channels that generate the most incremental return-on-spend.” But all it really does is present reports. Users have to interpret them and make appropriate changes in marketing programs. Similarly, email and offer page optimization means watching the results of user-defined rules and adjusting the rules manually. Rather than claiming that BridgeTrack “does” optimization, it might be accurate to say it “enables” it through integrated real time reports and unified campaign management. Given how hard it is to assemble information and coordinate campaigns without a tool like BridgeTrack, that’s actually quite enough.
Even within its chosen channels, BridgeTrack lacks automated predictive modeling and advanced analytics in general. (The ad server does offer some cool heat maps of placement performance.) This has direct consequences, since it means the system must rely heavily on user-defined rules to select appropriate customer treatments. Unfortunately, rule management is quite limited: users don’t even get statistics on how often different rules are fired or how they perform. The problem is compounded because rules can exist at many different levels, including within content, in content templates, and in campaign flows. Understanding interactions across different levels can be difficult, yet BridgeTrack provides little assistance. The central content store helps a bit, since rules embedded in a particular piece of content are shared automatically when the content is shared. BridgeTrack managers recognize this issue and hope to improve its rule management in the future.
In fact, despite the press release, BridgeTrack managers have a fairly realistic view of the product’s actual scope. This shows in recent agreements with Unica and Omniture to integrate with their respective marketing automation and Web analytics products. Users of the combined set of products would have many of the the planning, off-line marketing, project management and analytical tools that BridgeTrack itself does not provide.
(Actually, based on the July 2007 press release describing the BridgeTrack integration, Omniture positions itself as “online business optimization software” that provides “one standard view across all marketing initiatives”. That’s a bold set of claims. I’m skeptical but will wait to examine them some other day.)
BridgeTrack is a hosted solution. Pricing is designed to be comparable with the point solutions it replaces and therefore is calculated differently for specific activities: message volume for ad serving and email, cost per click for search bid management, and traffic levels for landing page hosting. The campaign manager and reporting systems support all the different channels. These are not usually sold independently but could be purchased for a monthly fee. Customer data integration, which combines BridgeTrack-generated data with information from external sources for reporting and customer treatments, is charged as a professional services project.
1. Both headlines are sure to draw the interest of certain readers.
2. People who click on either are likely to be disappointed.
Truth be told, I’ve never clicked on a Pam Anderson headline, so I can only assume it would disappoint. But I found the second headline irresistible. It was attached to a press release about the 5.0 release of Sapient’s BridgeTrack marketing software.
Maybe next time I’ll try Pam instead. BridgeTrack seems pretty good at what it does, but is nowhere near what the headline suggests.
First the good news: BridgeTrack integrates email, ad serving, offer pages, and keyword bidding (via an OEM agreement with Omniture) through a single campaign interface. All channels draw on a common content store, prospect identifiers, and data warehouse to allow integrated cross-channel programs. Results from each channel are posted and available for analysis in real time.
That’s much more convenient than working with separate systems for each function, and is the real point of BridgeTrack. I haven’t taken a close look at the specific capabilities within each channel but they seem reasonably complete.
But it’s still far from “optimization across all media channels”.
Let’s start with “all media channels”. Ever hear of a little thing called “television”? Most people would include it in a list of all media channels. But the best that BridgeTrack can offer for TV or any other off-line channel is a media buying module that manages the purchasing workflow and stores basic planning information. Even in the digital world, BridgeTrack does little to address organic search optimization, Web analytics, mobile phones, or the exploding realm of social networks. In general, I prefer to evaluate software based on what it does rather than what it doesn’t do. But if BridgeTrack is going to promise me all channels, I think it’s legitimate to complain when they don’t deliver.
What about “optimization”? Same story, I’m afraid. BridgeTrack does automatically optimize ad delivery by comparing results for different advertisements (on user-defined measures such as conversion rates) and automatically selecting the most successful. The keyword bidding system is also automated, but that’s really Omniture.
Otherwise, all optimization is manual. For example, the press release says the BridgeTrack campaign manager “reallocates marketing dollars across channels that generate the most incremental return-on-spend.” But all it really does is present reports. Users have to interpret them and make appropriate changes in marketing programs. Similarly, email and offer page optimization means watching the results of user-defined rules and adjusting the rules manually. Rather than claiming that BridgeTrack “does” optimization, it might be accurate to say it “enables” it through integrated real time reports and unified campaign management. Given how hard it is to assemble information and coordinate campaigns without a tool like BridgeTrack, that’s actually quite enough.
Even within its chosen channels, BridgeTrack lacks automated predictive modeling and advanced analytics in general. (The ad server does offer some cool heat maps of placement performance.) This has direct consequences, since it means the system must rely heavily on user-defined rules to select appropriate customer treatments. Unfortunately, rule management is quite limited: users don’t even get statistics on how often different rules are fired or how they perform. The problem is compounded because rules can exist at many different levels, including within content, in content templates, and in campaign flows. Understanding interactions across different levels can be difficult, yet BridgeTrack provides little assistance. The central content store helps a bit, since rules embedded in a particular piece of content are shared automatically when the content is shared. BridgeTrack managers recognize this issue and hope to improve its rule management in the future.
In fact, despite the press release, BridgeTrack managers have a fairly realistic view of the product’s actual scope. This shows in recent agreements with Unica and Omniture to integrate with their respective marketing automation and Web analytics products. Users of the combined set of products would have many of the the planning, off-line marketing, project management and analytical tools that BridgeTrack itself does not provide.
(Actually, based on the July 2007 press release describing the BridgeTrack integration, Omniture positions itself as “online business optimization software” that provides “one standard view across all marketing initiatives”. That’s a bold set of claims. I’m skeptical but will wait to examine them some other day.)
BridgeTrack is a hosted solution. Pricing is designed to be comparable with the point solutions it replaces and therefore is calculated differently for specific activities: message volume for ad serving and email, cost per click for search bid management, and traffic levels for landing page hosting. The campaign manager and reporting systems support all the different channels. These are not usually sold independently but could be purchased for a monthly fee. Customer data integration, which combines BridgeTrack-generated data with information from external sources for reporting and customer treatments, is charged as a professional services project.
Tuesday, November 06, 2007
Datran Media Sells Email Like Web Ads
I wasn’t able to get to the ad:tech conference in New York City this week, but did spend a little time looking at the show sponsors’ Web sites. (Oddly, I was unable to find an online listing of all the exhibitors. This seems like such a basic mistake for this particular group that I wonder whether it was intentional. But I can’t see a reason.)
Most of the sponsors are offering services related to online ad networks. These are important but just marginally relevant my own concerns. I did however see some intriguing information from Datran Media, an email marketing vendor which seems to be emulating the model of the online ad networks. It’s hard to get a clear picture from its Web site, but my understanding is that Datran both provides conventional email distribution services via its Skylist subsidiary and helps companies purchase use of other firms’ email lists.
This latter capability is what’s intriguing. Datran is packaging email lists in the same way as advertising space on a Web site or conventional publication. That is, it treats each list as “inventory” that can be sold to the highest bidder in an online exchange. Datran not very creatively calls this “Exchange Online”, or EO. Presumably (this is one of the things I can’t tell from the Web site) the inventory is limited by the number of times a person can be contacted within a given period.
Datran also speaks of having an email universe of over100 million unique consumers. I can’t tell if this is its own compiled list or the sum of the lists it sells on behalf of its clients, although I’m guessing the former. The company offers event-based selections within this universe, such as people who have recently responded to an offer or made a purchase. This is more like traditional direct mail list marketing than Web ad sales, not that there’s anything wrong with that. Completing the circle, Datran also offers event-triggered programs to its conventional email clients, for retention, cross sales and loyalty building. This is not unique, but it’s still just emerging as a best practice.
From my own perspective, treating an email list as an inventory of contact opportunities exactly mirrors the way we see things at Client X Client. In our terminology, each piece of inventory is a “slot” waiting to be filled. Wasting slots is as bad as wasting any other perishable inventory, be it a Web page view, airline seat, or stale doughnut. One of the core tasks in the CXC methodology is identifying previously unrecognized slots and then attempting to wring the greatest value possible from them. It’s pleasing to see that Datran has done exactly that, even though they came up with the idea without our help.
The notion of slots also highlights another piece of ambiguity about Datran: are email customers purchasing an entire email, or an advertisement inserted into existing email? There is language on the company site that suggests both possibilities, although I suspect it’s the entire email. Actually, embedding ads in existing emails might be a more productive use of the “slots” that those emails represent, since it would allow delivery of more messages per customer. Whether it would also annoy recipients or diminish response is something that would have to be tested.
Datran offers other services related to online marketing, such as landing page optimization. This illustrates another trend: combining many different online channels and methods in a single package. This is an important development in the world of marketing software, and I plan to write more about it in the near future.
Most of the sponsors are offering services related to online ad networks. These are important but just marginally relevant my own concerns. I did however see some intriguing information from Datran Media, an email marketing vendor which seems to be emulating the model of the online ad networks. It’s hard to get a clear picture from its Web site, but my understanding is that Datran both provides conventional email distribution services via its Skylist subsidiary and helps companies purchase use of other firms’ email lists.
This latter capability is what’s intriguing. Datran is packaging email lists in the same way as advertising space on a Web site or conventional publication. That is, it treats each list as “inventory” that can be sold to the highest bidder in an online exchange. Datran not very creatively calls this “Exchange Online”, or EO. Presumably (this is one of the things I can’t tell from the Web site) the inventory is limited by the number of times a person can be contacted within a given period.
Datran also speaks of having an email universe of over100 million unique consumers. I can’t tell if this is its own compiled list or the sum of the lists it sells on behalf of its clients, although I’m guessing the former. The company offers event-based selections within this universe, such as people who have recently responded to an offer or made a purchase. This is more like traditional direct mail list marketing than Web ad sales, not that there’s anything wrong with that. Completing the circle, Datran also offers event-triggered programs to its conventional email clients, for retention, cross sales and loyalty building. This is not unique, but it’s still just emerging as a best practice.
From my own perspective, treating an email list as an inventory of contact opportunities exactly mirrors the way we see things at Client X Client. In our terminology, each piece of inventory is a “slot” waiting to be filled. Wasting slots is as bad as wasting any other perishable inventory, be it a Web page view, airline seat, or stale doughnut. One of the core tasks in the CXC methodology is identifying previously unrecognized slots and then attempting to wring the greatest value possible from them. It’s pleasing to see that Datran has done exactly that, even though they came up with the idea without our help.
The notion of slots also highlights another piece of ambiguity about Datran: are email customers purchasing an entire email, or an advertisement inserted into existing email? There is language on the company site that suggests both possibilities, although I suspect it’s the entire email. Actually, embedding ads in existing emails might be a more productive use of the “slots” that those emails represent, since it would allow delivery of more messages per customer. Whether it would also annoy recipients or diminish response is something that would have to be tested.
Datran offers other services related to online marketing, such as landing page optimization. This illustrates another trend: combining many different online channels and methods in a single package. This is an important development in the world of marketing software, and I plan to write more about it in the near future.
Friday, November 02, 2007
The Next Big Leap for Marketing Software
I’ve often written about the tendency of marketing automation vendors to endlessly expand the scope of their products. Over all this is probably a good thing for their customers. But at some point, the competitive advantage of adding yet another capability probably approaches nil. If so, then what will be the next really important change in marketing systems?
My guess is it will be a coordination mechanism to tie together all of those different components – resource management, execution, analysis, and so on. Think of each function as a horse: the more horses you rope to your chariot, the harder it is to keep control.
I’m not talking about data integration or marketing planning, which are already part of the existing architectures, or even the much desired (though rarely achieved) goal of centralized customer interaction management. Those are important but too rigid. Few companies will be technically able or politically willing to turn every customer treatment over to one great system in the sky. Rather, I have in mind something lighter-handed: not a rigid harness, but a carrot in front of those horses that gets them voluntarily pulling in the same direction. (Okay, enough with the horse metaphor.)
The initial version of this system will probably be a reporting process that gathers interactions from customer contact systems and relates them to future results. I’m stating that as simply as possible because I don’t think a really sophisticated approach – say, customer lifecycle simulation models – will be accepted. Managers at all levels need to see basic correlations between treatments and behaviors. They can then build their own mental models about how these are connected. I fully expect more sophisticated models to evolve over time, including what-if simulations to predict the results of different approaches and optimization to find the best choices. But today most managers would find such models too theoretical to act on the results. I’m avoiding mention of lifetime value measures for the same reason.
So what, concretely, is involved here and how does it differ from what’s already available? What’s involved is building a unified view of all contacts with individual customers and making these easy to analyze. Most marketing analysis today is still at the program level and rarely attempts to measure anything beyond immediate results. The new system would assemble information on revenues, product margins and service costs as well as promotions. This would give a complete picture of customer profitability at present and over time. Changes over time are really the key, since they alert managers as quickly as possible to problems and opportunities.
The system will store this information at the lowest level possible (preferably, down to individual interactions) and with the greatest detail (specifics about customer demographics, promotion contents, service issues, etc.), so all different kinds of analysis can be conducted on the same base. Although the initial deployments will contain only fragments of the complete data, these fragments will themselves be enough to be useful. The key to success will be making sure that the tools in the initial system are so attractive (that is, so powerful and so easy to use) that managers in all groups want to use them against their own data, even though this means exposing that data to others in the company. (If you’re lucky enough to work in a company where all data is shared voluntarily and enthusiastically – well, congratulations.)
You may feel what I’ve described is really no different from existing marketing databases and data warehouses. I don’t think so: transactions in most marketing databases are limited to promotions and responses, while most data warehouses lack the longitudinal customer view. But even if the technology is already in place, the approach is certainly distinct. It means asking managers to look not just at their own operational concerns, but at how their activities affect results across the company and the entire customer life cycle. More concretely, it allows managers to spot inconsistencies in customer treatments from one department to the next, and to compare the long-term results (and, therefore, return on investment) of treatments in different areas. Comparing return on investment is really a form of optimization, but that’s another term we’re avoiding for the moment.
Finally, and most important of all, assembling and exposing all this information makes it easy to see where customer treatments support over-all business strategies, and where they conflict with them. This is the most important benefit because business strategy is what CEOs and other top-level executives care about—so a system that helps them execute strategy can win their support. That support is what’s ultimately needed for marketing automation to make its next great leap, from one department to the enterprise as a whole.
My guess is it will be a coordination mechanism to tie together all of those different components – resource management, execution, analysis, and so on. Think of each function as a horse: the more horses you rope to your chariot, the harder it is to keep control.
I’m not talking about data integration or marketing planning, which are already part of the existing architectures, or even the much desired (though rarely achieved) goal of centralized customer interaction management. Those are important but too rigid. Few companies will be technically able or politically willing to turn every customer treatment over to one great system in the sky. Rather, I have in mind something lighter-handed: not a rigid harness, but a carrot in front of those horses that gets them voluntarily pulling in the same direction. (Okay, enough with the horse metaphor.)
The initial version of this system will probably be a reporting process that gathers interactions from customer contact systems and relates them to future results. I’m stating that as simply as possible because I don’t think a really sophisticated approach – say, customer lifecycle simulation models – will be accepted. Managers at all levels need to see basic correlations between treatments and behaviors. They can then build their own mental models about how these are connected. I fully expect more sophisticated models to evolve over time, including what-if simulations to predict the results of different approaches and optimization to find the best choices. But today most managers would find such models too theoretical to act on the results. I’m avoiding mention of lifetime value measures for the same reason.
So what, concretely, is involved here and how does it differ from what’s already available? What’s involved is building a unified view of all contacts with individual customers and making these easy to analyze. Most marketing analysis today is still at the program level and rarely attempts to measure anything beyond immediate results. The new system would assemble information on revenues, product margins and service costs as well as promotions. This would give a complete picture of customer profitability at present and over time. Changes over time are really the key, since they alert managers as quickly as possible to problems and opportunities.
The system will store this information at the lowest level possible (preferably, down to individual interactions) and with the greatest detail (specifics about customer demographics, promotion contents, service issues, etc.), so all different kinds of analysis can be conducted on the same base. Although the initial deployments will contain only fragments of the complete data, these fragments will themselves be enough to be useful. The key to success will be making sure that the tools in the initial system are so attractive (that is, so powerful and so easy to use) that managers in all groups want to use them against their own data, even though this means exposing that data to others in the company. (If you’re lucky enough to work in a company where all data is shared voluntarily and enthusiastically – well, congratulations.)
You may feel what I’ve described is really no different from existing marketing databases and data warehouses. I don’t think so: transactions in most marketing databases are limited to promotions and responses, while most data warehouses lack the longitudinal customer view. But even if the technology is already in place, the approach is certainly distinct. It means asking managers to look not just at their own operational concerns, but at how their activities affect results across the company and the entire customer life cycle. More concretely, it allows managers to spot inconsistencies in customer treatments from one department to the next, and to compare the long-term results (and, therefore, return on investment) of treatments in different areas. Comparing return on investment is really a form of optimization, but that’s another term we’re avoiding for the moment.
Finally, and most important of all, assembling and exposing all this information makes it easy to see where customer treatments support over-all business strategies, and where they conflict with them. This is the most important benefit because business strategy is what CEOs and other top-level executives care about—so a system that helps them execute strategy can win their support. That support is what’s ultimately needed for marketing automation to make its next great leap, from one department to the enterprise as a whole.