I spoke earlier this week with SQLStream, which offers software to execute queries against data streams such as stock market prices, Web logs and credit card transactions. These queries can include on-the-fly calculations such as moving averages, as well as scans for patterns like a sequence of failed log-in attempts. Typical applications include security monitoring, fraud detection, and general business activity monitoring. Marketers can use the queries to identify new leads and select cross-sell and upsell offers. Although the connection is a little less obvious, the system can also be used as an alternative to conventional batch data preparation methods for tasks like customer data integration.
SQLStream’s particular claim to fame is that its queries are almost identical to garden-variety SQL. Other vendors in this space apparently use more proprietary approaches. I say “apparently” because I haven’t researched the competition in any depth. A quick bit of poking around was enough to scare me off: there are many vendors in the space and it is a highly technical topic. It turns out that stream processing is one type of “complex event processing,” a field which has attracted some very smart but contentious experts. To see what I mean, check out Event Processing Thinking (Opher Etzion) and Cyberstrategics Complex Event Processing Blog (Tim Bass). This is clearly not a group to mess with.
That said, SQLStream’s more or less direct competitors seem to include: Coral8, Truviso, Progress Apama, Oracle BAM, TIBCO BusinessEvents, KX Systems, StreamBase and Aleri . For a basic introduction to data stream processing, see this presentation from Truvisio.
Back to SQLStream. As I said, it lets users write what are essentially standard SQL queries that are directed against a data stream rather than a static table. The data stream can be any JDBC-accessible data source, which includes most types of databases and file structures. The system can also accept streams of XML data over HTTP, which includes RSS feeds, Twitter posts and other Web sources. Its queries can also incorporate conventional (non-streaming) relational database tables, which is very useful when you need to compare streamed inputs against more or less static reference information. For example, you might want to check current activity against a customer’s six-month average bank balance or transaction rate.
The advantages of using SQL queries are that there are lots of SQL programmers out there and that SQL is relatively easy to write and understand. The disadvantage (in my opinion; not surprisingly, SQLStream didn’t mention this) is that SQL is really bad at certain kinds of queries, such as queries comparing subsets within the query universe and queries based on record sequence. Lack of sequencing may sound like a pretty big drawback for a stream processing system, but SQLStream compensates by letting queries specify a time “window” of records to analyze. This makes queries such as “more than three transactions in the past minute” quite simple. (The notion of “windows” is common among stream processing systems.) To handle subsets within queries, SQLStream mimics a common SQL technique of converting one complex query into a sequence of simple queries. In SQLStream terms, this means the output of one query can be a stream that is read by another query. These streams can be cascaded indefinitely in what SQLStream calls a “data flow architecture”. Queries can also call external services, such as address verification, and incorporate the results. Query results can be posted as records to a regular database table.
SQLStream does its actual processing by holding all the necessary data in memory. It automatically examines all active queries to determine how long data must be retained: thus, if three different queries need a data element for one, two and three minutes, the system will keep that data in memory for three minutes. SQLStream can run on 64-bit servers, allowing effectively unlimited memory, at least in theory. In practice, it is bound by the physical memory available: if the stream feeds more data than the server can hold, some data will be lost. The vendor is working on strategies to solve this problem, probably by retaining the overflow data and processing it later. For now, the company simply recommends that users make sure they have plenty of extra memory available.
In addition to memory, system throughput depends on processing power. SQLStream currently runs on multi-core, single-server systems and is moving towards multi-node parallel processing. Existing systems process tens of thousands of records per second. By itself, this isn't a terribly meaningful figure, since capacity also depends on record size, query complexity, and data retention windows. In any case, the vendor is aiming to support one million records per second.
SQLStream was founded in 2002 and owns some basic stream processing patents. The product itself was launched only in 2008 and currently has about a dozen customers. Since the company is still seeking to establish itself, pricing is, in their words, “very aggressive”.
If you’re still reading this, you probably have a pretty specific reason for being interested in SQLStream or stream processing in general. But just in case you’re wondering “Why the heck is he writing about this in a marketing blog?” there are actually several reasons. The most obvious is that “real time analytics” and “real time interaction management” are increasingly prominent topics among marketers. Real time analytics provides insights into customer behaviors at either a group level (e.g., trends in keyword response) or for an individual (e.g., estimated lifetime value). Real time interaction management goes beyond insight to recommend individual treatments as the interaction takes place (e.g., which offer to make during a phone call). Both require the type of quick reaction to new data that stream processing can provide.
There is also increasing interest in behavior detection, sometimes called event driven marketing. This monitors customer behaviors for opportunities to initiate an interaction. The concept is not widely adopted, even thought it has proven successful again and again. (For example, Mark Holtom of Eventricity recently shared some very solid research that found event-based contacts were twice as productive as any other type. Unfortunately the details are confidential, but if you contact Mark via Eventriicty perhaps he can elaborate.) I don’t think lack of stream processing technology is the real obstacle to event-based marketing, but perhaps greater awareness of stream processing would stir up interest in behavior detection in general.
Finally, stream processing is important because so much attention has recently been focused on analytical databases that use special storage techniques such as columnar or in-memory structures. These require processing to put the data into the proper format. Some offer incremental updates, but in general the updates run as batch processes and the systems are not tuned for real-time or near-real-time reactions. So it’s worth considering stream processing systems as a complement to that lets companies employ these other technologies without giving up quick response to new data.
I suppose there's one more reason: I think this stuff is really neat. Am I allowed to say that?
Showing posts with label event detection. Show all posts
Showing posts with label event detection. Show all posts
Thursday, January 29, 2009
Thursday, May 22, 2008
For Behavior Detection, Simple Triggers May Do the Trick
I was in the middle of writing last week’s post, on marketing systems that react to customers’ Web behavior, when I got a phone call from a friend at a marketing services agency who excitedly described his firm’s success with exactly such programs. Mostly this confirmed my belief that these programs are increasingly important. But it also prompted me to rethink the role of predictive modeling in these projects.
To back up just a bit, behavioral targeting is a hot topic right now in the world of Web marketing. It usually refers to systems that use customer behavior to predict which offers a visitor will find most attractive. By displaying the right offer for each person, rather than showing the same thing to everyone, average response rates can be increased significantly.
This type of behavioral targeting relies heavily on automated models that find correlations between the a relatively small amount of data and subsequent choices. Vendors like Certona and [X+1]tell me they can usually make valuable distinctions among visitors after as few as a half-dozen clicks.
At the risk of stating the obvious, this works because the system is able to track the results of making different offers. But this simple condition is not always met. The type of behavior tracking I wrote about last week—seeing which pages a visitor selected, what information they downloaded, how long they spent in different areas of the site, how often they returned, and so on—often relates to large, considered purchases. The sales cycle for these extends over many interactions as the customer educates herself, gets others involved for their opinions and approvals, speaks with sales people, and moves slowly towards a decision. A single Web visit rarely results in an offer that is rejected or accepted on the spot. Without a set of outcomes—that is, a list of offers that were accepted or rejected—predictive modeling systems don’t have anything to predict.
If your goal is to find a way to do predictive modeling, there are a couple of ways around this. One is to tie together the string of interactions and link them with the customer’s ultimate purchase decision. This can be used to estimate the value of a lead in a lead scoring system. Another solution is to make intermediate offers during each interaction, of “products” such as white papers and sales person contacts. These could be made through display ads on the Web site or something more direct like an email or phone call. The result is to give the modeling system something to predict. You have to be careful, of course, to check the impact of these offers on the customer’s ultimate purchase behavior: a phone call or email might annoy people (not to mention reminding them that you are watching). Information such as comparisons with competitors may be popular but could lead them to delay their decision or even end up purchasing something else.
Of course, predictive modeling is not an end in itself, unless you happen to sell predictive modeling software. The business issue is how to make the best use of the information about detailed Web (and other) behaviors. This information can signal something important about a customer even if it doesn’t include response to an explicit offer.
As I wrote last week, one approach to exploiting this information is to let salespeople review it and decide how to react. This is expensive but make sense where a small number of customers to monitor have been identified in advance. Where manual review is not feasible, behavior detection software including SAS Interaction Management, Unica Affinium Detect, Fair Isaac OfferPoint, Harte-Hanks Allink Agent, Eventricity and ASA Customer Opportunity Advisor can scan huge volumes of information for significant patterns. They can then either react automatically or alert a sales person to take a closer look.
The behavior detection systems monitor complex patterns over multiple interactions. These are usually defined in advance through sophisticated manual and statistical analysis. But trigger events can also be as basic as an abandoned shopping cart or search for information on pricing. These can be identified intuitively, defined in simple rules and captured with standard technology. What’s important is not that sophisticated analytics can uncover subtle relationships, but that access to detailed data exposes behavior which was previously hidden. This is what my friend on the phone found so exciting—it was like finding gold nuggets lying on ground: all you had to do was look.
That said, even simple behavior-based triggers need some technical support. A good marketer can easily think of triggers to consider: in fact, a good marketer can easily think of many more triggers than it’s practical to exploit. So a testing process, and system to support the process, is needed to determine which triggers are actually worth deploying. This involves setting up the triggers, reacting when they fire, and measuring the short- and long-term results. The process can never be fully automated because the trigger definitions themselves will come from humans who perceive new opportunities. But it should be as automated as possible so the company can test new ideas as conditions change over time.
Fortunately, the technical requirements for this sort of testing and execution are largely the same as the requirements for other types of marketing execution. This means that any good customer management system should already meet them. (Another way to look at it: if your customer management system can’t support this, you probably need a new one anyway.)
So my point, for once, is not that some cool new technology can make you rich. It’s that you can do cool new things with your existing technology that can make you rich. All you have to do is look.
To back up just a bit, behavioral targeting is a hot topic right now in the world of Web marketing. It usually refers to systems that use customer behavior to predict which offers a visitor will find most attractive. By displaying the right offer for each person, rather than showing the same thing to everyone, average response rates can be increased significantly.
This type of behavioral targeting relies heavily on automated models that find correlations between the a relatively small amount of data and subsequent choices. Vendors like Certona and [X+1]tell me they can usually make valuable distinctions among visitors after as few as a half-dozen clicks.
At the risk of stating the obvious, this works because the system is able to track the results of making different offers. But this simple condition is not always met. The type of behavior tracking I wrote about last week—seeing which pages a visitor selected, what information they downloaded, how long they spent in different areas of the site, how often they returned, and so on—often relates to large, considered purchases. The sales cycle for these extends over many interactions as the customer educates herself, gets others involved for their opinions and approvals, speaks with sales people, and moves slowly towards a decision. A single Web visit rarely results in an offer that is rejected or accepted on the spot. Without a set of outcomes—that is, a list of offers that were accepted or rejected—predictive modeling systems don’t have anything to predict.
If your goal is to find a way to do predictive modeling, there are a couple of ways around this. One is to tie together the string of interactions and link them with the customer’s ultimate purchase decision. This can be used to estimate the value of a lead in a lead scoring system. Another solution is to make intermediate offers during each interaction, of “products” such as white papers and sales person contacts. These could be made through display ads on the Web site or something more direct like an email or phone call. The result is to give the modeling system something to predict. You have to be careful, of course, to check the impact of these offers on the customer’s ultimate purchase behavior: a phone call or email might annoy people (not to mention reminding them that you are watching). Information such as comparisons with competitors may be popular but could lead them to delay their decision or even end up purchasing something else.
Of course, predictive modeling is not an end in itself, unless you happen to sell predictive modeling software. The business issue is how to make the best use of the information about detailed Web (and other) behaviors. This information can signal something important about a customer even if it doesn’t include response to an explicit offer.
As I wrote last week, one approach to exploiting this information is to let salespeople review it and decide how to react. This is expensive but make sense where a small number of customers to monitor have been identified in advance. Where manual review is not feasible, behavior detection software including SAS Interaction Management, Unica Affinium Detect, Fair Isaac OfferPoint, Harte-Hanks Allink Agent, Eventricity and ASA Customer Opportunity Advisor can scan huge volumes of information for significant patterns. They can then either react automatically or alert a sales person to take a closer look.
The behavior detection systems monitor complex patterns over multiple interactions. These are usually defined in advance through sophisticated manual and statistical analysis. But trigger events can also be as basic as an abandoned shopping cart or search for information on pricing. These can be identified intuitively, defined in simple rules and captured with standard technology. What’s important is not that sophisticated analytics can uncover subtle relationships, but that access to detailed data exposes behavior which was previously hidden. This is what my friend on the phone found so exciting—it was like finding gold nuggets lying on ground: all you had to do was look.
That said, even simple behavior-based triggers need some technical support. A good marketer can easily think of triggers to consider: in fact, a good marketer can easily think of many more triggers than it’s practical to exploit. So a testing process, and system to support the process, is needed to determine which triggers are actually worth deploying. This involves setting up the triggers, reacting when they fire, and measuring the short- and long-term results. The process can never be fully automated because the trigger definitions themselves will come from humans who perceive new opportunities. But it should be as automated as possible so the company can test new ideas as conditions change over time.
Fortunately, the technical requirements for this sort of testing and execution are largely the same as the requirements for other types of marketing execution. This means that any good customer management system should already meet them. (Another way to look at it: if your customer management system can’t support this, you probably need a new one anyway.)
So my point, for once, is not that some cool new technology can make you rich. It’s that you can do cool new things with your existing technology that can make you rich. All you have to do is look.
Wednesday, December 26, 2007
Eventricity Lets Banks Buy, Not Build, Event-Based Marketing Systems
As you may recall from my posts on Unica and SAS, event-based marketing (also called behavior identification) seems to be gaining traction at long last. By coincidence, I recently found some notes I made two years about a UK-based firm named eventricity Ltd. This led to a long conversation with eventricity founder Mark Holtom, who turned out to be an industry veteran with background at NCR/Teradata and AIMS Software, where he worked on several of the pioneering projects in the field.
Eventricity, launched in 2003, is Holtom’s effort to convert the largely custom implementations he had seen elsewhere into a more packaged software product. Similar offerings do exist, from Harte-Hanks, Teradata and Conclusive Marketing (successor to Synapse Technology) as well as Unica and SAS. But those are all part of a larger product line, while eventricity offers event-based software alone.
Specifically, eventricity has two products: Timeframe event detection and Coffee event filtering. Both run on standard servers and relational databases (currently implemented on Oracle and SQL Server). This contrasts with many other event-detection systems, which use special data structures to capture event data efficiently. Scalability doesn’t seem to be an issue for eventricity: Holtom said it processes data for one million customers (500 million transactions, 28 events) in one hour on a dual processor Dell server.
One of the big challenges with event detection is defining the events themselves. Eventricity is delivered with a couple dozen basic events, such as unusually large deposit, end of a mortgage, significant birthday, first salary check, and first overdraft. These are defined with SQL statements, which imposes some limits in both complexity and end-user control. For example, although events can consider transactions during a specified time period, they cannot be use a sequence of transactions (e.g., an overdraft followed by a withdrawal). And since few marketers can write their own SQL, creation of new events takes outside help.
But users do have great flexibility once the events are built. Timeframe has a graphical interface that lets users specify parameters, such as minimum values, percentages and time intervals, which are passed through to the underlying SQL. Different parameters can be assigned to customers in different segments. Users can also give each event its own processing schedule, and can combine several events into a “super event”.
Coffee adds still more power, aimed at distilling a trickle of significant leads from the flood of raw events. This involves filters to determine which events to consider, ranking to decide which leads to handle first, and distribution to determine which channels will process them. Filters can consider event recency, rules for contact frequency, and customer type. Eligible events are ranked based on event type and processing sequence. Distribution can be based on channel capacity and channel priorities by customer segment: the highest-ranked leads are handled first.
What eventricity does not do is decide what offer should be triggered by each event. Rather, the intent is to feed the leads to call centers or account managers who will call the customer, assess the situation, and react appropriately. Several event-detection vendors share this similar approach, arguing that automated systems are too error-prone to pre-select a specific offer. Other vendors do support automated offers, arguing that automated contacts are so inexpensive that they are profitable even if the targeting is inexact. The counter-argument of the first group is that poorly targeted offers harm the customer relationship, so the true cost goes beyond the expense of sending the message itself.
What all event-detection vendors agree on is the need for speed. Timeframe cites studies showing that real time reaction to events can yield an 82% success rate, vs. 70% for response within 12 hours, 25% in 48 hours and 10% in four days. Holtom argues that the difference in results between next-day response and real time (which Timeframe does not support) is not worth the extra cost, particularly since few if any banks can share and react to events across all channels in real time .
Still, the real question is not why banks won’t put in real-time event detection systems, but why so few have bought the overnight event detection products already available. The eventricity Web site cites several cases with mouth-watering results. My own explanation has long been that most banks cannot act on the leads these systems generate: either they lack the contact management systems or cannot convince personal bankers to make the calls. Some vendors have agreed.
But Holtom and others argue the main problem is banks build their own event-detection systems rather than purchasing someone else’s. This is certainly plausible for the large institutions. Event detection looks simple. It’s the sort of project in-house IT and analytical departments would find appealing. The problem for the software vendors is that once a company builds its own system, it’s unlikely to buy an outside product: if the internal system works, there’s no need to replace it, and if it doesn’t work, well, then, the idea has been tested and failed, hasn’t it?
For the record, Holtom and other vendors argue their experience has taught them where to look for the most important events, providing better results faster than an in-house team. The most important trick is event filtering: identifying the tiny fraction of daily events that are most likely to signal productive leads. In one example Holtom cites, a company’s existing event-detection project yielded an unmanageable 660,000 leads per day, compared with a handy 16,000 for eventricity.
The vendors also argue that buying an external system is much cheaper than building one yourself. This is certainly true, but something that internal departments rarely acknowledge, and accounting systems often obscure.
Eventricity’s solution to the marketing challenge is a low-cost initial trial, which includes in-house set-up and scanning for three to five events for a three month period. Cost is 75,000 Euros, or about $110,000 at today’s pitiful exchange rate. Pricing on the actual software starts as low as $50,000 and would be about 250,000 Euros ($360,000) for a bank with one million customers. Implementation takes ten to 12 weeks. Eventricity has been sold and implemented at Banca Antonveneta in Italy, and several other trials are in various stages.
Eventricity, launched in 2003, is Holtom’s effort to convert the largely custom implementations he had seen elsewhere into a more packaged software product. Similar offerings do exist, from Harte-Hanks, Teradata and Conclusive Marketing (successor to Synapse Technology) as well as Unica and SAS. But those are all part of a larger product line, while eventricity offers event-based software alone.
Specifically, eventricity has two products: Timeframe event detection and Coffee event filtering. Both run on standard servers and relational databases (currently implemented on Oracle and SQL Server). This contrasts with many other event-detection systems, which use special data structures to capture event data efficiently. Scalability doesn’t seem to be an issue for eventricity: Holtom said it processes data for one million customers (500 million transactions, 28 events) in one hour on a dual processor Dell server.
One of the big challenges with event detection is defining the events themselves. Eventricity is delivered with a couple dozen basic events, such as unusually large deposit, end of a mortgage, significant birthday, first salary check, and first overdraft. These are defined with SQL statements, which imposes some limits in both complexity and end-user control. For example, although events can consider transactions during a specified time period, they cannot be use a sequence of transactions (e.g., an overdraft followed by a withdrawal). And since few marketers can write their own SQL, creation of new events takes outside help.
But users do have great flexibility once the events are built. Timeframe has a graphical interface that lets users specify parameters, such as minimum values, percentages and time intervals, which are passed through to the underlying SQL. Different parameters can be assigned to customers in different segments. Users can also give each event its own processing schedule, and can combine several events into a “super event”.
Coffee adds still more power, aimed at distilling a trickle of significant leads from the flood of raw events. This involves filters to determine which events to consider, ranking to decide which leads to handle first, and distribution to determine which channels will process them. Filters can consider event recency, rules for contact frequency, and customer type. Eligible events are ranked based on event type and processing sequence. Distribution can be based on channel capacity and channel priorities by customer segment: the highest-ranked leads are handled first.
What eventricity does not do is decide what offer should be triggered by each event. Rather, the intent is to feed the leads to call centers or account managers who will call the customer, assess the situation, and react appropriately. Several event-detection vendors share this similar approach, arguing that automated systems are too error-prone to pre-select a specific offer. Other vendors do support automated offers, arguing that automated contacts are so inexpensive that they are profitable even if the targeting is inexact. The counter-argument of the first group is that poorly targeted offers harm the customer relationship, so the true cost goes beyond the expense of sending the message itself.
What all event-detection vendors agree on is the need for speed. Timeframe cites studies showing that real time reaction to events can yield an 82% success rate, vs. 70% for response within 12 hours, 25% in 48 hours and 10% in four days. Holtom argues that the difference in results between next-day response and real time (which Timeframe does not support) is not worth the extra cost, particularly since few if any banks can share and react to events across all channels in real time .
Still, the real question is not why banks won’t put in real-time event detection systems, but why so few have bought the overnight event detection products already available. The eventricity Web site cites several cases with mouth-watering results. My own explanation has long been that most banks cannot act on the leads these systems generate: either they lack the contact management systems or cannot convince personal bankers to make the calls. Some vendors have agreed.
But Holtom and others argue the main problem is banks build their own event-detection systems rather than purchasing someone else’s. This is certainly plausible for the large institutions. Event detection looks simple. It’s the sort of project in-house IT and analytical departments would find appealing. The problem for the software vendors is that once a company builds its own system, it’s unlikely to buy an outside product: if the internal system works, there’s no need to replace it, and if it doesn’t work, well, then, the idea has been tested and failed, hasn’t it?
For the record, Holtom and other vendors argue their experience has taught them where to look for the most important events, providing better results faster than an in-house team. The most important trick is event filtering: identifying the tiny fraction of daily events that are most likely to signal productive leads. In one example Holtom cites, a company’s existing event-detection project yielded an unmanageable 660,000 leads per day, compared with a handy 16,000 for eventricity.
The vendors also argue that buying an external system is much cheaper than building one yourself. This is certainly true, but something that internal departments rarely acknowledge, and accounting systems often obscure.
Eventricity’s solution to the marketing challenge is a low-cost initial trial, which includes in-house set-up and scanning for three to five events for a three month period. Cost is 75,000 Euros, or about $110,000 at today’s pitiful exchange rate. Pricing on the actual software starts as low as $50,000 and would be about 250,000 Euros ($360,000) for a bank with one million customers. Implementation takes ten to 12 weeks. Eventricity has been sold and implemented at Banca Antonveneta in Italy, and several other trials are in various stages.
Subscribe to:
Posts (Atom)
