Showing posts with label data science. Show all posts
Showing posts with label data science. Show all posts

Saturday, November 28, 2015

Model Factory from Modern Analytics Offers High Scale Predictive Modeling for Marketers

Remember when I asked two weeks ago whether predictive models are becoming a commodity? Here’s another log for that fire: Model Factory from Modern Analytics, which promises as many models as you want for a flat fee starting at $5,000 per month. You heard that right: an all-you-can eat, fixed-price buffet for predictive models. Can free toasters* and a loyalty card be far behind?

Of course, some buffets sell better food than others. So far as I can tell, the models produced by Model Factory are quite good. But buffets also imply eating more than you should. As Model Factory’s developers correctly point out, many organizations could healthily consume a nearly unlimited number of models. Model Factory is targeted at firms whose large needs can’t be met at an acceptable cost by traditional modeling technologies. So the better analogy might be Green Revolution scientists increasing food production to feed the starving masses.

In any case, the real questions are what Model Factory does and how. The "what" is pretty simple: it builds a large number of models in a fully automated fashion. The "how" is more complicated.  Model Factory starts by importing data in known structures, so users still need to set up the initial inputs and do things like associate customer identities from different systems. Modern Analytics has staff to help with that, but it can still be a substantial task. The good news is that set-up is done only when you’re defining the modeling process or adding new sources, so the manual work isn't repeated each time a model is built or records are scored. Better still, Modern Analytics has experience connecting to APIs of common data sources such as Salesforce.com, so a new feed from a familiar source usually takes just a few hours to set up.  Model Factory stores the loaded data in its own database. This means models can use historical data without reloading all data from scratch before each update.

Once the data flow is established, users specify the file segments to model against and the types of predictions.  The predictions usually describe likelihood of actions such as purchasing a specific product but they could be something else. Again there’s some initial skilled work to define the model parameters but the process then runs automatically. During a typical run, Model Factory evaluates the input data, does data prep such as treating outliers and transforming variables, builds new models, checks each model for usable results, and scores customer records for models that pass.

The quality check is arguably the most important part of the process, because that’s what prevents Model Factory from blindly producing bad scores due to inadequate data, quality problems, or other unanticipated issues. Model Factory flags bad models – measured by traditional statistical methods like the c-score – and gives users some information their results. It’s then up to the human experts to dig further and either accept the model as is or make whatever fixes are required. Scores from passing models are pushed to client systems in files, API calls, or whatever else has been set up during implementation.

If you’ve been around the predictive modeling industry for a while, you know that automated model development has been available in different forms for long time. Indeed, Model Factory's own core engine was introduced five years ago. What made Model Factory special, then and now, is automating the end-to-end process at high scale.  How high?  There's no simple answer because the company can adjust the hardware to provide whatever performance a client requires.  In addition to hardware, performance is driven by types of models, number of records, and size of each record.  A six-processor machine working with 100,000 large records might take 2 to 40 minutes to build each model and score all records in 30 seconds per model.**

Model Factor now runs as a cloud based service, which lets users easily upgrade hardware to meet larger loads. A new interface, now in beta, lets end-users manage the modeling process and view the results.  Even with the interface, tasks such as exploring poorly performing models take serious data science skills.So it would still be wrong to think of Model Factory as a tool for the unsophisticated. Instead, consider Model Factory as a force multiplier for companies that know what they’re doing and how to do it, but can’t execute the volumes required.

Pricing for Model Factory starts at $5,000 per month for modest hardware (4 vCPU/8Gb RAM machine with 500 Gb fast storage).  Set-up tasks are covered by an implementation fee, typically around $10,000 to $20,000. Not every company will have the appetite for this sort of system, but those that do may fine Model Factory a welcome addition to their marketing technology smorgasbord.

_____________________________________________________________________________

* For the youngsters: banks used to give away free toasters to attract new customers. This was back, oh, during the 1960’s. I wasn’t there but have heard the stories.

** The exact example provided by the company was: On a 6 vCPU, 64Gb RAM machine, building 500 models on between 20K and 178K records with up to 20,000 variables per record takes an average between 2 and 40 minutes to build each model and 30 seconds per model to score all records.  This hardware configuration would cost $12,750 per month.

Thursday, June 11, 2015

Highspot Sales Enablement Helps Sales People Find Content and Marketers Measure What Works

“Sales enablement” is something of a catch-all term for a wide range of solutions that help sales people do their jobs better. Highspot has staked out the corner of this world occupied by systems that help sales people find the right marketing materials. It grew out of the pain that co-founder Robert Wahbe felt which running marketing for Microsoft’s Server and Tools Division, where he found no good tools to help sales people and channel partners find the right materials when they needed them.


When Highspot was founded in 2012, it focused on better content discovery for sales people. But the firm soon learned that this wasn’t enough. It has now redefined its core mission as improving results by showing which content is working.  This is currently measured by tracking how often each item is used by sales people and read by recipients. The July release will supplement this with opportunity information from Salesforce.com CRM, allowing correlation of content usage with funnel stage conversions and revenue.

Highspot mostly does what you’d expect from this sort of system: it lets users load content and sales people, tracks who sends which content to which prospects, and reports on results. Users can set up collections (called “spots”) of materials for a particular product, sales team, funnel stage, region, or any other purpose. They can find content by looking in a spot, by filtering on sales stage, industry, product, and other attributes, or by doing an “intelligent semantic search” that recommends content based on past choices by the user and others. Users can view, download, bookmark or email the selected content or do a live pitch to a prospect. The system automatically adds pitches and emails to the prospect history in Salesforce.  It can also track when a piece of emailed content is opened by the prospect, how long they kept it open, and which pages they viewed. A dashboard can highlight new and featured content. The system will also analyze the inventory of available contents to find gaps or redundancies by sales stage, product, region, etc.

The operational details are all nicely executed, which is probably the most important consideration for a sales enablement system: if it's not easy, sales people won’t use it. But from a technology standpoint, what’s most interesting about Highspot is what the vendor calls “content genomics”. This uses machine learning to examine each piece of content – such as each slide in a Powerpoint deck – and identify properties including text, color, graphs, and images. Different pieces are then compared to find similarities and grouped into “content families”. This approach lets Highspot recognize when a piece has been modified and reused, for example by taking a slide from one deck and adding it to another with some reformatting along the way. Identifying these relationships gives a much more accurate understanding of how often each item is used and how well it is performing. Without this grouping, results from the system could be highly misleading.

Highspot now has more than 100 paying customers. The system is now sold primarily to marketing departments as the system of record for marketing content. There’s a limited function Business Edition and a full function Enterprise Edition, which includes Salesforce integration. Pricing for the Enterprise Edition isn’t published but the vendor says that, once volume discounts are included, it is usually less than the $30 per month per user charged for the Business Edition.

Tuesday, June 02, 2015

Blueshift Offers a Simple B2C Customer Data Platform


[Note: this post is from 2015.  Click here for a newer post about BlueShift.]

It’s just over two years since I started writing about Customer Data Platforms. One thing that’s become clear since then is that only big companies will purchase a marketing database by itself. Everyone else wants to combine the database with a practical application. B2B CDPs have favored analytical applications like lead scores and churn predictions. B2C CDPs have often included campaign engines that manage triggers, query-based segmentations, and multi-step program flows in addition to predictive models. But even the B2C CDPs rely on external systems such as email agents and Web content managers to deliver the campaign messages.

Blueshift fits nicely into the B2C CDP mold: it builds a multisource database, incorporates machine learning-based predictive models, uses filters to create segments, and runs multi-step campaigns that are executed by external systems in email, SMS, mobile apps, and display and Facebook retargeting. What sets Blueshift apart – and this is typical of later entrants to a new market – are a lower price point and simpler operation than early B2C CDPs like RedPoint  and AgilOne.

How low? Pricing for the most basic version of Blueshift starts at $999 per month.  The most advanced version starts at $3,999 per month for all features and up to 1 million “active users” across all channels.  (The company says that most clients are in fact larger than one million users, with the largest at 100 million.)  The fact that prices are published is itself a mark of a later entrant.

How simple? Well, one measure is implementation time.  Blueshift says can be operational in as one day (if data is loaded through an existing Web page tag or push-button integration with Segment) or under two weeks if some work is required.  Technically, this is plausible: the system has JSON API that can accept pretty much anything and will put it into MongoDB and/or Postgres with minimal data modeling.

Another measure of simplicity is the campaign building interface.  Blueshift lets users specify a sequence of steps by filling out forms to define the segment, channel, and content template for each step and time between steps. This is nowhere near as pretty or flexible as graphical flow charts, but does qualify as simple.

Segments are also built using forms to define one or more filters.  Again, nothing fancy but it gets the job done. What’s more important is that the segments can use a wide range of data including online behaviors, attributes from CRM and other systems, predictive model scores, and product information from catalogs.  This is what gives the system its power.  Content templates do incorporate some visualization, as well as tokens for personalization and machine learning-based product recommendations. Split testing, ecommerce integration, and predictive models for activation, churn and repeat purchase are available in advanced versions of the system. Reports show model performance and attributes, segment counts, and campaign results using several basic attribution methods.

So, apart from some missing bells and whistles, what doesn’t Blueshift do? The main limit is that it works only with known individuals (i.e., those reachable through an email or SMS address, app registration, or similar identifier) and primarily in outbound channels. This means that Web display ads, site personalization, and anonymous visitor targeting aren’t part of the mix, aside from retargeting. And, while data and models are updated continuously, the system isn’t designed to manage real-time interactions.

Blueshift was launched earlier this year.  It has more than ten clients, who are mostly multi-channel marketers with a majority of revenue from mobile payments.

In sum, Blueshift isn’t the fanciest marketing system available but it provides a solid mix of highly usable features at a reasonable price. B2C marketers will find it worth a look.