Saturday, October 10, 2009

Beautiful BABI: SiSense PrismCubed Offers Business Intelligence for Business Analysts

Summary: SiSense PrismCubed offers a reasonable option for a business-analyst business intelligence system. It’s probably a little harder to use than some competitors, but gives a bit more power and flexibility in return.

SiSense PrismCubed, officially launched this past August, is another member of the growing set of business intelligence systems aimed at empowering business analysts to build their own applications. I’ve also written about QlikView and Lyza, and think there are others.

What distinguishes these tools from other business intelligence systems is they let non-technical users manipulate source data in more sophisticated ways than a spreadsheet or report writer. Specifically, data from several sources can be merged on a common key, filtered, aggregated and processed through complex formulas.

This sort of manipulation has traditionally required SQL programmers, OLAP cube designers, or similar technical experts. Allowing business analysts to do it without having to learn deep technical skills is precisely what lets them build applications with minimal external assistance. (I say "minimal" because technical staff must still handle connections to the source data.)

These systems also provide report creation and distribution. But unlike business-analyst data manipulation, those capabilities are also found in other business intelligence products.

You’ll note that my definition does NOT specify a particular database technology, such as in-memory or columnar, that the data is updated in real time, that the system is targeted at mid-sized businesses, or that results are distributed pervasively through the organization. Those have all been proposed as ways to classify business intelligence systems, and several of the products in my business-analyst business intelligence (BABI--how cute!) category fall into one or another such group. But I think it’s a mistake to focus on those other features because they don’t get at real value provided by these tools, which is the flowering of applications made possible when business analysts can create them independently.

Now that I've defined a new type of application, complete with the all-important acronym, the next step is defining an evaluation framework to help compare the competitors. I’d like to claim I do this through deep research and brilliant insights into user needs, but, in fact, I generally start with the features in the existing systems. This runs the risk of missing some critical requirement that no vendor has yet uncovered, but it saves a ton of work. And I can still argue that I’m piggybacking on the vendors’ own deep research and insights as embodied in their products.

In any event, a starter set of review criteria for BABI systems (sorry, but I find the acronym irresistible) would include:

- combine data from multiple, heterogeneous sources (relational databases, CSV files, Excel tables, etc.)

- allow non-technical users to define processing flows to manipulate the data (merge, filter, aggregate, calculate)

- present the manipulated data in a structure that’s suitable for reporting and visualization

- allow non-technical users to create applications including reports, visualizations, and (optionally) additional functions such as data refresh and export

- allow other users to view (and optionally interact with) the applications

- meet reasonable performance standards for data load, storage, response time and scalability

- use appropriate technology (actually, I don’t care if the thing is powered by hamsters. But understanding the underlying technology helps to predict where problems might arise.)

- affordable pricing (not exactly a criterion, but important nevertheless)

Obviously these criteria could be much more detailed, and no doubt they will grow over time. But for now, they provide a useful way to look at PrismCubed.

1. Combine data from multiple sources: PrismCubed provides a wizard to connect with different data sources, including SQL Server, Oracle, CSV files, Excel and Amazon S3 logs (which earns them extra coolness points). The system can read the database schemas directly, saving users the need to define basic data structures. Users have the option modify structures if they desire. A connection can be live (i.e., the source is requeried each time a report is run) or reloaded on demand from within a completed application. This provides real-time data access, which isn’t always available in business intelligence systems. The system can also reload data automatically on a user-specified schedule.

2. Allow non-technical users to manipulate source data: PrismCubed does a particularly good job here. At a basic level, users can write complex formulas to add derived fields to a table during the import process. More important, a drag-and-drop interface lets them build complex visual processing flows from standard icons including data definition, filtering, inclusion or exclusion, unions, and top or bottom selects. These flows can combine multiple data sources and include branches that generate separate output sets that are all available to use in applications.

3. Present the manipulated data for reporting: the system automatically classifies input data as dimensions (text, dates, etc.) and measures (numbers which can be aggregated). Users can override the system’s assignments and can add new dimension fields during the data load. They can create derived measures at any time. Once the load is complete, the system presents the dimensions and measures in an “ElastiCube” available for reports and other applications.

4. Create reports and other applications: the system provides a remarkably rich development environment. Users build applications by dropping different types of objects (which the vendor calls widgets) onto dashboard pages. Widgets can make selections; display data in pivot tables, charts, calendars, and images; and execute actions including refresh data, jump to different pages, query external data sources, edit data, and export to Excel. A dashboard can have multiple pages.

The primary reporting widget is the pivot table, which itself is built by dragging dimensions into rows and columns, and the measures into cell values. Users can apply filters to widgets, such as selecting the top 10 values for a dimension. These filters can be static (a fixed list) or dynamic (reselected each time the dashboard is updated). PrismCubed also provides special features for time series calculations such as period-to-period growth and differences. That's a nice touch, because those can be quite difficult to define with conventional reporting systems.

Reporting widgets can be connected to the ElastiCube dimensions and measures or directly to SQL data sources. Users can also specify whether selections made in one widget affect the data displayed in other widgets. There are actually three options here, including complete independence, direct links from one widget to another, and global impact on other widgets. This gives more flexibility than systems that automatically apply global selections, but does force users to do more work in specifying which approach they want.

Widgets, filters and other components can be stored in a central repository and reused across applications.

5. Share applications: Users can export dashboard contents to Excel tables or can copy an entire dashboard as a static PDF. Applications, including underlying ElastiCubes, can be copied and run on another user’s PC. In addition, a Web server due for release this month (October) will let dashboard creators publish their dashboards to a central server, where other users will be able to access and modify them. The server will provide fine-grained control over what different users are allowed to change.

6. Scalability and Performance: SiSense has tested the PrismCubed engine on multiple terabytes of data. It cited one client who loaded 30 million telephone call detail records in 30 to 90 minutes. Loaded data usually takes somewhat less disk space than the original source. The system currently requires a complete reload to add new data to an existing ElastiCube, although the vendor plans to add incremental appends by the end of November. Once the data is loaded, reports within applications usually update in seconds.

7. Technology: PrismCubed stores data in a columnar data structure. It also stores a dimension map for each column, but doesn’t preaggregate the data along the dimensions. As with other columnar databases, this avoids the need for specialized data structures to handle particular queries. When data has not been preloaded into the system, PrismCubed can also run the same query across multiple external data sources.

Although PrismCubed stores the entire ElastiCube on disk, it only loads into memory the columns required for a particular query. This lets it can handle larger data sets than purely in-memory systems without massive hardware. There might be some problems if the selected columns for a query exceeded the system’s available memory.

PrismCubed runs on Windows PCs with the .NET framework installed. On 64 bit systems, this means the amount of potential memory is virtually unlimited. Although PrismCubed itself is new, a previous version of the product using the ElastiCube database engine was launched in September 2008 and has more than 6,000 users.

8. Pricing: PrismCubed is priced on an annual subscription basis, which is unusual for this type of product but common among hosted BI vendors. SisSense offers several versions of PrismCubed, ranging from a free Viewer that can only access dashboards created elsewhere, to a $1,500 per year Professional edition that allows full creation of dashboards and ElastiCubes. There are also a free version (limited to 2,000 rows of data), a $300 per year Personal edition (which can create dashboards but not share them), and a $700 per year Analyzer that can build and share dashboards but not ElastiCubes. Server pricing wasn’t quite set when I spoke with SiSense but will probably be around $3,500 per year per server.

These prices are quite reasonable compared with similar vendors, even considering the recurring annual subscription fees, particularly because the end-user Viewer is free. Price details are published on the vendor’s Web site.

10 comments:

Unknown said...

Dave, I'm curious why they don't support a web-based interface. Why have a thick footprint Windows client these days?
Thanks.

David Raab said...

Hi Jerome. Good question. I'd guess that they just haven't gotten around to adding a Web-based client, given that their original product was a desktop system and the Web server is brand new. But I've forwarded the question to SiSense so hopefully they'll answer directly.

Elad Israeli said...

Hi guys,

We will be making our zero-footprint web server publicly available in the upcoming weeks (it is currently under a private beta at select customers).

This product will allow publishing of BI apps created with PrismCubed to be seamlessly deployed as interactive zero-footprint solutions.

As far as the authoring (="development") environment, we believe that the desktop is more suitable for this, considering the type of features and functionality a typical person needs when building BI apps from scratch (ETL, development, etc).

Finally, you'd be surprised how many of the BI solutions out there are not implemented as web solutions. It really depends on what you need, your budget, how many users would be using it, firewall issues, etc. But, obviously we agree with you and that is why we are releasing this new functionality in the immediate future :-)

David, thank you for the professional and thorough report.

Unknown said...

Elad,

Actually I know you're right about client-side BI applications being fairly common out there. I guess my confusion then is: where does the actual data reside. I imagine it has to be server-side if you're in the TB game right?
Also I would venture to guess that economically, web-based (hosted) solutions would be cheaper and quicker to bootstrap. I don't see where the firewall issues are on those though. From a UX perspective, I'd be interested in understanding what exactly can be done in a client app that cannot be accomplished as nicely in RIA these days.
Thanks.
J.

Elad Israeli said...

Jerome,

We separate between (roughly) three types of users:

1. Developers: These types of users usually handle the data preparation side (consolidation, cleansing, centralization, etc). This is done (and stored) using our ElastiCube feature that can be set up either locally (on a desktop) on a server to service multiple user requests.

2. Power Users: Users who use power tools for advanced analytics, reporting and publishing. They use the PrismCubed environment to perform ad-hoc analysis, create reports or establish "starting point" applications for business users.

3. Business Users: These types of users would usually prefer to use "canned" applications (in contract to free form applications) that serve a business case. An example of existing applications like this is Google Analytics.

For the first type of user (and in many cases the second type as well), using a web-based (zero footprint) environment to do this would be similar to using a web-based Visual Studio - cumbersome and unproductive.

This is not true for the third type of user that simple wants to work within the confines of a pre-package solution without too many irrelevant features to distract or complicate.

Web based solutions have a few critical advantages:

1. Easy deployment (no installation and no firewall problems due to unstandard ports and protocols).

2. They accessible from anywhere (business users typically need to see BI reports/dashboards on the road or from multiple machines without having to install anything.

3. They are platform independent.

Please note that we do not currently provide "hosted services" and maybe this is where you confusion is.

We provide a server that can be installed on premise or on a cloud. Both the data and the web server resides on a server machine. Some or our partners and OEMs use this server to provide hosted solutions to their own customers but this is not our focus as a company. We are a platform provider, and it has yet to be proven that generic hosted BI platforms are practical (actually, I tend to believe the opposite). For vertical solutions, sure. Not for platforms.

So, PrismCubed provides a server that handles data storage and access as well as access to web apps created with the PrismCubed environment.

I hope that cleared things up?

Elad

Unknown said...

Yes it did thank you. Although I would disagree with your assertion that a web-based IDE would be cumbersome and unproductive for user type 1 and 2. :)

Elad Israeli said...

Fair enough :-)

Although, take Excel and Google Spreadsheets for example. I once read somewhere that the main difference between GoogSheets and Excel is the amount of data a sheet can hold (Google's is ridiculously small).

I say to that - even if GoogSheets could handle the same amount of data as Excel (not gonna happen soon) the main difference is simply in the little things that save you 2 seconds each time you do them. Drag and drop is faster. Browsing through a big tree is faster. Copy/Paste is faster. Scrolling is smoother. The browser doesn't crash because one of its plug-ins misbehaved. Etc.

I give a lot of credit to Google's web devs and that just proves my point. No accountant would use GoogSheets as it is, even if it had unlimited sized sheets. And even if it's free.

On the other hand, Face Book would have billions of users if it was desktop based.

In our market (platform BI) we try to use what's best for each user segment, as we see them - powerful desktop-based dev tools and easily deployable web apps for the actual business applications.

Elad Israeli said...

"On the other hand, Face Book WOULDN'T have billions of users if it was desktop based"

Unknown said...

You make a good point. Clearly Excel has 85% market penetration anyway so the numbers do support your argument.

On the data sizes, Gemini they did a DIM (direct in-memory) implementation as well. I don't know that CPAs wouldn't use GoogleSheets based on data volumes though. They do use Intuit's Quicken Online and I don't know how large a dataset that supports but it can't be that huge. Also accounting data is not that large typically IMO (if you're talking about Hyperion users that's a different ballgame).

It is nice to have a fat client for people who prefer that for sure. However the costs of deployment/maintenance on these is really high (even with ClickOnce types of updates) and it's platform-tied and it's not firewall friendly etc. Even with thin clients you don't get the user mining advantages SaaS brings to you.

Also IMHO with a fat client, the users can't do work from home unless they install the bits on their personal boxes which has all sorts of nasty implications (not the least of which is security, from IT's perspective). With a web browser, I can have my people working anywhere, anytime. That's valuable. What you might lose in performance is greatly made up for with convenience. And of course, one man's slow app is another's blazing fast application :) If you have a tree that's so huge as to impact node selection, I would suggest you may have implementation issues on the caching/abstraction layers. In these situations, less is more. Ditto on D&D. In either case I would think performance is a backend issue more than UX-related (at least in my experience). And performance is a subjective issue too.

Thanks for the opportunity to share some interesting thoughts!

Elad Israeli said...

Well, I guess we're both right ;-)

Thank you as well!