Skip to main content

Instant Realtime BI with SOA BI

 

gfx-bi_diagram

BI, Business Intelligence, has taken hold at almost every mid-size or larger IT organization. 

It commonly means extracting key data elements from all the main systems and databases in the organization and compiling it all together in the Business Intelligence Data Warehouse.  And the primary method for doing this is ETL – extract, transform, and load.  Basically meaning batch-style data loads performed daily, weekly, or monthly (from the source systems to the data warehouse).

Setting it up is expensive and time consuming as it requires building a large capacity database and ETL processes for every important data source in the company.  The ETL processes by themselves are often not enough as data duplication and data quality problems quickly float to the surface and have to be resolved to a sufficient level to continue (resolved in the data warehouse, not in the data sources).

However, it’s relatively easy to demonstrate the business value of the resulting Business Intelligence Center, as that cross-system data provides business process statistics, results, and reporting as none of the systems can standing alone.  Further, intensive data mining can be performed that can’t be applied against the source systems (either because they’re operating live and can’t handle that depth of data access or because the value points come out by the combinations across systems and complete business processes.)

Most businesses that implement BI consider it a win and see their ROI.

Now as I’ve been implementing a new enterprise SOA strategy at a major customer, the BI team arrived in an architecture strategy presentation meeting and asked “ok, how’s this (an enterprise SOA strategy including common entities and processes) different from BI?”

In essence they’re pointing out that they already have integrated to everything (or at least every important data source), they’ve already translated between all those disparate data models to a single enterprise model (the structure of their data warehouse), and they’ve already handled the cross-system conflicts of object model conflicts and different representations of the same data. 

The primary differences between BI ETL integrations and SOA integrations is small.  One, BI is a timed infrequent (daily or less) data extract, SOA is a real time integration.  Two, BI tends to work in a true ETL model…extract the data in the source system’s format, transform it to the data warehouse’s preferred format, and load it into the data warehouse.  SOA, when it moves beyond a point to point connectivity technology does several transformations, first from the source format to XML, second from the source data model to a company or industry standard, and third for legacy situations from the company standard to the destination system’s requirements (though this need fades with time if a company standard is selected). 

But conceptually BI and SOA integrations are doing much of the same thing!

Now BI is being faced with a new requirement…realtime.  Companies are seeing such a value in BI that they’re asking why they can’t get the reports and analyses RIGHT NOW (rather than tomorrow or next week).

The SOA vendors offer a solution to part of this desire with their BAM – Business Activity Monitoring tools.  These tools (examples include IBM Websphere Business Monitor and Software AG Webmethods Optimize) monitor data elements passing through web service requests and use it to build a real time image of what’s happening – an image that can be identical to the BI image generated 24 hours later through summarized data reporting.

However, the integration teams and Integration Competency Centers have generally been unsuccessful at selling this ability to their business users.  This isn’t because it doesn’t solve the problem but rather because the business users of BI and integration are different.  BI is actually being used by key business users.  Integration is generally service the IT executives as their “customer”, and therefore BAM doesn’t fit in with their normal “offerings”.

Realtime BI has become one of the “hot” IT topics for 2011.  There’s two relatively easy solutions to help BI become realtime, though they’re somewhat problematic politically as they violate current organization structures at many IT shops.

Solution #1 – Get BAM SOA tools but give it to the BI team to use as part of the offerings to their data needing business customers.

Solution #2 – IT shops with a good library of existing connections and services can begin to echo certain update services directly to the data warehouse and BI team for realtime handling.  (Realtime in this case means sent to a queue, the realtime update processes can’t stop and wait for the data warehouse to process the updates.)

In both cases it requires the BI teams to begin leveraging the integration teams infrastructure.  However, the connection can bring BI the realtime options it needs with minimal effort.

Other models, such as Change Data Capture utilities, are great for increasing vendor software sales (and they do work and are impressive tools) but aren’t necessary… IF we can get two disciplines with somewhat different historical goals to begin working together.

Those that do will get a big and relatively inexpensive win.

Popular posts from this blog

Integration Spaghetti™

  I’ve been using the term Integration Spaghetti™ for the past 9 years or so to describe what happens as systems connectivity increases and increases to the point of … unmanageability, indeterminate impact, or just generally a big mess.  A standard line of mine is “moving from spaghetti code to spaghetti connections is not an improvement”. (A standard “point to point connection mess” slide, by enterprise architect Jerry Foster from 2001.) In the past few days I’ve been meeting with a series of IT managers at a large customer and have come up with a revised definition for Integration Spaghetti™ : Integration Spaghetti™ is when the connectivity to/from an application is so complex that everyone is afraid of touching it.  An application with such spaghetti becomes nearly impossible to replace.  Estimates of change impact to the application are frequently wrong by orders of magnitude.  Interruption in the integration functioning are always a major disaster – both in terms of th

Solving Integration Chaos - Past Approaches

A U.S. Fortune 50's systems interconnect map for 1 division, "core systems only". Integration patterns began changing 15 years ago. Several early attempts were made to solve the increasing problem of the widening need for integration… Enterprise Java Beans (J2EE / EJB's) attempted to make independent callable codelets. Coupling was too tight, the technology too platform specific. Remote Method Invocation (Java / RMI) attempted to make anything independently callable, but again was too platform specific and a very tightly coupled protocol. Similarly on the Microsoft side, DCOM & COM+ attempted to make anything independently and remotely callable. However, as with RMI the approach was extremely platform and vendor specific, and very tightly coupled. MQ created a reliable independent messaging paradigm, but the cost and complexity of operation made it prohibitive for most projects and all but the largest of Enterprise IT shops which could devote a focused technology

From Spaghetti Code to Spaghetti Connections

Twenty five years ago my boss handed me the primary billing program and described a series of new features needed. The program was about 4 years old and had been worked on by 5 different programmers. It had an original design model, but between all the modifications, bug fixes, patches and quick new features thrown in, the original design pattern was impossible to discern. Any pattern was impossible to discern. It had become, to quote what’s titled the most common architecture pattern of today, ‘a big ball of mud’. After studying the program for several days, I informed my boss the program was untouchable. The effort to make anything more than a minor adjustment carried such a risk, as the impact could only be guessed at, that it was easier and less risky to rewrite it from scratch. If they had considered the future impact, they never would have let a key program degenerate that way. They would have invested the extra effort to maintain it’s design, document it property, and consider