Jun 21, 2012

Are Relational Databases Dead?

I met with a major vendor technology evangelist recently (actually he called himself a technology space regional CTO) on the topic of noSQL and big data. 

Now I’ll admit, like any technologist and architect, I have my areas of specialty and areas of less specific knowledge.  And as the IT industry continues to develop new technologies and approaches yearly, it’s challenging to stay up to date.

So while I know relational databases well, can write good SQL and understand architecturally when to use database power functions like triggers, store procedures and the like, as well as database implementation level issues such as clustering, failover, performance, etc, I had not been paying attention to recent industry changes such as noSQL, graph databases and big data.

This tech evangelist presented the concept of a particular noSQL tool, emphatically stating that the day of the relational database was over.  Relational databases are dead as a future technology.  He even went so far as to predict a major decline in RDBMS use (in new projects) over the next 2 years.

(That’s how you know he’s an evangelist.  Even if a wonderful new tools that makes an older tech completely replaceable with a much better approach arrives, it still takes time to propagate across the industry.)

A recent project brought be face to face with noSQL type challenges.  Businesses are demanding more and more data relationships, more interconnectivity between objects, entities, data elements.  Suddenly businesses expect to build Facebook-like relationships into various parts of their business data, and expect to be able to find and present to users, data by walking the relationship chain.

Lets take an example:

A customer sends an email about a problem with his account.  The business wants to manage the customer relationship, so a link to the email object needs to be placed in the CRM system, as does an entry of what his requests were.  His requests set off a series of business processes to handle them, and each process wants to link to the entity (the original request in the CRM system) that triggered the process.  This allows the process to report “I was started due to…”.  But the inverse is also desired, the ability for the CRM system to look at the request and see the processes started (possibly in other systems) due to the request – so they can report back to the customer (if he calls or looks up the status of his request on a customer web portal) “we did or are doing …… because of your request”.

This bi-directional many-to-many relationship structure would be complicated and tricky to implement in a traditional relational database structure.  (It could be done, but would not maintain any referential integrity.)  But that assumes operating in 1 single database and one single database schema.

What happens when these relationships need to be maintained across modules, across applications and across systems?  When it’s the CRM system and a customer request that’s invoking actions in the Sales system and Billing system?

In other words, what happens when our data relationships are cross platform?

In the past we just threw customer keys at each other.  But when the points of the data relationships move to the tens or more, throwing around keys and indirect relationships is no longer viable.

This is where the noSQL (not only SQL) tools step in.  They are all about building and walking data relationships, and dynamically building data content and data access paths.

So much of our time with relational databases is spent on keys and indices (access paths) and relationships, especially when those relationships or access paths grow beyond 2 or 3.  The recent growth of the JPA Java standard is an attempt to partially resolve this problem (let Java generate it for you).

noSQL graph databases are a full resolution of this problem.

If you haven’t looked into them and you’re into architecture, system integration and/or database design, I suggest it’s time to do so. 

There is one failing though.  Relational databases are very good at what they do, and very well understood for it and it would seem overkill to try to supplant them with noSQL tools in their primary areas of strength.  Yet at the moment there are few access tools that offer a combined RDBMS and Graph-DB approach, where I could build a single query, traverse a graph node path to a relational table row.

Whether RDBMS has a future or not, I don’t know.  But the noSQL and graph database approach is clearly worth looking into for almost any new project.