Skip to main content

Categorize or Search?

SOA design time governance products come with a variety of methods to categorize the services or assets.  Interestingly I’m currently working on a project involving a document management system, and the select document management tools comes with almost an identical selection of categorization methods.  These include trees, taxonomies, and domains among others.

In some of the earlier SOA design time governance implementations I performed, we spent significant time working on the categorizations – trying to make the catalog and information easy to traverse for the various user categories that would encounter in.  (In the case of design time governance, this might be analysts, architects, developers, QA, and IT management.)

We invariably found designing the categories and approaches took a tremendous amount of time, with every constituency having different ideas and requesting various adjustments to the approach.  To some extent it became the never ending quest for the perfect structure, never to be found.

We saw this same pattern emerge and fail among the World Wide Web.  For those with a little Internet history behind them, they may remember the early Internet Indexing + Search Sites, such as Lycos, AltaVista, Netscape and Yahoo.  Each presented various approaches to indexing, categorizing and presenting the Internet in various taxonomies.

One day a newcomer arrived named Google who presented one simple function…search.  Within a short time all the indexing and taxonomy sites were dead.

The way we do things is limited by our technology.  When we’re doing things manually, our technology may be bookshelves or filing cabinets, paper index cards and human retrieval methods – or even human memory capacity.  As we automate processes with newer technology, it’s perfectly normal to take the previous process and “enhance” it with the new technology – but the previous process still exists.

Once we truly understand the possibilities and capabilities of the new technology, we often supersede the previous process.  In the case of cataloging our information, our models came from books, libraries, index cards, filing cabinets, etc.  Even if we were storing our information in high speed relational databases, our access models were based on our previous process – catalogs, sorted indexes, grouped information.

Yet Google has shown us that model is significantly less efficient and of less value that a capable search. 

When we’re modeling today’s software abilities, user interfaces and data organization approaches, search should be the first and primary approach.  Categorization is just pushing the old much less efficient approach forward.

Popular posts from this blog

Integration Spaghetti™

  I’ve been using the term Integration Spaghetti™ for the past 9 years or so to describe what happens as systems connectivity increases and increases to the point of … unmanageability, indeterminate impact, or just generally a big mess.  A standard line of mine is “moving from spaghetti code to spaghetti connections is not an improvement”. (A standard “point to point connection mess” slide, by enterprise architect Jerry Foster from 2001.) In the past few days I’ve been meeting with a series of IT managers at a large customer and have come up with a revised definition for Integration Spaghetti™ : Integration Spaghetti™ is when the connectivity to/from an application is so complex that everyone is afraid of touching it.  An application with such spaghetti becomes nearly impossible to replace.  Estimates of change impact to the application are frequently wrong by orders of magnitude.  Interruption in the integration functioning are always a major disast...

Solving Integration Chaos - Past Approaches

A U.S. Fortune 50's systems interconnect map for 1 division, "core systems only". Integration patterns began changing 15 years ago. Several early attempts were made to solve the increasing problem of the widening need for integration… Enterprise Java Beans (J2EE / EJB's) attempted to make independent callable codelets. Coupling was too tight, the technology too platform specific. Remote Method Invocation (Java / RMI) attempted to make anything independently callable, but again was too platform specific and a very tightly coupled protocol. Similarly on the Microsoft side, DCOM & COM+ attempted to make anything independently and remotely callable. However, as with RMI the approach was extremely platform and vendor specific, and very tightly coupled. MQ created a reliable independent messaging paradigm, but the cost and complexity of operation made it prohibitive for most projects and all but the largest of Enterprise IT shops which could devote a focused technology...

From Spaghetti Code to Spaghetti Connections

Twenty five years ago my boss handed me the primary billing program and described a series of new features needed. The program was about 4 years old and had been worked on by 5 different programmers. It had an original design model, but between all the modifications, bug fixes, patches and quick new features thrown in, the original design pattern was impossible to discern. Any pattern was impossible to discern. It had become, to quote what’s titled the most common architecture pattern of today, ‘a big ball of mud’. After studying the program for several days, I informed my boss the program was untouchable. The effort to make anything more than a minor adjustment carried such a risk, as the impact could only be guessed at, that it was easier and less risky to rewrite it from scratch. If they had considered the future impact, they never would have let a key program degenerate that way. They would have invested the extra effort to maintain it’s design, document it property, and consider ...