In existing IT systems, services tend to get created as IT projects require access to a function or bit of data from another system. The need is for a specific narrow bit of functionality or narrow data set. The function the exposing system has internally usually offers a wider bit of functionality or data.
Example: System B needs customer data from System A, specifically they need the name and address. System A’s internal customer query function retrieves the customer, name and address, demographic data, and a list of related customer accounts.
System B’s project manager comes and says I need the customer name and address, please expose it. System A exposes it, matching exactly what System B needs rather than what System A can actually do. A year later System C needs customer data, plus demographic data. The service needs to be changed to match System C, requiring System B to retest as well. Or alternatively a new service is created. In the first case the testing costs go up, in the second reuse is completely lost and service maintenance costs go up.
The rule is: to maximize reuse potential, a function being exposed should expose it’s full function set and full data set.
There are 4 arguments made against this...
#1, it makes the response larger (often much larger), causing lots of extra network traffic. Our network will overload.
Response #1 – Internal network bandwidth is cheap and isn’t overloaded. Most backbone networks inside data centers are running at 1gbit or even 10gbit. The days of 10mbit networks are in the far past. Programming and testing time is the major expensive in almost every IT shop.
#2, processing the larger response takes extra CPU time. An extension of this complaint is processing XML at all (rather than flat fixed length data or a unique serialized data object) takes significant CPU time to parse and objectify.
Response #2 – XML processing tools have become orders of magnitude more efficient in the last few years, and every few years CPU power has doubled. Further, CPU's are now shipping as dual-core or quad-core as standard, meaning their parallel processing capability as multiplied by 4x. Net net, extra CPU power is cheap, extra manpower is expensive.
#3, logging these larger-than-needed-right-now transactions takes lots of extra disk space.
Response #3 – I just bought 1 terabyte of disk for my home for $100. Thats the cost of 1 hour of programming time in the U.S.
#4, security. If more is being exposed than needed it’s a security risk.
Response #4 – Leave security to the SOA security layer, ESB and associated tools which can handle it without development and coding time.
In essence, none of these arguments hold any water in the current IT environment, but did 10 years ago. In the current environment, people are the primary cost, programming, coding, and testing time present a large bill, maintenance and modification efforts absorb a major part of the budget. The actual hardware is not a major cost, and with virtualization, hosting and cloud computing just continues to drop.
What is a major cost from the service perspective is having to touch a service later. The MORE a service is reused, the HIGHER the cost to touch it later. (As either all the connections retest or a clone of the service is made - resulting in a permanent doubling in the maintenance cost.)
Example: System B needs customer data from System A, specifically they need the name and address. System A’s internal customer query function retrieves the customer, name and address, demographic data, and a list of related customer accounts.
System B’s project manager comes and says I need the customer name and address, please expose it. System A exposes it, matching exactly what System B needs rather than what System A can actually do. A year later System C needs customer data, plus demographic data. The service needs to be changed to match System C, requiring System B to retest as well. Or alternatively a new service is created. In the first case the testing costs go up, in the second reuse is completely lost and service maintenance costs go up.
The rule is: to maximize reuse potential, a function being exposed should expose it’s full function set and full data set.
There are 4 arguments made against this...
#1, it makes the response larger (often much larger), causing lots of extra network traffic. Our network will overload.
Response #1 – Internal network bandwidth is cheap and isn’t overloaded. Most backbone networks inside data centers are running at 1gbit or even 10gbit. The days of 10mbit networks are in the far past. Programming and testing time is the major expensive in almost every IT shop.
#2, processing the larger response takes extra CPU time. An extension of this complaint is processing XML at all (rather than flat fixed length data or a unique serialized data object) takes significant CPU time to parse and objectify.
Response #2 – XML processing tools have become orders of magnitude more efficient in the last few years, and every few years CPU power has doubled. Further, CPU's are now shipping as dual-core or quad-core as standard, meaning their parallel processing capability as multiplied by 4x. Net net, extra CPU power is cheap, extra manpower is expensive.
#3, logging these larger-than-needed-right-now transactions takes lots of extra disk space.
Response #3 – I just bought 1 terabyte of disk for my home for $100. Thats the cost of 1 hour of programming time in the U.S.
#4, security. If more is being exposed than needed it’s a security risk.
Response #4 – Leave security to the SOA security layer, ESB and associated tools which can handle it without development and coding time.
In essence, none of these arguments hold any water in the current IT environment, but did 10 years ago. In the current environment, people are the primary cost, programming, coding, and testing time present a large bill, maintenance and modification efforts absorb a major part of the budget. The actual hardware is not a major cost, and with virtualization, hosting and cloud computing just continues to drop.
What is a major cost from the service perspective is having to touch a service later. The MORE a service is reused, the HIGHER the cost to touch it later. (As either all the connections retest or a clone of the service is made - resulting in a permanent doubling in the maintenance cost.)