Skip to main content

Serverless - Unintentional Granularity Problems

 

Serverless is great, but serverless functions rarely run by themselves... they're usually connected to data sources - and frequently that means SQL databases.

In the graphic attached, my cloud SQL database is collapsing under assault by a serverless function that's scaling up instances to meet demand.  Here's how we unintentionally created this problem:

To allow for parallel processing to speed processing, we've taken a transactional billing file, broken it into groups of 30 transactions, loaded those into messages and queued them.  We then have a serverless function that's listening for these events, and if messages are still waiting after a certain amount of time, another instance is spawned to begin handling the waiting messages and processing.

When it was sent thousands of transactions, this worked great and the business users were suitably impressed with the processing speed (which dropped from many hours to a few minutes).  When they began sending larger transaction sets, tens of thousands, the processing speed remained great but the single instance database CPU began spiking.  No problem - it's cloud so size up the database instance (though don't ignore the very significant cost jump of moving from small instances to medium or large ones).

But when it was sent hundreds of thousands of transactions, the single instance database hit capacity.  [ This is a common traditional relational database problem -- you hit capacity of CPU, Memory or IOPS - even if you size up your database you can only push so much volume through it.  This can be stretched with sizing up CPU and Memory if it's complicated or non-optimized SQL's, and with cloud databasing provisioning more IOPS if it's poorly indexed or doing huge reads and writes, but quickly gets very expensive and still has it's 'not too big' limits. There are some tricks such as creating read replicas that can extend the ability to use a traditional SQL DB model. But that's not the focus of this article.]

Serverless functions are "throttle"-able (in AWS this is called concurrency), and concurrency is set-able per serverless function, and this is the traditional way to handle this sort of problem. (This limits the performance, but there's always trade off's).  But this comes to the point of this article...

Concurrency is AT THE LEVEL of the (serverless) function.  If, for example, your function contains the code that handles many different processing situations or transaction types, because they are combined in the single deployed function, they can only be throttled at the level of the combined functionality. (If you wanted to allow up to 100 instances for transaction type one, but limit to 10 for a less time sensitive transaction type two, you can't segment it because the control is at the overall deployed function.)

Meaning part of the consideration for how to break down and deploy serverless functions needs to include at what level of granularity is needed to apply the levers of control (settings such as concurrency and function resource sizing) and monitoring.

In the graphic above, the team is using a serverless application framework (Laravel Vapor) that deploys the full application into 3 separate functions: one responds to user interaction, one to messages, and one to command line interaction.  So it is impossible to throttle without providing the same limit, a combined limit, across a processing mode.  This team has some refactoring to do.

The granularity of the function deployment affects your ability to resource manage (and therefore the cost), as well as control and monitor of your functions.  

Architect and developer beware.

Popular posts from this blog

Integration Spaghetti™

  I’ve been using the term Integration Spaghetti™ for the past 9 years or so to describe what happens as systems connectivity increases and increases to the point of … unmanageability, indeterminate impact, or just generally a big mess.  A standard line of mine is “moving from spaghetti code to spaghetti connections is not an improvement”. (A standard “point to point connection mess” slide, by enterprise architect Jerry Foster from 2001.) In the past few days I’ve been meeting with a series of IT managers at a large customer and have come up with a revised definition for Integration Spaghetti™ : Integration Spaghetti™ is when the connectivity to/from an application is so complex that everyone is afraid of touching it.  An application with such spaghetti becomes nearly impossible to replace.  Estimates of change impact to the application are frequently wrong by orders of magnitude.  Interruption in the integration functioning are always a major disaster – both in terms of th

Solving Integration Chaos - Past Approaches

A U.S. Fortune 50's systems interconnect map for 1 division, "core systems only". Integration patterns began changing 15 years ago. Several early attempts were made to solve the increasing problem of the widening need for integration… Enterprise Java Beans (J2EE / EJB's) attempted to make independent callable codelets. Coupling was too tight, the technology too platform specific. Remote Method Invocation (Java / RMI) attempted to make anything independently callable, but again was too platform specific and a very tightly coupled protocol. Similarly on the Microsoft side, DCOM & COM+ attempted to make anything independently and remotely callable. However, as with RMI the approach was extremely platform and vendor specific, and very tightly coupled. MQ created a reliable independent messaging paradigm, but the cost and complexity of operation made it prohibitive for most projects and all but the largest of Enterprise IT shops which could devote a focused technology

From Spaghetti Code to Spaghetti Connections

Twenty five years ago my boss handed me the primary billing program and described a series of new features needed. The program was about 4 years old and had been worked on by 5 different programmers. It had an original design model, but between all the modifications, bug fixes, patches and quick new features thrown in, the original design pattern was impossible to discern. Any pattern was impossible to discern. It had become, to quote what’s titled the most common architecture pattern of today, ‘a big ball of mud’. After studying the program for several days, I informed my boss the program was untouchable. The effort to make anything more than a minor adjustment carried such a risk, as the impact could only be guessed at, that it was easier and less risky to rewrite it from scratch. If they had considered the future impact, they never would have let a key program degenerate that way. They would have invested the extra effort to maintain it’s design, document it property, and consider