Serverless is great, but serverless functions rarely run by themselves... they're usually connected to data sources - and frequently that means SQL databases.
In the graphic attached, my cloud SQL database is collapsing under assault by a serverless function that's scaling up instances to meet demand. Here's how we unintentionally created this problem:
To allow for parallel processing to speed processing, we've taken a transactional billing file, broken it into groups of 30 transactions, loaded those into messages and queued them. We then have a serverless function that's listening for these events, and if messages are still waiting after a certain amount of time, another instance is spawned to begin handling the waiting messages and processing.
When it was sent thousands of transactions, this worked great and the business users were suitably impressed with the processing speed (which dropped from many hours to a few minutes). When they began sending larger transaction sets, tens of thousands, the processing speed remained great but the single instance database CPU began spiking. No problem - it's cloud so size up the database instance (though don't ignore the very significant cost jump of moving from small instances to medium or large ones).
But when it was sent hundreds of thousands of transactions, the single instance database hit capacity. [ This is a common traditional relational database problem -- you hit capacity of CPU, Memory or IOPS - even if you size up your database you can only push so much volume through it. This can be stretched with sizing up CPU and Memory if it's complicated or non-optimized SQL's, and with cloud databasing provisioning more IOPS if it's poorly indexed or doing huge reads and writes, but quickly gets very expensive and still has it's 'not too big' limits. There are some tricks such as creating read replicas that can extend the ability to use a traditional SQL DB model. But that's not the focus of this article.]
Serverless functions are "throttle"-able (in AWS this is called concurrency), and concurrency is set-able per serverless function, and this is the traditional way to handle this sort of problem. (This limits the performance, but there's always trade off's). But this comes to the point of this article...
Concurrency is AT THE LEVEL of the (serverless) function. If, for example, your function contains the code that handles many different processing situations or transaction types, because they are combined in the single deployed function, they can only be throttled at the level of the combined functionality. (If you wanted to allow up to 100 instances for transaction type one, but limit to 10 for a less time sensitive transaction type two, you can't segment it because the control is at the overall deployed function.)
Meaning part of the consideration for how to break down and deploy serverless functions needs to include at what level of granularity is needed to apply the levers of control (settings such as concurrency and function resource sizing) and monitoring.
In the graphic above, the team is using a serverless application framework (Laravel Vapor) that deploys the full application into 3 separate functions: one responds to user interaction, one to messages, and one to command line interaction. So it is impossible to throttle without providing the same limit, a combined limit, across a processing mode. This team has some refactoring to do.
The granularity of the function deployment affects your ability to resource manage (and therefore the cost), as well as control and monitor of your functions.
Architect and developer beware.