In a previous posting we mentioned Microsoft Azure’s latest announcements for what they call “Azure Functions.” It’s a part of their approach to the trend toward “serverless computing” — a concept taking the place of “big data” as the hot-button topic in internet development circles these days.
As we’ve discussed this emerging trend with our fellow coders and technological peers, the conversation seems to always start with a discussion about what “serverless” means. After all, the purpose of all those super-gigantic datacenters that all the big-boys of tech have been building — we’re talking about the Microsofts, Googles, Amazons, and IMBs of the world — is to house literally millions of steel server boxes in seven-foot-tall server racks aligned within buildings the size of football fields.
The investment in these mega-data-mansions is enormous. As of January of this year, Microsoft alone had built 38 massive, single-data centers all over the world to host their Azure services — adding one every other month on average. And they have significant points-of-presence in hundreds of additional datacenters owned by others. All of this unimaginably large infrastructure is designed to bring their network edge of steel boxes closer to more customers — and their customer’s customers.
Does “serverless computing” mean that all those billions of dollars are wasted? Hardly. But it’s important that we agree on what a server is — in the cloud computing sense —before we talk about doing away with them.
What Is a Server?
All those physical steel server boxes crammed into all those massive datacenters are really just defined units of technology resources. Traditionally, a physical server is ordered from Dell, HP, or another vendor by citing a series of specifications. You tell your Dell Computer Sales rep that you want x-number of processors of a certain type and speed, you want y-gigabytes of RAM, and z-amount of storage. There are other specs of course, but this is the meat of it.
Then the sales rep has your order delivered before he or she goes off to a company sales retreat to Aspen … because these datacenter-class servers are freaking expensive. It’s because these servers — and the datacenters that house them — are so expensive to buy, build, AND to operate that we got “the cloud.”
The Problems with the Cloud
The last eight years have seen a race to “the cloud.” Incredible technological advancements allowed websites and applications to transcend physical servers — the steel sided boxes that housed the spinning disks, cooling fans, and circuitry. This allowed users and applications to more accurately access the amount of processing and storage capacity they needed from the robust power of server-grade hardware located in hardened datacenters with super-high bandwidth connectivity.
In other words, think of a datacenter as being the most technologically advanced — and most expensive — real estate in the world. Instead of having to buy the penthouse apartment, you get to just show up and use the pool.
This has been accomplished through the concepts and methodologies of abstraction, virtualization, and containers. Extending our analogy, this means that someone else buys that expensive penthouse and sub-divides it into lots of small studio apartments that you can rent by the day. The problem is that all those small apartments still require walls, electrical, plumbing, and air conditioning which takes up space and makes it hard to move around.
This has become a friction point in cloud hosting. “Virtual” servers are still servers even though there are several of them inside a physical server (the steel box). You still need to spin-up a “virtual server” or “virtual instance” and define the amount of resources available to it — how much RAM, number of CPUs, how much storage, etc. But what happens when you need a little bit more or a little bit less of something? No one wants to pay for more of the penthouse than they actually want to use at the moment. Technology advanced to address this by making those virtual servers elastic.
The Limits of “Elastic”
Real-time “elastic” environments are mostly real-time … until they aren’t. They promise to allow near “instant” access to additional resources that expand or contract the size of your virtual server on demand. But the reality is that — just like in a penthouse studio — walls don’t move themselves.
Even the fastest hosting automation has a bit of a time lag when it comes to adding CPUs or RAM. This can be problematic if you have significant traffic spikes, causing you to reserve larger amounts of resources permanently to make sure that you have them ready to go. But that costs more, and it also means that you are paying too much for resources when they are not in use.
But this is essentially “the cloud” — a series of virtualized servers that are relatively elastic, more economical than previous options, and that can be accessed to do the data storage and heavy-lift processing of apps and remote devices from robust datacenters with tons of redundancy and connectivity.
A Weak Spot (and Inspiration) — Storage
The weak spot of this set-up used to be storage arrays. CPU and RAM have become very advanced and rock-solid reliable. But disks are the only moving piece of a physical server. And even though they are FAR more durable and reliable than in the past — seriously, HUGE advancements — they are still the weak point. You can only fit so many drives in a single server box, and the fewer the drives the greater the statistical risk of some failure causing a data loss (yes, we understand the statistical limits).
One way this was addressed was converting spinning disks to the new technology of solid-state drives which are faster and have a lower catastrophic failure rate, but they degrade over time and are generally more expensive.
A second way was to separate the storage from the server. This was VERY effective, placing the disks away from the heat of CPU processing in massively-redundant disk-arrays that could be hot-swapped on the fly. The servers then called and wrote to those partitioned storage units. One massive storage array could serve dozens of physical servers, thus reducing each server’s risk of data loss due to an individual or multi-disk failure. This is the solution used by nearly all public and private cloud architectures today.
Separating the storage component from the physical server turns out to have been an important innovation that made way for the concept of serverless computing.
No Server, Just Service
Turns out those separated disk storage arrays held the key to the solution of wasted, shared, and elastic resources. Instead of just de-coupling (abstracting) ONLY the storage, why not abstract all the resources from the server entirely? This is the big idea of serverless computing.
The gurus of public cloud are developing tools to transform a need for computational resources into a series of API service requests. So instead of having to stretch the virtual walls of the cloud server your application is hosted on (elastic), your application would make an API call requesting additional CPU, more RAM, or a few extra gigabytes of storage. And because the processing or memory already exists outside the virtual and even physical walls of your server instance, those additional resources can be assigned from anywhere — even from another physical server, a different server rack, or even (theoretically) a device located in a nearby building on the same network.
Of course, the Holy Grail vision of Grid Computing is to seamlessly allow this kind of resource allocation across vast networks separated by miles. Whether we will ever get to that point — or even want to — is a matter for another discussion.
The point of serverless computing is that processing CPUs become CPU-as-a-Service, memory becomes RAM-as-a-Service, and storage becomes Disk-as-a-Service. In the industry, we are starting to call these requests “micro-services.” And the pricing model for micro-services is looking like it will be based up on seconds of use. In other words, if your website application needs additional CPU to process a new photo gallery you just uploaded from your mobile phone, your site will request the CPU-power it needs. If that extra CPU processing takes 115 seconds, you will be billed 115 seconds of extra CPU time as your API service call turns the resources on and off again.
No elastic technological yoga-stretches, no additional virtualization instances, no logging in to management interface, no wasted idle resources, and FAR less lag (closer to instant than ever before).
Currently this will require a bit of additional coding in your app to manage micro-service requests. And this is not (yet) applicable to every project. But this is the vision of serverless computing that some of the brightest minds and largest tech companies in the world see as a way to wring even more value, economy, and scale out of their massive cloud architectures.
Now you know.
You can read about Azure’s implementation of Azure Functions here. Thanks for reading.