Compute Fabric Efficiency: Better density or better service?
In attempts to quantify efficiency of compute fabric people use various parameters and numbers. They sometimes pick a resource or two (like memory or CPU) and try to come up with a level of utilization – say, CPU utilization should be no less than 50%. The problem with this approach is that it is hard to define and enforce a single metric across multiple resources. You could reach 50% of CPU utilization but memory could be above or below this number because various pieces of the workload demand different amount of resources and it is never as simple as using standard identical building blocks. And the more resources you add to the mix the harder would be to maintain the desired efficiency.
First, what is that unit of workload? It could be a VM or a virtual desktop. So you could say that a physical host in your infrastructure should run at least 15 VMs or desktops. But how do you arrive to that number? Some people use a simple economics approach: when they do p2v and transform a physical server to a VM a ratio 1:10 or above makes a lot of economic sense, you just saved 9 physical boxes. However, this density number depends on many factors, a smaller box can fit only 5 VMs whereas the larger one can fit 20. Determining a right number for a box is not simple at all.
Second, it is still assumed that these VMs are pretty much identical. Which can simplify the calculation – but there is a caveat. If some VMs demand more resources than others, then these density numbers could be misleading. 15 barely-utilized VMs on the host won’t load it all whereas 15 demanding VMs on another host will bring it to its knees.
So the density – while better than raw utilization – doesn’t help much either. So what is the right way of providing efficiency of compute fabric? Conceptually it is simple, as we pointed out before: the purpose of the IT infrastructure is to deliver reliable application performance. So the efficiency should be driven by the application demand.
Applications can run inside virtual machines or some containers like Docker, but ultimately they should be placed across the hosts to satisfy their demand and maximize the host utilization. But you can’t separate host utilization from application performance. So instead of improving density you should really think of improving service delivery with the best possible resource usage. Do you know how to do that?
Image source: Scotty modelling a different kind of efficient compute fabric in the awesome reboot of Star Trek.