Last week I discussed the Manual Approach to unlocking true elasticity and this week I will continue by discussing second type of solution commonly seen in the industry: attempting to resolve the problem by applying rules that kick in when a threshold is crossed (if X happens, do Y). The rule usually applies to a single resource and will adjust allocation based on that resource alone. For example, “if CPU is above 80% utilization move one size higher within the family”, “if CPU is below 20% move one instance type smaller in the family type” and so on.
The problem is that when you move up in the family you double all resources and when you move down you cut all in half. In the CPU example, if CPU is below 20% you will cut the memory in half without any understanding of the how allocation or how it is being used. If it is utilized and removed, then changing the instance size will degrade the performance of the application. Alternatively, if CPU is above 80% you may double the memory even if the memory currently allocated isn’t being used.
Due to the impact of rules considering resources in isolation these decisions can’t be automated safely, so while the analytics run in real-time, they still require humans to verify the output which means resources aren’t adjusted in real-time.
The Magic Metric
In an attempt to improve upon that, some solutions extend this to enable setting thresholds on a combination of resources, or a “magic metric”. For example, magic metric X = 70% * CPU Utilization + 20% Memory Utilization + 10% IO Utilization, and defining rules on this metric, similar to the rules for the single resource. These allow you, based on your understanding of your application to prioritize certain resources over others.
While better than making decisions based on a single resource, it has many of the same faults. If you decide to scale down you still need to understand the exact consumption of all resources to be able to choose what instance type is the best choice for each application. Most importantly, any metric, no matter how defined and how many resources it combines, could not assure the application gets the resources they need when they need it.
As an example, let’s assume you define a metric as 50% CPU utilization + 50% memory utilization to scale up after it reaches 85% or scale down below 20%. What happens if that application is running on an instance that is over provisioned significantly on memory but tight on CPU? The combined threshold would not reach the 85% required to trigger scaling even though the application is suffering.
The Endless Sizing Loop
Another side effect of this approach is an endless sizing loop. Let’s take the example from above applied to an application constrained on CPU but overprovisioned on memory. The combined metric might be triggered to scale up because CPU is extremely constrained but by executing it memory becomes even more over provisioned causing the combined metric to go below 20% triggering a scale down of the instance and the merry continues to go round…
So, Do These Rules Really Unlock Elasticity?
Unfortunately, no. Even with methods like the “magic metric”, rules will not scale. All the different variations to this approach are either time consuming, prone to errors, or both, and can only end with over-spending or under performing. Without a platform that understands demand in real-time and is able to make incremental changes accordingly, it is impossible to be truly elastic.
Stay tuned for my final blog where I will discuss Batch Analytics.