VM Memory Management to Avoid Memory Ballooning & Swapping

Video Transcript

You know those frustrating application issues that pop-up and then disappear before you can trace root cause like a whack-a-mole game? The cause may not be your code, but rather VMs being too big.

You want to get the best possible performance for your apps, and logic might lead one to believe that allocating more memory to a VM would only improve performance, but issues like memory ballooning and swapping defies that logic. Let’s see how.

A physical host with 16 cores and 64 Gigs of memory that has been virtualized may share its physical resources with multiple virtual machines – sometimes 10 or more. When those VMs are over-sized – with say, a 64 Gig virtual memory allocation – otherwise known as vRAM – they take up all resources on the host whether or not those resources are needed.

One of the most powerful aspects of virtualization is that multiple VMs can share the same host machine’s underlying resources. When one VM is over-allocated memory, however, it can lead to things like memory ballooning and memory swap.

Memory ballooning happens when a virtual guest requires a certain amount of memory, a balloon of virtual memory is inflated with physical memory to provide the needed resources to meet the demand. When the demand decreases, the virtual balloon deflates, and the physical resources are surrendered back to the physical host. The problem is that while recovering memory from the ballooned capacity is possible, it isn’t always timely. As a result, there may be varied levels of performance during these burst-and-recover transactions, potentially degrading performance of specific applications.
With many VMs demanding memory at once, the hypervisor may then also have the host swap to disk to fulfill the demand, creating latency on the guest where memory is unavailable. The I/O required in such swap actions creates a problem that can become a self-fulfilling prophecy. As more workloads target memory aggressively, other workloads have to suddenly move from RAM to swap, increasing latency and decreasing end-user performance.

And think about how much harder it is to move a VM demanding 64 Gigs of RAM from a host that’s running hot to a better spot – a VM that’s over-built with too much RAM makes it much harder to find a better spot for it to run. As more workloads have performance issues, they can impact other workloads they are sharing resources with, increasing latency and further degrading end-user performance.

However, if we re-size the VM to, say, 4 gigs of VRAM rather than 32 or 64, we see how physical resources are more efficiently allocated, reducing latency for every VM, improving application performance and end-user experience.

The other benefit of avoiding over-sizing is that each blade now has more room, meaning each VM has more options if it needs to move to avoid a noisy or busy neighbor. More options means better performance for every workload, including your own.

Cutting edge IT operations are making use of standardized VM deployment models with smaller initial VM sizing.

This VM right sizing will actually improve application performance, while managing resources more effectively, and will also aim to optimize current workloads.

With virtualization, both virtual CPU and Memory resources can be added without needing to re-start the VM (a process known as “hot add”). Turbonomic is able to automate this hot add process so that VMs seeing increased demand can get the resources they need to assure performance in real-time.

Turbonomic looks at all resources holistically, in real time, and understands the tradeoffs between not only CPU and Memory, but also network, storage, and indeed all aspects of the data center supply chain. Turbonomic then correlates those tradeoffs with application demand at that very moment. This enables every workload to get exactly the resources it needs in real time – no more, but importantly no less – to assure performance for the end user.

By sizing VMs conservatively we can avoid both CPU and Memory resource contention leading to latency and other performance degradation impacting end users, while hot adding resources to specific VMs like the one here when more users suddenly increase application demand. Current VMs in the infrastructure will be included in the auto upsizing program, coupled with a downsizing program to get running VMs optimized, and increase utilization of datacenter hardware.

In this way, your environment will remain in a goldilocks state where no VM has too many – or too few – resources, and healthy application performance is assured, no matter how popular your apps become.

Recommended Resources

White Paper How to Overcome The Challenges of Virtual Desktop InfrastructureThis paper explores some of the common challenges facing VDI planning, deployment, operations, and Turbonomic’s unique approach for addressing them.
White Paper The Evolution of Enterprise Applications And PerformanceIn this paper, we examine the evolution of enterprise applications, the data centers in which these applications reside & increasing QoS expectations.