I spent my fair amount of years in IT operations, staying around enterprise VMware infrastructure for about a decade. During this period, I worked with development environments (with crazy stuff like developers running Visual Studio, Jenkins CI/CD pipelines, and automation testing clusters on top of Citrix XenApp farms on top of vSphere). I also worked with production infrastructures ranging from usual CRM and ERP applications to performance-hungry financial and real-time telco-grade applications.
Irrespective of the environment, there was always that user complaining about the slowness of a particular VM. It was not a general performance issue, but specific to one VM. And you know what? Sometimes the user was right and the performance of the VM was subpar. The easiest “solution” would be to add more resources and this was at many times the path supported by the user. “I don’t have enough processing power, give me 4 more virtual CPUs”. Sometimes it is the proper solution. But often these are just resources going out of the door. In fact, all you need to recover the performance is to tune your virtual machine configuration.
In this article, I want to highlight 12 areas worth checking at the virtual machine configuration. If nothing works, then you can look into changes that get easily translated into real money. I will not touch any configuration at a level above the VM and nothing at the operating system level.
Performance Tuning – Virtual Machine Configuration
Let’s start with VMware Tools. As per VMware KB 340, “VMware Tools is a suite of utilities that enhances the performance of the virtual machines guest operating system…”. This is so true, and you may observe a measurable boost of performance by updating the tools to the latest version.
Now that you have the latest VMware Tools, let’s look at the VM hardware version. How many versions are you behind? Which features have the newer hardware versions? Especially if you have upgraded vSphere multiple times, you have an increased opportunity to see performance benefits from this upgrade. KB 2051652 presents a useful feature comparison for all available hardware versions. Before deciding to upgrade the hardware version, be sure to check KB 1010675 for any potential pitfall.
Disable any unnecessary virtual hardware devices. Do you really need that USB controller in your VM? Most likely, you will never use it.
What network adapter have you configured for the VM? For the best performance, use the VMXNET3 paravirtualized network adapter for operating systems in which it is supported. For a list of supported operating systems you can refer to VMware Compatibility Guide. If your operating system is not compatible with VMXNET3, use e1000e.
Which SCSI adapter have you configured in your VM? For best performance, use PVSCSI adapter. Same as in the network adapter case, first check the Compatibility Guide. The PVSCSI adapter offers a significant reduction in CPU utilization compared to the default virtual storage adapters. At the same time, you may observe an increased throughput.
How does the virtual CPU number relate to the physical NUMA architecture of your servers? For the best performance, size your VM to stay within a single physical NUMA node. If you have a host with six cores per NUMA node, size your VM with up to six vCPUs. As today we live in the world of the monster VMs, you should also do the same check for RAM: do you have more RAM allocated than is available in a single NUMA node?
Do you have CPU Hot Add active in your VM configuration? As per “Performance Best Practices” whitepaper, enabling this feature disables vNUMA for that virtual machine, resulting in the guest OS seeing a single vNUMA node. Without vNUMA support, the guest OS has no knowledge of the CPU and memory virtual topology of the ESXi host. This in turn could result in the guest OS making sub-optimal scheduling decisions, leading to reduced performance for applications running in large virtual machines. According to a post by Frank Denneman, you should disable CPU Hot Add feature if the VM expands across multiple NUMA nodes.
Another recommendation from Best Practices document is to allocate to each virtual machine only as much virtual hardware as that virtual machine requires. If you worked only in the physical world, this is something counterintuitive, but it is for real. Common example of this include a single-threaded workload running in a multiple-vCPU virtual machine. Another one is a multi-threaded workload in a virtual machine with more vCPUs than the workload can effectively use. These are few of the practical examples from VMware vSphere Optimize and Scale course. I recommend this course to any vSphere administrator.
In terms of RAM, same recommendation as above applies. If you don’t have enough available RAM to hold the working set of the applications, you will see performance problems due to swap being used. The other side of RAM allocation also has its issues – too much RAM is not good either. More RAM means more memory overhead. More unused RAM in the VM may potentially trigger memory reclamation techniques like ballooning, which in certain cases may affect your performance.
Do you use VM encryption? Starting with vSphere 6.5, VMware introduced virtual machine encryption, meaning you can encrypt at-rest disk files, snapshot files, NVRAM and swap files. However, this may cause some performance penalty as encryption and decryption is performed by the CPU and therefore incurs a CPU cost.
Check your VM for any configured limits. As a best practice you should use shares instead of limits, but who knows? You may have a limit configured which prevents the VM to access all the allocated resources. If the VM belongs to a resource pool, you may need to check limits configured at the resource pool level.
When do you run your backups? They can bring significant performance penalties during the backup window. Make sure to run them at off-peak hours. Avoid scheduling backups to run simultaneously in multiple virtual machines on the same ESXi host.
I will stop here, at a dozen tips. If you find something useful for your use case please leave a comment below 🙂