Core Dump

Perf Experiment Fail [μ]
Posted on October 10, 2022.

A recent customer incident reminded me of a funny mistake that I made 4 years ago while conducting a performance experiment. I had 2 virtual machines on VMware ESX, each backed by one virtual disk, that I was running a specific workload of random writes. One of the VMs had stock bits while the other had my custom bits which I expected to outperform the first one in terms of IOPS. After capturing some IOPS data from both machines and plotting them on a spreadsheet, the following IOPS graph was rendered:

Thinking that this is probably some anomaly in my initial measurement, I continued gathering data from the VMs but the above pattern continued. When the first VM was doing well the other one would experience performance degradation, and vice versa. After looking at the data in more detail I was surprised to see that the sum of IOPS of the two VMs at any given moment would stay about the same. What is more, their graph lines would look like a mirror image of each other if you placed an imaginary mirror in the middle of the y-axis.

If you’ve already guessed what the issue was, then kudos to you. Basically, I had to look back at my setup and realize that both of the virtual disks were carved out of the same physical drive in the VMware level and the workload of one VM was affecting the other, and vice versa. Oops :)