The moment you use virtual environments, your performance becomes virtual as well and is totally dependent on your virtual environment resource allocation, prioritization, cycle steals, and the manor the other virtual machines are using things like disk space. Many of these environments are ''managed'' on an ongoing basis and since load tests tend to be run at off hours and for small amounts during the week or month, they tend to get ''managed'' to overloaded hosts with a large number of similar VMs.
So:
- 1 virtual sec != 1 real sec. It can range from 0.9 to 1.2 secs.
- The faster the response time, the more likely it is to be inaccurate
- There will be higher priority VMs
- nightly scan disks or anti virus scan kickoffs will happen sometime during your biggest load tests
- Task Manager will say 99% Idle, yet mouse movements will be choppy and apps will be in Slooow Moootion....
Essentially, you can run the same test 3 times in a row and get totally different performance and performance measurements out of the system.
Unfortunately, many companies have bought into the virtual cool-aid including their load test environments and don't seem to care that their measurements are compromised because of it. Stick to real hardware including a local disk, with real video and network adapters, if you can.
If you can't, make sure they assign your VMs to lightly loaded hosting servers and that they scatter the load generators a bit. You can also have them go through an exercise of shutting down VMs on the servers hardware hosting your VMs.
Good Luck