Friday, July 10, 2015

Don't forget to check the HCL

A few weeks ago I had a customer that implemented some new hardware into their VMware enviroment. They ran into some major problems with VMotion.

 Now your probably thinking they purchased some commodity hardware that was completely unsupported. Actually they purchased some brand new HP DL380 Gen 9 servers. All the components in this server are in the HCL. In fact the version of VMware that they were running is still fully supported by VMware.

So where did they go wrong?
If you check the VMware HCL for a HP DL380 Gen 9 server you will see that it is supported for ESXi versions  5.1 - 6. My customer was running version ESXi 5.0. The customer was able to install version 5.0 and get it running without issue. Where they ran into problems was with VMware EVC mode and VMotion.

What is EVC  mode?
EVC mode is a software solution for a hardware problem. Intel and AMD CPU's are constantly changing and adding new features and instruction sets with each new"family" or generation of CPU's.
This is a problem for VMware Virtualization when you use VMotion. With VMotion all servers in a cluster must have the same exact CPU instruction sets available. If they don't and a VM guest is using a newer instruction set it will not be able to be moved to a older CPU without that doesn't have that feature. The solution that VMware came up with is EVC mode. Turning this feature on in your cluster "masks" or hides the instruction sets of the new CPU's and only allows the VM Guests to see instructions that are the same across all the hosts in a cluster.

Back to the Customers Issue
For my customer they were running one older HP DL380 Gen 8 server that had a Sandy Bridge Intel Processor. EVC mode was enabled and running as "Sandy Bridge Mode", however EVC mode in version 5.0 is not compatible with  a new Intel Haswell processor. In fact older versions of EVC doesn't even know about the new instruction sets that are available in Haswell or Ivy Bridge. Because of this EVC didn't even know how to mask or hide them. With EVC mode enabled is seemed to work for a little while, but the customer learned that they could only VMotion from the Gen8 host to the Gen9 host, not the other way around. It was a one way trip. If they tried to move a VM guest from a new host to an old host, this Error occurred.

The Fix
After doing some research on the issue I checked the HCL for the Gen 9 HP server and discovered that it was not compatible with ESXi 5.0. So we proceeded to upgrade Virtual Center and the ESXi hosts to version 5.5. After completing the upgrade and enabling EVC mode in the cluster again. Along with a hand full of cold boots of VMware Guests, the issue was resolved.

When your making any hardware changes, always check the HCL first.

VMware Hardware Compatibility List

No comments:

Post a Comment

Safety First!

Today started out crazy, My wife is a runner and goes on a run almost every morning. I decided to join her for part of it and take a morni...