Frequent DAG failovers in a virtualized Microsoft Exchange environment on VMware vSphere

Running Microsoft Exchange in a virtualized environment provides a lot of extra flexibility and even increased availability when running in a HA configuration. This short article is dedicated to some extra tuning that might be necessary in your environment.

The environment I’m talking about is consisting of 2 virtual Exchange 2013 servers, running on VMware vSphere 5.5. Storage is provided by an iSCSI-based array. Compute by HP Gen6 Intel-blades.

Ever since these servers are running, a failover is triggered by the Microsoft Cluster Service, failing all the active mailbox databases over to the second Exchange server. It seems that a snapshot creation task is triggering this failover. As we are using Veeam for backups, we contacted them to ask if there are any workaround for this issue.

Veeam released this following KB article, telling you how to decrease the cluster sensitivity and prevent the failovers to happen. In our case, these settings sadly didn’t solve our issues.

What seems to be the problem, are dropped network packets from within the Guest OS. Following this KB article by VMware, it seems there are some issues with the VMXNET3 NIC on systems that have high traffic bursts (like Exchange).

For now, these settings seem to solve our issue and no failovers are happening again, but if it arises again, I will definitely update this article.

Hopefully both possible solutions by Veeam and VMware can help you in case you are running into the same issue.

Got feedback? Please leave it below!

VMworld 2013: Successfully Virtualize Microsoft Exchange Server #VAPP5613

Alex Fontana, creator of the VMware documentation about virtualizing Microsoft Exchange products told all listeners about the best practices and pitfalls you can experience.

Some information was presented last year in Barcelona, but still provided useful information which I will pass on to my Exchange friends. My highlights of this session:

  • Use EagerZeroThick virtual disks instead of thin provisioned virtual disks. I believe this is only for the disks hosting mailbox databases. This due to the fact that there is a small performance penalty when writing new blocks to a thin provisioned disk.
  • When having multiple virtual disks in your virtual Exchange server, configure multiple vSCSI adapters (4 is the max, so when having 4 or more virtual disks, use them all and split the virtual disks among the vSCSI adapters). During testing, this has brought the Exchange latency from 60 to around 10ms.
  • Exchange latency should be under 10ms and should be no more then 20ms.
  • When using vMotion, make sure you use Jumbo Frames on the vMotion network. When this is not possible, configure the ‘SameSubnetDelay’ parameter to 2000ms instead of the default 1000ms on the DAG cluster.
  • Load balancing using the vCloud Networking & Security Edge is supported and works fine. No need to use a hardware or third-party load balancer.

A nice session and experienced speaker, thanks Alex!