I wanted to provide some information to help clarify the process for performing maintenance on an ESXi host in a vCenter cluster that has active virtual machines on it. My experience with this type of maintenance is limited to Horizon View virtual desktops, but it is likely also applicable to clusters hosting virtual servers.
Replacing an ESX host in a cluster that has View Composer linked-clone pool installed (1015292)
“To replace a ESX host in a cluster with deployed View Desktops:
- Prepare the new ESX host outside the cluster, verify all datastores that are available to the old ESX hosts in the cluster are accessible to the new ESX host.
- Put the old ESX host in Maintenance Mode from the vCenter Server GUI.
- Ensure that you have selected the Move powered off and suspended virtual machines to other hosts in the cluster option.
- The running virtual machines migrate to other ESX hosts in the cluster, shutdown and suspended virtual machines and replicas are moved as well. The ESX host enters Maintenance Mode.Note: If the Replica virtual machine did not migrate then you need to unprotect it. For instructions on unprotecting the replica virtual machine, see Cannot remove source and replica virtual machines associated with View Composer desktop pools (1008704).
- Click the ESX host
- Click the Virtual Machines tab to verify that all virtual machines and replicas have been moved.
- In View 4.5 and later, if you have unprotected the replica virtual machine then you need to re-protect it. For more information, see Re-protecting a View replica virtual machine (2015006).
- Remove the ESX host in Maintenance Mode from the cluster.”
So, in summary, you should be able to put the host into Maintenance Mode and, if you have the cluster configured properly, the virtual desktops that are on it should be migrated over to other available hosts in the cluster.
If you have trouble putting an ESX/ESXi host into maintenance mode, check out ESXi/ESX host fails to go into maintenance mode (1036167).
One specific instance I’d like to note from personal experience: “If an ESXi/ESX host is a part of VMware High Availability (HA) or DRS cluster, check the Admission Control settings. You may have to disable this option if there are not enough resources to ensure fail over capacity. “
Which refers to this setting on your cluster:
For a more detailed explanation of Admission Control, see VMware HA Admission Control. The important parts pertaining to our topic of conversation:
“vCenter Server uses admission control to ensure that sufficient resources are available in a cluster to provide failover protection and to ensure that virtual machine resource reservations are respected.”
“Admission control imposes constraints on resource usage and any action that would violate these constraints is not permitted. Examples of actions that could be disallowed include the following:
- Powering on a virtual machine.
- Migrating a virtual machine onto a host or into a cluster or resource pool.
- Increasing the CPU or memory reservation of a virtual machine.”
The second bullet is in bold because in order for a host to enter Maintenance Mode, all of the VMs assigned to that host must first be migrated to other available hosts in the cluster or powered off. If Admission Control is enabled on your cluster and during the VM migration process you violate the HA failover capacity check, the migrations will stop and you’ll likely get an error that the host could not enter maintenance mode.
At this point, you basically have two options; modify your Admission Control Policy settings to provide more failover capacity, or shutdown non-essential VMs on the host entering maintenance mode and disable Admission Control. Use caution doing either of these to avoid overwhelming your other hosts once the VMs have been migrated. If you disable AC, make sure you remember to enable it again once your maintenance has finished.
For some good resources on vSphere HA, check out Yellow-Bricks HS DeepDive and HA cluster configuration: Requirements and steps