The reason for this post was because I had some questions regarding a situation where your USB or SD memory card would die while running the ESXi hypervisor. What would happen next? How can you monitor this and how to recover from this situation?
Since I started using ESX in 2009, a lot of things have changed, including the storage devices that were used to install ESX(i) on. This post is dedicated to these storage devices and how this affects availability and manageability.
In my opinion, before software-defined compute, networking and storage was a hot-topic, a traditional physical server used for virtualization usually consumed two physical drives in a RAID-1 configuration, where you would deploy ESX(i) and had a lot of spare capacity since ESX only used about 2 GB (versus 150 MB for ESXi), the spare capacity is automatically used for creating a VMFS datastore. A storage array was used for VM placement (no fancy VSAN back then!).
This approach will cost you at least two physical drives (since you don’t want a failed disk to have impact on your running system) and a storage controller, which will require power, cooling and maintenance.
As ESX(i) evolved, reasons for deploying on local, physical disks were becoming less and less. Booting from SAN (iSCSI and FC), PXE (AutoDeploy), USB or SD cards are extra ways of booting your ESXi hosts and could result in cost reduction. Personally, I have no experience with booting ESXi from a SAN, but I do have experience with booting from local SAS disks, USB sticks and SD memory cards.
To get back to the original question, what happens when your ESXi boot device dies? Simply: nothing! The OS is loaded into memory while booting, preventing impact on your ESXi host when the boot device fails. When you would reboot the host, obviously, it has no working boot device so the host will fail to boot.
When this happens, you should replace the boot device as soon as possible by migrating any running VMs off the ESXi host and putting it into maintenance mode. Your ESXi host needs to be shut down to be able to replace the faulty device and afterwards you need to redeploy your ESXi host (by using an ISO file, bootable USB device or AutoDeploy).
But how would you know if the device has failed? When using spinning disks in your ESXi host, the devices will appear as disk devices in ESXi and will be monitored by ESXi, so a failed disk device would trigger a warning or critical state on your host. When using removable disk devices, there is no local VMFS volume that is being monitored. I always wondered how to monitor these devices, as they are not actively used anymore after ESXi has been loaded to RAM.
Recently, one of the SD cards in an ESXi host died and my question was answered. It is indeed monitored by ESXi and if you are using monitoring tools like HP SIM, it also reports a defective boot device. See screenshots below.
To sum everything up, using removable media to install ESXi on is perfectly fine and purchasing expensive spinning disks, which consume more power and require more maintenance in my opinion, is unnecessary. Are you having different experiences with using removable media or spinning disks? Let me know so I can update this post accordingly.
Thanks for reading!