I’ve just begun to pull together some interesting data on a series of Linux VM crashes I’ve seen. I don’t have a resolution yet, but some interesting patterns have emerged.
A CentOS 4.x or 5.x guest will crash with a message similar to the following on its console:
[<f883b299>] .text.lock.scsi_error+0x19/0x34 [scsi_mod]
[<f88c19ce>] mptscsih_io_done+0x5ee/0x608 [mptscsi] (…)
RIP [<ffffffff8014c562>] list_del+0x48/0x71 RSP <ffffffff80425d00> <0>Kernel Panic - not syncing: Fatal exception
A hard reset (i.e. pressing the reset button on the VM’s console) is required to reboot the guest.
There have been a few posts elsewhere discussing file-level recovery for Linux VMs on NetApp NFS datastores, but none that have dealt specifically with Linux LVM-encapsulated partitions.
Here’s our in-house procedure for recovery; note that we do not have FlexClone licensed on our filers.
I’ve written about running VMware ESX with VM swap on an NFS datastore previously – specifically whether or not it was supported/recommended:
After writing the second post, I thought the issue was pretty much resolved: From multiple sources, the consensus seemed to be that running ESX with VM swap on NFS would be fine. Imagine my surprise (and disappointment) at seeing the following VMware KB article 1008091, updated yesterday: An ESX virtual machine on NFS fails with swap errors. Further details are in the article itself, but VMware’s KB site is throwing intermittent errors for me at the moment, so I’ll provide the money quote:
The reliability of the virtual machine can be improved by relocating the swap file location to a non-NFS datastore. Either SAN or local storage datastores improve virtual machine stability.
Paul Manning, from VMware, in response to a question I asked in the VI:OPS forums:
The current best practice for NFS is to not seperate the VM swap space from the VMhome directory on a NFS datastore. The reason for the originial recommendation was just good old fashioned conservitiveness.
More at the forum post, including more on the reasoning for the old recommendation of separating swap when using NFS – thanks, Paul, you made my day.
Scott Lowe recently linked to a VMware KB article entitled Storing swap files on VMFS when running virtual machines from NFS. The article (from 3/31/2008) is perhaps the latest word from VMware in the frustrating back-and-forth on whether placing an ESX VM’s swap on NFS is acceptable or not.