Interesting Linux VM Crash Pattern
I’ve just begun to pull together some interesting data on a series of Linux VM crashes I’ve seen. I don’t have a resolution yet, but some interesting patterns have emerged.
A CentOS 4.x or 5.x guest will crash with a message similar to the following on its console:
[<f883b299>] .text.lock.scsi_error+0x19/0x34 [scsi_mod]
[<f88c19ce>] mptscsih_io_done+0x5ee/0x608 [mptscsi] (…)
RIP [<ffffffff8014c562>] list_del+0x48/0x71 RSP <ffffffff80425d00> <0>Kernel Panic - not syncing: Fatal exception
A hard reset (i.e. pressing the reset button on the VM’s console) is required to reboot the guest.
VMware/NFS/NetApp SnapRestore/Linux LVM Single File Recovery Notes
There have been a few posts elsewhere discussing file-level recovery for Linux VMs on NetApp NFS datastores, but none that have dealt specifically with Linux LVM-encapsulated partitions.
Here’s our in-house procedure for recovery; note that we do not have FlexClone licensed on our filers.
ESX VM swap on NFS: If it crashes, try something else
I’ve written about running VMware ESX with VM swap on an NFS datastore previously – specifically whether or not it was supported/recommended:
After writing the second post, I thought the issue was pretty much resolved: From multiple sources, the consensus seemed to be that running ESX with VM swap on NFS would be fine. Imagine my surprise (and disappointment) at seeing the following VMware KB article 1008091, updated yesterday: An ESX virtual machine on NFS fails with swap errors. Further details are in the article itself, but VMware’s KB site is throwing intermittent errors for me at the moment, so I’ll provide the money quote:
The reliability of the virtual machine can be improved by relocating the swap file location to a non-NFS datastore. Either SAN or local storage datastores improve virtual machine stability.
VMware about ESX swap on NFS: It’s okay
Paul Manning, from VMware, in response to a question I asked in the VI:OPS forums:
The current best practice for NFS is to not seperate the VM swap space from the VMhome directory on a NFS datastore. The reason for the originial recommendation was just good old fashioned conservitiveness.
More at the forum post, including more on the reasoning for the old recommendation of separating swap when using NFS – thanks, Paul, you made my day.
ESX Swap on NFS or Not?
Scott Lowe recently linked to a VMware KB article entitled Storing swap files on VMFS when running virtual machines from NFS. The article (from 3/31/2008) is perhaps the latest word from VMware in the frustrating back-and-forth on whether placing an ESX VM’s swap on NFS is acceptable or not.
VMware’s Comparison of Storage Protocol Performance
VMware has just released a paper entitled Comparison of Storage Protocol Performance (seen at Scale the Mind and blog.scottlowe.org); maybe this will help deflate some of the too-often repeated speculation that NFS is too slow for VMware ESX.