Hyper-V – Fixing Broken Volume GUIDs
Yesterday I had an interesting problem I had to fix. As you may or may not know, using Hyper-V within Failover Clustering sometimes requires the use of volume GUIDs for storage if you have more LUNs than drive letters available (like we do). What you may not know is that these GUIDs can, under some circumstances, change – completely screwing up Failover Clustering’s ability to move virtual machines between nodes.
Now, firstly – Microsoft have issues a hotfix to resolve the issue where GUIDs change. It’s post-SP2 fix KB970529 – The Volume GUID may unexpectedly change after a volume is extended on a Windows Server 2008 failover cluster node. So, if you haven’t suffered with this, apply this patch on your Windows 2008 cluster nodes now and save yourself some hassle!
However if, like me, your disks have been expanded before applying this patch and you now have either semi-working (your virtual machine works on only one or a small subset of your nodes) or completely failed virtual machines, keep reading – this is how I came to fix the issue.
First off, look at your virtual machine’s resources in Failover Cluster manager and note down the assigned storage’s volume GUID (it’ll display as Volume: (\\?\Volume\{some_numbers_and_letters}).

Now, to check if you’re affected by this problem, go into Hyper-V Manager and check the settings of your virtual machine’s hard disk – you’ll want to check the volume GUID matches what Failover Cluster manager thinks…

So, if it matches, sorry, but this isn’t going to fix your problem – there’s another cause to this. However, if they don’t match, here’s where the fun starts. The first problem you have is that the settings in the cluster are wrong. Great! However, this is relatively easily fixed, if you don’t mind using the command line. Firstly, shut the virtual machine down and take the configuration offline. You can safely leave the storage online though (and in fact for the next bit to work you may actually need to). Log on to the node that currently holds the virtual machine (this works on both core and full installations, by the way) and run the following command:
cluster res "Virtual Machine Configuration Resource" /priv VmStoreRootPath=\\?\Volume\{Volume GUID}\Virtual Machine folder
So, for example, let’s say I have a Virtual Machine called "Rob’s Special VM", which Failover Cluster manager is telling me resides on GUID \\?\Volume{12345678-9012-3456-7890-123456789012}. The command would be:
cluster res "Rob's Special VM Configuration" /priv VmStoreRootPath="\\?\Volume{12345678-9012-3456-7890-123456789012}\Rob's Special VM"
Once you’ve done this, bring the configuration back on line, but don’t boot the virtual machine. At this point you might encounter a bit of an issue. I did, anyway. In Failover Cluster manager, when attempting to bring the configuration online, it fails. Not 100% sure why this happens, but suffice to say it’s Hyper-V and Failover Cluster manager playing silly buggers. If it fails, check the event log on the node that failed. There will probably be an error message like this:
The Hyper-V Virtual Machine Management service failed to register the configuration for the virtual machine ‘abcdefab-cdef-abcd-efab-cdefabcdefab’ at ‘\\?\Volume{12345678-9012-3456-7890-123456789012}\Rob’s Special VM’: Cannot create a file when that file already exists. (0x800700B7)
The problem here is that Failover Cluster and/or Hyper-V haven’t unregistered the old configuration file. Now, taking the virtual machine ID (in this example, abcdefab-cdef-abcd-efab-cdefabcdefab), navigate to the folder C:\ProgramData\Microsoft\Windows\Hyper-V\Virtual Machines and delete the file abcdefab-cdef-abcd-efab-cdefabcdefab.xml. You may need to do this on every node. Once this is done, hey presto, the configuration should now come online through Failover Cluster manager.
But wait! You’re not finished yet! Remember earlier, in your Hyper-V configuration file? The volume GUID wasn’t right there, either! This is simple to change. Back to your Hyper-V configuration screen, you’ll be able to either change the path to the new location, or remove and re-add that disk from the new path. I chose the latter option; in the past I have had issues with permissions on the VHD file so removing it, clicking ‘apply’ and re-adding it circumvents this issue. Finally, you’ll need to change the "Snapshot Location" too as this uses the same GUID as the config disk.
Hope this saves someone else a day of hassle!
Oh, and one site that really helped me out: Hyper-V notes from the field. Although wasn’t the answer to my particular problem, pointed me in the right direction for the Hyper-V symlink files, which led me on to working out they were stored by the cluster service… which led me to fixing my problem. So thanks for that!
Many thanks, for this great entry!!