When doing vSphere Metro Storage Cluster, on the shared storage layer, you often have a 'fallback' side. The LUN that will become authoritative for reading and writing in case of a site failure, or a split brain.
This makes VM storage placement on the correct Datastores rather important from an availability perspective.
Up till now, you had to manage intelligent VM storage-placement decisions yourself. And if you wanted the alignment of 'compute' -aka where the VM is running, in relation to where its storage falls back, then you also had to take care of this yourself through some kind of automation or scripting.
This problem would be compounded if you also wanted to logically group these storage 'sides' into SDRS clusters, which you often do, especially if you have many datastores.
In the past few years, mostly in regard to vSAN and vVOLs, VMware have been pushing the use of Storage Policies, and getting us thinking towards a model of VM-policy based storage management.
Wouldn't it be great if you could leverage the new Storage Policies, to take care of your metro-cluster datastore placement? For example, by tagging datastores, and building a policy around that.
And what if you could get SDRS to automate and enforce these policy-based placement rules?
The EnforceStorageProfiles advanced setting introduced in 6.0U2 seemed to promise to do this.
However, messing around with Storage Policies, Tagging and in particular that EnforceStorageProfiles advanced setting, I encountered some inconsistent and unexpected GUI and enforcement behavior that show we are just not quite there yet.
This post details my findings from the lab.
The summery is as follows:
It appears that if you mix different self-tagged storage capabilities inside a storage-cluster, the cluster itself will not pass the Storage Policy compatibility check on any policy that checks for a tag that is not applied to all datastores in that cluster.
Only if all the datastores inside the storage-cluster share the same tag, will the cluster itself report itself as compatible.
This is despite applying that tag to the storage-cluster object itself! It appears that adding or not adding these tags to the storage-cluster object has no discernible effect on the Storage Compatibility check of the policy.
This contradicts the stated purpose and potential usefulness of the EnforceStorageProfiles advanced setting.
However, individual datastores inside the storage-cluster will correctly be detected as compliant or non-compliant based on custom tags.
The failure of the compatibility check on the storage-cluster will not stop you from provisioning a new VM to that datastore cluster, but the compatibility warnings you get only apply to 1 or more underlying non-compatible data stores. It does not tell you which though, so that can be confusing.
The Advanced setting EnforceStorageProfiles will effect storage-cluster initial placement recommendations, but will not result in SDRS movements on their own when the value is set to 1 (soft enforcement) .
Even EnforceStorageProfiles=2 (hard enforce) does not make SDRS automatically move a VMs storage from non-compatible to compatible datastores in datastore-cluster. It seems to only effect initial placement. This appears to contradict the way the setting is described to function.
However, even soft enforcement will stop you from moving a VM manually to a non-complaint datastore within that storage-cluster, even though you specified an SDRS override for that VM. That is unexpected, and the kind of behavior one would only expect with a ‘hard’ enforce. Again, this is unexpected behavior.
This may mean that while SDRS will not move a VM that has already been placed, to correct storage on its own accord after the fact, it will at least prevent the VM from moving to incorrect storage.
Summed up that means that as long as you get your initial placement right, EnforceStorageProfiles will make sure the VMs storage at least stays there. But it won’t leverage SDRS to fix placements, as the setting appears to have meant to.
Now for the details and examples:
I have 4 Datastores in my SDRS cluster:
I have applied various tags to these datastore objects, for example the datastores start with 'store1' received the following tags:
datastores start with 'store2' received the following tags:
The crucial difference here is the tag "Equalogic Store 1" vs "Equalogic Store 2"
In the this default situation, the SDRS Datastore Cluster itself has no storage tags applied at all.
I have created a Storage Policy that is meant to match with datastores with the "Equalogic Store 2" tag. The idea here is that I can assign this policy to VMs, so that inside that datastore cluster those VMs will always reside on 'Store2' datastores and not on 'Store1' datastores.
I plan to have SDRS (soft) enforce this placement using the advanced option EnforceStorageProfiles=1, introduced in vSphere vCenter Server 6.0.0b
The match for 'Equalogic Store 2' is the only rule in this policy.
But when I check the storage compatibility, neither the datastores that have that tag nor the datastore cluster object shows up under the 'Compatible' listing.
However, under the 'Incompatible' listing, the Cluster shows up as follows:
Notice how the SDRS Cluster object has appeared to have 'inherited' the error conditions of both Datastores that do not have the tag.
This was unexpected.
In the available documentation for VM Storage Policies, I have not found any reference to SDRS Clusters directly. My main reference here is Chapter 20 of the vsphere-esxi-vcenter-server-601-storage-guide. Throughout the documentation, only datastore objects themselves are referenced.
The end of chapter 8 of the vsphere-esxi-vcenter-server-601-storage-guide ; 'Storage DRS Integration with Storage Profiles' - explains the use of the EnforceStorageProfiles advanced setting.
The odd thing is, the documentation for the The PbmPlacementSolver data object (which I asume Storage Policy placement checker is utilizing) even explicitly states that storage POD's (SDRS Clusters) is a valid 'Hub' for checking against.
But it seems as if the 'hub' in the case of being an SDRS cluster, will produce an error for every underlying datastore that throws an error. In cases of mixed-capability datastores in a single SDRS Cluster, depending on how specific your storage profile is, chances are it will always throw an error.
So this seems contradictory! How can we have an SDRS advanced setting that operates on a per-datastore bases, while the cluster object will likely always stop the compatibility check from succeeding?
As a possible workaround for these errors, I tried applying tags to the SDRS Cluster itself. I applied the "Equalogic Store 1" and "Equalogic Store 2" both to the SDRS Cluster object. The idea being that the compatibility check of the storage policy would never fail to match on either of these tags.
But alas, it seems to ignore tags you set on the SDRS Cluster itself.
Anyway, its throwing an error, but is it really stopping SDRS from taking the policy into account, or not?
Testing SDRS Behaviors
Provision a new VM
Selecting the SDRS Cluster, It throws the compatibility warning twice, without telling you which underlying datastores it is warning you about. That is not very useful!
However, it will deploy the VM without any issue.
When we check the VM, we can see that it has indeed placed the VM on a compatible Datastore
Manual Storage-vmotion to non-compliant datastore
In order to force a specific target datastore inside an SDRS Cluster, check the 'Disable Storage DRS for this virtual machine' checkbox. This will create an override rule for this VM specifically. When we do this and select a non-compatible datastore, it throws a warning, as we might expect. But as I have chosen to override SDRS recommendations completely here, I expect to be able to just power on through this selection.
No such luck. Remember that EnforceStorageProfiles is still set to only '1', which is a soft enforcement. This is not the kind of behavior I expect from a 'soft' enforcement, especially not when I just specified that I wanted to ignore SDRS placement recommendations altogether!
I should be able to ignore these warnings, for above stated reasons. Its a bit inconsistent that I am still prevented from overriding!
There are 2 ways around this.
First of all you can momentarily turn off SDRS completely.
You must now choose a datastore manually. Selecting the non-compatible datastore will give the warning, as expected.
But now no enforcement takes place and we are free to move the VM wherever we want.
The other workaround, which is not so much a workaround, as it is the correct way of dealing with policy-based VM placement, is to change the policy.
If you put the VMs policy back to default, it doesn't care where you move it.
Storage DRS Movement Behaviors
When EnforceStorageProfiles=1 SDRS does not seem to move the VM, even if it is non-complaint.
Unfortunately, EnforceStorageProfiles=2 (hard enforce) does not change this behavior. I was really hoping here that it would automatically move the VM to the correct storage, but it does not, even when manually triggering SDRS recommendations.
Manual Storage-vmotion to compliant datastore
When the VM is already inside the storage-cluster, but on a non-complaint datastore , you would think it would be easy to get it back onto compliant datastore.
It is not. When you select the datastore-cluster object as the target, it will fault on the same error as manually moving it in the previous example. - explicit movements inside an SDRS-enabled cluster always require an override.
Create the override by selecting the checkbox again.
Dont forget to remove the override again, afterwards.
Manual Storage-vmotion from external datastore to the storage-cluster
Here, SDRS will respect the storage policy and recommend initial placement on the correct compliant datastores.
Tag-based storage policies, and their use in combination with SDRS Clusters, appears to be buggy and underdeveloped. The interface feedback is inconsistent and unclear. As a result, the behavior of the EnforceStorageProfiles setting becomes unreliable.
Its hard to think of a better used case for EnforceStorageProfiles than the self-tagged SDRS datastore scenario I tried in the lab. both vSAN and vVOL datastores do not benefit from this setting. It really only applies to 'classic' datatores in an SDRS cluster.
I have seen that self-tagging does not work correctly. But I have not yet gone to back to the original use-case of Storage Profiles: VASA properties. However, with VASA advertised properties you are limited to what the VASA endpoint is advertising. Self-tagging is far more flexible, and currently the only way I can give datastores a 'side' in a shared-storage metro-cluster design.
Nothing I have read about vSphere 6.5 so far, leads me to believe this situation has been improved. But I will have to wait for the bits to become available.