Using extents gives you flexibility with your volumes and adding extents is a very simple and usually painless process. The ability to dynamically increase your total storage for a single VMFS volume is a feature we all love, but how many of us actually use it? Is it worth the risk?
Are you running extents for your VMFS volumes? Are all the extents for single VMFS volume located on the same LUN (a single LUN that has been extended several times) or do you have multiple LUNs that make up a datastore? If your answer is the latter and you have full control over how your LUNs are provisioned, then you should be smacked. I understand that there are those out there that are given standard LUN sizes by their SAN team and there is nothing you can do about it however this article does not concern you…. yet. We will come back to that later.
The very idea that one LUN being dependent on another should be enough reason alone to avoid using extents in the first place. Unfortunately this logic isn’t enough to stop a lot of people out there from using them. It is only after they lose an entire datastore worth of data that they re-evaluate their infrastructure design. This is a classic example of hindsight being 20/20. Whether this is related to lack of education or misunderstanding their usage, the end result is still the same.
For those that use extents in a pinch because they ran out of space on a datastore, I applaud you for being resourceful enough to come up with that solution however you probably should have understood how VM snapshots work in the first place (IE: Snapshots are NOT a backup solution and if you purposely create a snapshot, make sure you remove it once it has served its purpose). On the other hand, if you continue to run that extended datastore without the intent of creating a new, larger, and non-extended datastore and move the VM’s to that location, then you should be punished, though I hope that punishment isn’t data loss which could lead to a possible RGE (Resume Generating Event).
Don’t get me wrong, VMware has vastly improved upon how extents function over the years. The design is actually quite robust however it is still susceptible to LUN dependency and all the caveats that go hand in hand with it. Fortunately if an extent is accidentally overwritten (aside from the head extent, of course!), all VMs running on that LUN may continue to function properly however there is a chance that if the VM is powered down that you will no longer be able to power it back on. This would only be true if any of the blocks for that VM’s disk(s) reside on the part of the extent that has been overwritten. In a case like this you may get lucky if you leave the VM running and use VMware Converter inside the Guest OS to clone the VM to a new datastore.
The bottom line is this: The risks for using extents far outweigh the rewards. Do NOT use extents in production environment unless you have to do so in an emergency situation, such as a datastore running out of space and being unable to power on VMs as a result. In a case like this be sure to create a larger datastore and move the VMs to that new location. While you are at it, perhaps you should reconsider how you are using snapshots as you obviously didn’t know they would grow that quickly (snapshots on SQL or MS Exchange) or were unaware of how they function in the first place. If you weren’t aware that your VM’s are even running snapshots, then you should re-evaluate who has permissions to create these snapshots and restrict it. Also, for those with the minimal chance that your VMs are running on a consolidated helper or VCB backup snapshot due to a snapshot commit error or other error, perhaps a little VM administration would have found those snapshots before they became a problem. Just a thought.. 😉
And now back to those of you that have no control over the LUN sizes given to you by your SAN team. Now that you have read this article and fully understand that running production on extents is one hell of a poor choice, it is time to sit down with that team and explain to them why you need specific LUN sizes. I have fixed enough broken volumes and extents over the past few years to know that running production on extents in the first place is just a bad idea. Even if you have full backups, it will require downtime in order to restore those images, not to mention you will have to rebuild the volume in order to make use of the space again.
Don’t use extents! At least not permanently 😉