Sunday, March 28, 2010

VMware, NetApp De-Dup, and Effects on Exchange 2010 DAG

The first time I read about Database Availability Groups (DAG) in Exchange 2010 I instantly thought of what a great fit NetApp de-duplication would be for storing database copies.  NetApp claims 10-30% de-duplication on Exchange 2010 databases, but they do not mention space savings of identical copies of databases.  Technically this number should be close to 100% space savings (after a de-dup job completes) so long as your database copies live in the same volume.

As far as I know NetApp is currently the only storage vendor that actually recommends running production data de-duplication so these comparisons will also encompass what I call NNS (Non NetApp Storage).  I also make the assumption that when running JBOD you will be using lag database copies.  Since we can (and will) take snapshots on the array and have no need for lag copies I have factored in an additional 20% storage for snap space on the NetApp.  Additionally I have set the single database de-dup number at 15% for all the calculations in this article.

Take the example below where we place all the database copies on the same volume, but present the LUN's to the different servers in the DAG.  The downside to this scenario is if you lose a volume or aggregate you will have lost all the protection capabilities afforded you by the database copies in the first place.  Personally I have never experienced the loss of an entire aggregate or volume, and I believe the event to be highly unlikely.  This particular scenario all boils down to your own comfort level with your back-end storage array.  It should be noted if you stretch your DAG across physical locations then you have successfully mitigated this risk.  You can expect to see about a 73% space reduction over JBOD or NNS.

Scenario 1

If you do not have a DR site (bad idea) you can provide protection from an aggregate failure by putting your database on separate physical spindles.  This example outlines the process.  You do not game the same level of data de-duplication that you do in the first scenario, but you do gain better protection.  You can expect to see about a 58% space reduction over JBOD or NNS.

Scenario 2

Here is a similar scenario as the first, only this time we use VMware virtual machines configured in a HA cluster instead of physical servers.  This lets us reduce our database copies by 1 per database since we no longer need 3 copies.  Microsoft recommends 3 copies so you are protected against hardware failure in the event you have a database offline for maintenance, but we gain that protection in the form of VMware HA.  You could still add an additional database copy if you wish, and it would take up the same amount of space as having 1 copy.  Space savings is the same as scenario 1 at about 73%.
Scenario 3

My favorite scenario is listed below.  This design is easy to manage, and is a good balance between data protection and space savings.  We use the same design as the diagram above, but here we use VMware SRM to replicate our virtual machines and SnapManager for Exchange and SnapMirror to replicate the databases to a DR site.  This is the same space savings as scenario 2, but you will reclaim Microsoft licensing cost by leveraging SRM.

Scenario 4

Microsoft has done an outstanding job with Exchange 2010 giving customers all the options they need to deploy a highly reliable and redundant messaging solution.  Leveraging VMware and NetApp storage you have even more options, and gain even greater functionality.  NetApp TR-3824, "Storage Efficiency and Best Practices for Microsoft Exchange Server 2010" does not recommend placing database copies in the same volume, but based on what I have presented here I'll leave it for you to decide the risk/reward.


brian sytsma said...

Nicely laid out. I would add that the 2 vs.3 database copies argument is more significantly impacted by the fact that with NetApp/VMware you could easily take a snap just before any offline maintenance. HA helps too, but not if your only online db gets hosed during maintenance on the 2nd copy.

paulb said...

Well done Erick! Very well thought out. The ability to dedupe the production DAG and avoid duplicate LAG copies altogether is a very compelling argument for putting Exchange 2010 on NetApp storage.

thanks for sharing.

Dimitris Krekoukias said...

Nice. Eliminating the impact of LAG copies altogether is even more important I think than dedupe. With large DBs, keeping several LAG copies for recovery purposes will burn tons of disk and servers if you follow Microsoft's recommendation and just use servers with replicated DASD everywhere.

Of course, Microsoft doesn't much like this since it reduces the # servers needed (and therefore their revenue), but such is the price of progress.

Getting an extra 30% dedupe is the icing on the cake.


Vaughn Stewart said...

This post rocks! it demonstrates foreword thinking regarding the application of mutually beneficial technologies. Stay tuned, hopefully we'll have more to share in this area in the near future.

Mark Arnold said...

This is an awesome post. As a NetApp employee I blogged the same thing, but in my usual irreverent jokey manner. For customers and anyone else who needs a professionally done article I'm just going to replace my post with a link to this!
Thank you so much for adding weight to this neat cost/disk saving measure.

Anonymous said...

I just implemented scenario 1 on NetApp and I'm seeing over 40% dedupe across the board.

kopiluwak nya said...