Practical Limits of NetApp Deduplication
I’ve blogged before about the limits of NetApp’s A-SIS (Deduplication). In practical use, however, those limits can be even lower – here’s why:
Suppose, for example, that you have a FAS2050; the maximum size FlexVol that you can dedupe is 1 TB. If the volume has ever been larger than 1 TB and then shrunk below that limit, it can’t be deduped, and, of course, you can’t grow a volume with A-SIS enabled beyond 1 TB. Fair enough, you say – but consider those limitations in the case of a volume where you aren’t sure how large it will eventually grow.
If you think your volume could eventually grow beyond 1 TB (deduped), and you’re getting a healthy 50% savings from dedupe you’ll actually need to undo A-SIS at 500GB. If you let your deduped data approach filling a 1TB volume, you will not be able to run “sis undo” – you’ll run out of space. TR-3505 has this to say about it:
Note that if sis undo starts processing and then there is not enough space to undeduplicate, it will stop, complain with a message about insufficient space, and leave the flexible volume dense. All data is still accessible, but some block sharing is still occurring. Use “df –s” to understand how much free space you really have and then either grow the flexible volume or delete data or Snapshot copies to provide the needed free space.
Bottom line: Either be absolutely sure you won’t ever need to grow your volume beyond the A-SIS limitations of your hardware platform, or run “sis undo” before the sum of the “used” and “saved” columns of “df -s” reaches the volume limit.
Postscript: If you were thinking – like I was – that ONTAP 7.3 would up the A-SIS limitations, apparently you need to think again.
Postscript 2: See also NOW KB35784, as pointed out by Dan C on Scott Lowe’s blog.
Good news Andy. A new version of TR-3505 (rev 5) has been released and volume limits have been restated. In your example, a FAS2050 with a 1TB volume, getting 50% dedupe savings and with 500GB of physical (deduped) data will not have to be resized, but in fact could grow to a logical size of 16TB. On page 19: “The maximum shared data limit per volume for deduplication is 16TB, regardless of the platform type. Once this limit is reached, there is no more deduplication of data in the volume, but writes to the volume continue to work successfully until the volume gets completely full.” I am sorry this was not communicated very clearly in earlier versions of TR-3505 and we made sure to clarify this in Rev 5.
As far as raising the hard volume limits for dedupe, we tried like the devil to get this into 7.3 but had some technical hurdles we just couldn’t clear in time for that release. I’ve taken a vow of silence when discussing unannounced deudpe enhancements, but rest assured that we want to raise vollume limits too and an announcement will be forthcoming.
Here is the link to the Rev 5 TR-3505 http://communities.netapp.com/docs/DOC-1642 on the NetApp dedupe community, it will also be posted soon on http://www.netapp.com in the technical library.
Thanks for the comment, and the link to the updated document. However, I’m not sure that the situation I’m talking about changes (we’re haven’t upgraded from 7.2.4/7.2.4L1 yet): Where you run into the maximum deduplicated volume size, or know you will, and need to turn A-SIS off and run “sis undo” to grow the volume past that limit – the situation where you risk running into the following message on a 100% full volume:
[sis.undo.nospace:warning]: Undoing shared blocks on volume /vol/vol1 has aborted because there is insufficient free space. Data blocks that were still shared at the time the undo operation stopped will remain shared. Blocks already processed by the undo operation are no longer shared.