
NetApp Deduplication for FAS and V-Series Deployment and Implementation Guide
NUMBER OF DEDUPLICATION PROCESSES
A maximum of eight deduplication processes can be run at the same time on one FAS system.
If another flexible volume is scheduled to have deduplication run while eight deduplication processes are
already running, deduplication for this additional flexible volume is queued. For example, suppose that a
user sets a default schedule (sun-sat@0) for 10 deduplicated volumes. Eight will run at midnight, and the
remaining two will be queued.
As soon as one of the eight current deduplication processes completes, one of the queued ones starts;
when another deduplication process completes, the second queued one starts.
Next time deduplication is scheduled to run on these same 10 flexible volumes, a round-robin paradigm is
used so that the same volumes aren’t always the first ones to run.
With Data ONTAP 7.2.X, for manually triggered deduplication runs, if eight deduplication processes are
already running when a command is issued to start another one, the request fails and the operation is not
queued. However, starting with Data ONTAP 7.3, the manually triggered deduplication runs are also queued
if eight deduplication operations are already running (including the sis start –s command).
4 DEDUPLICATION WITH OTHER NETAPP FEATURES
For the versions of Data ONTAP that are required to run deduplication with the NetApp features described in
this section, read the section on deduplication limitations.
4.1 DEDUPLICATION AND SNAPSHOT COPIES
Deduplication only deduplicates data in the active file system, and that data could be locked in Snapshot
copies created before deduplication, causing reduced storage savings.
There are two types of data that can be locked in Snapshot copies:
Data can be locked in a Snapshot copy if the copy is created before deduplication is run. This effect can
be mitigated by always running deduplication before a Snapshot copy is created.
Deduplication metadata could get locked in a Snapshot copy when the copy is created. In Data ONTAP
7.2.X, all the deduplication metadata resides in the volume. Starting with Data ONTAP 7.3.0, part of the
metadata resides in the volume, and part of it resides in the aggregate outside the volume. The
fingerprint database and the change log files that are used in the deduplication process are located
outside of the volume in the aggregate and are therefore not captured in Snapshot copies. This change
enables deduplication to achieve higher space savings. However, some other temporary metadata files
created during the deduplication operation are still placed inside the volume. These temporary metadata
files are deleted when the deduplication operation completes. (For the size of these temporary
metadata files, see section 3.3.2, ―Deduplication Metadata Overhead.‖) These temporary metadata files
can get locked in Snapshot copies, if the copies are created during a deduplication operation. The
metadata files remain locked until the Snapshot copies are deleted.
For deduplication to provide the most benefit when used in conjunction with Snapshot copies, the following
best practices should be considered:
Run deduplication before creating new Snapshot copies.
Remove unnecessary Snapshot copies maintained in deduplicated volumes.
If possible, reduce the retention time of Snapshot copies maintained in deduplicated volumes.
Schedule deduplication only after significant new data has been written to the volume.
Configure appropriate reserve space for the Snapshot copies.
If the space used by Snapshot copies grows to more than 100%, it will cause df –s to report incorrect
results, because some space from the active file system is being taken away by Snapshot, and
therefore actual savings from deduplication aren’t reported.
If snap reserve is 0, you should turn off the Snapshot auto-create schedule (this is the case in most
LUN deployments).
Comentarios a estos manuales