In the previous blog, we talked about the concepts behind virtual data optimizer (VDO) and how to configure it with Red Hat Enterprise Linux (RHEL) 7.5 Beta.  In the last step, we created a VDO volume.

In this blog we are going to experiment and observe real world savings on storage. For the experiment, I am not going to use any artificially generated workload but, rather, we'll follow the simplest way by introducing some data on the disk and then multiplying the same data on the disk with a different name. This simple exercise can ensure that we are adding redundant data to disk to trigger deduplication.

VM SPECIFICATIONS:

           OS : RHEL 7.5 Beta pre-release (Since, VDO is integrated in RHEL 7.5 Beta)

           RAM : 4GB

           OS DISK : 20GB

           ADDITIONAL DISK : 15GB (VirtIO-blk)

 

Before beginning with the experiment, I assume you have a successfully running volume with "enabled" Deduplication and Compression. To verify this run the commands :

# vdo status -n <vdo_vol_name> | grep Deduplication

 

 

# vdo status -n <vdo_vol_name> | grep Compression

 

 

From the output we see that both deduplication and compression are enabled.

If either of the above options are disabled, you can enable them with the following commands:

 

# vdo enableCompression -n <vdo_vol_name>

 

 

 

 

# vdo enableDeduplication -n <vdo_vol_name>


 

Before we add data to the VDO volume, let’s check the storage status :

 

# vdostats --hu

 

 


The output of "vdostats -- hu" shows:

  1. User available space (in my case, 12.0GB)
  2. UDS metadata space (in my case, 3GB)
  3. Storage savings (in my case, 98%)

Now, we’ll test the effect of deduplication, zero block elimination, and compression.

 

  1. For this purpose, I add a RHEL-7.5-x86_64-boot.iso file to the vdo_vol device on the /vdo_vol mountpoint.

# df -hT

 

 

 

 

 

  1. Next, I copy the same image file to the same directory four times (using "cp -pr <file to be copied> <dest_directory>").

# ls -l --block-size=1MB

 

 

This is the output of a directory listing (ls) before copying the data

 

  1. Each RHEL iso is approximately 557MB, so total logical space consumed on disk is 557x4=2228MB ~ 2.2GB

 

 

 

 

This is output of a directory listing (ls) after making three more copies of the data

 

  1. Next we check the resulting space savings. Since only unique streams of data blocks should persist on media within a VDO volume, the additional space usage in 12GB of user space shouldn't be more than 500-700MB.

To verify this, we run the command:

# vdostats --hu

 

 

Status of storage on a VDO vol after adding redundant data

 

OBSERVATIONS:

  1. The initial space used by the VDO volume was 3.0 GB according to vdostats output.
  2. The space used after the initial copy was 3.5 GB in the vdostats output.
  3. The space occupied by 4 RHEL iso images on a VDO vol is : 0.5 GB, the same as for a single RHEL iso file on the underlying storage.
  4. The df -hT output shows the logical space occupied by these 4 iso files as seen by the filesystem on the VDO volume reported as. 2.2GB.
  5. Since VDO manages a logical to physical block map, df sees logical space consumed according to the file system that resides on top of the VDO volume.  vdostats --hu is viewing the physical block device as managed by VDO. Physically a single RHEL image is residing on the disk, but logically the file system thinks there are 4 copies, occupying 2.2GB.

 

# df -HT

 

 

 

 

 

df -hT output (after applying the copy operation)

 

Based on this, we can conclude that VDO eliminates the redundancy of data stored on a VDO volume.