Creating and Restoring a Gzip Compressed Disk Image with dd on UNIX/Linux

Creating and restoring disk images are essential tasks for developers, system administrators, and users who want to safeguard their data or replicate systems efficiently. One useful tool for this purpose is dd, which allows for low-level copying of data. In this article, we will explore how to clone and restore a partition from a compressed disk image in a UNIX/Linux operating system.

IMPORTANT: There is a risk of data loss if a mistake is made. The dd command can be dangerous if not used carefully. Specifying the wrong input or output device can result in data loss. Users should exercise caution and double-check their commands before executing them.

Cloning a Partition into a Compressed Disk Image

To clone a partition into a compressed disk image, you can use the dd and gzip commands:

dd if=/dev/SOURCE conv=sync bs=64K | gzip --stdout > /path/to/file.gzCode language: plaintext (plaintext)

This command copies the content of the block device /dev/SOURCE to the compressed file /path/to/file.gz, 64 kilobytes at a time.

Restoring a Partition from a Compressed Disk Image

To restore a partition from a file containing a compressed disk image, use the following command:

gunzip --stdout /path/to/file.gz | dd of=/dev/DESTINATION conv=sync bs=64K
Code language: plaintext (plaintext)

This command decompresses the content of the compressed file located at /path/to/file.gz and copies it to the block device /dev/DESTINATION, 64 kilobytes at a time.

More information about the dd command options

Here are additional details about the dd command options:

  • The status=progress option makes dd display transfer statistics progressively.
  • The conv=noerror option instructs dd to persist despite encountering errors. However, ignoring errors might result in data corruption in the copied image. The image could be incomplete or corrupted, especially if errors occur in critical parts of the data. This option can be added to the conv option as follows: conv=sync,noerror
  • The conv=sync option makes dd wait for both the data and the metadata to be physically written to the storage media before proceeding to the next operation. In situations where data integrity is less critical, using conv=sync can help restore as much data as possible, even from a source with occasional errors.
  • Finally, the bs=64K option instructs dd to read or write up to the specified bytes at a time (in this case, 64 kilobytes). The default value is 512 bytes, which is relatively small. It is advisable to consider using 64K or even the larger 128K. However, it’s important to note that while a larger block size speeds up the transfer, a smaller block size enhances transfer reliability.

Ensuring Data Integrity

Although the dd command automatically verifies that the input and output block sizes match during each block copy operation, it is prudent to further confirm the integrity of the copied data after completing the dd operation.

To achieve this, follow these steps:

Generate the md5sum of the source block device:

dd if=/dev/SOURCE | md5sumCode language: plaintext (plaintext)

Next, generate the md5sum of the gzip-compressed file:

gunzip --stdout /path/to/file.gz | md5sumCode language: plaintext (plaintext)

Ensure that the two md5sum fingerprints are equal. This additional verification step adds an extra layer of assurance regarding the accuracy and integrity of the copied data.