Dd - Destroyer of Disks
These notes cover useful things you can do with `dd`.
- 1 Burn Linux ISO images to a USB flash drive using Apple Mac OS X
- 2 Securely erase a drive
- 3 Erase MBR
- 4 Erase GPT (GUID Partition Table)
- 5 Fill a file with bytes
- 6 Copy a drive to an image file
- 7 Image a CD or DVD
- 8 Show progress status statistics of `dd`
Burn Linux ISO images to a USB flash drive using Apple Mac OS X
This example burns an ISO image to a USB flash drive. In this example, the source image is ubuntu-13.10-desktop-amd64.iso. On Mac OS X the iso image must first be converted to a dmg image.
diskutil list diskutil unmountDisk /dev/disk1 hdiutil convert -format UDRW -o ubuntu-13.10-desktop-amd64 /Users/noah/Downloads/ubuntu-13.10-desktop-amd64.iso sudo dd if=ubuntu-13.10-desktop-amd64.dmg of=/dev/rdisk1 bs=1m
Securely erase a drive
If you are in a hurry then just drill a hole through the top of case into the platters. A professional data recovery service might be able to get some data off the platters, but it will be very expensive to do so.
You can use `dd` to destroy just the data without destroying the drive.
dd if=/dev/zero of=/dev/sda bs=1M
You can also use `cp` or `cat`:
cp /dev/zero /dev/sda cat /dev/zero > /dev/sda
Some say you should write random data to the drive (see Tinfoil hat paranoia below), but this is nearly ten times slower using /dev/urandom than /dev/zero. The urandom device is faster than the random device. It would be nearly impossible to use /dev/random.
dd if=/dev/urandom of=/dev/sda bs=1M
Tinfoil hat paranoia
It takes about 15 minutes to destroy a 1GB file using GNU `shred` (default options). It takes 30 seconds to destroy the file using `dd if=/dev/zero of=somefile bs=1024 count=1M`. This is on a laptop with a 1.6 GHz dual core CPU, 2 GB RAM machine, and a Seagate Momentus ST9160823AS drive with ext3 filesystem -- in other words, nothing fancy.
Some people say that simply overwriting data isn't really secure because they heard that it's possible to read data that has been overwritten (this is known as data remanence). This is a myth for the necessity for using multiple random overwrites for security. This myth came about because Dr. Peter Gutmann theorized that overwritten data could be recovered through the use of Scanning Transmission Electron Microscopy. This is an unsubstantiated theory -- no one has ever demonstrated recovering even a single bit of data using this technique. There is not even a proof of concept of this theory. It's a pipe dream. No commercial data recovery or forensics firms offer any services that can recover overwritten data. Yet, somehow this theory became accepted as fact and now everyone believes that supposedly the NSA and maybe space aliens can read overwritten bits on a drive. If your data is so sensitive that the government has to call the NSA or hire space aliens then you don't need my opinion or advice. Most drive erasure tools use very slow methods to prevent recovery of erased data. These tools overwrite each byte repeatedly with random data. This can take hours to erase a whole drive! Yet there is not a single example of overwritten data having ever been recovered using the method described by Dr. Gutmann.
If you still prefer to use the GNU `shred` command then you may want to put this in your ~/.bashrc or alias file to make it a little more sane:
alias shred='shred --iterations=1'
For my purposes, I'm perfectly happy with using this command, which will work everywhere UNIX is found (where sdXXX is the device to erase):
dd if=/dev/zero of=/dev/sdXXX
My argument also applies to Flash memory media, which, in consumer devices, is slowly replacing magnetic media. In fact, it's probably easier to decap a flash chip and somehow read the electron potentials trapped in the floating gates. Assuming this is possible, this would still require a laboratory and lots of money. And even if someone did this it still wouldn't let anyone recover erased bits with enough of a signal to noise ratio to be useful. The problem becomes even harder with MLC flash memory, which is the most common.
Only nut-jobs worry about this.
One step disk wipe tool
This is useful if you want to reuse a lot of drives: Darik's Boot and Nuke
I had Linux with GRUB installed on a machine. I needed to get rid of it and put Windows on the machine. I used a Ghost recovery disk to restore Windows on it, but Ghost didn't restore the MBR. GRUB was still lurking in the Master Boot Record. On boot GRUB would try to start but would error out. Wiping out the MBR fixed the problem. This will wipe out the MBR of a disk (sda in this example) but keep the partition table and disk signature:
dd if=/dev/zero of=/dev/sda bs=440 count=1
If you also want to totally erase the entire MBR include disk signature and partition table then use the following command:
dd if=/dev/zero of=/dev/sda bs=512 count=1
disk signature of boot disk
Disk signature is an obscure topic. These are the 4 bytes in the MBR starting after the first 440 bytes (offset 0x01B8 through 0x01BB). Often you can mess with it without problems, but in certain circumstances Linux may need to see a specific disk signature on the boot disk. The most critical fact is that the disk signature of the primary BOOT disk must be unique. In days past, I did not know the significance of the disk signature so I would often zero it out along with the MBR boot code using `dd if=/dev/zero of=/dev/sda bs=446 count=1`. That is not guaranteed to be harmless. It may cause problems; although, usually it is harmless. It is also bad to COPY a disk image including the MBR and then mount both copies on the same system. The system may not boot or nothing will go wrong at all!
Do not confuse disk signature with MBR signature. The MBR signature is always 0xAA55 starting at offset 0x01FE. It is stored little endian, so 0x01FE:0x55 and 0x01FF:0xAA.
The `ms-sys` command may be helpful in working with the MBR and disk signatures.
- Bios Enhanced Disk Drive Services (EDD) 3.0. This protocol determines which disk BIOS tries boot from. This uses the Disk Signature bytes. These are the 4 bytes in the MBR starting after the first 440 bytes (offset 0x01B8 through 0x01BB).
Erase GPT (GUID Partition Table)
If you see this error while using fdisk then you may want to remove all trace of GPT.
WARNING: GPT (GUID Partition Table) detected on '/dev/sdb'! The util fdisk doesn't support GPT. Use GNU Parted.
To erase the GPT you need to erase the table at both the beginning and end of the disk. You need to use blockdev to calculate the block number at the end of the drive.
dd if=/dev/zero of=/dev/sdb bs=512 count=2 dd if=/dev/zero of=/dev/sdb bs=512 count=2 seek=$(($(blockdev --getsz /dev/sdb) - 2))
Error: Unable to install GRUB
While installing Ubuntu on a disk that may have been previously used you may get this error when you get to the very final end of the installation process.
[!!] Install the GRUB boot loader on a hard disk Unable to install GRUB in /dev/sda Executing 'grub-install /dev/sda' failed. This is a fatal error. <Go Back> <Continue>
The cause of this is GPT. You must remove the partition table before you install Linux and GRUB. Do this with an Ubuntu Desktop LiveCD (Ubuntu Server CD does not have a live option for debugging... go figure.). Remeber, the GPT is both at the beginning and end of the disk. You must remove both of them.
dd if=/dev/zero of=/dev/sda bs=4096 count=35 dd if=/dev/zero of=/dev/sda bs=4096 count=35 seek=$(($(blockdev --getsz /dev/sda)*512/4096 - 35))
Note 1, you may have to zero out a larger range of blocks for the secondary GPT. This is because I am not certain of my math. Most modern disks use 4096 byte sectors internally, but may report 512 bytes to the OS. I'm not sure what size the drives use for the LBA arithmetic. I think LBA blocks are 512 bytes, but in these examples I pretend it may be 4096 bytes just to be sure. blockdev uses 512 bytes for the --getsz option, but the seek option in dd uses 4096 byte blocks, so the results of blockdev have to be converted to a seek point using 4096 byte blocks... Note because I use 4096 bytes for LBA block sizes this may over-estimate the size of the GPT tables and so could remove more than just the GPT tables. For my purposes this is OK because I just want to ignore whatever was on the disk before and get GRUB to install properly. This is bad if you are trying to surgically remove the GPT tables while preserving all other data.
Note 2, If you are exploring GPT by using dd to dump disk information, remember to use skip instead of seek.
Note 3, While using a live CD to get a shell to do any of this, you may also need to remove device-mapper targets (a wild guess), dmsetup remove vg--vmh--root, or something like that.
While searching for strings in the dd dumps of the GPT table of a drive I noticed the following string, Hah!IdontNeedEFI. A little research shows that this is actually the GUID of GPT.
Fill a file with bytes
This creates a 10MB file filled with zeros (0):
dd if=/dev/zero bs=1M count=10 of=test_data.bin
You can use /dev/zero and `tr` to generate and fill a file with any given byte constant. This creates a 10MB file filled with ones as a bit pattern (0b11111111, 0377, 255, or 0xff).
dd if=/dev/zero bs=1M count=10 | tr '\0' '\377' > test_data.bin
Filling a file with bytes other than zero can be handy for use with devices such as framebuffers where you want to clear the display but set all the pixels to white instead of black. Note that this may be larger than your framebuffer. The framebuffer device should give a harmless error when you try to write beyond the end of the framebuffer.
dd if=/dev/zero bs=1024 count=1024 | tr '\0' '\377' > /dev/fb0 2>/dev/null
This can also be done without using `dd`:
tr '\0' '\377' < /dev/zero > /dev/fb0 2>/dev/null
Copy a drive to an image file
The following will image a drive and compress the image.
dd bs=4096 conv=noerror,sync if=/dev/sda | gzip -c > drive.img.gz
The conversion options (conv) are useful when working with physical devices such as drives. The noerror option says that the copy process should continue even if there are read errors from the drive. With this option a read error will cause the rest of the current block to be skipped and a warning message printed to stderr. The sync option says that the missing data from any skipped blocks should be replaced with null bytes. This ensures that bytes in the output file are in the same offset position they would have been if there had been no skipped blocks. Setting bs to the physical block size used by the drive ensures that as little data as possible is skipped due to read errors. If the bs was set higher physical size of the block with the error then more data than necessary would be skipped. Most drives use 4096 byte blocks. CD drives use 2048 byte blocks.
Many guides written for copying drives also show the notrunc option, but as far as I can tell this option is irrelevant in this context. It may be that dd would stop the copy process if all remaining blocks in the device were filled with nulls, so the output image size would be smaller than the drive. Specifying notrunc might tell dd to continue copying even the null blocks so that the output image would be identical to the contents of the drive. At least, I think that is the reasoning some people use to explain why they add the notrunc option, but I have found this not to be true. This option seems to have no effect in any of the use cases I have tested.
See Forensics,_Undelete,_and_Data_Recovery for more powerful tools, such as ddrescue, to recover damaged drives.
restore from an image
gunzip -c drive.img.gz | dd of=/dev/sda
Image a CD or DVD
Basically, do this:
dd if=/dev/dvd bs=2048 count=$(isosize -d 2048 /dev/dvd) conv=noerror,sync,notrunc > disc.iso
Note that /dev/cdrom, /dev/dvd, /dev/cdrw, and /dev/scd0 are usually just sym links to /dev/sr0 or some other optical disc device.
The following shows the naive, bad way to a CD-ROM or DVD to an iso file. This works, but it will often grab a few extra null blocks which will throw off the checksum of the disc image. If you burn this image onto a new disc then the checksum of the new disc will not match the checksum of the image file.
dd if=/dev/cdrom of=disc.iso
The following will create a correct image of a CD-ROM or DVD. This ensures that the image will have exactly the same md5sum or checksum value no matter what device or operating system is used to burn the image. This is a two-step process.
isoinfo -d -i /dev/cdrom # Take the values of '''Logical Block Size''' and '''Volume Space Size''' and plug into '''bs''' and '''count''' below: dd if=/dev/cdrom bs=2048 count=326239 conv=noerror,sync,notrunc > disc.iso
You can do this in one line:
dd if=/dev/dvd bs=2048 count=$(isosize -d 2048 /dev/dvd) conv=noerror,sync,notrunc > disc.iso
You can turn this into an alias. The alias, `cdgen`, generates an ISO image from a directory tree and dumps it to stdout. The alias, `cddump`, dumps an ISO image to stdout. The alias, `cdburn`, reads an ISO image from stdin and burns it to a disc. These assume the primary device, /dev/dvd, is the one you want (it works for CD as well as DVD).
alias cdgen='genisoimage -quiet -iso-level 3 -J -force-rr -l -N -d -allow-leading-dots -allow-multidot -V "`date --rfc-3339=seconds`" -r ' alias cddump='dd if=/dev/dvd bs=2048 count=$(isosize -d 2048 /dev/dvd) conv=noerror,sync,notrunc' alias cdburn='cdrecord fs=16m speed=8 padsize=63s -pad -dao -v -'
Here are some examples of how these can be used:
cddump | md5sum cddump > disc.iso cat disc.iso | cdburn
Steve Litt's `rawread` script does this automatically with the added advantage this it gets the Logical Block Size as reported by the drive instead of assuming that it is 2048; although, all ISO formatted CDs and DVDs use 2048 for the Logical Block Size, so I usually just use the aliases above.
#!/bin/sh device=$1 blocksize=`isoinfo -d -i $device | grep "^Logical block size is:" | cut -d " " -f 5` if test "$blocksize" = ""; then echo catdevice FATAL ERROR: Blank blocksize >&2 exit fi blockcount=`isoinfo -d -i $device | grep "^Volume size is:" | cut -d " " -f 4` if test "$blockcount" = ""; then echo catdevice FATAL ERROR: Blank blockcount >&2 exit fi command="dd if=$device bs=$blocksize count=$blockcount conv=notrunc,noerror" echo "$command" >&2 $command
Steve Litt's `rawread` script can be used to do things like the following. Create an ISO disc dump:
rawread /dev/cdrom > disc.iso
check the md5sum of the physical optical disk:
rawread /dev/cdrom | md5sum
Image a drive with compression
dd if=/dev/sda conv=sync,noerror| gzip -c > drive.img.gz
gunzip -c drive.img.gz | dd of=/dev/sda conv=sync,noerror
- Save drive geometry info because cylinder size helps determine where a partition is stored
fdisk -l /dev/sda > drive.img.info
- Help the drive image compress more by filling unallocated space with zeros. Do this before you create the backup image. Don't do this on images to be used for forensic recovery! This creates a file filled with zeros and then deletes it
dd if=/dev/zero of=/media/drive_sda1/delete.me && rm /media/drive_sda1/delete.me && sync
Image a drive over a network with `dd` and `ssh` or `nc` (netcat)
You can use netcat or SSH to copy a disk over a network. If you are doing this on a live server you should unmount the drive or switch to single user mode or boot from a live CD. You don't have to unmount the drive. You may copy a live, mounted drive, but you should expect some corrupt files. This is certainly not the correct way to do it, but I have never had a problem When you try to mount the drive image later, it will complain that it was not cleanly unmounted or that its journal is inconsistent. It's better if the drive is not mounted or mounted read-only.
I prefer using `ssh` over `netcat` because the entire process is started from one machine in one step and all the traffic is encrypted.
dd if=/dev/sda | gzip -c - | ssh firstname.lastname@example.org "dd of=disk.img.gz"
This example uses 192.168.1.100 for the receiving machine's IP address. Port 2222 is used as the listening port on the receiving machine. You may substitute any free port. First, start the Netcat listener on the receiver:
nc -l 2222 > disk.img.gz
Then start the pipeline for `dd|gzip|nc` on the sender:
dd bs=1M if=/dev/sda | gzip -c - | nc 192.168.1.100 2222
Show progress status statistics of `dd`
Operations with `dd` can take a long time. Unfortunately, there is no command-line option to have `dd` print progress, but you can send the `dd` process a USR1 signal to have it print the progress statistics. For example, say you started `dd` and you know its PID is 15045. Example:
kill -USR1 15045
Here is a fancier example this will update every 10 seconds:
dd if=/dev/sda | gzip -c - | ssh email@example.com "dd of=disk_image.gz" & pid=$! while ps -p $pid > /dev/null; do kill -USR1 $pid; sleep 10; done