Disk mounting

From Noah.org
Jump to navigationJump to search

Can't unmount (umount): "device is busy."

Nested mounts can cause the error, "umount: foo : device is busy." when umounting parent directory. When unmounting a filesystem you may see something like this:

# umount /mnt/disk_image_loop
umount: /mnt/disk_image_loop: device is busy.
        (In some cases useful info about processes that use
         the device is found by lsof(8) or fuser(1))

You confirm that no terminal windows have a shell set to that working directory. You then find that fuser -m /mnt/disk_image_loop and lsof -n -N | grep disk_image_loop give no useful information. Checking losetup you see that a loopback device to associated with the mount point. [What do the columns in the losetup report mean? Apparently you are supposed to guess or go read the source code because the report output fields are not documented... loop_device [loop_dev_name?]:inode_of_looped{file or device?} (file_used_for _loop).]

# losetup --all
/dev/loop0: [fc00]:25953690 (/var/disk-images/sid.img)

This can happen with nested mounts. When checking mount it is easy to overlook additional mounts inside your mounted filesystem. For example, if you are building a root filesystem you might have /var/disk-images/sid.img mounted on /mnt/disk_image_loop then not notice that you have have additional proc and devpts filesystems mounted on directories under /mnt/disk_image_loop. You must unmount these filesystems before you can unmount /mnt/disk_image_loop.

/dev/sda1 on / type ext4 (rw,noatime,errors=remount-ro)
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
udev on /dev type tmpfs (rw,mode=0755)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620)
/var/disk-images/sid.img on /mnt/disk_image_loop type ext3 (rw)
/proc on /mnt/disk_image_loop type none (rw,bind)
devpts on /mnt/disk_image_loop type devpts (rw)

find open, but deleted file (unlinked files with inodes, but no directory).

This will show files are have been unlinked but still open. The files don't actually get deleted until the last process using it (before or after exit code is collected by wait?) . This can also be handy for finding what might be keeping a mount from being unmountable.

lsof +L1

fuser still has one more trick

user -n file -m /root/rootfs/quark-1.img

From this I got an absurdly long list of PIDs:

# fuser -m /root/rootfs/quark-1.img
/root/rootfs/quark-1.img:     1rce     2rc     3rc     5rc     7rc     8rc     9rc    10rc    11rc    12rc    13rc    14rc    15rc    16rc    17rc    18rc    19rc    20rc    21rc    23rc    24rc    25rc    26rc    28rc    29rc    30rc    31rc    33rc    34rc    35rc    36rc    37rc    38rc    39rc    40rc    41rc    42rc    43rc    45rc    46rc    47rc    48rc    49rc    50rc    51rc    53rc    54rc    55rc    56rc    57rc    58rc    70rc    72rc    91rc    92rc   197rc   198rc   730rce   943rce  1035rce  1091rce  1093rce  1119rce  1129rce  1132rce  1136rce  1137rce  1139rce  1169rce  1170rce  1177rce  1190rce  1209rce  1261rc  1262rc  1263rc  1268rc  1274rc  1292rc  1293rc  1294rc  1295rc  1327rc  1423rce  1442rce  1536rce  1585rc  1985rc  4617rce  4977rc  7627rce  7712rce  8372rce  8375rc  8460rce  8461rce  8498rce  8499rce  8500re  8682rce 15573rce 15577rce 15578rce 16818rc 16819rc 16820rc 16843rc 19397rc 21045rce 21047rce 22367rc 22636rc 23046rce 23057re 23060re 23066rce 23068rc 23147rce 23148rce 23185rce 23186rce 23187rce 25133rc 25135rc 25136rc 25426rc 28464rce 28466rce

I decided to look at the last pid, 28466, and saw this:

28464     1   0  19  0.0  0.0     4-11:34:07 root      wait                     Ss   /bin/sh -e /proc/self/fd/9

This looked odd, and having nothing better to do, I killed it with -9. All of a sudden all my shells sessions died. For a while I could not ssh back to the server, but then it seemed to recover and I was able to login again. The mysterious stuck mount was gone and losetup -a showed no connected loops. "I fixed it!" No... uptime showed that the machine had rebooted itself. I found nothing in /var/log for give any more clues as to what happened. Apparently the crash didn't give processes any time to sync their log files.

chrooted processes cause "device is busy" error during umount

This error can happen if a process was chrooted to the mounted filesystem and left running. The process could be a daemon or just an open shell somewhere. These can be difficult to find using the usual lsof or fuser commands because these won't report anything with the name of the mount point or the mounted device name (because it was chrooted). You can demonstrate this by mounting a root filesystem and chrooting into it and then running a trivial little bash daemon.

mount /var/disk-images/sid.img /mnt/disk_image_loop
mount -o loop /var/disk-images/sid.img /mnt/disk_image_loop
chroot /mnt/disk_image_loop /bin/bash
( ( while true; do date >> /log.log; sleep 1; done ) & ) &
tail -n 1 /mnt/disk_image_loop/log.log
sleep 2
tail -n 1 /mnt/disk_image_loop/log.log
ls -l /proc/*/root | grep /mnt/disk_image_loop

This technique of searching for chrooted processes can be handy. Here is an alias for listing chrooted processes.

alias lschroot='ls -l /proc/*/root | grep -v "\-[>] /$"'

This is also a useful way to view all chrooted processes:

for procpid in /proc/*/root; do
    linktarget=$(readlink ${procpid})
    if [ "${linktarget}" != "/" ]; then 
        echo "${procpid} chrooted to ${linktarget}"

Or this is equivalent with a different style of globbing.

for procpid in /proc/[0-9]*; do
    linktarget=$(readlink ${procpid}/root)
    if [ "${linktarget}" != "/" ]; then 
        echo "${procpid} chrooted to ${linktarget}"


This will list all the disks that Linux sees. This will not show loop devices. See `losetup` example for more information:

fdisk -l


Convert a VMWare flat split image disk set to a raw disk image
cat linux-server-f001.vmdk linux-server-f002.vmdk linux-server-f003.vmdk > linux-server.img
Find the start of partitions
fdisk -l -u linux-server.img
First partition usually starts at block 63. Each block is usually 512 bytes. Offset is therefore
echo $((63*512))
Find the start of each partition down to the exact offset byte (easier than `fdisk`)
parted linux-server.img unit b print
List the next available loopback device
losetup -f
Attach loopback to a partition offset inside of a disk image
losetup -o $((63*512)) /dev/loop0 linux-server.img
Create a mount point
mkdir -p /media/adhoc
Mount the partition
mount /dev/loop0 /media/adhoc
Unmount the partition before cleaning up loop device
umount /media/adhoc
Cleanup the loop device
losetup -d /dev/loop0

losetup -- mount individual partitions in a whole disk image

If you have a while disk image and you want to mount partitions inside that image then use `losetup` to create a loopback device for the image. For example, say you copied an entire disk using `dd` like this:

dd if=/dev/sda of=disk.img

You can later create a loop device for it and see its partitions with `fdisk` and mount those partitions individually with `mount`. Note that `fdisk -l` does not normally show loop devices. You must add an explicit path to the loop device that you want to list.

losetup /dev/loop0 disk.img
fdisk -l /dev/loop0

The previous example assumed that /dev/loop0 was free. You can you the '-f' option to automatically find a free loop device. In this example we first use the '-f' option to associate the image file with the next available loop device; then we use the '-j' option to see what loop device was associated with the file:

losetup -f disk.img
losetup -j disk.img

mounting partitions inside a disk image without loop device

It is also possible to mount partitions inside a disk image file directly with `mount` using the 'offset' option, but I have not had luck with this.

mount -o loop,ro,offset=1025 disk.img /media/adhoc

Disk recovery

Use `dd_rhelp`. This is a wrapper around `dd_rescue` that makes it easier to use.