Disk mounting

Can't unmount (umount): "device is busy."

Nested mounts can cause the error, "umount: foo : device is busy." when umounting parent directory. When unmounting a filesystem you may see something like this:

# umount /mnt/disk_image_loop
umount: /mnt/disk_image_loop: device is busy.
        (In some cases useful info about processes that use
         the device is found by lsof(8) or fuser(1))

You confirm that no terminal windows have a shell set to that working directory. You then find that fuser -m /mnt/disk_image_loop (find file on mounted filesystem) and lsof -n -N | grep disk_image_loop give no useful information. Checking losetup you see that a loopback device to associated with the mount point. [What do the columns in the losetup report mean? Apparently you are supposed to guess or go read the source code because the report output fields are not documented... loop_device [loop_dev_name?]:inode_of_looped{file or device?} (file_used_for _loop).]

# losetup --all
/dev/loop0: [fc00]:25953690 (/var/disk-images/sid.img)

This can happen with nested mounts. When checking mount it is easy to overlook additional mounts inside your mounted filesystem. For example, if you are building a root filesystem you might have /var/disk-images/sid.img mounted on /mnt/disk_image_loop then not notice that you have have additional proc and devpts filesystems mounted on directories under /mnt/disk_image_loop. You must unmount these filesystems before you can unmount /mnt/disk_image_loop.

/dev/sda1 on / type ext4 (rw,noatime,errors=remount-ro)
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
udev on /dev type tmpfs (rw,mode=0755)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620)
/var/disk-images/sid.img on /mnt/disk_image_loop type ext3 (rw)
/proc on /mnt/disk_image_loop type none (rw,bind)
devpts on /mnt/disk_image_loop type devpts (rw)

find open, but deleted file (unlinked files with inodes, but no directory).

This will show files are have been unlinked but still open. The files don't actually get deleted until the last process using it (before or after exit code is collected by wait?) . This can also be handy for finding what might be keeping a mount from being unmountable.

lsof +L1

Example of hard to find process keeping a mount point stuck open

The following is suspicious, but can't say for sure it's causing problems.

# lsof +L1
COMMAND PID USER   FD   TYPE DEVICE SIZE/OFF NLINK   NODE NAME
init      1 root   18w   REG  202,1      804     0 271554 /var/log/upstart/docker.io.log.1 (deleted)

I check with ps and find that a docker.io service is running. I still can't pinpoint my stuck mount to this process, but I decide to shut it down to see if this gets me anywhere.

service docker.io stop

Now it seems that lsof +L1 returns nothing, so it seems that the docker.io service was at least the cause of the open, deleted file. After all this is done I find that I still cannot umount the stuck filesystem.... oh well, this is a terrible example, so far.

Next I try strace:

strace umount /tmp/tmp.NXMuSARwu9'''

That gives me a lot of output. It seems that umount decided that the filesystem was busy soon after reading the /tmp/fstab file. I noticed something odd. It seems that /proc and /sys were listed twice. Perhaps my rootfilesystem builder escaped it's chroot and corrupted the host's /etc/fstab file.

proc /proc proc defaults 0 0
sysfs /sys sysfs defaults 0 0
proc /proc proc defaults 0 0
sysfs /sys sysfs defaults 0 0

I fixed the fstab file with no effect. I'm not sure how that would cause umount to fail, but it was another odd clue.

Another concern I had was my use of fallocate to reserve a large block of fiilesystem storage for a virtual disk image file. This function returns quickly. The reservation is noted as a credit. Actual blocks of the filesystem are not yet reserved. At some point the system desides to actually reserve the space. It may be that if some other process kills the fallocation building process. This leaves the kernel in a bad state with hung kernel threads.

fuser still has one more trick

This might be a red herring. The -m seems to dump a long life of pids for nearly any existing file in the filesystem.

fuser -n file -m /root/rootfs/quark-1.img

From this I got an absurdly long list of PIDs:

# fuser -m /root/rootfs/quark-1.img
/root/rootfs/quark-1.img:     1rce     2rc     3rc     5rc     7rc     8rc
9rc    10rc    11rc    12rc    13rc    14rc    15rc    16rc    17rc    18rc
19rc    20rc    21rc    23rc    24rc    25rc    26rc    28rc    29rc    30rc
31rc    33rc    34rc    35rc    36rc    37rc    38rc    39rc    40rc    41rc
42rc    43rc    45rc    46rc    47rc    48rc    49rc    50rc    51rc    53rc
 54rc    55rc    56rc    57rc    58rc    70rc    72rc    91rc    92rc   197rc
198rc   730rce   943rce  1035rce  1091rce  1093rce  1119rce  1129rce
1132rce  1136rce  1137rce  1139rce  1169rce  1170rce  1177rce  1190rce
1209rce  1261rc  1262rc  1263rc  1268rc  1274rc  1292rc  1293rc  1294rc
 1295rc  1327rc  1423rce  1442rce  1536rce  1585rc  1985rc  4617rce
 4977rc  7627rce  7712rce  8372rce  8375rc  8460rce  8461rce  8498rce
8499rce  8500re  8682rce 15573rce 15577rce 15578rce 16818rc 16819rc
16820rc 16843rc 19397rc 21045rce 21047rce 22367rc 22636rc 23046rce
23057re 23060re 23066rce 23068rc 23147rce 23148rce 23185rce 23186rce
 23187rce 25133rc 25135rc 25136rc 25426rc 28464rce 28466rce

I decided to look at the last pid, 28466, and saw this:

  PID  PPID  NI PRI %CPU %MEM     ELAPSED USER     WCHAN                STAT COMMAND
28464     1   0  19  0.0  0.0     4-11:34:07 root      wait                     Ss   /bin/sh -e /proc/self/fd/9

This looked odd, and having nothing better to do, I killed it with -9. All of a sudden all my shell sessions died. For a while I could not ssh back to the server, but then it seemed to recover and I was able to login again. The mysterious stuck mount was gone and losetup -a showed no connected loops. "I fixed it!" No, wait... uptime showed that the machine had rebooted itself. I found nothing in /var/log for give any more clues as to what happened. Apparently the crash didn't give processes any time to sync their log files.

chrooted processes cause "device is busy" error during umount

This error can happen if a process was chrooted to the mounted filesystem and left running. The process could be a daemon or just an open shell somewhere. These can be difficult to find using the usual lsof or fuser commands because these won't report anything with the name of the mount point or the mounted device name (because it was chrooted). You can demonstrate this by mounting a root filesystem and chrooting into it and then running a trivial little bash daemon.

mount /var/disk-images/sid.img /mnt/disk_image_loop
mount -o loop /var/disk-images/sid.img /mnt/disk_image_loop
chroot /mnt/disk_image_loop /bin/bash
( ( while true; do date >> /log.log; sleep 1; done ) & ) &
exit
tail -n 1 /mnt/disk_image_loop/log.log
sleep 2
tail -n 1 /mnt/disk_image_loop/log.log
ls -l /proc/*/root | grep /mnt/disk_image_loop

This technique of searching for chrooted processes can be handy. Here is an alias for listing chrooted processes.

alias lschroot='ls -l /proc/*/root | grep -v "\-[>] /$"'

This is also a useful way to view all chrooted processes:

for procpid in /proc/*/root; do
    linktarget=$(readlink ${procpid})
    if [ "${linktarget}" != "/" ]; then 
        echo "${procpid} chrooted to ${linktarget}"
    fi
done

Or this is equivalent with a different style of globbing.

for procpid in /proc/[0-9]*; do
    linktarget=$(readlink ${procpid}/root)
    if [ "${linktarget}" != "/" ]; then 
        echo "${procpid} chrooted to ${linktarget}"
    fi
done

fdisk

This will list all the disks that Linux sees. This will not show loop devices. See `losetup` example for more information:

fdisk -l

losetup

Convert a VMWare flat split image disk set to a raw disk image: cat linux-server-f001.vmdk linux-server-f002.vmdk linux-server-f003.vmdk > linux-server.img
Find the start of partitions: fdisk -l -u linux-server.img
First partition usually starts at block 63. Each block is usually 512 bytes. Offset is therefore: echo $((63*512))
Find the start of each partition down to the exact offset byte (easier than `fdisk`): parted linux-server.img unit b print
List the next available loopback device: losetup -f
Attach loopback to a partition offset inside of a disk image: losetup -o $((63*512)) /dev/loop0 linux-server.img
Create a mount point: mkdir -p /media/adhoc
Mount the partition: mount /dev/loop0 /media/adhoc
Unmount the partition before cleaning up loop device: umount /media/adhoc
Cleanup the loop device: losetup -d /dev/loop0

losetup -- mount individual partitions in a whole disk image

If you have a while disk image and you want to mount partitions inside that image then use `losetup` to create a loopback device for the image. For example, say you copied an entire disk using `dd` like this:

dd if=/dev/sda of=disk.img

You can later create a loop device for it and see its partitions with `fdisk` and mount those partitions individually with `mount`. Note that `fdisk -l` does not normally show loop devices. You must add an explicit path to the loop device that you want to list.

losetup /dev/loop0 disk.img
fdisk -l /dev/loop0

The previous example assumed that /dev/loop0 was free. You can you the '-f' option to automatically find a free loop device. In this example we first use the '-f' option to associate the image file with the next available loop device; then we use the '-j' option to see what loop device was associated with the file:

losetup -f disk.img
losetup -j disk.img

mounting partitions inside a disk image without loop device

It is also possible to mount partitions inside a disk image file directly with `mount` using the 'offset' option, but I have not had luck with this.

mount -o loop,ro,offset=1025 disk.img /media/adhoc

Disk recovery

Use `dd_rhelp`. This is a wrapper around `dd_rescue` that makes it easier to use.