Bash notes

From Noah.org
Revision as of 19:22, 2 August 2012 by Root (talk | contribs)
Jump to navigationJump to search


Canonical path to a file in pure POSIX

This works similar to the `readlink` command, which is not POSIX and doesn't work the same on Linux and BSD. This is sort of sill. It's amazing how complicated this gets.

#!/bin/sh
canonical () {
        # This returns the canonical path to the argument.
        # The argument must exist as either a dir or file.
        # This works in a pure POSIX environment.
        
        if [ -d "${1}" ]; then
                # `cd` requires execute permission on the dir.
                if [ ! -x "${1}" ]; then
                        canon=""
                        return 1
                fi      
                oldwd=$(pwd)
                cd ${1}
                canon=$(pwd)
                # Check special case of `pwd` on root directory.
                if [ -n "${canon#/*}" ]; then
                        canon=${canon}/
                fi
                cd "${oldwd}"
        else
                # At this point we know it isn't a dir.
                # But if it looks like a dir then error.
                if [ -z "${1##*/}" ]; then
                        canon=""
                        return 1
                fi
                # It looks like a path to a file.
                # Test the if the path before the file is a dir.
                dirname=$(dirname "$1")
                if [ -d "${dirname}" ]; then
                        # `cd` requires execute permission on the dir.
                        if [ ! -x "${1}" ]; then
                                canon=""
                                return 1
                        fi
                        oldwd=$(pwd)
                        cd "${dirname}"
                        canon=$(pwd)
                        # Check special case of `pwd` on root directory.
                        if [ -z "${canon#/*}" ]; then
                                canon=/$(basename "$1")
                        else
                                canon=${canon}/$(basename "$1")
                        fi
                        cd "${oldwd}"
                else
                        # It isn't anything so error.
                        canon=""
                        return 1
                fi
        fi
        echo "${canon}"
        return 0
}
canonical "$1"
exit $?

Enter ASCII control codes and unprintable characters

If you have a filename with weird ASCII characters or unprintable characters then you may have trouble specifying the filename on the command-line. It can be difficult to even see which weird characters are in the filename when you run `ls`. The filename may have unsupported unicode or control codes embedded to deliberately make it difficult to delete or find. If the filename looks like it has command-line options embedded in it then see Removing_files_with_weird_names.

This creates an empty file with a filename that contains an ASCII Escape control code.

touch $'\033'.foobar
<pre>
When you run `ls *.foobar` you will see one of the following depending on your environment setting for '''LS_OPTIONS''':
<pre>
?.foobar
\033.foobar

The second form will be shown if the --escape option is added to the `ls` command or to your LS_OPTIONS environment variable. The --escape option causes `ls` to to print the octal ASCII escape code for unprintable characters.

Notice how the ESC control code was specified in the `touch` command. The string, $'\033' , is a form of Bash variable expansion for constants. This is one way Bash allows you to enter non-printing characters.

Get the really real 'real user ID'

If you run a script inside of `sudo` then the real and effective users are both 'root'. Using `id -r` doesn't work. The following will give the real user name and real uid of the user that owns the current terminal running the script. That's usually what you want.

REAL_USERNAME=$(stat -c '%U' $(tty))
REAL_UID=$(stat -c '%u' $(tty))

or

REAL_USERNAME=$(stat -c '%U' $(readlink /proc/self/fd/0))
REAL_UID=$(stat -c '%u' $(readlink /proc/self/fd/0))

Making a script safe to be run from a daemon

This will ensure that a script is totally disconnected from input and output. If a daemon runs a script in the background and that script generates output then the daemon may block when waiting for the script to finish. The script child process will show up as <defunct>. The reason is because the kernel thinks the child is dead, but will not cleanup the pid information until all child output has been flushed. In this case if the child prints anything to stdout or stderr and the parent does not read that data then parent may block on waitpid. This is usually not a problem when a script is run from a foreground process because the parent process is connected to a TTY. The TTY will automatically read stdout and stderr, so any output from the parent or child gets flushed.

This is similar to a common problem with Open SSH where a client will not exit even after you exit from the remote server. See "Why does ssh hang on exit?" in the OpenSSH FAQ (see also 3.10).

It can be tricky to be sure that a script never generates any output, but you can get around this problem by closing the standard file descriptors.

# Close stdin, stdout, and stderr to be sure this script 
# cannot generate output and cannot accept input.
exec 0>&- 1>&- 2>&-

Locking with flock to prevent multiple instances running

This is a common idiom that you see on systems where the `flock` utility is available (most Linux systems). The `flock` utility is a command-line interface to the flock system call.

Simple add the following near the top of your script. What this does is check to see if this script was run inside of a `flock` command. If FLOCK_SET is set then it means the calling `flock` succeeded so we just continue on with the script. If FLOCK_SET is not set then that means we have to re-run the script inside of a `flock` command which will set FLOCK_SET. Note the use of `exec`. This replaces the current process, so if `flock` fails it will exit entire script. It is easy to overlook this and think that there is a bug in the logic. You might otherwise think that if `flock` fails then the rest of the script would continue after the if expression. Have no fear, if the `flock` fails then the entire script will fail to run.

# Set lockfilename to suite your application.
# lockfilename="/tmp/$0.lock"
lockfilename="/var/lock/$0"
if [ -z "$FLOCK_SET" ] ; then
    exec env FLOCK_SET=1 flock -n "${lockfilename}" "$0" "$@"
fi

# The rest of your script runs here.
# ...

Bashing Bash

When you get down to it, Bash is a piece of crap. It has no redeeming qualities except that it is ubiquitous. That is, in fact, the only reason I have put any effort into it at all. By the time I get deep into a project that involves a giant amount of Bash I usually regret it and wish I had just started the project off with Python.

That said, I seem to have learned a lot more Bash than I ever intended.

Arithmetic

Bash can't do floating point -- not even division and multiplication. You can pipe through `bc` or `dc`.

In this `bc` example the "scale=4" part of the expression sets the output display precision to 4 decimal places.

echo "scale=4; 1/2" | bc
.5000

If you need trig functions then you need to add the --mathlib option. The 's()' function is sine in `bc` syntax.

 echo "scale=4; s(1/2)" | bc --mathlib
.4794

The `dc` calculator is RPN. The "4 k" part of the following expression sets precision to 4 decimal places.

$ echo "4 k 1 2 / p" | dc
.5000

I find a lot of systems that do not have `bc` or `dc` installed, so I often use `awk`. For example, say some command output numbers in two fields, bytes and seconds. To get bytes per second you could use `awk`. In this example I just use `echo` to pipe sample data to `awk`:

echo "104857600 0.767972" | awk '{print $1 / $2}'

Using 1048576 bytes in a MegaByte:

echo "104857600 0.767972" | awk '{printf ("%4.2f MB/s\n", $1 / $2 / (1024*1024))}'
130.21 MB/s

Using 1000000 bytes in a MegaByte as `dd` does:

echo "104857600 0.767972" | awk '{printf ("%4.2f MB/s\n", $1 / $2 / (1000*1000))}'
136.54 MB/s

Get a list of files for iteration

This is short and simple, plus it will ignore .svn directories.

FILES=$(find . -path "*/.svn" -prune -o -print)

This will quote the filenames. It does not use `find` and does not filter out .svn directories:

FILES=$(ls -RQal | \
       awk -v PATH=$path '{ \
                          if ($0 ~ /.*:$/) \
                              path = substr($0,1,length($0)-2); \
                          else \
                              if ($0 ~ /^-/) \
                                  printf("%s%s/%s\n", PATH, path, \
                                          substr($0, match($0,"\".*\"$")+1, RLENGTH-1) \
                                        ) \
                          }' \
      )

To make the quoted list of files work with a for loop you will need to set IFS.


Interview with Steve Bourne

Interview with Steve Bourne.

Here files

#!/bin/sh
process <<EOF
This is the
here document.
EOF

By default shell variable expansion is done, so you must escape characters like $ and `. If "EOF" is quoted then no shell variable expansion is done. Characters like $ and ` are safe. This may also be written as \EOF. If -EOF starts with a dash then leading whitespace in the here document is stripped. This is simply to allow the here document to be indented and not have the indentation appear as part of the here document.

#!/bin/sh
FILENAME=${1}
gnuplot <<PLOTEOF
set terminal x11 persist
set title "${FILENAME}"
plot "${FILENAME}" using 1:2 with linespoints
PLOTEOF

Assign variable the contents of a "here document"

here doc here file

Here documents or here files can be put into a variable.

#!/bin/sh
EXIT_MESSAGE=$(cat <<'END_HEREDOC'
This documents what this script does.
You don't have to worry about embedded "quotes".
This makes it east to read in the script and easy to print.
Note that the 'END_HEREDOC' is quoted above.
Also the final closing parenthesis must come
after the 'END_HEREDOC' and on its own line.
Also note that you have to use th 'printf' command.
The echo command or the echo builtin will remove new-lines.
END_HEREDOC
)

printf "%s\n" "$EXIT_MESSAGE"
exit 0

Check if a web page exists or not

There must be a better way than this. I was surprised that curl doesn't offer an option to detect HTTP response.

curl --head --silent --no-buffer http://www.example.org/foobar.html | grep -iq "200 OK"

Example use in an 'if' statement:

if curl --head --silent --no-buffer http://www.example.org/foobar.html | grep -iq "200 OK"; then 
    echo "Web page exists"
fi

Example use handling additional HTTP response codes:

case "$(curl --head --silent --no-buffer http://www.example.org/foobar.html)" in
    *"200 OK"*) echo "200 HTTP response";;
    *"404 Not Found"*)  echo "404 HTTP response";;
    *)  exit_code=$?
        if [ ${exit_code} -ne 0 ]; then 
            echo "ERROR: curl failed with exit code ${exit_code}"
        else 
            echo "Unhandled HTTP response"
        fi
    ;;
esac

Draw a circle in ASCII

This uses `awk` for the math. The sequence in incremented by a fraction, 0.4, each time so that there is some overlap to make the circle smoother. You could use `seq 1 57`, but there will be gaps and rough edges.

tput clear;(seq 1 .4 57|awk '{x=int(11+10*cos($1/9));y=int(22+20*sin($1/9));system("tput cup "x" "y";echo X")}');tput cup 22 0

Some systems don't have the `seq` command. The following will work on a greater variety of platforms:

tput clear;(yes|head -n 114|cat -n|awk '{x=int(11+10*cos($1/18));y=int(22+20*sin($1/18));system("tput cup "x" "y";echo X")}');tput cup 22 0

That will render this fine quality circle (fits on an 80x24 console):


             XXXXXXXXXXXXXXXXXX
          XXXX                 XXX
        XX                        XXX
      XX                            XX
    XX                                X
   XX                                  XX
  XX                                    X
  X                                      X
  X                                      X
  X                                      X
  X                                      X
  X                                      X
  X                                      X
  XX                                    X
   XX                                  XX
    XX                                XX
      XX                            XX
        XX                        XX
          XXXX                 XXX
             XXXXXXXXXXXXXXXXXX

There should be a way to do it without `awk`... Maybe `join` or `paste` would help in this case. Here is a start:

seq 1 56 | sed -e 's/\(.*\)/c(\1 \/ 9)/' | bc

Then it just starts to get silly:

tput clear;(seq 1 0.4 57|awk '{x=int(11+10*cos($1/9));y=int(22+20*sin($1/9));system("tput cup "x" "y";echo X")}');tput cup 8 15;echo X;tput cup 8 28;echo X;(seq 16 0.4 21.6|awk '{x=int(11+6*cos($1/3));y=int(22+12*sin($1/3));system("tput cup "x" "y";echo X")}');tput cup 22 0

Clear all environment variables

This will delete all environment variables except for a few explicitly allowed to stay.

unset $(env | grep -o '^[_[:alpha:]][_[:alnum:]]*' | grep -v -E '^PWD$|^USER$|^TERM$|^SSH_.*|^LC_.*')

Redirect entire output of a script from inside the script itself

#!/bin/sh

# This demonstrates printing and logging output at the same time.
# This works by starting `tee` in the background with its stdin
# coming from a named pipe that we make; then we redirect our
# stdout and stderr to the named pipe. All pipe cleanup is handled
# in a trap at exit.

# This is the exit trap handler for the 'tee' logger.
on_exit_trap_cleanup ()
{
    # Close stdin and stdout which closes our end of the pipe
    # and tells `tee` we are done.
    exec 1>&- 2>&-
    # Wait for `tee` process to finish. If we exited here then the `tee`
    # process might get killed before it hand finished flushing its buffers
    # to the logfile.
    wait $TEEPID
    rm ${PIPEFILE}
}
tee_log_output ()
{
    LOGFILE=$1
    PIPEFILE=$(mktemp -u $(basename $0)-pid$$-pipe-XXX)
    mkfifo ${PIPEFILE}
    tee ${LOGFILE} < ${PIPEFILE} &
    TEEPID=$!
    # Redirect subsequent stdout and stderr output to named pipe.
    exec > ${PIPEFILE} 2>&1
    trap on_exit_trap_cleanup EXIT
}

LOGFILE="$0-$$.log"
echo "Logging stdin and stderr output to logfile: ${LOGFILE}"
tee_log_output ${LOGFILE}
date --rfc-3339=seconds
echo "command: $0"
echo "pid:     $$"
sleep 2
date --rfc-3339=seconds

This works only in Bash 4.x.

#!/bin/sh
# This will send output to a log file and to the screen using an
# unamed pipe to `tee`. This works only in Bash 4.x.
exec > >(tee -a ${LOGFILE})

date --rfc-3339=seconds
echo "command: $0"
echo "pid:     $$"
sleep 1
date --rfc-3339=seconds

Turn off bash history for a session

set +o history

Rename a group of files by extension

For example, rename all images from foo.jpg to foo_2.jpg.

This is somewhat more clear:

for filename in *.jpg ; do mv $filename `basename $filename .jpg`_2.jpg; done

This is more "correct" and doesn't require `basename`:

for filename in *.jpg ; do mv $filename ${filename%.jpg}_2.jpg; done

Usage Function

exit_with_usage() {
    local EXIT_CODE="${1:-0}"

    if [ ${EXIT_CODE} -eq 1 ]; then
        exec 1>&2
    fi

    echo "TODO: This script does something useful."
    echo "Usage: $0 [-h | --help]"
    echo "  -h --help         : Shows this help."

    exit "${EXIT_CODE}"
}

Special Shell Variables

$*
all parameters separated by the first character of $IFS
$@
all parameters quoted
$#
the number of parameters
$-
option flags set `set` or passed to shell
$?
exit status of last command
$!
pid of last background command
$$
pid of this script or shell
$0
name of this script of shell
$_
arguments of last command (with variables expanded).

Variable Expansion and Substitution

Bash can do some freaky things with variables. It can do lots of other substitutions. See "Parameter Expansion" in the Bash man page.

  • ${foo#pattern} - deletes the shortest possible match from the left
  • ${foo##pattern} - deletes the longest possible match from the left
  • ${foo%pattern} - deletes the shortest possible match from the right
  • ${foo%%pattern} - deletes the longest possible match from the right
  • ${foo:=text} - Use and assign default value. If $foo exists and is not null then return $foo. If $foo doesn't exist then create it; set value to 'text'; and return 'text'.
  • ${foo:-text} - Use default value. If $foo exists and is not null then return $foo, else return 'text'. This does not create $foo.

variable expansion to manipulate filenames and paths

$ # strip off any one extension on a file name (not greedy).
$ MY_FILENAME=my_video.project.copy.mov
echo ${MY_FILENAME%.*}
my_video.project.copy

$ # strip off all extensions on a file name (greedy).
$ MY_FILENAME=my_video.project.copy.mov
echo ${MY_FILENAME%%.*}
my_video

$ # strip off the .tar.gz extension on a file name.
$ MY_TARBALL=openssl-1.0.0d.tar.gz
echo ${MY_TARBALL%.tar.gz}
openssl-1.0.0d

$ # strip off trailing slash if there is one:
$ MY_PATH=${MY_PATH%/}
$ echo ${MY_PATH}
/var/log/apache2

$ # stripping it twice is harmless:
$ MY_PATH=${MY_PATH%/}
$ echo ${MY_PATH}
/var/log/apache2

$ # directory above: 
$ echo ${MY_PATH%/*}
/var/log

$ # last path element:
$ echo ${MY_PATH##*/}
apache2

$ # strip off leading slash if there is one:
$ echo ${MY_PATH#/}
var/log/apache2

brace expansion versus backtick expansion for command substitution

Backtick expansion works in even the oldest Bourne shell variant. It cannot best nested without quoting.

echo `ls /boot/`

Brace expansion works in any POSIX Bourne shell (sh, ash, dash, bash, etc...).

 
echo $(ls /boot/*$(uname -r)*)

Although you can do it if you quote the inner backticks:

echo `ls /boot/*\`uname -r\`*`

quote output in echo to preserve newlines

Echo converts newlines to spaces. This can be useful for substituting in loops. Quoting the argument will preserve the newlines.

This converts newlines to spaces:

echo $(ls /boot/)

The following preserves the newlines output from `ls`:

echo "$(ls /boot/)"

absolute and relative paths

Convert a relative path to a absolute path. It is stupid that there is not a command to do this. This does not effect the current working directory. This finds the absolute full path to $1:

echo "absolute path: `cd $1; pwd`"

Get the absolute path of the currently running script.

abs_path_here=`echo -n \`pwd\` ;( [ \`dirname \$0\` == '.' ] && echo ) || echo “/\`dirname \$0\`”`

Statements

Loop on filenames in a directory

for foo in *; do {
  echo ${foo}
}; done

Loop on lines in a file

for foo in $(cat data_file.txt); do {
  echo ${foo}
}; done

while loop

This is kind of like `watch`:

while sleep 1; do lsof|grep -i Maildir; done

read -- get input from user

In Bash, the builtin command, `read`, is used to get input from a user. It will read input into a variable named REPLY by default or into a given variable name.

read
echo $REPLY
read YN
echo $YN

Remember, `read` is a builtin command, so to get information on using it use `help read`, not `man read`.

Get input directly from a TTY -- not stdin

By default `read` will read input from stdin, but there are situations when you want to get input from the user's TTY instead of stdin. For example, say you piped output from another program into your script then it would try to read input from the user, not the pipeline (what the script now sees as stdin). Another example, you want a boot script to ask the user for input before the console TTY has been opened and attached to stdin (getty) -- this situation came up while I was building an embedded Linux system where I needed to read input from the user through the serial port (/dev/ttyS0) during boot to allow for an optional boot sequence.

In this example, technically `read` still thinks it's reading from stdin -- wWe just redirect input from a tty file.

read YN < `tty`

The `tty` command will tell you which tty you are currently logged into. The console ttys are usually on '/dev/tty[0-9]+' and the virtual ttys used for SSH logins are on '/dev/pts/[0-9]+'.

$ tty
/dev/pts/13

If you switch to a console screen (CTRL-ALT-F1 and ALT-F7 to return to X11) and then you login you will see that you now become the owner of /dev/tty1. Switch to a console and login then switch back to X11 (ALT-F7) and from a shell, you see that you now own /dev/tty1. When you logout /dev/tty1 will return to root ownership.

$ ll /dev/tty1
crw------- 1 root   root 4, 1 2009-01-04 06:02 /dev/tty1
$ ll /dev/tty1
crw------- 1 my_user tty 4, 1 2009-01-04 06:03 /dev/tty1

get yes/no input from user

YES=1
NO=0
INVALID=-1
yesno()
{
    echo -e $1
    VALID_YN=$FALSE
    YN=
    rval=
    echo -e " [y/n] \c"
    read YN
    if [ -z "$YN" ]
    then
        VALID_YN=$FALSE
        rval=$INVALID
    else
        case "$YN" in
            [Yy]*)
                VALID_YN=$TRUE
                rval=$YES
                ;;
            [Nn]*)
                VALID_YN=$TRUE
                rval=$NO
                ;;
            *)
                VALID_YN=$FALSE
                rval=$INVALID
                ;;
        esac
    fi
    if [ $rval -eq $INVALID ]
    then
        echo "Invalid Response..."
    fi
    return $rval
}

read a single character key then return -- with no Enter required

The following is a discussion of `stty` command. In Bash and Korn shell you can already get a single character using `read`. The following will set the variable, CHARACTER, with a single key read from stdin: `read -r -s -n 1 CHARACTER`.

Using `stty` can get confusing because many different examples do the same things in seemingly different ways. The differences are because the `stty` command has redundant and complimentary ways of doing things. For example, `stty icanon` is the same as `stty -cbreak` and `stty raw` is the same as `stty -cooked`. Raw mode does the same thing and more as '-icanon'.

This reads a single character without echo. It works two ways. If you pass no arguments to `readc` then it will create the variable REPLY and set it to the character read from stdin. If you pass a variable name argument to `readc` then it will set the given variable name to the character read from stdin.

# This reads a single character without echo.
# If a variable name argument is given then it is set to a character read from stdin.
# else the variable REPLY is set to a character read from stdin.
# This is equivalent to `read -r -s -n 1` in Bash.
# These two examples read a single character and print it:
#     readc CHARACTER
#     echo "CHARACTER is set to ${CHARACTER}."
#     readc
#     echo "REPLY is set to ${REPLY}."
readc ()
{
    previous_stty=$(stty -g)
    stty raw -echo
    char=`dd bs=1 count=1 2>/dev/null`
    stty "${previous_stty}"

    if [ -n "$1" ] ; then
        eval $1="${char}"
    else
        REPLY="${char}"
    fi
}

read with a timeout

The Bash built-in already has a timeout option. The following solution will work under most POSIX Bourne style shells:

read_timeout() {
        trap : USR1
        trap 'kill "${pid}" 2>/dev/null' EXIT
        (sleep "$1" && kill -USR1 "$$") &
        pid=$!
        read "$2"
        ret=$?
        kill "${pid}" 2>/dev/null
        trap - EXIT
        return "${ret}"
}

Example usage:


mktempdir -- Make Temp Directory

This is a fairly safe and fairly portable way to create a temporary directory with a unique filename. This does not clean up or delete the directory for you when done.

mktempdir () {
        CLEAN_NAME=$(echo $0 | sed "s/[-_.\/]//g")
        NEW_TMPDIR=${TMPDIR-/tmp}/$(date "+tmp-${CLEAN_NAME}.$$.%H%M%S")
        (umask 077 && mkdir ${NEW_TMPDIR} 2>/dev/null && echo ${NEW_TMPDIR}) || return 1
        return 0
}

Use it like this:

if ! MYTEMPDIR=$(mktempdir); then
        echo "Could not create a temporary directory."
        exit 1
fi

check if running as root

if [ $(id -u) -eq 0 ]; then
    echo "You are root."
fi

Or check if not root...

if [ $(id -u) -ne 0 ]; then 
    echo "You must be root to run this."
fi

check if process is running

Show the pids of all processes with name "openvpn":

ps -C openvpn -o pid=

Show if a process with pid=12345 is running:

kill -0 12345
echo $?

Check if a process with a given command name and pid is still running. For example, check if ssh process is running with pid 12345: "checkpid ssh 12345". Checkpid script:

#!/bin/sh
# example: checkpid ssh 12345
CMD=$1
PID=$2
for QPID in $(ps -C $CMD -o pid=); do
    if [ $QPID = $PID ]; then
        echo "running"
        exit 0
    fi
done
echo "not running"
exit 1