Shell Traps and Posix Signals

 ยท 7 min
AI-generated by DALL-E

Shell traps catch POSIX signals (and more) to allow asynchronous inter-process communication to inform any process or particular thread of various events and do some work. But do you know about all the different signals and ways to use the trap command?

Let’s start by looking at POSIX signals first.


POSIX Signals

The POSIX standard defines a variety of signals for asynchronous communication with a process or a specific thread in such a process. Signals are used to trigger certain actions under specific conditions, like asking a process nicely to terminate (SIGTERM), or telling it to die as fast as possible (SIGKILL).

Signal Characteristics

The main characteristics of signals are:

  • Asynchronous:
    They can be triggered at any time and, therefore, interrupt a running process during execution.
  • Default action:
    They’re not just signals to listen to, but there’s a default action associated with each one, like terminating the current process, or even ignoring the signal.
  • Interceptable:
    Any signal can be intercepted, like a trap in a shell script, to handle them appropriately. Also, a process can block or ignore certain signals altogether.

How to Send a Signal

Signals can be sent in multiple ways:

  • The kernel might send one due to error events like segmentation faults.
  • The command kill sends signals to particular processes by their pid.
  • The system call raise etc. in the C stdlib.
  • OS-packages in other programming languages, like Golang.
  • User input, such as keyboard event, e.g., ctrl-c
  • and more…

The simplest way to send a signal from our terminal is the kill command, like sending SIGKILL (9) to process id 42023:

shell
# kill -s<signal> <pid>
kill -9 42023

Default Actions

If not trapped and handled differently, each signal triggers one of these default actions:

  • Terminate:
    Terminates the process immediately
  • Ingore:
    The signal is ignored, and no further action is taken.
  • Core dump:
    Terminates the process immediately but also creates a core dump containing a copy of its memory/registry.
  • Stop:
    Suspends the current process until a SIGCONT signal is received.
  • Continue:
    If a process is currently stopped, continue it.

Most Common Signals

The actually available signals and their numbers depend on your type of POSIX-compliant operating system.

Here are the most common ones, including their default action:

SignalValueActionDescription
SIGHUP1TermHangup detected on controlling terminal or death of controlling process
SIGINT2TermInterrupt from keyboard (ctrl-c)
SIGQUIT3CoreQuit from keyboard (ctrl-)
SIGABRT6CoreAbort/Cancel
SIGKILL9TermTermination signal to quit a process immediately
SIGUSR110TermUser-defined signal 1
SIGALRM14TermTime signal from alarm system call
SIGTERM15TermTermination signal to quit gracefully
SIGTSTP20StopStop signal from terminal (ctrl-z)

There are a lot more signals at our disposal.

If you want to check the available signals on your system, you can find them in the man page under the section “Standard signals”:

shell
man 7 signal

The section “Signal numbering for standard signals” shows the different numbering between processor architectures. However, x86 and ARM share the same numbers, so many of us won’t encounter different signal numbers for the same signal.


Trapping Signals

The trap command is built directly into the shell. It catches signals and executes code when triggered:

trap <action> <signals...>

Actions can range from specific tasks, like cleaning up after ourselves or ignoring signals, to resetting a signal’s action to the default behavior again.

Calling a Function

Even though we don’t have to call a function, as a trap action is run just like when used with eval, using a function makes the code more straightforward and easier to maintain.

Let’s say we have a shell script using a WORKING_DIR for its tasks, and we want to clean up after ourselves when the script is done. Adding it to the end of the script will work when the script doesn’t encounter any issues, but it doesn’t work if the script gets terminated by a signal. So, let’s add a trap for a cleanup function:

shell

WORKING_DIR=$(mktemp -d)

# Declare a function that checks that the first argument exists
# and is a directory.
# This function is used to remove temporary directories.
_cleanup () {
    if [ -z "${1}" ]; then
        return
    fi

    if [ ! -d "${1}" ]; then
        return
    fi

    rm -r -- "${1}"
    exit
}

trap "_cleanup '${WORKING_DIR}'" INT QUIT ABRT TERM TSTP

# ... the rest of the script

_cleanup

This approach has some issues.

Are these the correct and all signals we need to handle? Other signals might terminate the process, and now, we can’t use any of the signals in any other way…

Also, we still have to call _cleanup at the end of the script. And in the case of a failing command, no cleanup is done.

Thankfully, trap also supports some non-signals, including one triggered on process exit.

Using Non-Signals

Depending on your shell, there are up to four additional non-signals available:

  • EXIT: Triggers on any process exit.
  • ERR: Triggers when any command completes with a non-zero status
  • DEBUG: Is triggered before every command to be executed
  • RETURN: Triggers each time a function or a sourced script returns.

With EXIT, we no longer need the final _cleanup call and can simplify the trap:

shell
trap "_cleanup '${WORKING_DIR}'" EXIT

Now, _cleanup is called no matter what!

Well, it’s not 100% correct. The trap won’t trigger on all exit conditions.

As you might have noticed in the first variant of the script, I didn’t include KILL (9) in the list of trapped signals. That’s because if you kill a process, you actually assassinate it without much chance for it to react. Also, STP (19), a non-terminal stop signal, can’t be trapped.

If a process could prevent those signals, it would open up a lot of security issues, as we always need a way to end or stop a process no matter what. The hardware equivalent would be pulling the plug as a last-ditch effort.

Most of the time, I’m happy with using EXIT for cleaning up. But you could also build a simple debugger for shell scripts with DEBUG, or add an error handler with ERR with additional infos:

shell
#!/usr/bin/env bash

_print_error () {
    echo "Error occurred:"
    awk 'NR>L-4 && NR<L+4 { printf "%-5d%3s%s\n",NR,(NR==L?">>>":""),$0 }' L=$1 $0
}

trap '_print_error "$LINENO"' ERR

echo "This is fine..."
this-isnt-fine
echo "This is fine again"

And here’s the output:

This is fine...
test.sh: line 13: this-isnt-fine: command not found
Error occurred:
10      trap '_print_error "$LINENO"' ERR
11      
12      echo "This is fine..."
13   >>>this-isnt-fine
14      echo "This is fine again"
15     

Disabling and Resetting Signal Actions

Setting a signal action with trap overrides the signal’s default action.

To replace a signal’s default action with a no-op, an empty string literal is used. For example, the keyboard combo ctrl-c can be disabled like this:

shell
trap "" INT

This approach can be useful to prevent a part of a script from being interrupted.

To re-enable the default action, a - (dash) action is used:

shell
# DISABLE INT FOR THE UPCOMING TWO COMMANDS
trap "" INT

uninterruptable_command_1
uninterruptable_command_2

# REENABLE DEFAULT ACTION FOR INT
trap - INT

might_be_interrupted_command

Instead of disabling a signal, I’d recommend handling it by printing out a message about why it’s ignored instead. As a user, I’m more likely to think something went wrong if SIGINT gets ignored, and I might use SIGTERM or even SIGKILL.

User-Defined Signals

The two signals SIGUSR1 (10) and SIGUSR2 (12) are free to use to indicate user conditions, as the system won’t emit them ever.

By default, both signals terminate the current process.


Conclusion

The trap command and POSIX signals are a neat way to interact with a shell script or any program that implements a handler. For example, the dd command prints its current progress when receiving SIGUSR1. Or haproxy uses SIGUSR1 and SIGUSR2 to gracefully stop or restart/reload the configuration files.

Personally, in shell scripts, I almost exclusively use EXIT for cleaning up after myself or catch SIGINT for critical parts. Nevertheless, it is worth knowing about the general concept of POSIX signals and how to trap them if necessary.


Resources