gemini - jsreed5.org

Some POSIX Tricks

---

Shell

The POSIX shell built-in utilities are:

.
:
break
continue
eval
exec
exit
export
readonly
return
set
shift
times
trap
unset

All other POSIX utilities--such as 'pwd', 'read', 'test' ('[') and 'true'--are external to the shell. Most shells include these and other utilities as built-ins.

An exit status of 0 is true/success; a nonzero exit status is false/failure. Contrast with languages that use Boolean algebra (1 is true; 0 is false).

'set -C' prevents '>' from overwriting existing files. You can override it at run time with '>|'.

You can check if a file descriptor is a terminal with '-t'. By checking if FD 0 (STDIN) is a terminal, you can determine if input is coming from an interactive session or being piped from somewhere else.

if [ -t 0 ]; then
  echo "input is coming from a terminal"
else
  echo "input is coming from a pipe"
fi

'read' and other commands will take input from STDIN, regardless of where STDIN is coming from. To force input from a terminal, pipe directly from '/dev/tty'.

read -r input < /dev/tty

Use $PWD to get the current working directory without having to call an external utility.

echo "$PWD"

POSIX shell does not have arrays. A crude workaround is to put elements in positional arguments with 'set'.

set -- 1 0 5 2 "hello world"
echo "$5"
expr "$1" + "$3"

To test if a string contains a substring, use 'case'. This enables checking for several substrings at once.

string="hello world"
substring="wo"
case "$string" in
  *"$substring"*) echo "the string contains the substring" ;;
  *) echo "the string does not contain the substring" ;;
esac

Alternatively, remove the largest part of the string that starts with the substring and compare the result to the original string. If they are equal, the attempted removal does not change the string, so the substring is not present. This is useful for one-liners.

string="hello world"
sub="wo"
[ "$string" = "${string%%"$sub"*}" ] && echo "substring not present"

The second syntax enables you to check if a string has multiple line without invoking 'wc'.

string="hello
world"
[ "$string" = "${string%%

	}" ] || echo "string has multiple lines"

Remove all instances of a substring from a string without using 'sed' or 'awk'. Loop continuously and use 'case' to check if the substring is present. If it is, remove the first occurrece and repeat. If it isn't, break out of the loop.

string="1f9774f4-ef87-4d32-b574-36f228414410"
sub="-"
while true; do
  case "$string" in
    *"$sub"*) string="${string%%"$sub"*}${string#*"$sub"}";;
    *) break;;
  esac
done

Create an empty file, or truncate an existing file, by writing ':' (no-op) to it.

: > /tmp/file.txt

'until' executes when the command given to it returns a nonzero exit status. This is equivalent to 'while !' but is more readable.

i=0
until [ $i -eq 10 ]; do
  echo $i
  i=$((i+1))
done

Group commands with either braces or parentheses. Braces execute the group in the current shell environment; parentheses spawn a subshell for the group. When using braces, all commands must be delimited, e.g. with a semicolon or a newline.

empty_string=0
string="nonempty"
[ -z "$string" ] && { empty_string=1; echo "string is empty"; }
[ "$string" = "hello" ] || ( hello_string="no"; echo "${hello_string}" )

This can be useful when several commands need to use data from a single pipe.

cat <<EOF > /tmp/file.txt
Tokyo
London
Paris
New York
EOF
sort /tmp/file.txt | {
  read -r first_entry
  echo "first entry: ${first_entry}"
  cat
}

':' behaves identically to 'true'. It saves an external call but is less readable.

while :; do
  read -r input
  echo "$input"
done

When writing a here-document, '<<-' strips leading tabs from the input lines, including from the delimiter. This is useful for managing indentation in a script.

cat <<-EOF
		no
	tabs
		here
	EOF

It is often tempting to reduce 'if'/'then'/'else' statements to a pseudo-ternary.

: > /tmp/file.txt
[ -f /tmp/file.txt ] && echo "file exists" || echo "file does not exist"

This can produce unexpected results. If the second command returns a nonzero exit status, the third command will execute, even if the first command returns 0.

echo "right" > /tmp/file.txt
[ -f /tmp/file.txt ] && grep -q "wrong" /tmp/file.txt || echo "file does not exist"

You can emulate the intended behavior by grouping the second command and returning 0 at the end of the group. This comes at the cost of losing the actual return value of the second command.

echo "right" > /tmp/file.txt
[ -f /tmp/file.txt ] && { grep -q "wrong" /tmp/file.txt; true; } || echo "file does not exist"

Command Replacements

Replace 'xxd -p':

Create a hex dump of data using 'od', then remove spaces with 'tr'.

echo "test file" > /tmp/file.txt
od -An -tx1 /tmp/file.txt | tr -d " "

The outputted hex is identical, but the formatting is not: 'od' prints 16 bytes per line, while by default 'xxd' prints 30 bytes per line. To emulate 'xxd -p -c0' (no column size limit), remove newlines as well as spaces.

od -An -tx1 /tmp/file.txt | tr -d " \n"

Replace 'xxd -r -p':

'fold' the hex dump into single bytes and loop over them. A nested 'printf' first converts the hexadecimal to octal, then prints the raw byte corresponding to the octal. Each line outputted by 'fold' must contain exactly 2 characters; otherwise an extra partial byte will appear in the result. Hexadecimal digits must be in lowercase. The dump can contain newlines (\n) if they do not split bytes; no other non-digit characters can be present. (Note: this is orders of magnitude slower than 'xxd -r', since it spawns a separate subshell for each byte of output. It is only useful for small amounts of data.)

echo "746573742066696c650a" > /tmp/file.txt
fold -w2 /tmp/file.txt | while IFS= read -r byte; do
  [ ${#byte} -eq 2 ] && printf "%b" "\\0$(printf "%o" "0x$byte")"
done

POSIX.1-2024 introduced the dollar-single-quote syntax, which allows bytes to be printed from their hexadecimals value directly. This removes the need for a subshell and is much faster.

fold -w2 /tmp/file.txt | while IFS= read -r byte; do
  [ ${#byte} -eq 2 ] && eval "printf \"%b\" \"


\x$byte'\""
done

awk

'awk' is the only POSIX source of randomness of any kind ($RANDOM is undefined). It prints random numbers between 0 and 1 using rand(). Initialize the randon-number generator by calling srand() first.

awk "BEGIN{srand(); print rand()}"

To mimic the behavior of $RANDOM, set minimum and maximum values of 0 and 32767 respectively, then scale rand() between them.

awk -v min=0 -v max=32767 "BEGIN{srand(); print int(min+rand()*(max-min+1))}"

Beware: this is not cryptographically secure, and it may not be random at all! POSIX specifies that srand() should use seconds since the epoch as a seed by default. This means in a given second, every random number generated by 'awk' will be the same. Some implementations require changing seeds manually. Wait at least 1 second after each call to give the seed time to change.

awk -v min=0 -v max=32767 "BEGIN{srand(); print int(min+rand()*(max-min+1))}"
sleep 1

Because the default seed for srand() is the seconds since the epoch, you can use 'awk' to print the epoch, which POSIX 'date' does not support.

awk "BEGIN{srand(); print srand()}"

cat

POSIX 'cat' buffers output by default. If it is piped into another command, the command may not receive data until 'cat' outputs a sufficiently large chunk or even the whole input. Use '-u' to unbuffer.

echo "line 1" > /tmp/file1.txt
echo "line 2" > /tmp/file2.txt
cat -u /tmp/file1.txt /tmp/file2.txt | while IFS= read -r line; do
  echo "line length: ${#line}"
done

The vast majority of 'cat' implementations today do not buffer output. They accept '-u' for compatibility but internally ignore it.

date

POSIX 'date' does not support numerical timezones, so it cannot write ISO 8601-compliant local date-times. You must use UTC and hardcode the timezone string.

date -u +%Y%m%dT%H%M%SZ
date -u +%Y-%m-%dT%TZ
date -u +%Y-%m-%dT%T+00:00

expr

'expr' can perform regex comparisons using BRE (basic regular expressions). If the string matches the pattern, 'expr' returns the length of the matching part of the regex; otherwise it returns 0. The pattern is left-anchored.

Some implementations of 'expr' give an error if the string is empty. Use an extra fixed character to avoid this.

string="2005-05-19"
if [ "$(expr "x$string" : "x[0-9]{4}")" ]; then
  echo "string matches regex"
else
  echo "string does not match regex"
fi

fc

'fc' is less compact but more robust than '!'. By default it edits the selected commands before running them.

fc 100

List a range of commands in reverse order.

fc -l -r 50 75

Execute the last command immediately.

fc -s -1

find

POSIX 'find' does not have the -mindepth or -maxdepth options. However, if a given name or path is a directory, -prune prevents its traversal. To emulate '-mindepth 1 -maxdepth 1' (list the contents of a directory but not subdirectories), prune all paths that do not match the target path for 'find'.

dir="/tmp"
find "$dir" \( ! -path "$dir" -prune \) -print

Alternatively, use a 'for' loop. Beware: if the target path is empty, globs (*) are treated literally! Check that the result exists before operating on it.

for item in /tmp/*; do
  [ -e "$item" ] && echo "$item"
done

mkfifo

FIFO (first in, first out) special files, also called named pipes, look like regular files on disk but behave like pipes. Processes can send data to each other ephemerally by reading from and writing to the FIFO. Data is only sent if the FIFO is being read and written simultaneously; writing without a reader will block, and reading without a writer will return EOF.

mkfifo /tmp/fifo
echo "input" > /tmp/fifo &
cat /tmp/fifo

You can open FIFOs as a file descriptor. If you open it for both reading and writing, the FIFO does not block on writes (up to the size of the pipe buffer). Note that 'cat' blocks in this example because the FIFO, which is open for reading, never sends an EOF.

mkfifo /tmp/fifo
exec 3<> /tmp/fifo
echo "left" > /tmp/file.txt
sed "s/lef/righ/" /tmp/file.txt >&3
cat <&3

POSIX.1-2024 introduced the ability for 'read' to split logical lines with a user-specified delimiter rather than always using a newline. You can append a NUL byte to the output of a command, redirect the output to the FIFO file descriptor, then 'read' from the FIFO into a variable using a NUL delimiter. This puts the entire command output into the variable without looping over lines and without spawning a subshell.

Note: if the filesystem on which you create the FIFO is mounted on memory, this will likely be faster than a subshell. Otherwise, accessing the FIFO uses the disk, which is slower than a subshell.

mkfifo /tmp/fifo
exec 3<> /tmp/fifo
echo "left" > /tmp/file.txt
{ sed "s/lef/righ/" /tmp/file.txt; printf "%b" "\0"; } >&3
read -r -d "" output <&3
echo "$output"

FIFOs are easily subject to race conditions. Do not perform multiple reads or writes on one FIFO simultaneously.

tee

The output of 'tee' is guaranteed to be unbuffered, unlike 'cat'.

cat <<EOF > /tmp/file.txt
name,age
Alice,24
Bob,31
Carol,39
EOF
{
  read -r header
  echo "$header" 1>&2
  tee
} < /tmp/file.txt

---

Up One Level

Home

[Last updated: 2025-03-10]