💾 Archived View for dmerej.info › en › blog › 0024-writing-clean-shell-scripts.gmi captured on 2024-05-12 at 15:10:57. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2022-07-16)

-=-=-=-=-=-=-

2016, Oct 30 - Dimitri Merejkowsky
License: CC By 4.0

I don't enjoy writing such code that much.

It's often used when all you want is automate a mundane task. "I'll just copy/paste the commands I usually run, add a few `if`s and `for`s and that will be enough"

Well, that's how many shell scripts come to existence I guess, but nonetheless writing such scripts is not as easy as it sounds, and there are many pitfalls to avoid.

Here a few tips you may find useful.


Use Bash

Sometimes you'll come across a `.sh` script with a `/bin/sh` shebang. (That is, a file that starts with `#!/bin/sh`)

I believe you should not do that, unless you know what you are doing.

If your script starts with `#!/bin/sh`, it's telling the operating system that the script should be run with the `/bin/sh` binary.

POSIX says that `/bin/sh` should exist and point to a "POSIX compliant" shell.

But on debian, it's a symlink to `/bin/dash`, and on Arch Linux, it's a symlink to `/usr/bin/bash`.

So if you use a `#!/bin/sh` shebang, be prepared to get weird errors when switching distributions, *or* prove yourself that the code you wrote is indeed "POSIX".

I find it much easier to just stick a `#/bin/bash` shebang and call it a day.

Enable nice options

Bash has a lot of "switches" you can activate with the `set` built-in.

(Type `set -o` to get a list of them)

Here are a few useful ones.

`set -e`

(Or `set -o errexit` if <a href="0007-dont-use-short-options.gmi">using long options</a> better suits you)

Adding that at the top of the script will make sure errors will not be silently ignored, which is nice.

Note if you *do* want to allow a command to fail, you can simply use a `|| true` to do the trick:

#!/bin/bash
set -o errexit

cd path/to/foo
command-that-may-fail || true

`set -u`

(Or `set -o nounset`)

You will get an error each time you use a "unbound" variable, that is a variable that has no value yet.

set -o nounset

my_option="foo"

echo $my_optoin

$ bash foo.sh
foo.sh: line 4: my_optoin: unbound variable

`shopt -s failglob`

Let's say you want to do something with all the markdown files (ending with `.md`) in the current directory.

But sometimes there are no such files:

# Without `shopt -s failglob`
$ do_something *.md
# calls do_something with the literal '*.md' string

# With shopt -s failglob
$ do_something *.md
# fails with:
foo.sh: line 6: no match: *.md

Avoid useless pipes

Very often you can get rid of a pipe if you use the correct syntax.

Here are some examples:

# bad
cat foo.txt | grep bar
# better
grep bar foo.txt

# bad
grep bar foo.txt | wc -l
# better
grep -c bar foo.txt

# you want to replace 'foo' by 'bar' in the
# value of $my_var:

# bad
my_new_var=$(echo $my_var | sed -e s/foo/bar/)
# better
my_nev_var=${my_var/foo/bar}

By the way, the last example is one of the many things you can do with Bash variables. Here's a list of the parameter substitutions[1] you can use.

1: http://www.tldp.org/LDP/abs/html/parameter-substitution.html

Learn to use sub-shells

Let's say you want to run the `make` command in all the subdirectories of your current working directory.

proj_1
|_ Makefile
|_ proj_1.c
proj_2
|_ Makefile
|_ proj_1.c

You may start by writing:

for project in */; do
  cd ${project} && make
done

But that won't work. After `cd proj_1`, you must go back to the top directory so that `cd proj_2` can work.

You *could* workaround that using `popd` and `pushd` that allow you to maintain a "stack" of working directories, but there's an easier way:

for project in */; do
  (cd ${project} && make)
done

By using parentheses, you've created a "sub-shell" that won't interfere with the main script.

Use static analysis

Yes, you can do this for bash scripts too :)

I like to use shellcheck[2] for this.

2: https://www.shellcheck.net/

Here's a sample of what `shellcheck` can do:

In foo.sh line 40:
find . -name "*.back" | xargs rm
^-- SC2038: Use -print0/-0 or -exec + to allow for non-alphanumeric filenames.

read name
^-- SC2162: read without -r will mangle backslashes.

$bin/foo bar.txt
^-- SC2086: Double quote to prevent globbing and word splitting.

my_cmd *
^-- SC2035: Use ./*glob* or -- *glob* so names with dashes won't become options.

The best thing about `shellcheck` is that each error message leads you to a detailed page explaining the issue.

Be careful with coreutils

The so-called `coreutils` (`cp`, `mv`, `ls`, ...) come with various flavours. Basically, there's the "GNU" and the "BSD" flavors, so be careful to not use things that only work in the "GNU" version.

This can happen when you switch from linux to OSX or vice-versa.

(for instance `cp foo.txt bar.txt --verbose` will *not* work on OSX, you have to put the option `--verbose` before the arguments)

Alternatives to shell scripts

Actually, I'd highly recommend using a high level langage for this.

For instance, with Python, path.py[3] and sh[4] you can write code that "feels" like shell script but is not:

3: https://pypi.python.org/pypi/path.py

4: https://amoffat.github.io/sh/

# In Bash:
for project in */ ; do
  (
    cd "${project}"
    git clean --force
    git reset --hard
    make
  )
done

# In Python:
import path
import sh

for project in path.Path(".").dirs():
    with project:
        sh.git.clean(force=True)
        sh.git.reset(hard=True)
        sh.make()

Note that by default `sh` swallows the output when the command is successful, but displays a nice error message when something goes wrong, which is usually what you want. If you still want to display the output of the command, you can use `print(sh.make())` or something similar.

----

Back to Index

Contact me