💾 Archived View for thrig.me › tech › subshells.gmi captured on 2024-07-09 at 01:43:11. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2024-06-16)
-=-=-=-=-=-=-
A unix shell at various points will spawn a subshell that will run various commands. Some subshells are denoted with parenthesis, so may be confused with mathematical (or C language) operator precedence. There is some precedence here in that the subshell commands are run in a different process. Subshells will also be spawned when they may not be expected.
A common use for a subshell is to scope commands independent of the parent process, similar to the different scope of a function call. The current working directory is global to a process, so if we want to temporarily change that one way is to spawn a subshell that operates in some other directory.
$ pwd /tmp $ ( cd /etc && pwd ) /etc $ pwd /tmp
More usefully this might be combined with a command that relies on the current working directory, so this is typically used with filesystem tools such as tar(1) or rsync(1). Always be sure to check that the chdir call does not fail, which is what the "&&" is for. The rsync(1) manual by contrast shows the dangerous "cd /foo; rsync ..." form. What happens when rsync runs in the wrong directory when the cd call fails?
$ (cd /etc && tar cpf - hosts) | ssh server '(cd /tmp && tar xpf -)'
This is still perhaps not useful, but shows how tar(1) can be used within shell subshells to move files between arbitrary directories on arbitrary hosts without changing the working directory of the parent process. The subshell on the server side is not here necessary. Also, this example is potentially a security problem as writing a known filename to a shared /tmp directory is a bad pattern.
Another use of a subshell is to delay actions, though there are better (if more complicated) ways to sequence events across different processes. Also note that unix is typically not a real-time operating system so if very busy the timing may take much longer than expected.
$ cat inbackground #!/bin/sh (sleep 3 ; printf 'ONE\n') & (sleep 1 ; printf '\nTWO\n') & $ sh inbackground $ TWO ONE
The "&" was used here to background the entire subshell; if instead ";" was used the subsequent command would wait for the prior command to finish.
$ (sleep 3;echo 1);(sleep 1;echo 2)& 1 [1] 15662 $ 2
Different shells may display background jobs and their completion differently; the above has all been ksh on OpenBSD 7.5. Also I am simply copying and pasting from the shell; the prompt configuration is pretty minimal.
$ grep PS ~/.kshrc [[ -z "$SSH_CLIENT" ]] && PS1='$ '
There are other differences between shells; in particular, a "while" loop involves a hidden subshell that may be on one or the other side of the loop.
$ cat while #!/bin/sh n=42 echo "begin $n" echo foo | while read line; do n=99 echo "while $n" done echo "outer $n" $ sh while begin 42 while 99 outer 42
Here the subshell is within the while loop as the variable assignment did not affect that of the parent shell. If we instead run the above program under ZSH,
$ zsh while begin 42 while 99 outer 99
we observe that where the parent shell is is reversed from where sh puts it. ZSH differs from POSIX shells in various other ways, such as not (by default) having the insane auto-split and auto-glob of unquoted variable names.
How can one detect a subshell? This can be tricky, as besides from indirect reports from pwd(1) or variable assignments the subshell carries over much that is unchanged from the parent process, such as the current process ID. With suitable delays (so that a slow human can catch the action) process tree tools will show the different child processes.
$ echo $;(echo $) 23998 23998 $ (sleep 99;echo $) & [1] 10613 $ pstree | fgrep -2 sleep | \-+= 23998 jhqdoe /bin/ksh | |-+= 10613 jhqdoe /bin/ksh | | \--- 08286 jhqdoe sleep 99 | |-+= 10382 jhqdoe pstree | | \-+- 65788 jhqdoe sh -c ps -kaxwwo user,pid,ppid,pgid,command ...
There is also the "{}" form that replaces "()" when you want a so-called "compound construct" but without the subshell of the parenthesis. A subshell is more expensive (especially on systems where forks are slow) though if you're worried about performance then the code should be written in some other language.
$ ( echo foo; echo bar; ) foo bar $ { echo foo; echo bar; } foo bar