💾 Archived View for thrig.me › blog › 2023 › 06 › 09 › shell-script-that-runs-forever.gmi captured on 2024-07-09 at 01:04:58. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-11-14)
-=-=-=-=-=-=-
#!/bin/sh echo yes sleep 42
Critics will point out that this script will not run forever due to e.g. the heat death of the universe, the hardware running the script failing, the human getting bored, etc. However given a running system this script will (probably) run forever, the lite version of forever that ignores our daystar going all red giant.
The script does need a bit of help to run "forever". One lever is that shell scripts take pains to operate line-by-line, to ape an ape at a terminal. Either shells read single bytes to find where the next line is, or they read chunks and use lseek(2) to scoot the file pointer back to the start of the current line. This can be observed under ktrace(1) or similar, which is a great way to learn what system calls a process makes while going about its business. Especially if there are too many lines of code to easily follow what some too complicated shell is doing.
$ echo echo echo | ktrace sh - echo $ kdump | fgrep 'read 1' 32926 sh GIO fd 3 read 1499 bytes 32926 sh RET read 1499/0x5db 32926 sh GIO fd 0 read 1 bytes 32926 sh RET read 1 32926 sh GIO fd 0 read 1 bytes 32926 sh RET read 1 ...
`echo echo echo` is a way to make sh run `echo echo` which of course prints "echo\n". Layers! Do try to instead use printf(1) instead where possible. fd 0 is by default standard input, so sh here is doing single byte reads of the script.
Another bit of information you may need to know is that some file descriptors are shared between processes. This is what dup(2) is going on about:
The object referenced by the descriptor does not distinguish between oldd and newd in any way. Thus if newd and oldd are duplicate references to an open file, read(2), write(2) and lseek(2) calls all move a single pointer into the file, and append mode, non-blocking I/O and asynchronous I/O options are shared between the references. If a separate pointer into the file is desired, a different object reference to the file must be obtained by issuing an additional open(2) call.
http://man.openbsd.org/man2/dup.2
Therefore, given a file descriptor that is shared between a shell and some other program, the other program can periodically lseek(2) the file descriptor back to the beginning of the file, which will cause the script to run "forever".
#!/usr/bin/env perl use File::Temp 'tempfile'; my $fh = tempfile(); print $fh <<'FOREVER'; #!/bin/sh echo yes sleep 42 FOREVER seek $fh, 0, 0; my $pid = fork // die "Aaaaaaargh: $!\n"; if ( $pid == 0 ) { close STDIN; open STDIN, '<&', $fh; exec 'sh'; die "Aaaaaaargh: $!\n"; } while (1) { seek $fh, 0, 0; sleep 1 }
Here the standard input of sh has been wired up to the thus shared file descriptor from the controlling program; it's a lot more typing in C to do the same. Process tracing will again show the system calls Perl makes for a "<&" open call.
The code may halt—forever is not forever—if the system is so busy that 42 seconds pass before the other program can run at least one lseek(2). Unix isn't generally a real-time operating system, and systems can get slow under load. Usually someone will be reaching for the reset button.
Even more clever would be to use pipe(2) to create a communication channel for the script and the other process; the script could block on that or send state information over it for the other process to decide where to seek to. Is this practical? Probably not.
The main takeaway is that you should not edit a script while it is running, as who knows what will happen if the bytes change in the same inode. An editor that replaces the file with rename(2) is probably safe, but not all editors do that when they write. What does your editor do?
tags #sh #unix