💾 Archived View for thrig.me › blog › 2024 › 02 › 07 › first-argument.gmi captured on 2024-07-09 at 01:17:23. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2024-03-21)
-=-=-=-=-=-=-
This question came up in chat somewhere: do the arguments to a program on unix start at the 0th or 1st index of the **argv list? The answer involves "it depends" and heavy doses of opinion.
Some languages take **argv apart and place the program name into a distinct variable from the subsequent arguments; various so-called scripting languages come to mind. Here the arguments have been split off from the program name in some way, which will make the arguments more distinct.
$ sh -c 'echo $0; echo "$@"' a b c a b c $ perl -E 'say $0; say "@ARGV"' a b c -e a b c $ ruby -e 'puts $0; puts ARGV.to_s' a b c -e ["a", "b", "c"]
With this behavior one could claim that the arguments to a program start at the 1st index of **argv. Interpreted languages add a small wrinkle that a "shebang" line is parsed to figure out what binary should be run with the given file, or there's some shell magic to get the file off to the right place.
$ cat script #!/bin/sh #| eval 'exec sbcl --script "$0" ${1+"$@"}' |# (format t "HELLO WORLD~&") $ chmod +x script $ ./script HELLO WORLD
As usual, process tracing will help show what is going on.
$ ktrace ./script HELLO WORLD $ kdump | perl -ne 'print if /ARGS/../execve/' 66327 ktrace ARGS [0] = "/bin/sh" [1] = "./script" 66327 sh RET execve JUSTRETURN 66327 sh ARGS [0] = "sbcl" [1] = "--script" [2] = "./script" 66327 sbcl NAMI "/usr/libexec/ld.so" 66327 sbcl RET execve JUSTRETURN
Meanwhile in C the first argument is the program name as **argv has not been unpacked. There are philosophical differences here; some hold that program behavior should not change based on the program name, while others do.
Time To Stop Using egrep and fgrep Commands, Per GNU grep
Here's a not very good example:
#include <stdio.h> #include <string.h> int main(int argc, char *argv[]) { if (strcmp(argv[0], "./goodbye") == 0) { printf("goodbye\n"); } else { printf("hello\n"); } }
$ make greeting cc -O2 -pipe -o greeting greeting.c $ ln -s greeting goodbye $ ./greeting hello $ ./goodbye goodbye
Useful examples might include busybox, or how "cpio", "tar", and "pax" are hardlinks on OpenBSD.
$ cd /bin $ ls -i | grep `stat -f '%i' cpio` 52062 cpio 52062 pax 52062 tar $ cd -
So in this case argv[0] or the program name is being used as an argument to the program: depending on what the program name is, the program does different things. Not everybody uses this feature, though, and some may argue against it.
The calling side might also be good to show; here, the program name is passed as an argument which results sometimes in a doubling of the name: the first argument to the execlp function is what to look for in PATH, and the second argument is the program name to give that program:
... $ mv greeting ~/bin $ perl -e 'exec {"greeting"} "./greeting"' hello $ perl -e 'exec {"greeting"} "./goodbye"' goodbye $ rm ~/bin/greeting
With C the above looks something like the following. Note that usually there isn't a "./" used in the argument, which is part of why this is a bad example.
#include <err.h> #include <unistd.h> int main(void) { // to run argv[0] in what is run execlp("greeting", "./goodbye", (char *) 0); err(1, "if you got here then execlp has failed"); }