💾 Archived View for thrig.me › blog › 2024 › 02 › 07 › first-argument.gmi captured on 2024-08-18 at 18:57:16. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2024-03-21)

-=-=-=-=-=-=-

First Argument Among Unixes

This question came up in chat somewhere: do the arguments to a program on unix start at the 0th or 1st index of the **argv list? The answer involves "it depends" and heavy doses of opinion.

Some languages take **argv apart and place the program name into a distinct variable from the subsequent arguments; various so-called scripting languages come to mind. Here the arguments have been split off from the program name in some way, which will make the arguments more distinct.

    $ sh -c 'echo $0; echo "$@"' a b c
    a
    b c
    $ perl -E 'say $0; say "@ARGV"' a b c
    -e
    a b c
    $ ruby -e 'puts $0; puts ARGV.to_s' a b c
    -e
    ["a", "b", "c"]

With this behavior one could claim that the arguments to a program start at the 1st index of **argv. Interpreted languages add a small wrinkle that a "shebang" line is parsed to figure out what binary should be run with the given file, or there's some shell magic to get the file off to the right place.

    $ cat script
    #!/bin/sh
    #|
    eval 'exec sbcl --script "$0" ${1+"$@"}'
    |#
    (format t "HELLO WORLD~&")
    $ chmod +x script
    $ ./script
    HELLO WORLD

As usual, process tracing will help show what is going on.

    $ ktrace ./script
    HELLO WORLD
    $ kdump | perl -ne 'print if /ARGS/../execve/'
     66327 ktrace   ARGS
            [0] = "/bin/sh"
            [1] = "./script"
     66327 sh       RET   execve JUSTRETURN
     66327 sh       ARGS
            [0] = "sbcl"
            [1] = "--script"
            [2] = "./script"
     66327 sbcl     NAMI  "/usr/libexec/ld.so"
     66327 sbcl     RET   execve JUSTRETURN

Meanwhile in C the first argument is the program name as **argv has not been unpacked. There are philosophical differences here; some hold that program behavior should not change based on the program name, while others do.

Time To Stop Using egrep and fgrep Commands, Per GNU grep

Here's a not very good example:

    #include <stdio.h>
    #include <string.h>
    int main(int argc, char *argv[]) {
        if (strcmp(argv[0], "./goodbye") == 0) {
            printf("goodbye\n");
        } else {
            printf("hello\n");
        }
    }

greeting.c

    $ make greeting
    cc -O2 -pipe    -o greeting greeting.c
    $ ln -s greeting goodbye
    $ ./greeting
    hello
    $ ./goodbye
    goodbye

Useful examples might include busybox, or how "cpio", "tar", and "pax" are hardlinks on OpenBSD.

    $ cd /bin
    $ ls -i | grep `stat -f '%i' cpio`
    52062 cpio
    52062 pax
    52062 tar
    $ cd -

So in this case argv[0] or the program name is being used as an argument to the program: depending on what the program name is, the program does different things. Not everybody uses this feature, though, and some may argue against it.

The calling side might also be good to show; here, the program name is passed as an argument which results sometimes in a doubling of the name: the first argument to the execlp function is what to look for in PATH, and the second argument is the program name to give that program:

    ...
    $ mv greeting ~/bin
    $ perl -e 'exec {"greeting"} "./greeting"'
    hello
    $ perl -e 'exec {"greeting"} "./goodbye"'
    goodbye
    $ rm ~/bin/greeting

With C the above looks something like the following. Note that usually there isn't a "./" used in the argument, which is part of why this is a bad example.

    #include <err.h>
    #include <unistd.h>
    int main(void) {
        //      to run      argv[0] in what is run
        execlp("greeting", "./goodbye", (char *) 0);
        err(1, "if you got here then execlp has failed");
    }