💾 Archived View for l-3.space › log-240212-1.gmi captured on 2024-03-21 at 14:49:55. Gemini links have been rewritten to link to archived content

View Raw

More Information

-=-=-=-=-=-=-

Time-stamp: <2024-02-12 13h42 UTC>

APL-likes as a notation for thought

I recently took it upon myself to fill a hole in my exposure to branches of the programming tree and learn an APL-like.

APL is often labelled as an "array programming language", indeed its name is derived therefor, but having spent some time learning the rudiments of one of the languages "descendent" from APL i feel that this "array language" moniker is missing the wood for the trees.

APL-likes definitely have arrays

I somewhat think that this is worth clearing up right away. It is absolutely an important quality, and one of the "features" of these languages. If you haven't seen one of these languages before, you might be thinking:

So what, language X also has arrays

I contend that language X doesn't have *arrays* in the first-class, pervasive sense that APL-likes do.

There are a few important facets to this claim. In APL-likes

let's unpack these

arrays are not nested list, and are inherently arbitrary dimensional

In language X supporting "arrays" it's quite likely that the way to obtain a two-dimensional array (hereafter table) is to nest lists:

ragged_jagged = [[1,2], [3,4,5]]

The above is likely a valid-looking assignment, syntax and possible pointer semantics notwithstanding. This is manifestly _not_ the case in APL-likes⁰. In APL-likes, a table might look like

┌─
╵ 0 1 2
  3 4 5
        ┘

This is simply a two-dimensional object, and it's no more a list of lists than a three-dimensional array

┌─
╎ 0 1 2
  3 4 5

  5 6 7
  7 8 9
        ┘

is a list of tables or a table of lists or a list of lists of lists. That is to say, there are primitive things called arrays, and although we give special names to various low dimension instances of these (dim 1 = list, dim 2 = table), and although we can nest these in funny ways, the primitive N-dimensional version of an array will always be different to funny nestings of lower-dimensional objects.

A corollary of this is that arrays have a well-defined shape, so that in particular they are entirely regular: all "rows" of a table are the same length, etc.

This may seem like pedantry, but it's necessary build up to the next facet.

built-ins are rank polymorphic

Examples are easier than definitions here, in a made-up syntax

    ┌─           ┌─
1 + ╵ 0 1 2   =  ╵ 1 2 3
      3 4 5        4 5 6
            ┘            ┘
          ┌─           ┌─
⟨ 1 2 ⟩ + ╵ 0 1 2   =  ╵ 1 2 3
            3 4 5        5 6 7
                  ┘            ┘
┌─           ┌─           ┌─
╵ 0 1 2   +  ╵ 1 2 3    = ╵ 1  3  5
  3 4 5        5 6 7        8 10 12
        ┘            ┘              ┘

In the first case + of a scalar and an table knows to add the scalar to each element of the table. In the second case, something fancy called the leading axis convention¹ dictates that the 1 is paired with the first "row" of the table and the 2 is paired with the second reducing this to the scalar and array case. In the last example, each row may be paired and their elements added point-wise.

If this is confusing, or doesn't seem to follow a rule that is discernible, allow me to invite you to look at the lucid explanation in ¹ , but you may have to learn some things besides to make sense of it.

Anyway, the point here is somehow that "rank polymorphism" means, among other things, that basic operations "do what you want"².

This and the above feature set aside "array programming languages" from "programming languages with arrays".

In my opinion, that's still not the point.

APL-likes encode the calculus of composition

The core innovation, for me anyway, that APL-likes present over other programming languages is a robust and comprehensive framework for expressing planar composition. A distillation of this claim to its essential steps might be

let's look at these one at a time.

of all the arities, binary functions are especially prevalent in programming

This is just, like, my opinion man. Nevertheless i think it's true to some extent. Certainly if we count arithmetic operations, binary functions are used more than unary functions.

It doesn't matter whether they're _the_ most prevalent, only that "doing something using two other things" is so common that having a dedicated syntax to make this ergonomic makes sense. Let's agree that it does :) Which brings us to ...

infix notation for binary functions provides a compact and clear function-application semantics

This observation is crucial, if our hypothesis is that we often want to write things that look like "f(x, y)" then why not simplify the syntax

x f y <===> f(x, y)

this is certainly very ordinary in many programming languages (think "foo = 2 + 4"), but few languages take the logical next step and allow more than logic/arithmetic functions to use infix notation; fewer still allow the user to define infix functions and apply them in this manner. Very few indeed _require_ all binary functions to be infix and all binary applications to be notated in this fashion.

The above notation extends logically to unary functions

f x <===> f(x)

which again is rather ordinary for _some_ unary functions in many programming languages (commonly unary negation, for example), but the same caveats apply. Few languages extend this to all unary functions; very few indeed _require_ all unary function applications to be of this form.³

Supposing then that we believe that this infix application syntax is useful, why then restrict this to application?

it is natural to represent composition patterns using infix notation

Recall that function application is in particular function composition. Should we identify our values x with points⁴ x : * -> X, then given f : X -> Y the value f(x) may be identified with the below composition.

f(x) <===> f∘x : * -> Y

Let's extend this to our infix binary application notation! Supposing f and h are unary functions, let's write

f g h <===> f(g, h)

which is to say

x (f g h) y <===> g(f(x), h(y))

because of course it must be! In fact this notation is already common in mathematics: we often write things like f+h to mean "the function given by (x, y) -> f(x)+h(y)". This is precisely our fgh above where g = +.

If we wished the result to be unary then we might write f+h to mean "the function given by x -> f(x)+h(x)", and this suggests that our notation should also have the semantics

(f g h) x <===> g(f(x), h(x))

On this account then, we could spell the "average" function (list -> avg(list)) as follows:

(sum ÷ length) x <===> ÷(sum(x), length(x))

and indeed so goes the famous example (here in BQN)

(+´÷≠) ⟨1, 2, 3⟩ ===> 2

This insight naturally suggests another. What if, in the above, f and h were binary? Should the expression fgh make sense? What should its semantics be? If you feel there's only one reasonable guess here, you may already be an APL'er

x (fgh) y <===> g(f(x, y), h(x, y))

this is the core of APL's insight into the calculus of composition

We have essentially just understood the utility of the so-named "phi combinator", or in APL parlance, a 3-train. Though this notation is extended slightly in two more cases ("B combinator", or 2-trains; a free atop for a 3-train) "that's it" basically. From here, having other composition combinators floating around and being able to write them into trains is what---at least on my account---gives APL-likes their unique flavour and power.

So sure, APL might _have_ arrays, but it's not _about_ arrays: it's about composition.

Happy array programming!

If you'd like to learn more

I can highly recommend

The APL wiki in general!

APL wiki article on trains

some beautiful tables and pictures from BQN docs

---

⁰ unless you want it to be

¹ BQN's excellent documentation on leading axis convention

https://mlochbaum.github.io/BQN/doc/leading.html

² and if you want things other than the default, there are ways to change that too

³ let's forgo the technical distinction here and say that a unary function application is both infix and prefix (though not postfix)

⁴ functions whose domain is the point classifier, for sets this is a/the set with one element