Time-stamp: <2024-02-22 21h26 UTC>

Formatting strings with separators in BQN

Let's open boldly, this entire post is about the following

(∨`' '⊸≠)⊸/∘»∘⥊(∘‿3⊸⥊(¯3⊸×⌈)∘(÷⟜3≠)⊸↑∘⊢)∾˘⊣

or as CBQN's `)explain` helpfully puts it

 (∨`' '⊸≠)⊸/∘»∘⥊(∘‿3⊸⥊(¯3⊸×⌈)∘(÷⟜3≠)⊸↑∘⊢)∾˘⊣
  │  │  │  │ │ │ │ │ │ │  ││   │ ││  │ │ │ │
  │  │  │  │ │ │ │ │ │ │  ││   │ ││  │ │ ∾˘│
  │  │  │  │ │ │ │ │ │ │  ││   ÷⟜3│  │ │  ││
  │  │  │  │ │ │ │ │ │ │  ││    ├─≠  │ │  ││
  │  │  │  │ │ │ │ │ │ ¯3⊸×│    │    │ │  ││
  │  │  │  │ │ │ │ │ │   ├─⌈    │    │ │  ││
  │  │  │  │ │ │ │ │ │   └───∘──┘    │ │  ││
  │  │  │  │ │ │ │ │ │       └──────⊸↑ │  ││
  │  │  │  │ │ │ │ │ │              └─∘⊢  ││
  │  │  │  │ │ │ ∘─3 │                │   ││
  │  │  │  │ │ │ └──⊸⥊                │   ││
  │  │  │  │ │ │    ├─────────────────┘   ││
  │  │  │  │ │ │    └─────────────────────┼⊣
  │ ' '⊸≠  │ │ │                          │
  ∨`   │   │ │ │                          │
   ├───┘   │ │ │                          │
   └──────⊸/ │ │                          │
          └─∘» │                          │
            └─∘⥊                          │
              ├───────────────────────────┘
╶─────────────┘

Wouldn't it help to know what problem the above piece of code is trying to solve before going further?

Yep, it sure would. Instead of that, however, let's try an example

   F ← (∨`' '⊸≠)⊸/∘»∘⥊(∘‿3⊸⥊(¯3⊸×⌈)∘(÷⟜3≠)⊸↑∘⊢)∾˘⊣
(∨`' '⊸≠)⊸/∘»∘⥊(⟨ ∘ 3 ⟩⊸⥊(¯3⊸×⌈)∘(÷⟜3≠)⊸↑∘⊢)∾˘⊣

   ',' F¨ ⟨"1", "10", "100", "1000", "10000", "100000", "1000000"⟩
⟨ "1" "10" "100" "1,000" "10,000" "100,000" "1,000,000" ⟩

(indented line is input, un-indentend line is output)

What's going on in the above? First we assign this function to a name so that it's easier to call it. Then we apply it. How? We have the two inputs, ',' on the left and the list on the right, the function in the middle and each (¨) modifier indicating we want to apply the function to each member of the list.

Thus you might well guess that using the embedded threes and the ',', this function groups 'digits' into groups of three separated by the provided character. Well ... that's exactly it.

Training⁰ our understanding of BQN

I'm going to take various pieces of BQN for granted, but try to remind us of the larger structural parts as we go along. The first most important part is trains, see eg

2024-02-12 - APL-likes as a notation for thought

for an incompetent account. Recall from there¹ that a 3-train is a composition

f g h

which when applied dyadically, that is to two arguments, yields the arguably intuitive result

𝕨 (f g h) 𝕩 <===> g(f(𝕨, 𝕩), h(𝕨, 𝕩))

BQN (and other languages) offer up an extension of this 3-train parsing rule: a 3-train may be preceded by a monadic function m and the resulting semantics are

m f g h <===> m (f g h)                                                       ²
𝕨 (m f g h) 𝕩 <===> m(g(f(𝕨, 𝕩), h(𝕨, 𝕩)))

With this in mind, let's see whether we can descry the broad outlines of what is going on above.

Let's trim-down the picture at the top of this page for a second

(∨`' '⊸≠)⊸/∘»∘⥊(∘‿3⊸⥊(¯3⊸×⌈)∘(÷⟜3≠)⊸↑∘⊢)∾˘⊣
              │                         │ │
              │                         ∾˘│
              │                          ││
              │    └─────────────────────┼⊣
              │                          │
             ∘⥊                          │
             ├───────────────────────────┘

I've purposefully suppressed some detail to emphasise the important features:

Thus the overall form of our function is

m f g h

or in greater detail

m <===> (...∘⥊)
f <===> (...)
g <===> ∾˘
h <===> ⊣


m f g h <===> (...∘⥊) (...) ∾˘ ⊣

The remaining parts of interest are the leading monadic function and the the left dyadic function in the 3-train. We'll take a quick look at those in what follows, but here, it's dangerous to go alone

take this BQN documentation

The left dyadic function

known to its friends as

 ∘‿3⊸⥊(¯3⊸×⌈)∘(÷⟜3≠)⊸↑∘⊢
 │ │ │ │  ││   │ ││  │ │
 │ │ │ │  ││   ÷⟜3│  │ │
 │ │ │ │  ││    ├─≠  │ │
 │ │ │ ¯3⊸×│    │    │ │
 │ │ │   ├─⌈    │    │ │
 │ │ │   └───∘──┘    │ │
 │ │ │       └──────⊸↑ │
 │ │ │              └─∘⊢
 ∘─3 │                │
 └──⊸⥊                │
    ├─────────────────┘
╶───┘

Here our outermost structure is that of a 2-train. Recall that we may combine a monadic function m with a dyadic function f as `m f` with application semantics

𝕨 (m f) 𝕩 <===> m(f(𝕨,𝕩))

In the above our monadic function m is

∘‿3⊸⥊

which reshapes the list it's applied to into Nx3 (where N, or '∘' in the code above, is determined dynamically based on the length of the list). Not too much going on over there.

On the other hand the dyadic function is somewhat more worth looking at.

(¯3⊸×⌈)∘(÷⟜3≠)⊸↑∘⊢

has two important things going on:

Here's the interesting part about about that monadic application: it's really a dyadic function being applied!

The crucial point here are the semantics of before (⊸):

f⊸g 𝕩 <===> g(f(𝕩), 𝕩)
𝕨 f⊸g 𝕩 <===> g(f(𝕨), 𝕩)

Isn't that so cool!? Incidentally the dual of before is after (⟜):

g⟜f 𝕩 <===> g(𝕩, f(𝕩))
𝕨 g⟜f 𝕩 <===> g(𝕨, f(𝕩))

and the fun observation is that using both before and after coincides with 3-trains (almost)

f⊸g⟜h 𝕩 <===> g(f(𝕩),h(𝕩)) <===> (f g h) 𝕩
𝕨 f⊸g⟜h 𝕩 <===> g(f(𝕨),h(𝕩)) <===> 𝕨 ((f⊣) g (h⊢)) 𝕩                          ³

Nice!

Putting this together, prefixes (↑) is dyadic but we combine it with before of

(¯3⊸×⌈)∘(÷⟜3≠)

which computes -3*ceiling(len( input )/3). What does taking a prefix of negative length do? Why, fill the beginning with the fill element

   ¯20↑"of course"
"           of course"

This is really the heart of the algorithm, after this we reshape the string-which-has-been-prefixed-padded-to-a-length-of-a-multiple-of-three to be a table of three-length strings, and fiddle with the table by joining on each row (∾˘) the separator (⊣).

   ∘‿3⊸⥊(¯3⊸×⌈)∘(÷⟜3≠)⊸↑ "1000"
┌─
╵"  1
  000"
      ┘
   ',' ((∘‿3⊸⥊(¯3⊸×⌈)∘(÷⟜3≠)⊸↑∘⊢)∾˘⊣) "1000"
┌─
╵"  1,
  000,"
       ┘
   ',' (⥊(∘‿3⊸⥊(¯3⊸×⌈)∘(÷⟜3≠)⊸↑∘⊢)∾˘⊣) "1000"
"  1,000,"

... unfortunately this leaves us with an ugly prefix of blanks, as well as a trailing delimiter, which brings us to

The leading monadic function

(∨`' '⊸≠)⊸/∘»∘⥊

There's not a whole lot going on here, except once again for this story about after (⊸) so that

(∨`' '⊸≠)⊸/ 𝕩 <===> (∨`' '⊸≠ 𝕩) / 𝕩 <===> (∨`' '≠𝕩) / 𝕩

Note: don't be caught out by ' '⊸≠ , ' ' is not a function. In this case ⊸ is called bind

https://mlochbaum.github.io/BQN/doc/hook.html#bind

That's it!

I'm going to stop short of explaining the outstanding details, hopefully with the excellent documentation and some interest the rest can be understood readily. To me what turned this mundane exercise in programming (group digits interspersing separator) into something interesting and worthwhile was first and foremost the lure of leveraging a useful implicit composition semantics with a robust set of combinators to express a functional solution. That we might express our thoughts in terms of arrays was an added and pleasant bonus.

All in all, i do hope that this short post manage to rouse some interest⁴ in you dear reader and that your BQN (or APL-like) journey begins soon!

---

⁰ a soon-to-be-explained pun

¹ or better yet, elsewhere

² the language is right associative

³ the parentheses around h⊢ are redundant

⁴ i'll settle for perverse curiosity