💾 Archived View for dcreager.net › stack-graphs › concatenative-language.gmi captured on 2024-05-12 at 15:41:50. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2023-07-22)

-=-=-=-=-=-=-

A concatenative language for stack graphs

The CC language

CC is a concatenative continuation-passing language. CC programs operate on a stack of data values. Because CC is a continuation-passing language, there is no control stack or runtime call stack. (Though see below how we could implement a ‘call’ instruction with syntactic sugar that wraps the rest of the instruction stream in a quotation.)

There are three kinds of value that can appear on the data stack:

P ≜ q
q ≜ i | q;q

A CC program ‘P’ consists of a single quotation. A clause ‘q’ is a sequence of one or more instructions.

The possible instructions are as follows. (Note that some instructions are “terminals”, which means that any instructions appearing after it in a quotation are ignored.)

General stack manipulation

Atoms

Quotations

Snapshot

TODO: Do we need snapshots?

Closure

A closure is a special tuple, which consists of a quotation and a stack (represented as a tuple). It gives you the ability to throw away your current execution state and resume somewhere else.

[This is more typically called a “continuation” in other concatenative languages, but I don't want to confuse names with the continuation used below in the execution semantics.]

Syntactic sugar

All of the following instructions don't need to be first-class in the formalism, and can be implemented in terms of other instructions:

Syntactic sugar: Pairs

We don't need pairs, just store the two elements on the stack.

Syntactic sugar: Tagged unions

We need literals for the tag values (typically ‘LEFT’ and ‘RIGHT’, which is sufficient to construct any other union)

CC execution semantics

v ≜ x | q | s
s ≜ ∘ | v | s⋅s

∘⋅s = s⋅∘ = s
s₁⋅(s₂⋅s₃) = (s₁⋅s₂)⋅s₃
(s₁,s₂,...) ≜ s₁⋅s₂⋅...

CC programs operate on, and executing one produces, a stack of values. Literal values and quotations are represented directly. Tuples are represented by a stack containing the tuple's values.

S ≜ s▹κ
κ ≜ s→s
⟦⋅⟧ ≜ S↔S

The current state of a CC program consists of a stack, along with a continuation representing any additional work that needs to be performed. The interpretation ‘I’ of each language construct is a relation mapping a before state to an after state.

These semantics are similar to that of Oxcart, which also includes a continuation as part of the before- and after-state that each semantic function operates on.

Oxcart

Note that, because of how a quotation is defined via non-determinism, its interpretation is a _relation_, and not a function, of before states to after states.
⟦drop⟧ ∋ (v⋅s)▹κ ↦ s▹κ
⟦ijump⟧ (q⋅s)▹κ = s▹κ
⟦b⟧ ≜ S→S
⟦i;b⟧ s▹κ = ⟦i⟧ s▹(λs'.⟦b⟧ s'▹κ)

The interpretation of a CC branch is also a function mapping before states to after states. It is the sequential composition of the interpretation of each instruction in the branch.

Notes

- To implement loops, you'll need to ‘dup’ the loop body's quotation before invoking it. Otherwise you'd need to have a copy of the loop body within itself, which isn't syntactically possible.

The FCC language

FCC is a “flattened” version of CC. Whereas in CC, quotations are nested directly in the instruction stream, FCC programs consist of a flat list of _labeled_ quotations. In each instruction stream, quotations are referred to by their labels, instead of being embedded directly.

This will map nicely to our existing partial path construction, which includes a source and sink node. (And as we've mentioned in a few places, graph nodes are equivalent to instruction pointers are equivalent to non-closure quotations.)

The NCC language

TODO: Move non-determinism here. CC would be fully deterministic (and would use functions instead of relations in its semantics) (Note that means CC would need to grow a conditional instruction of some kind)

P ≜ q
q ≜ b | q⊓q
b ≜ i | b;b

An NCC program ‘P’ consists of a single quotation. A clause ‘q’ is the non-deterministic choice of one or more branches. A branch ‘b’ is a sequence of instructions.