💾 Archived View for dcreager.net › stack-graphs › concatenative-language.gmi captured on 2024-09-29 at 00:09:14. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-07-22)
-=-=-=-=-=-=-
CC is a concatenative continuation-passing language. CC programs operate on a stack of data values. Because CC is a continuation-passing language, there is no control stack or runtime call stack. (Though see below how we could implement a ‘call’ instruction with syntactic sugar that wraps the rest of the instruction stream in a quotation.)
There are three kinds of value that can appear on the data stack:
P ≜ q q ≜ i | q;q
A CC program ‘P’ consists of a single quotation. A clause ‘q’ is a sequence of one or more instructions.
The possible instructions are as follows. (Note that some instructions are “terminals”, which means that any instructions appearing after it in a quotation are ignored.)
TODO: Do we need snapshots?
A closure is a special tuple, which consists of a quotation and a stack (represented as a tuple). It gives you the ability to throw away your current execution state and resume somewhere else.
[This is more typically called a “continuation” in other concatenative languages, but I don't want to confuse names with the continuation used below in the execution semantics.]
All of the following instructions don't need to be first-class in the formalism, and can be implemented in terms of other instructions:
We don't need pairs, just store the two elements on the stack.
We need literals for the tag values (typically ‘LEFT’ and ‘RIGHT’, which is sufficient to construct any other union)
v ≜ x | q | s s ≜ ∘ | v | s⋅s ∘⋅s = s⋅∘ = s s₁⋅(s₂⋅s₃) = (s₁⋅s₂)⋅s₃ (s₁,s₂,...) ≜ s₁⋅s₂⋅...
CC programs operate on, and executing one produces, a stack of values. Literal values and quotations are represented directly. Tuples are represented by a stack containing the tuple's values.
S ≜ s▹κ κ ≜ s→s ⟦⋅⟧ ≜ S↔S
The current state of a CC program consists of a stack, along with a continuation representing any additional work that needs to be performed. The interpretation ‘I’ of each language construct is a relation mapping a before state to an after state.
These semantics are similar to that of Oxcart, which also includes a continuation as part of the before- and after-state that each semantic function operates on.
Note that, because of how a quotation is defined via non-determinism, its interpretation is a _relation_, and not a function, of before states to after states.
⟦drop⟧ ∋ (v⋅s)▹κ ↦ s▹κ ⟦ijump⟧ (q⋅s)▹κ = s▹κ
⟦b⟧ ≜ S→S ⟦i;b⟧ s▹κ = ⟦i⟧ s▹(λs'.⟦b⟧ s'▹κ)
The interpretation of a CC branch is also a function mapping before states to after states. It is the sequential composition of the interpretation of each instruction in the branch.
- To implement loops, you'll need to ‘dup’ the loop body's quotation before invoking it. Otherwise you'd need to have a copy of the loop body within itself, which isn't syntactically possible.
FCC is a “flattened” version of CC. Whereas in CC, quotations are nested directly in the instruction stream, FCC programs consist of a flat list of _labeled_ quotations. In each instruction stream, quotations are referred to by their labels, instead of being embedded directly.
This will map nicely to our existing partial path construction, which includes a source and sink node. (And as we've mentioned in a few places, graph nodes are equivalent to instruction pointers are equivalent to non-closure quotations.)
TODO: Move non-determinism here. CC would be fully deterministic (and would use functions instead of relations in its semantics) (Note that means CC would need to grow a conditional instruction of some kind)
P ≜ q q ≜ b | q⊓q b ≜ i | b;b
An NCC program ‘P’ consists of a single quotation. A clause ‘q’ is the non-deterministic choice of one or more branches. A branch ‘b’ is a sequence of instructions.