💾 Archived View for stack.tilde.cafe › gemlog › 2022-09-29.forth.if.gmi captured on 2023-09-28 at 16:11:49. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-09-08)
-=-=-=-=-=-=-
I crudely avoided the issue of how one would implement what I tried to describe in my previous post: a Forth in which decompiled source looks like original source. Someone called me on it, and I should clarify.
The problem is that we are trying to linearize the unlinearizable. Code, especially Forth code, is more of a tree. We read and write code linearly - a function consists of a bunch of tokens, and think of it as an ordered sequence. But the threaded code is executed as a depth-first tree walk, down into definitions until the bottom CODE, then up and down again, until the entire tree is walked. It's kind of awe-inspiring to visualize execution.
So, whether IF jumps over some code or jumps sideways to some code somewhere else makes little difference. This opens the door for if's clauses to be elsewhere as anonymous subroutines, os as Slava of Factor calls them, quotations.
Quotations in Factor are anonymous subroutines, compiled in-place but not executed until something like IF chooses to.
In Factor, you would say
10 3 < [ "Math is broken" print ] [ "Math is good" print ] if
Roughly equivalent of Lisp's
(if (< 10 3) (print "Math is broken") (print "Math is good"))
This seem pretty good, and works just the way I like in terms of the codestream matching your expectations of source. And it's very Forthy: a quotation leaves an address on the stack, and IF can choose to execute it or not.
In practical terms, a straightforward implementation of inlining yet-unexecuted quotations requires a jump over the quotation that follows, but it can be made explicit by a <quotation><size>.... tokens, and perhaps even and <end-quotation> token which is never executed or perhaps is equivalent to a return.
Another way to do that is to relocate quotations elsewhere or compile them elsewhere in the first place, and put IF in front of the two possible quotation references. My preferred syntax would be something like
10 3 < IFELSE { ." math is broken" } { ." Math is good }
Which would look like
<lit>10<lit><3><<><IFELSE><Q1><Q2>... and elsewhere, Q1 is <quotation><."> Math is broken<enquotation> and Q2 is <quotation><."> Math is good<endquotation>
Note the use of IFELSE which takes two quotations, and IF which takes one: this is required since unlike Lisp we do not know where expressions end... I also like curly braces, leaving [ and ] to be able to go into compile mode or interpretive mode at will, as Forth allows you to.
This requires the decompiler to be able to inline the reconstructed quotation source, which is facilitated by the <quotation> token as a target of a reference, as opposed to a named function. Or we could omit such tokens and simply reverse-look-up the address to find a name, and failing that, assume a quotation.
Is it worth it? It does complicate the compiler somewhat, and why?
The concept of quotations is actually a very powerful tool, as it creates the possibility of passing first-class code chunks around as data. It is kind of a lambda.
More to come.