đŸ Archived View for dcreager.net âș 2020 âș 12 âș swanson-s1.gmi captured on 2024-08-25 at 00:21:24. Gemini links have been rewritten to link to archived content
âŹ ïž Previous capture (2023-07-22)
-=-=-=-=-=-=-
2020-12-30
Note: This post is out of date! Swanson no longer has an Sâ language.
In the previous post, we described Sâ, and showed how it would be **absolutely disgusting** to have to program in it directly. Which is why I described it as Swansonâs âassembly languageâ. In this post, weâll look into exactly _how_ the language is complicated, and use that to describe a _slightly_ better language named Sâ.
Letâs dig into that some more! Hereâs an incredibly simple bit of code:
6 + 4 * 3
Itâs not even a statement or function, itâs just an expression! But even this example will be quite complex in Sâ, for a few reasons that combine together:
The end result of all of this means that the Sâ version of our example looks something like:
module horrible_example { $load: containing () receiving ($loaded, primitive.int) { $module = closure containing (primitive.int) -> main; -> $loaded; } main: containing (primitive.int) receiving ($finish) { value = literal "4"; $return = closure containing ($finish) -> main@1; -> primitive.int from_literal; } main@1: containing ($finish) receiving ($_, $0) { primitive.int = rename $_; four = rename $0; value = literal "3"; $return = closure containing ($finish, four) -> main@2; -> primitive.int from_literal; } main@2: containing ($finish, four) receiving ($_, $0) { primitive.int = rename $_; three = rename $0; $return = closure containing ($finish, primitive.int) -> main@3; rhs = rename three; -> four "*"; } main@3: containing ($finish, primitive.int) receiving ($_) { twelve = rename $_; value = literal "6"; $return = closure containing ($finish, twelve) -> main@4; -> primitive.int from_literal; } main@4: containing ($finish, twelve) receiving ($_, $0) { primitive.int = rename $_; six = rename $0; $return = closure containing ($finish, six, twelve) -> main@5; -> primitive.int drop; } main@5: containing ($finish, six, twelve) receiving () { $return = closure containing ($finish) -> main@6; rhs = rename twelve; -> six "+"; } main@6: containing ($finish) receiving ($_) { eighteen = rename $_; $return = closure containing ($finish) -> main@7; -> eighteen drop; } main@7: containing ($finish) receiving ($_) { -> $finish succeed; } }
Hopefully you can piece together how this faithfully implements our simple arithmetic expression:
But not without complexity:
What does this example look like in Sâ?
module horrible_example { $load: containing () receiving ($loaded, primitive.int) { $module = closure containing (primitive.int) -> main; $loaded(); } main: containing (primitive.int) receiving ($finish) { value = literal "4"; primitive.int->from_literal(value) -> ($0 -> four); value = literal "3"; primitive.int->from_literal(value) -> ($0 -> three); four~>"*"(rhs <- three) -> ($_ -> twelve); value = literal "6"; primitive.int->from_literal(value) -> ($0 -> six); primitive.int~>drop(); six~>"+"(rhs <- twelve) -> ($_ -> eighteen); eighteen~>drop(); $finish~>succeed(); } }
Note that weâve only âsolvedâ one of the three complexities that we mentioned above. Weâve added back in âimplicit control flowâ, so that we donât have to manually extract each step of our computation into top-level blocks. But we still model _every_ operation as an invocation of some invokable, and we still have no nesting of operations. But itâs still a substantial improvement!
The overall structure of the code is largely the same: youâve got a module, containing a number of blocks, each of which consists of some operations. But whereas in Sâ, a block consists of zero or more statements followed by exactly one invocation, an Sâ block consists of an arbitrary list of statements and calls. The only restriction is that an Sâ block must end with a call.
This call expression is the meat of Sâ. Looking carefully, there are two variants, depending on whether you use â->â or â~>â. The â->â variant desugars into the â~>â variant, so letâs look at the â~>â version first:
six~>"+"(rhs <- twelve) -> ($_ -> eighteen);
This call gets âtranslatedâ into an Sâ invocation, along with some additional support statements. In this case, weâre invoking the â+â branch of the value named âsixâ in the current environment.
The call contains what look like parameter and return value lists. The â(rhs <- twelve)â part tells us that âsix +â expects an input value named ârhsâ â but that the name of that input value in our environment is currently âtwelveâ, and so weâll need to ârenameâ it before the invocation.
Similarly, the â($_ -> eighteen)â tells us that âsix +â will produce an output named â$_â â but that weâd rather call that output value âeighteenâ in the rest of our code, and so weâll need another ârenameâ _after_ the invocation.
Most importantly, though, because this call is not the last operation in the Sâ block, we will automatically extract everything after this call into a new continuation block, and add it as an additional input value named â$returnâ. (â$returnâ is the default name for the continuation parameter; itâs not mentioned explicitly. There is additional syntax that gives you more control over how the continuation is passed in to the invocation, but weâll ignore that for now.)
Altogether, this Sâ call gets translated into the following Sâ, where the âCLOSUREâ part is automatically determined by whatever values are in the environment at the time of the call, but not mentioned as an input.
$return = closure containing (CLOSURE) -> main@6; rhs = rename twelve; -> six "+"; } main@6: containing (CLOSURE) receiving ($_) { eighteen = rename $_;
As we mentioned above, _many_ invokables use an output named â$_â to âreturn themselves backâ to caller, as a way of getting around Swansonâs linearity. This is a common enough pattern that weâve added syntactic sugar to Sâ to handle it. A call that uses â->â will automatically add an extra output that renames â$_â back to the name that it had before the call. That is, the following two calls are exactly equivalent:
primitive.int->from_literal(value) -> ($0 -> four); primitive.int~>from_literal(value) -> ($_ -> primitive.int, $0 -> four);
While Sâ is certainly âbetterâ than Sâ â in that itâs less actively painful to program in it directly as a human â you might still be wondering why youâd subject yourself to it. In the introduction I mentioned that Swanson is a language _framework_, which we intend to compile or translate other higher level languages into. If thatâs the case, why do we have this language thatâs still so low level, instead of jumping straight to the higher level languages that are actually pleasant to use?
The main reason is that this gives us a better story for _bootstrapping_. Another goal of the framework is to work with arbitrary _host environments_, while requiring as little as possible of those hosts. The only hard and fast rule is that a host needs to be able to parse and execute Sâ code. Some primitives will need to be provided by the host, but we want to minimize the number of primitives that each host needs to implement directly. As much as possible, we want the core âstandard libraryâ of Swanson code to be written in some way that (a) doesnât require each host to reimplement it, and (b) doesnât âblessâ any one particular higher level language (or its standard library) as the one that all other hosts have to depend on.
Sâ is intended to serve this role. One single host environment (the âbootstrap environmentâ) will need to implement an Sâ parser and translator _directly_. That bootstrap environment can produce Sâ translations of any âstandard libraryâ code written in Sâ. And every other host environment will then have access to that code, while requiring nothing more than an Sâ parser and a small set of primitives.
Thatâs the vision, at least!