đŸ’Ÿ Archived View for dcreager.net â€ș 2020 â€ș 12 â€ș swanson-s1.gmi captured on 2024-12-17 at 09:16:38. Gemini links have been rewritten to link to archived content

View Raw

More Information

âŹ…ïž Previous capture (2023-07-22)

-=-=-=-=-=-=-

Swanson: S₁

2020-12-30
Note: This post is out of date! Swanson no longer has an S₁ language.

Previous: S₀

In the previous post, we described S₀, and showed how it would be **absolutely disgusting** to have to program in it directly. Which is why I described it as Swanson’s “assembly language”. In this post, we’ll look into exactly _how_ the language is complicated, and use that to describe a _slightly_ better language named S₁.

This is way too complex

Let’s dig into that some more! Here’s an incredibly simple bit of code:

6 + 4 * 3

It’s not even a statement or function, it’s just an expression! But even this example will be quite complex in S₀, for a few reasons that combine together:

The end result of all of this means that the S₀ version of our example looks something like:

module horrible_example {
  $load:
    containing ()
    receiving ($loaded, primitive.int)
  {
    $module = closure containing (primitive.int) -> main;
    -> $loaded;
  }

  main:
    containing (primitive.int)
    receiving ($finish)
  {
    value = literal "4";
    $return = closure containing ($finish) -> main@1;
    -> primitive.int from_literal;
  }

  main@1:
    containing ($finish)
    receiving ($_, $0)
  {
    primitive.int = rename $_;
    four = rename $0;
    value = literal "3";
    $return = closure containing ($finish, four) -> main@2;
    -> primitive.int from_literal;
  }

  main@2:
    containing ($finish, four)
    receiving ($_, $0)
  {
    primitive.int = rename $_;
    three = rename $0;
    $return = closure containing ($finish, primitive.int) -> main@3;
    rhs = rename three;
    -> four "*";
  }

  main@3:
    containing ($finish, primitive.int)
    receiving ($_)
  {
    twelve = rename $_;
    value = literal "6";
    $return = closure containing ($finish, twelve) -> main@4;
    -> primitive.int from_literal;
  }

  main@4:
    containing ($finish, twelve)
    receiving ($_, $0)
  {
    primitive.int = rename $_;
    six = rename $0;
    $return = closure containing ($finish, six, twelve) -> main@5;
    -> primitive.int drop;
  }

  main@5:
    containing ($finish, six, twelve)
    receiving ()
  {
    $return = closure containing ($finish) -> main@6;
    rhs = rename twelve;
    -> six "+";
  }

  main@6:
    containing ($finish)
    receiving ($_)
  {
    eighteen = rename $_;
    $return = closure containing ($finish) -> main@7;
    -> eighteen drop;
  }

  main@7:
    containing ($finish)
    receiving ($_)
  {
    -> $finish succeed;
  }
}

Hopefully you can piece together how this faithfully implements our simple arithmetic expression:

But not without complexity:

S₁ is ever so slightly better

What does this example look like in S₁?

module horrible_example {
  $load:
    containing ()
    receiving ($loaded, primitive.int)
  {
    $module = closure containing (primitive.int) -> main;
    $loaded();
  }

  main:
    containing (primitive.int)
    receiving ($finish)
  {
    value = literal "4";
    primitive.int->from_literal(value) -> ($0 -> four);
    value = literal "3";
    primitive.int->from_literal(value) -> ($0 -> three);
    four~>"*"(rhs <- three) -> ($_ -> twelve);
    value = literal "6";
    primitive.int->from_literal(value) -> ($0 -> six);
    primitive.int~>drop();
    six~>"+"(rhs <- twelve) -> ($_ -> eighteen);
    eighteen~>drop();
    $finish~>succeed();
  }
}

Note that we’ve only “solved” one of the three complexities that we mentioned above. We’ve added back in “implicit control flow”, so that we don’t have to manually extract each step of our computation into top-level blocks. But we still model _every_ operation as an invocation of some invokable, and we still have no nesting of operations. But it’s still a substantial improvement!

The overall structure of the code is largely the same: you’ve got a module, containing a number of blocks, each of which consists of some operations. But whereas in S₀, a block consists of zero or more statements followed by exactly one invocation, an S₁ block consists of an arbitrary list of statements and calls. The only restriction is that an S₁ block must end with a call.

S₁ calls

This call expression is the meat of S₁. Looking carefully, there are two variants, depending on whether you use ‘->’ or ‘~>’. The ‘->’ variant desugars into the ‘~>’ variant, so let’s look at the ‘~>’ version first:

six~>"+"(rhs <- twelve) -> ($_ -> eighteen);

This call gets “translated” into an S₀ invocation, along with some additional support statements. In this case, we’re invoking the ‘+’ branch of the value named ‘six’ in the current environment.

The call contains what look like parameter and return value lists. The ‘(rhs <- twelve)’ part tells us that ‘six +’ expects an input value named ‘rhs’ — but that the name of that input value in our environment is currently ‘twelve’, and so we’ll need to ‘rename’ it before the invocation.

Similarly, the ‘($_ -> eighteen)’ tells us that ‘six +’ will produce an output named ‘$_’ — but that we’d rather call that output value ‘eighteen’ in the rest of our code, and so we’ll need another ‘rename’ _after_ the invocation.

Most importantly, though, because this call is not the last operation in the S₁ block, we will automatically extract everything after this call into a new continuation block, and add it as an additional input value named ‘$return’. (‘$return’ is the default name for the continuation parameter; it’s not mentioned explicitly. There is additional syntax that gives you more control over how the continuation is passed in to the invocation, but we’ll ignore that for now.)

Altogether, this S₁ call gets translated into the following S₀, where the ‘CLOSURE’ part is automatically determined by whatever values are in the environment at the time of the call, but not mentioned as an input.

    $return = closure containing (CLOSURE) -> main@6;
    rhs = rename twelve;
    -> six "+";
  }

  main@6:
    containing (CLOSURE)
    receiving ($_)
  {
    eighteen = rename $_;

“Reuse” calls

As we mentioned above, _many_ invokables use an output named ‘$_’ to “return themselves back” to caller, as a way of getting around Swanson’s linearity. This is a common enough pattern that we’ve added syntactic sugar to S₁ to handle it. A call that uses ‘->’ will automatically add an extra output that renames ‘$_’ back to the name that it had before the call. That is, the following two calls are exactly equivalent:

primitive.int->from_literal(value) -> ($0 -> four);

primitive.int~>from_literal(value) -> ($_ -> primitive.int, $0 -> four);

What’s the point?

While S₁ is certainly “better” than S₀ — in that it’s less actively painful to program in it directly as a human — you might still be wondering why you’d subject yourself to it. In the introduction I mentioned that Swanson is a language _framework_, which we intend to compile or translate other higher level languages into. If that’s the case, why do we have this language that’s still so low level, instead of jumping straight to the higher level languages that are actually pleasant to use?

Introduction

The main reason is that this gives us a better story for _bootstrapping_. Another goal of the framework is to work with arbitrary _host environments_, while requiring as little as possible of those hosts. The only hard and fast rule is that a host needs to be able to parse and execute S₀ code. Some primitives will need to be provided by the host, but we want to minimize the number of primitives that each host needs to implement directly. As much as possible, we want the core “standard library” of Swanson code to be written in some way that (a) doesn’t require each host to reimplement it, and (b) doesn’t “bless” any one particular higher level language (or its standard library) as the one that all other hosts have to depend on.

S₁ is intended to serve this role. One single host environment (the “bootstrap environment”) will need to implement an S₁ parser and translator _directly_. That bootstrap environment can produce S₀ translations of any “standard library” code written in S₁. And every other host environment will then have access to that code, while requiring nothing more than an S₀ parser and a small set of primitives.

That’s the vision, at least!