Swail is the main programming language used in the implementation of XukutOS, a lisp which tends to lean more to the functional side of things. But if you just want to do object-orientation, that should be fine too. "Swail" is an acronym for "Language that I have forgotten the acronym for", but retroactively it could also mean "Some Witty Acronym for an Interesting Language". These manual pages are a Swail reference, not tutorial, and assume you are already comfortable in several programming languages, especially Common Lisp or Scheme.
The plan is for Swail to be a fully-featured general-purpose programming language for all your programming needs from within XukutOS. I'm trying to do so by implementing most of XukutOS in Swail (only the most fundamental parts are still in Assembly). Swail grows more functionality as XukutOS needs it, and it currently lacks some functionality. The manual describes Swail as it currently is, with a few references to some long-term plans.
The rest of this page gives an overview of the primitives available in Swail. It's a good idea to look this over before reading the rest of the Swail-related manual pages, but not essential that you understand all the details from this short description. Make sure to refer back to this page if you get confused.
Swail has the following basic types of objects:
A *symbol* is a very simple kind of object: it carries exactly no data, apart from its identity. Given symbols, you can test whether they are identical or distinct, and that's about it. Other uses, such as naming variables, work through support from the rest of the system. Unlike symbols in a typical lisp, Swail's symbols do not carry a print-name, which is intended to break assumptions that impede translation (between natural languages and between different computer systems). Symbols are used throughout Swail code whenever unique identifiers are needed: variable names, dictionary keys, annotating a definition, etc.
A *number* is an object storing a small integer. All the obvious numerical operations, addition, subtraction, multiplication, equality testing, etc., are available on numbers. Numbers have an immutable interface: there is no operation that modifies the value contained in a number. The implementation may have optimisations that cause number operations to return values that test as identical to other numbers, so that 2 might test as identical to the result of `1 + 1'. It is implementation-defined which values are stored in a small integer and which require a big integer. Currently, values in the interval `[-2^63, 2^63[' are stored in small integers. It may not be possible in the future to determine easily which number objects are stored as a small integer and which as a big number.
A *bool* (short for "Boolean value") represents the outcome of yes-or-no tests, such as comparison of object identity. There are only two *bool*s in existence, `#tt' ("true" or "yes") and `#ff' ("false" or "no"). They might as well have been symbols, were it not that making them a different type permits a few optimisation tricks.
A *cons* is an object referring to two (other, or the same) objects, traditionally known as the `car' and `cdr' of the cons. A cons with car `a' and cdr `b' is written `(a . b)'. Conses are used to encode list and tree structures. You can modify a cons, changing the object referred to by the car or cdr. Modification keeps all references to the cons intact; the language has no way to prevent objects containing conses from being mutated. *List*s are not a primitive type, rather they are a data structure built out of conses and nil. A list (sometimes called a "cons-list" for extra clarity) is encoded as follows: if the list is empty, then the cons-list is represented by nil, otherwise let `x' be the first element and `xs' be the cons-list representation of the remaining elements, then `(x . xs)' is the cons-list representation. Often we will use an abbreviated notation for cons-lists, leaving out the dot and the inner parentheses if the cdr is a cons or nil. For example, `(a . (b . (c . ())))' is written `(a b c)', and `(a . (b . (c . d)))' is written `(a b c . d)'.
A *tuple* is a sequence of referenced objects that can be indexed randomly. You could see a cons as a 2-tuple, except that it is possible to modify the size of a tuple in certain circumstances. Indexing starts at 0.
A *ref* contains a reference to one object. It could have been implemented as a 1-tuple, except for some special properties allowing for efficient use of refs in multiprocessor code.
A *span* is a (potentially) very big number whose digits can be indexed in certain bases, e.g. base 2^8, 2^16, 2^32 and 2^64. In other words: it's an array of bytes that can be indexed in byte, 2-byte, 4-byte and 8-byte increments (chosen at indexing time). Indexing starts at 0. A span is represented as a contiguous slice of a block of memory. Multiple spans can share the same block of memory, and the slices can overlap or not. Programmers are advised to be very cautious when modifying the contents of a span that shares its block with other spans, as the modifications will need to be synchronised carefully. Presumably there will be support for some sort of copy-on-write operation, in order to avoid many of these issues while retaining efficiency.
A *text* is a sequence of characters that can be shown to the user, including choice of character sets, fonts, markup, etc. Texts are built as a "rope", i.e. ordered sequence, of *str*s (short for "string"). Text within a str cannot switch character set, font or markup. Strs are a low-level abstraction that should best be avoided outside of text editors and similar, but the amount of work other languages put in their support makes them worth mentioning here.
A *macro* is a special type of function that is used to transform Swail code into other Swail code. Due to its special role in the evaluation order, there might be differences between the properties of a macro and those of a normal function that returns a piece of syntax.
A *continuation* is like a function in that it can be applied to arguments. Instead of executing code stored in the continuation, execution will instead go back to the point where the continuation was defined. Working with continuations directly can get terribly confusing, but they allow you to implement many different language features (nondeterminism, exceptions, handlers and restarts, ...) in a uniform way.
Alongside these general-purpose data types, Swail includes a variety of more specialised data types, described below.
A Swail program is typically given as one object, called a *form*. A form is processed by the evaluator as follows, in a kind of yin-yang mutual recursion of eval / apply. [see e.g. the cover of Structure and Interpretation of Computer Programs].
- If the form is a symbol, it is looked up as a variable in the lexical environment. If there is a binding in the lexical environment, the result of evaluating is the value the name is bound to. If there is no such binding, an error is raised.
- If the form is not a symbol or a list, the result is the form itself.
- Otherwise, the form must be a list. The result of evaluating is the result of applying the car of the form to the cdr.
To apply an operator to a list of arguments, the evaluator does the following:
- If the operator is one of the special operators, the result is as defined for that specific operator.
- If the operator is the name of a macro, the macro call is expanded and the result of expanding the macro is evaluated.
- Otherwise, the operator and arguments are evaluated in the order of the list. The result of evaluating the operator is called on the result of evaluating all the arguments.
Swail has the following built-in special forms:
- `(swail:def-macro name params . body)': creates a new macro with the given parameters, code and the current lexical environment as context. The macro will be in scope for the current environment.
- `(swail:do . forms)': evaluates each form in the list `forms' in turn, return the result of evaluating the last form. If `forms' is empty, the result is `nil'.
- `(swail:dyn name)': looks up the variable `name' in the dynamic environment.
- `(swail:fn params . code)': creates a new `fn' object with the given parameters, code and the current lexical environment as context. Planned: this operator has a built-in abbreviation `λ'.
- `(swail:if cond then (optional else))': evaluates `cond' and converts the result to a bool. If the result is `#tt', the result if the form is the result of evaluating `then', otherwise the result of evaluating `else'. If `else' is omitted, the default value is `nil'. (In fact, there is also a built-in special form `swail:if-bool' which does not convert its condition to a boolean; this is mostly an implementation detail.)
- `(swail:let binders . code)': `binders' is a list of even length alternating between a name and a form. All the forms in the binders are evaluated, then `code' is evaluated with each name bound to the result of evaluating the corresponding form.
- `(swail:quote form)': evaluates to `form' but does not evaluate `form' any further. For example, `(let (x 1) (swail:quote x))' evaluates to the symbol `x' rather than the number `1'. Planned: `quote' can be abbreviatied to the ' symbol, so that `'form' expands to `(quote form)'. Planned: `quote' (or a new `syntax-[quasi]quote' operator?) might annotate symbols in such a way that variables appear lexically scoped, e.g. by replacing all unbound symbols at the moment of evaluation with gensyms.
- `(swail:repeat form)': keeps evaluating `form' until control passes outside the `repeat' form.
- `(swail:set var expr)': evaluate `expr' and set `var' to the result of evaluation.
Any questions? Contact me:
By email at vierkantor@vierkantor.com