💾 Archived View for gary.vern.cc › v3-spec.gmi captured on 2024-09-29 at 00:07:30. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2024-08-18)

-=-=-=-=-=-=-

V3

A virtual machine for program "archival". The entire body of programs for the V3 architecture can be trivially ported (to a new architecture, or into the future as the system under an existing implementation changes) by implementing only this specification. That implementation can then be used to run all the programs written for V3. The ISA aims to strike a balance between being straight forward to implement, but providing rich enough features to support useful applications, while still encoded somewhat compactly.

Notation

Terminology

Cell:

Direct Address/Operand:

Immediate Address/Operand:

Instruction:

Operation:

Segment:

Architecture

Memory is a contiguous address space of 64K 16-bit cells, which are not byte addressable -- each consecutive address references the next 16-bit cell, and each cell can be one of the following:

Character strings are not packed as ASCII bytes; instead each consecutive character of a string uses a full 16-bit cell, encoded as UTF-16.

CPU

The CPU has a dual stack architecture and several additional registers. The stacks share $1f9 cells between the data stack and the exit stack, all mapped into the same address space. The first 6 cells memory map the first six registers, for a total of $200 memory mapped cells. Registers %d and %e each refer to the next available empty cell inside the data and exit stacks respectively. These stacks grow towards each other from either end of the shared stack cells. Stack overflow occurs when the top elements collide somewhere between the two extremes --not at an arbitrary "half-way" address.

    ~~~~~ box-diagram of CPU components and connections
                            ,>,>- - - - - - ->?+-------+?<- - - - .            
         ...  |   ...  |   / /                 |       |          ^            
    ,-> $1237 | nextop |  / /                  v       v      REGISTERS        
    |         +--------+ / /                  src     dst    +----+----+       
    |   $1236 |  arg2 -+' /                  __v__   __v__   | %a | %b |       
    |   $1235 |  arg1 -+-'  +----+----+      \    \ /    /   |    | %c-+>-.    
    |   $1234 | opcode-+--->| %i | %f |>--in->\    v    /    | %d | %e |  |    
    |         +--------+    +-v--+----+     ,->\  ALU  /     | %h |    |  |    
    |    ...  |  ...   |      |            /    \_____/      | %r | %s |  |    
    |             ^           `-----------'        v         | %t | %n |  |    
    |             |                               dst        +----+----+  |    
    |             |                                v              ^       |    
    |             `- - - - - - - - - - - - - - - < +->------------'       |    
    `--------------< %c is the address of the next opcode <---------------'

        |ADDRS|<MEMORY>|<-                    CPU                    ->|

    ~~~~~ 16-bit cpu registers
    %a     address  - general purpose register
    %b     base     - general purpose register
    %c     counter  - program counter; address of the next instruction to fetch
    %d     data     - reference to the next empty data stack slot
    %e     exit     - reference to the next empty exit stack slot
    %f     flags    - 1-bit flags set as a side effect of many operations
    %h     height   - height of the allocated stack of memory segments
    %i     instruction - instruction currently being decoded
    %m     memory   - index of the active memory segment
    %r     return   - "return" (top) element on exit stack
    %s     second   - "second" element on exit stack
    %t     top      - "top" element on data stack
    %n     next     - "next" (2nd) element on data stack

Flag Bits

1-bit flags are all packed into the %f register.

    ~~~~~
    S      sign     - msb of last alu operation result
    O      overflow - instruction result flipped msb compared to operands
    C      carry    - instruction result is larger than fits in 16-bit cell
    Z      zero     - all 0-bits in instruction result
    B      %b       - pop %b from %e on the next exit
    A      %a       - pop %a from %e on the next exit
    X      exit     - pop BAX flags from %e on the next exit

    |f e d c|b a 9 8|7 6 5 4|3 2 1 0|
    +-------+-------+-------+-------+
    |S O C Z|    *reserved*   |B A X|

There are no instructions to directly manipulate the contents of %f. The bits are set automatically according to the result of many operations. See the Flag Operations section below for details of the instructions with behaviour that depends on the bits in %f.

Memory Map

Programs normally reside in memory $200 or above, because addresses below that are memory mapped registers where they are accessible using normal instructions.

    ~~~~~ memory mapped registers
    $0000  %f      flags
    $0001  %a      address
    $0002  %b      base
    $0003  %c      counter
    $0004  %d      data
    $0005  %e      exit
    ~~~~~ memory mapped stacks
    $0006          bottom of data stack (grows towards $0200)
     ...
    $01ff          bottom of exit stack (grows towards $0000)
    ~~~~~
    $0200 _start   start of program memory
     ...
    $ffff          end of segment
    ~~~~~

%f is mapped to $0000 so that when reading hex instructions, %a through %e are encoded as $1 through $5 in m/dst and m/src nibbles, and those encoded values map exactly to their mapped addresses in memory.

While only 63.5K cells are available to an individual program (512 addresses are taken up by memory mapped stacks and registers), there are up to $ff 64K segments available for other tasks when necessary.

Each segment provides the same access to memory mapped registers, which are synchronized with the CPU registers when switching between programs defined in those segments.

We don't map every register into memory:

Devices

In addition to the segmented address spaces available for executing programs, there is a separate address space for devices. Devices are accessed by "out" and "in" operations. Each device has $f 16-bit ports, which can be read or written according to their specification.

    ~~~~~ devices and ports
          :system (device $0)
    $0  fatal   ,---.
    $8  color1  0a6f `-> 000000
    $9  color2  05df     aa55cc
    $a  color3  0cbf     66ddbb
    $e  debug      `---> ffffff
    $f  state
          :console (device $1)
    $0  readv   runs when there is input from console
    $1  write
    $2  error
    $f  outlen

TODO: define more devices!

Instruction Set

Each instruction fits in a 16-bit cell, where the most significant 4 bits encode the actual operation and determine how the rest of the instruction is decoded.

    ~~~~~ :operation bits
    $0000  out      - write `m/src` to `m/dst` in device address space
    $1000  in       - read from `m/src` in device address space into `m/dst`
    $2000  mov      - copy `m/src` to `m/dst`
    $3000  inv      - set `m/dst` to bitwise ~ `src`
    $4000  and      - set `m/dst` to bitwise `m/dst` & `src`
    $5000  or       - set `m/dst` to bitwise `m/dst` | `src`
    $6000  xor      - set `m/dst` to bitwise `m/dst` ^ `src`
    $7000  shf      - shift m/dst left by bits 4:7 nibble of `m/src`, and then
                      shift right by bits 0:3 of `m/src`    
    $8000  mul      - set `m/dst` to `m/dst` * `src`
    $8040  tuck     - tuck a copy of the top item of %d/%e under the `depth`-most item
    $9000  div      - set `m/dst` to `m/dst` / `src`
    $9040  roll     - remove the `depth`-most item of %d/%e and push back on top
    $a000  mod      - set `m/dst` to `m/dst` % `src`
    $b000  add      - set `m/dst` to `m/dst` + `src`
    $c000  sub      - set `m/dst` to `m/dst` - `src`
    $d000  set      - set `s/dst` to $0000 or $ffff according to flags in %f
    $dd0d  exit     - exit subroutine according to flags set in %f
    $e000  jmp      - jump to `m/src` according to flags set in %f
    $f000  call     - call subroutine at `m/src` according to flags set in %f
    $f400  xch      - exchange `m/dst` and `m/src`

There is no unary minus operation, because with 2's complement binary numbers we can use "inv" with the `onein` post setting.

Micro-coding of Operations

15 of the 20 available operations use the same bit-patterns to indicate how the CPU should behave for that operation; the remaining 5 operations each require their own alternative interpretations of the micro-coded bits:

    |f e d c|b a 9 8|7 6 5 4|3 2 1 0|  |f e d c|b a 9 8|7 6 5 4|3 2 1 0|
    +-------+-------+-------+-------+  +-------+-------+-------+-------+
    |  out  |x| post| m/dst | m/src |
    |  in   |x| post| m/dst | m/src |
    |  mov  |x| post| m/dst | m/src |
    |  inv  |x| post| m/dst | m/src |
    |  and  |x| post| m/dst | m/src |
    |  or   |x| post| m/dst | m/src |
    |  xor  |x| post| m/dst | m/src |
    |  shf  |x| post| m/dst | m/src |
    |  mul  |x| post| m/dst | m/src |  |  tuck |x| post| %d/%e | depth |
    |  div  |x| post| m/dst | m/src |  |  roll |x| post| %d/%e | depth |
    |  mod  |x| post| m/dst | m/src |
    |  add  |x| post| m/dst | m/src |
    |  sub  |x| post| m/dst | m/src |
    |  set  |x| post| flags | s/dst |  |  exit |  $d   | flags |  $d   |
    |  jmp  |x|c| $0| flags | m/src |  |  xch  |x|  $1 | m/dst | m/src |
    |  call |x|c|b|a| flags | m/src |

"mul" & "tuck" and "div" & "roll" use the same operation nibble, but only %d or %e are valid m/dst nibbles for "tuck" and "roll" where %d and %e are invalid m/dst nibbles for "mul" and "div"!

"xch" and "jmp" use the same operation nibble, but "jmp" always sets "a" and "b" bits to 0, where "xch" sets the "a" bit to 1.

"exit" uses bits from the otherwise nonsensical `%f set exit` encoding, though often an unconditional exit operation can be fused into the preceding instruction's x-bit instead (exceptions are when the address of an "exit" is labelled as a potential "jmp" target, or if the previous instruction already has a set x-bit).

ALU Operations

The most common machine instruction microcode bit-layout is:

    |f e d c|b a 9 8|7 6 5 4|3 2 1 0|
    +-------+-------+-------+-------+
    |  alu  |x| post| m/dst | m/src |


    ~~~~~ :micro-code
    alu    operation
                    - see the previous section: Instruction Set
    x      exit     - when set, exit the current call when the rest of this
                      instruction has been executed, see Flag Operations below
                      for more detail
    post   post adjust
                    - various options that affect the interaction of this
                      instruction with the ALU, a single instruction can only
                      specify one of these:
            $0000  zeroin   - `in` is 0
            $0100  onein    - auxilliary line `in` is 1
            $0200  signin   - `in` is S flag
            $0300  carryin  - `in` is C flag
            $0400  direct   - `m/src` is a small direct value
            $0500  onlyf    - do not write the results of the ALU operation back
                              to `m/dst`, but do set flags as if we had done so
            $0600  postinc  - post increment contents of `dst` if its `m` bit
                              is set and similarly for `src` and its `m` bit
            $0700  postdec  - as postinc, but decrementing `m` registers
    m/dst  destination
                    - set value or memory at address in register `dst`
    m/src  source   - get value or memory at address in register `src`

Normally, the `m/src` bits encode what register is being referenced. In this table, the hex columns are the encoding in the least significant nibble of the instruction, and the following symbols are the assembly read syntax -- a % sign followed by a letter refers to the content of the given register, and an @ followed by a letter refers to the content of the memory who's address is in the given register:

    $0000  @c    $0001  @a    $0002  @b    $0003  @r
    $0004  @d    $0005  @e    $0006  @t    $0007  @n    
    $0008  %t    $0009  %n    $000a  %a    $000b  %b
    $000c  %c    $000d  %d    $000e  %e    $000f  %r

...otherwise when the direct flag is set, the four `m/src` bits then encode a direct constant value:

    $0000  $0    $0001  $1    $0002  $2    $0003  $3
    $0004  $4    $0005  $7    $0006  $8    $0007  $f
    $0008  $fff1 $0009  $fff8 $000a  $fff9 $000b  $fffb
    $000c  $fffc $000d  $fffd $000e  $fffe $000f  $ffff

Note that %d and %e will pop a value from the top of the given stack, where %t and %r will use those same values but without popping.

Also "@c" really means an immediate argument from the instruction stream, i.e. the content of the address held in the program counter. Like popping from %d and %e, fetching from the instruction stream automatically increments %c -- you can think of it like a stack of op-codes growing from high memory down towards the address currently in %c.

For `m/dst` bits, the encoding is similar except that instead of adjusting %c (we have "jmp" and "call" operations for that), `m/dst` encoding references %s, the second value on exit stack.

    $0000  @c    $0010  @a    $0020  @b    $0030  @r
    $0040  @d    $0050  @e    $0060  @t    $0070  @n
    $0080  %t    $0090  %n    $00a0  %a    $00b0  %b
    $00c0  %s    $00d0  %d    $00e0  %e    $00f0  %r

With `m/src` and `m/dst` bits, notice how %a, %b, %d and %e are encoded as $a, $b, $d and $e; and similarly @a, @b, @d and @e as $1, $2, $4 and $5. This makes it a little easier to eyeball many instruction op-codes.

For a few operations ("in", "mov", "inv," and the stack operations "tuck" and "roll"), %d and %e in `m/dst` will push a new result value on top of the given stack, where %t and %r will overwrite that same top stack value: For all other operations, %d and %e in `m/dst` usually constitute an invalid instruction. The right side of the table in "Micro-coding of Operations" above shows some exceptions where distinct stack operations can be encoding with %d or %e in `m/dst`.

At a later date, the other bit patterns might be used to add new opcodes, so don't rely on any specific behaviour of currently "invalid" instructions.

Most op-codes with this layout set %f bits according to the result of executing executing the instruction (the ALU output), see the individual operation descriptions in the Assembly Syntax section below for exceptions:

    S - set to the sign bit (msb) of the result
    O - set if sign bit of the result is not the same as the operands
    C - set when result is larger than fits in 16-bit cell
    Z - set if only 0 valued bits comprise the result

Device Operations

    |f e d c|b a 9 8|7 6 5 4|3 2 1 0|
    +-------+-------+-------+-------+
    |  out  |x| post| m/dst | m/src |
    |  in   |x| post| m/dst | m/src |

These operations differ from "mov" because they operate in the device address space rather than regular memory space. "out" is device output and writes from m/src in memory address space to m/dst in device space, where "in" does the opposite: it reads from m/src device space into m/dst in memory address space.

See the "Devices" section above for more detail about devices.

Stack Operations

    |f e d c|b a 9 8|7 6 5 4|3 2 1 0|
    +-------+-------+-------+-------+
    |  tuck |x| post| %d/%e | depth |
    |  roll |x| post| %d/%e | depth |
    |  xch  |x|  $1 | m/dst | m/src |

    ~~~~~ :micro-code
    %d/%e       - what stack to affect (using m/dst encoding)
    depth       - what stack depth to tuck or roll down to

Tuck copies the top item to under the 0-based depth-most position in the selected stack. For example `1 tuck,` is like the Forth `tuck` operation, and `0 tuck` is like `dup`.

Roll removes the 0-based depth-most item entirely, and the elements above it falling down into the gap it would otherwise leave. The removed item is then pushed back on the top. `2 roll` is like the Forth `rot` operation, and `1 roll` is like `swap`

Xch exchanges `m/dst` and `m/src`.

Flag Operations

    |f e d c|b a 9 8|7 6 5 4|3 2 1 0|
    +-------+-------+-------+-------+
    |  exit |  $d   | flags |  $d   |
    |  set  |x| post| flags | s/dst |
    |  jmp  |x|c|b|a| flags | m/src |
    |  call |x|c|b|a| flags | m/src |

    ~~~~~ :micro-code
    c      save %c  - before changing %c, push a copy of %c on %d stack
    b      save %b  - before changing %c, push a copy of %b on %e stack
    a      save %a  - before changing %c, push a copy of %a on %e stack
    flags  preconditions
            $0000   o    - only if overflow flag is 1
            $0010   l    - only if sign and overflow flags differ
            $0020   ns   - only if sign flag is 0
            $0030   nc   - only if carry flag is 0
            $0040   no   - only if overflow flag is 0
            $0050   s    - only if sign flag is 1
            $0060   le   - only if sign and overflow flags differ, or zero
                           flag is 1
            $0070   ne   - only if zero flag is 0, used for !=
            $0080   ge   - only if sign and overflow flags are equal, or zero
                           flag is 1
            $0090   g    - only if sign and overflow flags are equal
            $00a0   a    - only if carry and zero flags are 0
            $00b0   be   - only if carry or zero flag is 0
            $00c0   b    - only if carry flag is 0
            $00d0   ae   - only if carry flag is 0, or zero flag is 1
            $00e0   e    - only if zero flag is 1, used for == check
            $00f0   t    - no conditions, always push

For "set" operations, push $ffff to s/dst if flag conditions were met, $0000 otherwise.

Like "mov" and "inv", if %d or %e are encoded into s/dst new values will be pushed to those stacks (where %t or %r will overwrite the existing top value) and for @c an immediate operand is "popped" from the instruction stream. The s/dst bits encode what register is being referenced, following the m/dst pattern but shifted right 4-bits:

    $0000  @c    $0001  @a    $0002  @b    $0003  @r
    $0004  @d    $0005  @e    $0006  @t    $0007  @n
    $0008  %t    $0009  %n    $000a  %a    $000b  %b
    $000c  %s    $000d  %d    $000e  %e    $000f  %r

For "jmp", if preconditions are not met, do nothing but continue execution from the next instruction. The `a` and `b` bits are always 0-valued, but if `c` is set, then a copy of %c is pushed to the data stack, and then %c is always set to `m/src` to continue with fitching the next instruction from that address.

For a "call" operation, if flag preconditions are not met, also do nothing more than continue program execution with the next instruction in memory. Otherwise, make a subroutine call by:

For "exit", if preconditions are not met, likewise continue execution from the next instruction. It's the the caller's responsibility to save the contents of %a and %b before executing a "call" instruction as is restoring them after the called code exits. The `a` and `b` bits in the "call" instruction save the current values to %e as described in the previous paragraphs, so that when the associated "exit" operation is executed:

This allows for nested subroutine calls, where each can make their own use of %a and %b while preventing inner subroutines from corrupting the values used by outer subroutines.

As usual, immediate operands (when `m/dst` or `m/src` is @c) read a value directly from the next cell in the instruction stream.

For "jmp" and "call" operations, the `c` bit is used to jump over a block of instructions, while saving their start address to %d, which can then be used to call that block later.

Assembly Language Syntax

Comments

Comments between `;` and the following `\n` are discarded and ignored.

Every time ``` (three back-ticks) at the start of any line are encountered, commenting is toggled off and on, to encourage a more literate style. Everything from the start of an input file up-to the first ``` is a comment.

Labels

As instructions are assembled, the assembler keeps track of the address to be written next, for the purpose of making a symbol table of labelled addresses as each label is encountered in the input. If a symbol starts with a `:`, the remaining characters up to the next whitespace are a label for the address of the following instruction which can be referred to from assembly instructions elsewhere in the program. Forward references work as expected.

If a symbol starts with a `.`, the remaining characters upto the next whitespace are a local label for the address of the following opcode. Local labels are added to the symbol table with the current `:` label string prefixed. Within the scope of the same `:` label symbol, local `.` symbols can be referenced using just the local symbol name, but can also be referenced from anywhere else using the full prefixed name. Local labels defined before the first `:` label are illegal.

A label reference does not use the leading `:` or `.`, just the remainder of the characters in the label, and the assembler will replace that reference with the address that the label refers to.

Whenever a new label is encountered, any "protect" flags are reset. See the "branching operations" subsection below for more details.

    :errout ; (z*--)
        0 cmp,
        end je,
        .while
            @d+ console.stderr out,
            errout jmp,                 ; recurse
        .end
        exit,

    :fatal ; (z*--)
        errout.while call,
        system.status $81 out,          ; bye

Within the scope of the `:errout` label, `while` refers to the address of the `.while` local label. From the scope of any other `:` label, the address of the earlier `while` is accessible by the prefixed reference `errout.while`.

Note that `.end` labels the `exit` operation, so it must be assembled as a separate instruction instead of setting the x-bit of the preceding instruction, otherwise it would end up labeling the same address as `:fatal`, and send the calling `je` instruction to the wrong address!

HERE

For the purposes of assembling, `HERE` is the address of the next empty cell that the assembler will write to -- be that opcodes or raw data.

Numbers

Numbers can be written in decimal, but must fit within one 16-bit cell. They are not typed so any number between -32768 and 65535 will fit.

Positive numbers can be written in hexadecimal with up to four digits following a ` gemini - kennedy.gemi.dev sign.

Printable characters can be written after a `'` single quote and will (in some future implementation) be translated to a UTF-16 codepoint during assembly (until then we encode as ASCII in the least significant byte of a cell).

When a number is encountered by the assembler, it is pushed on the operand stack where the next operation is expected to use it. To write a number directly to HERE, it should be followed by the `,` pseudo-operation, which can be written immediately following the number or separated by space.

    17 'k out,    ; output value of character `k` to device port 17
    $20 , 10,     ; assemble hex $20 followed by decimal 10 directly into memory, and increment

Strings

A string of non-space consecutive characters can be written after a `"` double quote. Spacing and unprintable characters will need to be written as numbers. The string will be assembled directly in place.

    "Hello, $20, "World '!, 10,

Strings cannot be passed as operands, so they are always written to consecutive cells in memory starting HERE. Take care to use the `,` pseudo-operation to write spacing and other non-string characters to memory to form the string you want, otherwise they may be accidentally pushed onto the operand stack.

Instructions

Otherwise, all remaining input is expected to describe machine instructions using the following mnemonics. Following the Forth tradition, operations all end with a literal ',' to signify that a cell will be written with that operation and any relevant preceding modifers, possibly followed by immediate operands in subsequent cells. The current write address is incremented accordingly, pointing to the next unused cell for any following instructions.

As already discussed, a single `,` pseudo-operation will pop a value from the operand stack and write it to HERE.

mov

Copy the value from m/src to m/dst. m/dst and m/src each take one of the valid values listed in ALU Operations. m/dst defaults to %d if not given explicitly. Remember that %d or %e for m/dst will push a new value onto that stack.

Additional "post" bits in the op-code are annotated with no more than one of:

    onein    - write +1 immediately after m/src, eg %t+1
    signin   - write +S immediately after m/src
    carryin  - write +C immediately after m/src
    direct   - assembler will set this automatically when a suitable m/src value is given
    postinc  - write + immediately after one or both operands, eg @d+
    postdec  - write - immediately after one or both operands
    onlyf    - write %f for m/dst

Flag bits in %f are set as described in the Flag Bits section above, according to the value written to m/dst.

in & out

When written in hexadecimal, the last digit of devport is the port number between $0 and $f, and the rest must be a valid device number in the device address space. m/dst is the destination address (or register) in regular memory space where the value read from the devport is written, defaulting to %d if not given explicitly. Valid m/dst values are enumerated in the earlier section, ALU Operations.

Flag bits in %f are set as described in the Flag Bits section above, according to the value written to m/dst.

devport is the same as described above. m/src is the source value in regular memory address space that will be written to the given devport. Both operands are required.

Flag bits in %f are set as described in the Flag Bits section above, according to the value written to devport.

inv

By default if m/dst is not given explicitly, this is assembled as if m/dst were the same as m/src. Invert all of the bits from the value in m/src, and write them to m/dst. Remember that %d or %e for m/dst will push a new value onto that stack.

Additional bits in the opcode are set according to the same annotations used in mov, above.

Flag bits in %f are set as described in the Flag Bits section above, according to the value written to m/dst.

and, or & xor

Bitwise operations between the content of m/dst and m/src, and written back to m/dst if given, or %t otherwise. %d and %e are invalid values for m/dst, use %t and %r instead.

Additional "post" bits in the opcode are set according to the same annotations used in mov above.

Flag bits in %f are set as described in the Flag Bits section above, according to the value written to m/dst.

shf

Shift m/dst left by a number of bits counted by the bits 4:7 nibble of m/src, and then shift right by a number of bits counted by bits 0:3 of m/src. Bits 8:15 of m/src are currently unused.

For arithmetic right shift, make sure the sign bit is set correctly from the contents of m/dst, and use the +S post bit. To shift left and set bit 0 of the result to 1, use the +C or +1 post bit.

    ~~~~~ arithmetic right shift
    0 cmp,           ; sets S from %t
    %t+S $40 shf,    ; arithmetic shift right by 4 bits
    ~~~~~ multiply a 32-bit number by 2 -- 16msb in %n, 16lsb in %t
    1 shf,           ; left shift 1 bit, and set carry bit if result >= 256
    %n+C 1 shf,      ; left shift 1 bit, then add 1 if carry flag is set

Flag bits in %f are set as described in the Flag Bits section above, according to the value written to m/dst.

mul, div, mod, add & sub

Math operations between the content of m/dst and m/src, and written back to m/dst if given, or %t otherwise. %d and %e are invalid values for m/dst, use %t and %r instead.

Additional bits in the opcode are set according to the same annotations used in mov above.

Flag bits in %f are set as described in the Flag Bits section above, according to the value written to m/dst.

setting flags

"cmp," and "test," are syntactic sugar for "sub," and "and,", with m/dst set to %f accordingly, so without writing a result back anywhere.

stack operations

Stack operations between the content of m/dst and m/src, and written back to m/dst if given, or %d otherwise. %d and %e are the only valid values for m/dst -- the alu bits are shared with mul and div operations where %d and %e are invalid operands.

Additional bits in the opcode are set according to the same annotations used in mov above.

For "tuck," and "roll," only, flag bits in %f are set as described in the Flag Bits section above, according to the value written to m/dst. %f is not changed by executing "xch,".

set

Set m/dst to $ffff when <cc> condition is met, $0000 otherwise. If m/dst is not given, push to %d. The list of conditions are given in the Flag Operations section above.

"set," is syntactic sugar for "st," (always set m/dst, without checking for specific flags).

There are no annotations for additional opcode bits for this operation.

%f is not changed by executing this instruction.

branching operations

Exit back to the caller when <cc> condition is met, otherwise continue executing from the next instruction.

"exit," is syntactic sugar for "et," (always exit back to the caller, without checking for specific flags). When this instruction is assembled, the x-bit of the preceding instruction is usually set: It does not assemble into an opcode with its own cell, unless it is labelled and might be the target address for a jump or call or the return address from a call, or when the preceding instruction already has a set x-bit.

%f is not changed by executing this instruction.

After fetching any immediate operand, save %c to %e -- where the next exit instruction will continue fetching instructions when the called instructions return here. Then, set %c to m/src so that execution continues from that address until the matching exit is reached. <cc> can be any of the conditions listed in the Flag Operations section above.

call is syntactic sugar for "ct" (always call, without checking for specific flags).

%f is not changed by executing this instruction.

Identical to c<cc> and call, except that %c is not saved to %e before overwriting with m/src.

%f is not changed by executing this instruction.

Push the address of the first instruction after `[` to %d, then set %c to the address of the instruction following `]`, causing all instructions between braces to be skipped. Execution continues by fetching the following instructions from the new address in %c.

The closing `]` is assembled as an exit operation, usually by setting the x-bit in the instruction immediately preceding the `]`. Like a normal exit operation, if the preceding instruction already has its x-bit set, in the case of nested consecutive closing `]` for example, or if the `]` immediately follows a label, then an individual exit instruction is assembled to the next available cell.

%f is not changed by executing this instruction.

Write a counted string directly to memory, following an unconditional `jmp` to the first instruction following `]"`. A counted string's first cell is the length of the string that immediately follows, and then that many characters written to consecutive cells. At runtime, `["` jumps past the counted string leaving the address of the initial length cell on %d.

    ; writes "Hello, World!\n" to console
    ["Hello, $20, "World! 10, ]"
    console.outlen @t out,
    console.write %d+1 out,

Notation to save a copy of reg to %e before executing any call or c<cc> instructions between here and the first following : label. Only %a and %b are valid for reg, though both can be provided with a single operation.

The assembler will set the appropriate register caller-save bits on all call intructions between a "protect" instruction and the next `:` labelled address, no actual instruction is assembled nor is %f changed.

some one instruction forth operations

Remember, if "mov," is given only one operand then it represents m/src, and the default m/dst of %d is assumed.

    drop            %t %d mov,
    dup             0 tuck,
    LIT             @c mov,
    nip             %n %d mov,
    over            1 tuck,
    pop             %e mov,
    push            %e %d mov,
    rot             2 roll,
    swap            1 roll,
    tuck            1 tuck,
    @               @a mov,
    @+              @a+ mov,
    !               @a %d mov,
    !+              @a+ %d mov,

No support for these yet. Need a simple macro system for the assembler to make adding these easy and expandable.