This is an explanation of low-level design in XukutOS. We hope that the design works well enough that you can stick to higher-level stuff for day to day activities.
The calling conventions described in this manual page determine how functions use the registers available in aarch64. If XukutOS gets ported to other platforms, the calling conventions will be described accordingly.
The aarch64 architecture uses 32 numbers to refer to general-purpose registers, from `x0' up to `x30'. Register number 31 is either the zero register or the stack pointer, depending on the instruction. x30 is also architecturally special: it is referred to as `lr' and stores the return address for the current function. All other registers do not have an architecturally-assigned role.
XukutOS partitions registers into 3 roles: objects, numbers and special pointers. The partitioning of registers into these categories does not have to be fixed for all time, but it is fixed for now. Using registers according to their roles ensures the garbage collector knows which objects the code is dealing with. The object registers must contain a valid pointer to a memory block. The number registers contain fixnums (or pointers that can validly be converted from flat data to object pointers). The special pointers are pointers that do not necessarily refer to the start of a memory block, such as the stack pointers or the link register.
The two most important special pointers are the stack pointers `osp' and `nsp'. We have two stacks, one for storing objects and the other for storing numbers. Technically your process could use them differently (they are just flat registers as far as the garbage collector is concerned). However, all built in code assumes these are practically unbounded downwards-growing stacks. Also, the `oarcc' calling convention expects the `osp' to have some specific contents.
Planned: maybe we'll use frame pointers instead. Then there would be two frame pointers: `ofp' and `nfp'. The system does not care very much about the way `ofp' and `nfp' are actually used (they are just object registers as far as the garbage collector is concerned). Still, we follow a couple of guidelines to improve understandability. `ofp' is the object frame pointer, and points to the current call frame, which is an object filled with pointers. Object registers are stored in object call frames. `nfp' is the number stack frame. Number registers are stored in number frames. The previous call frame is stored in the first word of `ofp'. The previous number frame in the second word of `ofp', if applicable, otherwise the second word is just any object pointer. The link register is stored in the first word of `nfp', if applicable, otherwise the first word is just any number. To fit the design of aarch64 better, the kernel instead has a "traditional" stack in register `sp = sp_el1' which is pushed and popped as a stack normally would.
We use nicknames for registers that indicate their roles:
- `xzr': non-GP, special pointer to `nil' (immutable)
- `x0': `o0', `this': object
- `x1': `osp': "special" stack pointer
- `x2': `o1', `env': object
- `x3': `o2', `args': object
- `x4'-`x6': `o3'-`o5': object
- `x7': `o6', `otmp': object
- `x8'-`x15': `o7'-`o14', `oar0'-`oar7': object
- `x16'-`x24': `n0'-`n8': number
- `x25': `n9', `sys0': number
- `x26': `n10', `sys1': number
- `x27': `n11', `nar': number
- `x28': `n12', `tmp0': number
- `x29': `n13', `tmp1': number
- `x30': `n14', `lr': "special" pointer to the return address
- `sp': `nsp': "special" stack pointer
- `pc': non-GP, special pointer to the current instruction (not easily accessible)
At all times, from calling to returning:
- `o0'-`o14' (planned: and `ofp' and `nfp') are valid pointers to memory blocks
- `env', `sys0' and `sys1' are not modified unless specified otherwise
- `n0'-`n15' are not converted to object pointers unless doing so is valid.
- `osp' and `nsp' are valid stack pointers
- pushes to `osp' are only from the `o'-registers
- pushes to `nsp' are only from the `n'-registers
At the moment of calling:
- `env' is a valid environment structure
- `lr' is the return address
- `pc' is the first instruction of the function to call
At the moment of returning:
- `pc' is the value in `lr'
- `o6', `n12' and `n13' may be modified
- other registers as determined by the function
- *every* other register is preserved by default (including the stack pointers)
Motivation: this is a very generic calling convention that also applies the conventions used in compiled Swail code. This ensures calling the function is memory-safe, and the called function can run Swail code itself without confusion.
Versions of this calling convention are used throughout the kernel and in optimized user code, although OARCC tends to work better for object-/recursion-heavy code. The absence of argument/return register assignments makes it harder to make generic calls or do tail call optimization, but the improved efficiency and ease of calling from Assembly make up for it.
At all times, from calling to returning:
- `o0'-`o14' and `sp_el1' are valid pointers to memory blocks
- `env', `sys0' and `sys1' are not modified unless specified otherwise
- `n0'-`n15' are not converted to object pointers unless doing so is valid.
At the moment of calling:
- `env' is a valid environment structure
- `lr' is the return address
- `pc' is the first instruction of the function to call
At the moment of returning:
- `pc' is the value in `lr'
- `o6', `n12' and `n13' may be modified
- other registers as determined by the function
- *every* other register is preserved by default (including the stack pointer)
Motivation: this is ASMCC with a different stack register, which works well within the kernel. It is supposed to be easy to program to in Assembly code. Functions following this convention are suffixed with `__asmcc_k' and a description of the registers used.
At all times, from calling to returning:
- `o0'-`o14' are valid pointers to memory blocks
- `env', `sys0' and `sys1' are not modified unless specified otherwise
- `n0'-`n15' are not converted to object pointers unless doing so is valid.
- `osp' and `nsp' are valid stack pointers
- pushes to `osp' are only from the `o'-registers
- pushes to `nsp' are only from the `n'-registers
At the moment of calling:
- `env' is a valid environment structure
- `lr' is the return address
- `pc' is the first instruction of the function to call
- arguments are objects, first in `oar0' - `oar7', then on the object stack: [osp + 0W] is arg 8, [osp + 1W] is arg 9, ...
- the number of arguments is stored in `nar_32' (the high bits of this register are reserved; they might be e.g. used in the future to indicate that some arguments are passed in number or float registers instead)
- `oar' registers past `nar_32' are valid object pointers, but should not be read (these can be used for temporary values or further calls)
- if arguments are passed on the stack, these values may be modified and should be popped before returning (but see also the rules for returning)
At the moment of returning:
- `pc' is the value of `lr' at the moment of calling
- the returned values are objects, first in `oar0' - `oar7', then going upwards on the stack: [osp] is return value 8, [osp + 1W] is return value 9, ...
- the number of return values is stored in `nar_32' (the high bits of this register are reserved; they might e.g. be used in the future to indicate that some return values are passed in number or float registers instead)
- `oar' registers past `nar_32' are valid object pointers, but should not be read (these can be used for temporary values or further calls)
- `nsp' is preserved
- `osp' is preserved, except any arguments have been popped and a return values may have been pushed. More precisely, let `osp_pre' and `nar_32_pre' be the values of those registers at the moment of calling and `osp_post' and `nar_32_post' at the moment of returning. Define `osp_mid = if nar_32_pre < 8 then osp_pre else osp_pre + (nar_32_pre - 8)W', then `osp_post = if nar_32_post < 8 then osp_mid else osp_mid - (nar_32_post - 8)W'.
- `lr', `o6', `tmp0' and `tmp1' are not preserved (note that tmp0, tmp1 and `o6' together provide enough scratch space to call the `alloc_imm' macro for obtaining temporary storage space)
Summary:
14 preserved registers: `o0', `o2' - `o5', `n0' - `n8',
13 overwritten registers: `o7' - `o14', `n11' - `n14'
4 special registers: `env', `osp', `sys0', `sys1'
3 non-GP registers: `nsp', `pc', `xzr'
Motivation: this improves on Swail-compatible ASMCC by permitting better tail recursion: reserving a register to pass an argument means we do not need to save those registers before a tail-call.
This calling convention should be used in optimized user code, unless it does so much number crunching that (Swail-compatible) ASMCC is preferred. (TODO: can we adjust the calling convention so that number registers also work?)
At all times, from calling to returning:
- `o0'-`o14', `ofp' and `nfp' are valid pointers to memory blocks
- `env', `sys0' and `sys1' are not modified unless specified otherwise
- `n0'-`n15' are not converted to object pointers unless doing so is valid.
At the moment of calling:
- `env' is a valid environment structure
- `lr' is the return address
- `pc' is the first instruction of the function to call
- arguments are objects, first in `oar0' - `oar7', then in the call frame: [ofp + 2W] is arg 8, [ofp + 3W] is arg 9, ...
- the number of arguments is stored in `nar_32' (the high bits of this register are reserved; they might be e.g. used in the future to indicate that some arguments are passed in number or float registers instead)
- `oar' registers past `nar_32' are valid object pointers, but should not be read (these can be used for temporary values or further calls)
- if arguments are passed in the call frame, this frame may be modified and should be popped before returning (but see also the rules for returning)
At the moment of returning:
- `pc' is the value of `lr' at the moment of calling
- the returned values are objects, first in `oar0' - `oar7', then going upwards on the stack: [ofp] is return value 8, [ofp + 1W] is return value 9, ...
- the number of return values is stored in `nar_32' (the high bits of this register are reserved; they might e.g. be used in the future to indicate that some return values are passed in number or float registers instead)
- `oar' registers past `nar_32' are valid object pointers, but should not be read (these can be used for temporary values or further calls)
- `nfp' is preserved
- `ofp' is preserved, except a frame for arguments may have been popped and a frame for return values may have been pushed. More precisely, let `ofp_pre' and `nar_32_pre' be the values of those registers at the moment of calling and `ofp_post' and `nar_32_post' at the moment of returning. Define `ofp_mid = if nar_32_pre < 8 then ofp_pre else ofp_pre[0W]', then `ofp_post = if nar_32_post < 8 then ofp_mid else new_stack_frame(size=nar_32_post - 8 + 2, prev_frame=ofp_pre, values=[whatever, ret 8, ret 9, ...])'. If a frame for arguments was passed, the contents of the frame are not preserved, and it may even be used as the frame for return values.
- `lr', `o6', `tmp0' and `tmp1' are not preserved (note that tmp0, tmp1 and `o6' together provide enough scratch space to call the `alloc_imm' macro for obtaining temporary storage space)
Summary:
14 preserved registers: `o0', `o2' - `o5', `n0' - `n8',
13 overwritten registers: `o7' - `o14', `n11' - `n14'
4 special registers: `env', `ofp', `sys0', `sys1'
3 non-GP registers: `nfp', `pc', `xzr'
Motivation: this improves on Swail-compatible ASMCC by permitting better tail recursion: reserving a register to pass an argument means we do not need to save those registers before a tail-call.
This calling convention should be used in optimized user code, unless it does so much number crunching that (Swail-compatible) ASMCC is preferred. (TODO: can we adjust the calling convention so that number registers also work?)
This calling convention is not so efficient, but allows for very generic calling code.
At all times, from calling to returning:
- `o0'-`o14' are valid pointers to memory blocks
- `env', `sys0' and `sys1' are not modified unless specified otherwise
- `n0'-`n15' are not converted to object pointers unless doing so is valid.
- pushes to `osp' are only from the `o'-registers
- pushes to `nsp' are only from the `n'-registers
At the moment of calling:
- `val' is the object being applied
- `env' is a valid environment structure
- `args' is a list of function arguments
- `lr' is the return address
- `pc' is the first instruction of the application function of the object being applied
- `osp' is a valid stack pointer to object data (words in the interval [osp - N, osp[ are scratch for a large value of N)
- `nsp' is a valid stack pointer to flat data (words in the interval [nsp - N, nsp[ are scratch for a large value of N)
At the moment of returning:
- `pc' is the value in `lr'
- `o6', `n12' and `n13' may be modified
- `val' is the result of calling the function
- *every* other register is preserved
Motivation: storing the arguments in a list allows for easy varargs support without weird stack manipulations.
This calling convention stands to Swail-ASM as OARCC stands to ASMCC.
Everything works like the OARCC convention, except an extra argument is added at the end: the `nar_32-1'th argument is the object being called. This argument is typically popped before the actual code gets executed.
Any questions? Contact me:
By email at vierkantor@vierkantor.com