💾 Archived View for vierkantor.com › xukut › manual › memory.gmi captured on 2024-05-10 at 10:38:10. Gemini links have been rewritten to link to archived content
-=-=-=-=-=-=-
This is an explanation of low-level design in XukutOS. We hope that the design works well enough that you can stick to higher-level stuff for day to day activities.
We also have a page explaining the design and implementation from a systems perspective.
Memory in XukutOS is managed by the kernel in co-operation with a garbage collection, ensuring you can use memory without having to know exactly when you're done using it and have to give it back to the system. Understanding the memory system is useful if you want to write assembly code, want to implement custom data structures, or if you want to improve the performance of the garbage collector.
The smallest subdivision that the memory system cares about is a block. Blocks are allocated, moved around and deallocated as one unit. Pointers must point to a byte within a block to be understood by the memory system. Blocks can be marked in various ways, to influence how the memory system deals with them: a permanent block cannot be moved around or deallocated, an immovable block cannot be moved around. A word in memory can have two roles: it can be a pointer to a valid block, or it can be flat data. If a block is not flat, it must have a description, explaining to the memory system where the pointers can be found (flat data blocks can, but do not need to, have a description).
XukutOS manual → Description objects: explains how a description works
Planned: The following attributes of a block are fixed at the moment it is created:
- whether it contains object pointers or flat data
- its description
- whether it is unmarked, permanent or immovable
- its size
- its alignment
The garbage collector can (will, and should!) at any time remove blocks that are unreachable, and re-order blocks that are not marked as fixed. The reachable memory at a given moment in time is defined inductively as the smallest set of blocks, such that:
- the blocks pointed to by object registers (for any running process) are reachable; in particular any value in the environment is reachable (unless it is optimized away!)
- blocks marked as permanent are reachable
- if a block is pointed to an object pointer stored within a reachable block, it is reachable
Bad things might happen if pointers to non-reachable blocks are dereferenced in your program, and to avoid this you need to ensure you work safely.
For complicated stuff like making and working with your own data structures, there are more specialised operations available than the generic data structure functions. The drawback is that you need to ensure you work safely with memory. Memory-safe code means that all memory operations performed by the code are valid. A memory operation is reading or writing an address in memory, or writing values into locations (memory, registers) storing object pointers.
XukutOS manual → Calling conventions: which registers contain object pointers
Reading from or writing to memory is valid when:
- the pointer being read from or written to is located within memory that is either compile-time allocated, or the memory has been allocated (without being deallocated) and the pointer is in an object register
- the target (the register receiving the read data, or the memory being written to) is flat data, or the write to an object location is valid
- if the operation writes a new description for an object, the result is as if a new object with that description was written to in one operation
Writing to an object location is valid when:
- (the value is a pointer to the start of an existing block, which is the case iff the following hold:)
- the write is a multiple of a word long and (if in memory) word-aligned
- the value written is equal to the value read in the same operation from an object location (i.e. data is copied between object registers or between object registers and object memory)
- or the value written is equal to a value read in a previous operation from an object location, being stored in between as flat data, and during the whole time between reading from the original object location and writing to the new object location, the block being pointed to was either marked permanent, or the block was reachable and marked immovable.
If you're writing Swail code, being memory safe is easy:
- don't call memory unsafe functions
- failing that, ensure you follow the safety precautions described in the documentation of that function
Typically, each function described in another page of this manual is memory-safe; most functions described in this page are not. If a function is memory-unsafe, it is clearly noted.
Being memory safe when turning flat data into a pointer is not a local check: there is a condition that must hold at each point in the time between conversion from pointer to flat data, and back. Think carefully about your code before you try to do this.
If you're doing a sequence of operations that would be memory safe if they were instant (e.g. updating a description, and data that the description depends on), the memory lock allows you to block the garbage collector from operating. Everything that happens between locking and unlocking the memory lock is considered the same operation for the memory system. Multiple processes can lock the memory lock at the same time, only the garbage collector will be blocked.
Notifies the garbage collector that this process is going to do operations that are incompatible with the garbage collector's activities, e.g. pointers being stored in "flat" data.
Can block, because it needs to wait until the garbage collector is done.
SAFETY: safe
Notifies the garbage collector that this process is done with operations that are incompatible with the garbage collector's activities, cancelling a previous call to `lock'.
A process is allowed to call `lock' multiple times, and until it has called `unlock' at least that amount of times, the memory lock remains held. If `unlock' is called more times than `lock', the extra calls do nothing. So locking once, unlocking twice, then locking once results in one lock being held by the current process, so that one unlock is enough to release the memory lock.
If you want to build your own data representation formats, these functions will help you do so, mostly safely. Always take the safety notes into account.
Allocate a new memory block of at least the given size, fill it with the given word up to the indicated `size' (cut off at the end if the size is not a whole multiple of the word size) and set its description to `description'.
The value of `fill-with' will remain available until this function returns. A good choice for `fill-with' in almost all cases is to use the default, nil.
SAFETY: this function is memory-safe if filling an object with description `description' with the repeated word `fill-with' is memory-safe. Make sure there is no cut-off part of `fill-with' being interpreted as a pointer. The size of the object indicated by the description takes priority over the size as indicated by `size', after the description is written. (Planned: this last concern should be irrelevant soon.)
Write block[distance : distance + size(values)[ <- values.
The values are given as a span of bytes.
SAFETY: You are responsible for ensuring distance + size(values) is less than the end of the block. You are responsible for checking that this validly writes to any object locations within the block.
Write block[words(distance) : words(distance + size(values))[ <- values.
The values are given as a tuple.
SAFETY: You are responsible for ensuring words(distance + size(values)) is less than the end of the block.
Return a span containing block[distance : distance + size[.
SAFETY: You are responsible for ensuring distance + size(values) is less than the end of the block.
Return a tuple containing block[words(distance) : words(distance + size)[.
SAFETY: You are responsible for ensuring words(distance + size(values)) is less than the end of the block. You are responsible for checking that this validly writes to object locations in the tuple.
The system responsible for giving out new blocks when the process requires it, is called the allocator. Each process, including the kernel, has at least one allocator. The allocator controls multiple memory regions, which are contiguous sections of memory, pre-subdivided into blocks. When a process needs a new block, it notifies the allocator. The allocator determines whether the block can be taken from an existing region, or asks the kernel for a new region which it can start handing out blocks from. As the garbage collector does its work, each object in a region might become cleared away (through recycling or relocation), and the region is returned to the kernel for handing out to the next allocator that needs it.
An allocator and its regions are extremely process-unsafe, meaning they cannot be used simultaneously by different processes. The blocks allocated by the allocator can be safely shared between processes.
This variable contains the allocator used by default in this process.
Rebinding this variable causes all allocations that do not explicitly invoke another allocator, to use the new value as an allocator.
You can rebind this variable safely, as long as the allocator is not in use by another process.
Ask the kernel for a new allocator.
XukutOS manual → System calls: lists the system calls, including `new-allocator'.
You can rebind this definition locally, for example for testing. You should probably also re-bind `region:make' at the same time.
Assign a region to an allocator, allowing the allocator to give out blocks from that region.
A region can be assigned to exactly one allocator.
Unassign a region to an allocator, making the allocator stop giving out blocks from that region.
This does not cause any blocks in the region to be freed, although the garbage collector may decide to do so. The garbage collector can unassign regions from an allocator if it decides to relocate or recycle the objects in that region.
Returns a new block with the given description.
The block will be filled with `nil' (or `0' if it is flat).
SAFETY: You are responsible for checking that the description can apply to this new block.
Determine the region this block is located in.
This is used to determine the attributes of the block: e.g. the description, whether it is flat, ...
Ask the kernel for a new, empty memory region.
`block-size' is rounded up to the smallest allocatable unit, something like a few words. The region will hold blocks that measure, and are aligned to, that many bytes (thus, the region will be at least that size itself). Other block attributes can be set before assigning it to an allocator, starting out as:
- `flat? = #tt', whether it contains object pointers or flat data
- `descriptions = nil': stores the description of each block in the region
- `mark = nil': whether blocks in this region are permanent, immovable or unmarked
Regions are always marked permanent.
You can rebind this definition locally, for example for testing purposes. You should probably also re-bind `new-allocator' at the same time.
Returns the size in bytes of the blocks within this region.
Returns the total number of blocks in this region, both allocated and unallocated.
If `flat?' is true, the blocks in this region will be flat data; otherwise they will contain object pointers.
SAFETY: This behaves like creating all allocated blocks in this region again with the new attribute and copying the data in this region to those "new" blocks, in one operation. You are responsible for checking that this only validly writes flat data to object pointers.
Returns whether all blocks in this region will be flat data.
Change the description of blocks in this region, where `descriptions' is interpreted as follows:
- if `descriptions' is `nil', the blocks in this region have no descriptions (and must therefore be flat data).
- if `descriptions' is a tuple, it must be marked permanent and have length exactly `(block-count region)'. The `n'th block in this region has description `descriptions[n]'.
- if `descriptions' is a description, each block in this region has that description.
SAFETY: This behaves like creating all allocated blocks in this region again with the new attribute and copying the data in this region to those "new" blocks, in one operation. You are responsible for checking that the description can apply to all blocks allocated and allocatable from the region.
Returns the descriptions of all blocks in this region, with meanings as described in `set-descriptions'.
Each block in the region will be marked either unmarked (if `mark' is nil), permanent (if `mark' is `xukut:memory:permanent') or immobile (if `mark' is `xukut:memory:immobile').
SAFETY: This behaves like creating all allocated blocks in this region again with the new attribute and copying the data in this region to those "new" blocks, in one operation. Changing the mark of permanent or immobile blocks can cause safety issues later on.
Returns a new block in this region with the given description and size.
If region does not support the description (e.g. no blocks in the region can have a description), the result is `ff'.
The block will be filled with `nil' (or `0' if it is flat).
SAFETY: You are responsible for checking that the description can apply to this new block.
Returns a new block in this region with the given description and size.
If region does not support the description (e.g. no blocks in the region can have a description), an error occurs.
The block will be filled with `nil' (or `0' if it is flat).
SAFETY: You are responsible for checking that the description can apply to this new block.
Tries to initiate a garbage collection.
If garbage collection is started, this blocks until garbage collection is done (including waiting for the memory lock to be released by other processes). If the current process holds a memory lock, this does nothing (because blocking would cause the process to be blocked indefinitely, while holding the memory lock).
Note that the garbage collector is already scheduled to run automatically, so you probably don't *need* to call this function. Perhaps after your process finishes a long computation and is idling now.
SAFETY: safe.
Return the region that the block the pointer points to is contained within.
Return the block that the pointer points to. (That is, return a pointer to the start of a block.)
Return the next index to check for pointers and the next pointer value in the block starting at the given index.
If `next-index = 0' there are no more pointers (and only one value is returned).
The values of `index' that are accepted by this function are either `0' or a value returned by `get-pointer' on this block.
This is implemented by calling a method in the block's description.
Update the `index'th pointer in the block, corresponding to the index as passed to `block:get-pointer'.
This is implemented by calling a method in the block's description.
If the index is out of range, nothing happens.
Any questions? Contact me:
By email at vierkantor@vierkantor.com