💾 Archived View for gemini.f4grx.net › vfs.gmi captured on 2023-12-28 at 14:59:56. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-11-14)
-=-=-=-=-=-=-
Now that I have a working 68hc11 board, and that other boards are possible, I am interested in having a usable operating system on it. And by that I mean having the ability to load and run programs recorded on mass storage devices.
As a long time Linux user I now want to get a Linux/UNIX experience on it, with a shell and an usual file hierarchy and such.
At the base of every UNIX based system, of course, the most important component of the OS is the VFS, the virtual file system, that allows mounting storage volumes, and access special devices.
Note: I am in no way a Linux expert, I am just an experienced enthusiast, and I only claim practical knowledge. The Linux VFS is quite large and complex and I do not master every detail of it. What I present here is a hands-on overview based on my various readings, with the simple goal to make the basic concepts more comprehensible.
I am not interested in building a CP/M or DOS-like system but it is still interesting to study how file access is generalized on these.
Both CP/M and DOS (and any windows that follows) have the notion of volumes. Each volume is associated to a drive letter, and contains a filesystem. The path to a file has the following format:
VOLUME:/path/components
VOLUME is either a single letter in the A-Z range, corresponding to a disk drive, or a device name, like COM1: LPT1: CON: or NUL: . Usually named devices do not use path components (windows complains when trying to type file > COM23:/dir/path)
So for data structures, you need
-a list of volumes (A-Z or other names), that can be an array, a linked list, etc.
-for each volume, a pointer to a structure of access routines for open,close,read,write,etc
-for each open file, storage of the path, access rights, and a pointer to the volume so file operations for this volume can access the file
The top level of the VFS (eg the system calls (or libs) used from user/application space) has to
-identify the volume
-find the file
-initialize a file control block
-manage the accesses using the file operations for this volume
UNIX does not have the concept of volumes, but has mount points, which are very similar.
In UNIX-like systems (let's stay vague), a path is only made of path components separated by a slash separator, without a volume prefix. The top level of the hierarchy is the root, noted / , as if there was a unique volume with no name.
Storage volumes are "mounted" in the file path hierarchy, where a particular empty folder can be made the root of a mass storage device.
This is just another way to express the same concepts as DOS, but in a more uniform looking way.
Under the hood, we have several important kernel objects:
Efficient management of these inodes is very important for filesystem performance. Linux has a cache for them, for faster access. inodes are reused if several files having common path components are used simultaneously by the system. This also means that some reference counting is required when deallocating components of a path.
Each inode need to have a set of function pointers, expressing the operations that one can do on them. Operations are different according to the file system that holds them. These operations represents how an inode is loaded from disk representation, how it is stored, updated, and created.
From this point, several implementations are possible.
In Linux, and most probably in other systems like the BSD family, the root of the hierarchy is required to be "hosted" on a mass storage mount point. That means that / is an actual mount point, the device to be mounted is passed to the kernel command line and mounted by the kernel.
To switch to user space, the kernel executes the /init program automatically, this is a "magical" name. The nature of init is unimportant, it is just an executable program. It can be a shell, for example.
Device entries, usually in /dev, are special entries which are neither files nor directories, but... devices "files"! When accessing these, the kernel does special stuff, like finding a driver for the major and minor device numbers indicated in the device "file" entry. There is usually no device mounted at /dev/, which is a normal directory holding these device "files".
When a new storage device is mounted in this hierarchy, it is bound to an empty directory of the root filesystem. When the device is not mounted, the mount point is just an empty dir.
When a file system is mounted, the inode in which the filesystem is mounted gets modified so its "inode_operations" and "super_operations" are able to explore the mounted file system. This includes inode allocation: If inodes in a filesytem have special needs for file management, then the file system itself will allocate enough memory to store these, and return these structures to the VFS layer. With proper type casting, the Linux VFS will use the first allocated bytes as a standard inode, but the following bytes are free to use by specific inode_operations of this filesytem.
In Linux mounts can be nested, since the root is a mount point, it is necessary to support more mount points inside mounted voumes. I have not looked at details but I am sure this is possible by reference counting, so a mount point stays allocated even if not used for file access, and by replacing the function pointers in these inodes.
Here are some documents about the Linux VFS:
Overview of the Linux Virtual File System, kernel.org
A tour of the Linux VFS, tdlp.org
NuttX is a RTOS that makes everything possible to stay compatible with the POSIX specifications, which means it behaves as much as possible as a "real" UNIX. However, it's made for embedding in electronic devices, so it takes different approaches to do things in a simpler and more compact way.
It also has a VFS, but there is a large difference, NuttX does not require to mount an actual storage device at the VFS root. In addition, it avoids loading an initial program for storage, and instead, proceeds to run the system by executing an entry point in a monolithic image. Loading a binary is possible but is not the only (or main) mode of operation.
The NuttX root filesystem is actually a pseudo-FS, which is entirely managed in RAM. The inodes of this FS are very basic, you can just create directories and IO devices. Initially, NuttX didnt even support mounting storage devices, and it can still be built without this ability, which saves memory while retaining the unixish flavour of a VFS for IO devices. It is important to note that only this pseudo file system supports IO devices, all storage based filesystems in NuttX only support regular files.
Because of this, there are no real inode_operations or super_operations, since these are actually very simple and built-in: inodes are just allocated as need be, there is only a handful of them, and they dont need to hold many fields. Almost everything is optional, including modification dates, because not every embedded system has a RTC.
In NuttX, mountable filesystems were added later, and this lead to a different structure from Linux: When a filesystem is bound to the inode of an empty directory for mounting, a large "mountpoint_operations" is defined in the inode, which takes over the default operations used for the simple pseudo-FS, and controls everything from file opening to directory creation.
When a file is searched in the NuttX VFS, the inode hierarchy is explored until the required file is found (this happens for IO devices, usually in /dev), or until a mount point is found: this happens for mounted filesystems.
If a IO device inode is found, the file_operations that were registered on this inode when the IO device was installed are used. This is used to implement /dev/zero, /dev/ttyS0, etc
But when a mountpoint is found, the corresponding mountpoint_operation is called, with the remaining path passed as parameter. It is then the job of the filesystem (which provided the mountpoint_operations) to handle file access.
So in a way, in NuttX, a mount point is a special kind of IO device in the form of a directory, that uses extended file_operations.
A remark about how inodes are allocated: In NuttX the inode allocation happens entirely in the VFS code, and there are no super_operations to allocate filesystem specific inodes. The inode specialization happens via a i_private pointer in the inode structure, that can be set to an allocated pointer from the IO device or filesystem code. This makes the allocation code simpler, but requires two memory allocation per inode: One for the common structure, and one for the device specific structure. If allocating memory dynamically wastes some bytes, this method is less efficient and a Linux filesystem specific inode allocator would be preferable.
Also, by reading the inode searching code, I am not sure that NuttX supports nested mount points.
Final remark: NuttX filesystems don't need support for special files, since the special files can be created individually in the pseudo-FS.
In this way, NuttX VFS is an hybrid betweem UNIX and DOS: It uses an uniform file hierarchy like UNIX, but it also has volumes, it's just that these are given interesting names so the naming stays uniform and UNIX-like. However, with nested mounts, the Linux VFS goes one step farther in complexity.
NuttX documentation:
Special Files and Device Numbers
NuttX FS, (PDF webshoot of a disappeared blog article)
Since my own system is designed for use in retro 8-bit platforms, I want to save as much memory as possible. So the NuttX way looks more natural.
I am now in the process of defining the required data structures and writing some Linux C code that will implement my system. It will then be ported to 8-bit assembly.