💾 Archived View for sprock.dev › posts › file-locks-of-nix-oses.gmi captured on 2024-08-24 at 23:40:51. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-09-28)
-=-=-=-=-=-=-
I wrote this for my own reference, but decided to upload it here in case someone else finds it useful. Feel free to email me with corrections if you find any mistakes.
The existence of multiple conflicting APIs and implementations has resulted in *nix systems commonly supporting at least three distinct, vaguely-standard locking APIs which may or may not manipulate the same underlying locks, and one non-API convention that was a workaround for the limitations of historical *nix systems. All of the APIs discussed here are advisory locks: programs are not prevented from ignoring the locks and using the file without restriction. Some *nix OSes support mandatory locks as well, but they are less standard and I have therefore excluded them from this discussion.
flock() is simultaneously the locking API with the most reasonable behaviour and the only one of these APIs not specified by POSIX. That being said, it is supported on Linux, macOS, and (other) BSDs, with more-or-less the same API.
It supports both exclusive and shared locks that apply to the whole file – there is no support for locking byte ranges – with each lock being held by an open file table entry. As a result, duplicate file descriptors created by dup*() or fork() share any lock on the file, while new ones created by open have their own locking state. Unless one has a particular reason to use another API, this is probably the API that should be used.
These locks can be released manually with another call to flock(), and are automatically released when the last file descriptor for that open file table entry is closed, including when a process exits or is killed.
It is perhaps worth noting that some systems do implement these using fcntl() locks (discussed next), although this does not seem to be the case on modern systems [0].
Unlike flock(), this API is part of the POSIX standard. fcntl() locks support exclusive locks and shared locking, and additionally allow for the locking of specific byte ranges.
That being said, their behaviour is significantly less intuitive: even though these locks are manipulated with a file descriptor, these locks are held by the process itself and are therefore shared with file descriptors created by dup() and open(), but are not shared with file descriptors in the child process created by fork().
A lock on a given section of the file can be explicitly released or – rather unexpectedly – all of a process's locks on a given file are automatically released when *any* file descriptor for that file is closed [1]. These locks are therefore unsafe for use in multi-threaded contexts – where one thread might close a file descriptor for a file while another thread is still accessing the same file – and in the presence of libraries which manipulate files – which might open and close a file while the process believes it still holds the lock.
As specified by POSIX, these are also locks on a given section of a file held by the process, and therefore act like fcntl() locks with a couple of limitations: they only support exclusive locks and operate on a range of bytes whose start is implicitly given by the current offset into the file. Some OSes implement this using the same underlying locks as fcntl(), but this is not always the case [2].
One final method of locking files that I occasionally encounter in old programs and mail clients is "dotlocking", were a separate file – usually "foo.lock" – is used as the lock with prospective lock-holders either using O_EXCL to open it or using link() to copy a temporary file onto the lock if it does not exist. As a result, this method only supports exclusive locking. There are other drawbacks too: it is only possible to poll the lock, with no way to block until it is released, nor is there any way to reliably release the lock if, e.g., the process is killed.
As a result of these limitations, dotlocking is only really a good choice when needing compatibility with a legacy program that already uses it, or when your program is running on a legacy system without support for a better API.
NFS is one place of particular note where other locking APIs were unavailable for a long time. Early versions lacked protocol support for locking and programs instead resorted to dotlocking to (semi-)reliably lock their files. These early problems have led to a great deal of out-of-date information about the usability of locks on NFS.
When NFS2 brought preliminary support for locking, it was supported with a separate protocol connecting to a separate daemon with the hope of keeping the protocol stateless. Because this support existed in the form of byte-range locks, many early systems only supported fcntl() locks on NFS and even then, the locks were initially quite unreliable. On these systems, other locks were typically local and therefore unable to coördinate access across systems that shared the NFS file system.
Unlike its predecessors, NFS4 – first standardized in December 2000 – was a stateful protocol and brought support for locks into the NFS protocol, making them much more reliable. Furthermore, Linux 2.6.12 – released in June 2005 – allowed flock() to use NFS locking. The old behaviour persisted on some systems for years, but flock() now works on all modern *nix systems acting as NFS clients (unless locking has been disabled on the server).
It is worth noting that NFS still brings with it a few unusual behaviours: all locking APIs generally use the same locking mechanism and therefore interact even on OSes where they otherwise would not, and – as a result of the NFS protocol's handling of locks – may act as mandatory locks, not advisory ones.
I briefly discussed dotlocking above, but there is another kind of lock file that deserves mention: one where a specific file is used to *bear* a lock using another locking API, rather than *being* the lock. This can be useful if access to multiple files needs to be coördinated, or if the file that needs to be accessed might be subject to atomic replacement via rename().
While useful, this comes with its own potential footgun: the lock file must remain in place or you risk a race condition where multiple threads/processes believe they hold the same lock: since the lock and accessed files use different file descriptors, there is no way to safely access the latter if the lock file is deleted and replaced.
———
[0] As reported in the GNU flock(2) man page. This was apparently true on Linux before version 2.0, although I am not aware of any systems that still do this. If you are aware of a modern system with this behaviour, I encourage you to get in touch.
[1] Locks held by a process aren't wholly irredeemable, but their usage would be a lot more intuitive if they were acquired using a path and only automatically released when the process exits. This does not fix all their other problems wrt multi-threading and libraries, however.
[2] On the other hand, they may also interact with flock() locks on some systems.