💾 Archived View for tilde.pink › ~maria › log › 2021-08-13_flat_file_databases.gmi captured on 2023-03-20 at 18:37:41. Gemini links have been rewritten to link to archived content

-=-=-=-=-=-=-

flat file databases

Whether it's writing a blog (gemlog, etc.), being in need of a config file, creating documentation or simply building a list of things, most of us start with a flat file database. A document. Or an easy to grasp document structure. Eventually people stop using documents. Especially in business context. Then things are being migrated into databases.

What if we didn't migrate the data into a database server/service. What if we attempted to build our own flat file structure that worked? Would we end up with a database, written by ourselves? And why is it so hard to stay within a flat file context?

These are good questions, especially when you're trying to be a minimalist. Disregarding the challenges of connecting complex data, filtering and grouping them, because those use cases are typical for database users that use the database.

I am looking at other database-backed things. The windows registry. Wordpress. Wikipedia. Various logging storages.

For almost all of these we have examples that do not use databases and similar use cases that do. Linux has survived with a well-defined flat file structure for ages now. Patches and updates are frequent and fast. How is it, that windows registry uses a key-value storage database that grows into the hundreds of megabytes and stops being efficient? How is it that Wordpress reinvents the wheel about text editing just to provide an incentive to stay within a browser context (that was originally designed to read text, not edit text, hence the name browser).

Flat file structures are hard. They depend on the use case. As an engineer you need to think hard about how to build your things around the limitations. It's also an easy tool, one that an operating system natively supports with various features. Reading from files and writing to files isn't that easy either. One needs to know about the operating system and how it works internally so you can be efficient. If you decide to write to a disk. A ramdisk (a filesystem inside the computer memory) would be different and also support different things, being ephemeral and all that.

Is it really worth it to trade this type of control and simplicity for an external service? For most things a real database is overkill. Storing blog posts in a postgresql database just because you can may feel like a good idea at first, but ultimately using the operating system to open a file and then connecting that file descriptor to the network socket to let the operating system handle the transfer may ultimately be faster and less resource consuming. If you know what you're doing.

Geminispace has made me more of an advocate for flat file databases. Honestly as a developer you need to know about file descriptors, sockets, and all that anyway. Knowing and running a database just so someone can have a browser editor to post company updates, somehow I feel that parsing and storing an email into a file would be more efficient.