💾 Archived View for ftrv.se › 1 captured on 2021-12-03 at 12:44:48. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2020-09-24)
-=-=-=-=-=-=-
Wanna look thinking different? Why use all these XMLs and JSONs yet again? That's too mainstream, let's invent our own, like no one else does!
You know what sucks about these language-independent data formats? They suck. They are not really human-readable, are hard to serialize data with, and not that easy to write correct parsers for, either. Also, too mainstream.
Let's assume there are two computers -- `A` and `B`. They only know what types of data they are exchanging and what data to expect from each other. Humans debugging the software running on these two computers need to be able to read what's being transferred back and forth, without having to use any tools (for pretty-printing or parsing).
(De)serialization on the software side should be as simple and as fast as possible. It should be straightforward to write a parser/serializer in `awk`, to use `grep` to filter stuff around and `sed` to replace values/etc.
<data> ::= <line> | <data> <line> ::= <key> [\t] <value> [\n] <key> ::= [^\t\n]+ <value> ::= [^\n]+ Only UTF-8 is allowed.
And that's it! As simple as possible. No escaping bullshit. One line, one key, one value! Highly readable!
If you really want newlines in your value, use lists instead.
You can have lists like `[1, 2, 3]` or `["<I'm\tbored>", "\"Привет!\"", "No."]`, just parse it:
numbers 1 numbers 2 numbers 3 what <I'm bored> what "Привет!" what No.
Just as with lists, instead of forcing "objects" on our format itself, let's put it on a different layer.
<key> ::= [^\t\n]+ <nested-key> ::= <nested-key-level> | <nested-key-level> [ ] <nested-key-level> <nested-key-level> ::= [^ \t\n]+
So, nested key is a key with space character used to access different level of object. Let's see how a list of objects could look like:
human human age 45 human name John Snow human human name William Budd human age 68 human human age 57 human name Yoseph Thomas Clover
This data forms a list of objects:
[ Human{name: "John Snow", age: 45}, Human{name: "William Budd", age: 68}, Human{name: "Yoseph Thomas Clover", age: 57} ] :: [Human]
Each object starts with a `human\t` line here, it's like having `key="human"` and `value=""`.
It's not. But you don't have "big data" either. If you are concerned that much about the number of bytes being transferred, use compression.
If you care much about the speed of comparison between the key you've got and the ones you defined (to deserialize into a structure field), don't worry. `strcmp` is really fast.
Wow.