💾 Archived View for bbs.geminispace.org › u › gemalaya › 5625 captured on 2023-12-28 at 17:23:28. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2023-11-14)
-=-=-=-=-=-=-
@clseibold Nice! Yes the repo is accessible now, i saw the gemmail parser it's really cool.
I meant gemmail. The misfin spec defines gemmail as the "mail file format" for misfin, so i assumed that this is the format that the mailserver should use to store individual messages. So, looking at your gembox code, gembox would be kind of an "mbox" that stores a series of messages stored in the gemmail format separated by this magic "<=====" separator ?
What about binary content, is binary forbidden in the content of messages ? Anyway, it's impressive what you wrote in such a short time, kudos.
Sep 27 · 3 months ago
🚀 clseibold · Sep 27 at 10:50:
@gemalaya Yeah, the gembox is basically kinda like an mbox, but for gemmail files. I wanted something very simple. The separator could always be changed if people think it should be different.
As for binary files, that's something the spec doesn't cover, afaik. I know people were talking about other stuff, and they said they just used preformatted blocks, but I don't see how that could handle binary.
The other option is to detect whether a sent file is UTF-8 text, and if not, then assume binary. I think the main function of misfin is supposed to be gemtext (plus gemmail linetypes) only though.
🚀 clseibold · Sep 27 at 10:59:
Actually, I think I'm going to change my mind on making a parser for python. Someone else that likes python can do it, because I really just cannot with that language. I take one look at the misfin python code and I remember how much I despise python and why I have not touched that language in 5 years, lol.
I will do other languages though. I was thinking about C, which might not be so hard (I can actually copy some code I wrote for parsers in the past, so a stretchy buffers would be pretty simple, or there's always the stb libraries too)
@clseibold Ok. I'll write the gembox writer for the python implementation.
I have the multi-mailbox server working, with certificate chain of trust verification. You just have to create "child" certificates in a separate directory, and messages are delivered to a message box according to the cert's fingerprint.
@clseibold Just pushed a first version of the multi-mailbox code, an example usage is at the end of the README. The mbox thing is just for debugging, i'll have the gembox format later today.
@clseibold I wrote the gembox parser/writer, and the server can now deliver the messages to a gembox file. What should the official file extension be for a gembox file ? .gembox is a bit long and .gbox is ambiguous.
📷 billsmugs · Sep 27 at 17:20:
In my server, incoming messages are just stored as individual .gmi files in a mailbox-specific subfolder. I then have an inbox page on the corresponding gemini capsule (behind a client certificate) that dynamically builds links to each file in the folder (sorted by the date/time in the file name) and parses out the subject/date/sender for each one for a nice chronological listing.
This way I get nice rendering through any client (except for the misfin-specific line types, but they are easily human-readable when rendered as normal text), the only dynamic portion is the inbox page (viewing a message requires no further parsing and just serves the file from disk), and the only parsing I have to write is the sender/date lines which I know are guaranteed to be the first two lines, in that order (which is much easier than writing a full gemtext parser).
Defining a storage format using a flat file with separators seems unnecessary and introduces a need to escape the separator if its found in an incoming message. If I ever need something more complex than files in a folder I think I would just go straight to a database (e.g. sqlite) instead.
Having a public server that allows simple registration by other users would be a great boost for Misfin beyond the more technical crowd. I would be concerned about the potential legal risks of running something like that personally (e.g. GDPR requirements and the risk of potentially hosting discussions of illegal behaviour), but don't want to put you off as it would be a great service to have!
If you (or anyone) does start a public Misfin mail host, I had some thoughts in this post (around compromised certificates) that may (or may not!) be worth considering before going live:
— bbs.geminispace.org/u/billsmugs/3289
🚀 clseibold · Sep 27 at 17:34:
@billsmugs Having multiple files for emails is fine if you want to have to manage naming all of those files and putting them in their own directories for each user and then keeping track of the file names, but I feel having a single file is simpler and does not make the parser that much more complicated (it added few lines of code). I personally don't think gembox is that complicated, but you're right that the divider needs to be something that won't appear in regular messages.
The sender and timestamp, afaik, are not guaranteed to be the first two lines, because you can have multiple senders if you are receiving a forwarded message, and you can have multiple timestamps based on how many times a user has received that message, as per the spec.
🚀 clseibold · Sep 27 at 18:55:
@billsmugs I wanted to also mention that the divider I chose for gemboxes is the exact string `<=====\n`. The `<` means that it could be conflated with a Sender line, which is a good thing because then readers that don't support gemboxes will either error out on an invalid address syntax, or they will just ignore this invalid line. But for readers that support gembox, they can easily split the file based on these and then handle each gemmail separately. Using the `<` *because* it was already a gemmail linetype was intentional for these reasons.
So the big question is how do we escape the gemmail linetype specifiers (`<`, `@` and `:` at the start of a line).
📷 billsmugs · Sep 27 at 19:31:
Using the '<' as the start of the divider is a clever solution, but I'm still not sure what the gembox format gains you in exchange for the effort? I don't keep track of any file names for messages, they have the received date/time at the start of the file name to quickly sort them but when a user requests their inbox page the server just works off the list of all files in the mailbox directory.
Having a single flat file per user does make migrating to another server easier I suppose (the user can just grab all of their messages in one go and theoretically import them into another server/client), but I'm not sure that's worth having to split the file up programatically each time the user views their inbox?
In order to display a message listing, you only need to know the address you received the message from and the date/time it was received, which will always be the first two lines of the file I think? (Other instances of <,: and @ inside the message from forwards/replies etc could be handled by a client to render them differently, but they don't need to be parsed to display an inbox).
🚀 clseibold · Sep 27 at 19:33:
In order to display a message listing, you only need to know the address you received the message from and the date/time it was received, which will always be the first two lines of the file I think? (Other instances of <,: and @ inside the message from forwards/replies etc could be handled by a client to render them differently, but they don't need to be parsed to display an inbox).
I don't think this is true. What I get from the examples in the spec is that the latest sender is always the last sender line, not the first. Same with timestamps. The latest timestamp is the last timestamp, not the first.
Although, it depends on if you are talking about the original sender or the person who forwarded the email to you, I suppose.
This makes sense to me because the lines are always in the order, from earliest/first to latest/last. The first sender is the first sender line, whereas the last sender is the last sender line.
You are also not guaranteed to get the timestamp within the first two lines. My current server appends timestamps after all sender lines, for example. If I wanted to switch this, then I would need to refactor my parser a bit.
As for the gembox format, it was modelled after the mbox format. mbox dates well back to Multics, probably even CTSS, well before Unix v1. I assume they chose the mbox format because it was simple to set permissions, it was just one file that had to be read, and it was fairly easy to parse, and one could move messages from/to different mbox files. It's also easier to transport mbox files. These are pretty much all of the reasons I chose to do gembox as well.
Gembox is certainly an optional thing. Servers don't need to support appending to gemboxes, so I think it's good that there are different ways of storing messages for different servers.
As for readers, supporting both individual gemmails and gemboxes allows for servers to pick and choose which they find best for their setup without worrying about hvaing unsupported readers. These two options, individual gemmails, and a flat file format like gembox, pretty much support all the options that one could need for mail storage, imo.
🚀 clseibold · Sep 27 at 19:52:
@billsmugs I still think you're overestimating the effort it took me to make a gembox parser. It took like 2 minutes to make, lol. I mostly just wrap around my already-make gemmail parser, detect `<=====` to split stuff, and then pass each split piece to the gemmail parser. You could probably even do what I did in even less time and with even less effort by using strings.Split in golang.
Although, you do have a point in splitting up a file each time the inbox is opened. If the gembox file ends up having like thousands of gemmails in it, then I could see how this might make it slow, but modern computers are pretty darn fast, so I'd have to test to see how slow it actually would be for big files like this.
🦀 jeang3nie · Sep 27 at 22:24:
re binary data, the spec is pretty clear that's a non starter and you should just send a link to a file.
One could always work around that restriction using base32 or base64 encoding, however.
📷 billsmugs · Sep 27 at 23:01:
@clseibold Having read my own messages back, I'm worried I've come across as antagonistic and/or dismissive about the gembox format, which wasn't my intention! The ease of transferring/backing up entire mailboxes and setting file permissions are definitely advantages of your system over mine that hadn't occurred to me (UNIX permissions are something I keep meaning to do more reading about in general as I know they are quite powerful but have very little knowledge of how they work and what they can do) and you're probably right that the performance impact is negligible in reality.
With regards to forwarding and sender line ordering, if Alice sends a message to Bob's server, which auto-forwards it on to me, I would expect to see the message in my inbox start with the following, with my server adding the first two lines and Bob's server adding the last two:
< b@b.com Bob @ 2023-09-27T00:00:01 < a@a.com Alice @ 2023-09-27T00:00:00
The example in the spec for sender lines omits timestamps but looks like it matches this order:
< development@mailing-lists.com Development mailing list < source@example.com Source user
I understood this as Source user sending a message to the mailing list and I then receive a message from the mailing list address.
🚀 clseibold · Sep 27 at 23:06:
@billsmugs I didn't read it as antagonistic, so don't worry about that :)
Also, you're absolutely correct and I misread the spec. I will have to change my server code to prepend rather than append. This is a simple change.
The thing I still have a question about is if timestamps are grouped together in reverse order like the senders are, or if the sender lines and timestamp lines are intermingled, like so:
@ Final Destination Timestamp
< Mailinglist
@ Mailinglist Timestamp
< Source user
Of if it's supposed to be more like the following:
< Mailinglist
< Source user
@ FInal Destination Timestamp
@ Mailinglist Timestamp
If it's the first one, then I will have to make major changes to my parser, because currently my parser does the second option.
Personally, I don't like that the senders are in reverse order. I feel the source user should always be at the top, not as the last sender line. For mailinglists, you want the source user as the real sender, but for forwarding emails, you want the last sender (the forwarder) as the real sender, I suppose. Hm....
In the python implementation i wrote a gembox folder class that uses the MH mailbox format. Honestly it's much nicer because you don't have to deal with indexes and stuff.
I've got a misfin gemini frontend working where you can register, read your inbox and send messages, it was easier than i expected.
I don't like to output the cert and key in the page but lagrange has this nice feature where it parses a cert/key pair and imports it. The other way would be to have temporary (time-limited or something) URLs that let you download your certificate .. Is there any standard way of doing this with the gemini protocol ?
Misfin Server Ideas — I believe I have written a basic solo-mailbox server in golang. It is running now, so people should be able to test it at my same misfin address (clseibold@auragem.letz.dev). I wanted to outline some ideas that I have for the server: 1. I want it to support both solo-mailbox and multi-mailbox setups. 2. An interesting idea came up when I compared the spec to gemini. Gemini has this proxy ability, I believe, so that it could actually proxy other gemini servers. This led...