💾 Archived View for gemi.dev › gemlog › 2023-10-15-numbering-madness.gmi captured on 2024-12-17 at 09:49:47. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2023-12-28)

-=-=-=-=-=-=-

Help wanted: Recovering the actual message numbers from the Mailing List archive

2023-10-14 | #gemini #mailinglist | @Acidus

I've gotten some nice feedback on my Gemini-first archive of the Gemini Mailing List that I released a few days ago. I even implemented some suggestions like masking email addresses.

🛰🦊❤️ Orbital Fox Redux: Complete Mirror of Gemini Mailing List

However, there was one thing I really wanted to do that I was not able to accomplish: Using the same message numbering scheme. You see, Orbital Fox hosted a public HTML archive of all the mailing list messages. Each message on the mailing list was given a number and was accessible via a URL like this:

Format:
https://lists.orbitalfox.eu/archives/gemini/[YEAR]/[6 DIGIT NUMBER].html

Example:
https://lists.orbitalfox.eu/archives/gemini/2019/000046.html

Threads were represented in the HTML interface using nested lists, with links to each specific message.

Wayback machine archive of Orbital Fox's HTML threaded view for 2019

These hyperlinks to specific messages were SUPER IMPORTANT! They allowed people on the list to include hyperlinks to previous discussions or decisions when someone would ask questions or propose changes. They also help you track how ideas evolved over time. If I could use the same message numbering as Orbital Fox did, then people could still follow these hyperlinks to the message that an author was referencing.

In other words, if I could use the same numbering scheme, then the message:

https://lists.orbitalfox.eu/archives/gemini/2019/000046.html

would be available in my archive at, say:

gemini://gemi.dev/gemini-mailing-list/2019/000046.gmi

I even could *rewrite* references in the archive to point to my Gemini links, so the reference would be preserved! That would be awesome and really help the reading experience!

Numbering Madness

So, just use the same numbers from Orbital Fox's HTML interface right? That's can't be hard. Only it is, because the way messages were numbered in Orbital Fox's HTML archive is just madness and I can't seem to figure it out.

I assumed that these message numbers were assigned, starting with 000000, to each message on the mailing list, based on when the message was received. Open this Wayback machine version of the Threaded view from 2019:

Wayback machine archive of Orbital Fox's HTML threaded view for 2019

The first 4 messages in the first thread of the mailing list ("Let's get this list started") use the numbers 000000, 000001, 000002, and 000004. Where is message 000003? Well message 000003 is the first message of the 2nd thread (the absolutely insane read that starts as "Text reflow woes (or: I want bullets back!)"), since it was sent before the 4th message in the first thread was sent (which is message 000004).

So, that seems to match what I expected: Message numbers just increment. So all I need to do is sort the messages by date sent, and then assign them incremental numbers starting at 000000 right? No.

To see why, look at this Wayback machine copy of Orbital Fox's HTML archive's "date" view, which shows you all messages in a year, in the order they were received:

Orbital Fox's HTML date view for 2019

That *should* simply be a list of messages, starting at 000000.html, increasing by 1 each message. Only it's not. There are 289 messages in 2019. But the message number of the last message is 000293. Wait, what? Things seem fine up until message 000120 ("CGI suport for Gemini" from solderpunk). The message after that is 000125 ("Text format proposal (was Re: Text reflow woes (or: I want bullets back!)y)" from Jason McBrayer). After 000125 comes... 000124? And after that comes message 000126?!?! 🤬 🤬 🤬 Where are messages 000121, 000122, or 000123? Yeah, they don't seem to exist. These jumps and out-of-order numbers happen multiple times in the 2020 and 2021 archives too.

Help Needed!

After banging my head against the wall for several hours I was getting no where:

Other random thoughts:

So, please, if you were involved in the early mailing list, or have any ideas on how I can map all the 6 digit numbers used by Orbital Fox's HTML interface to the actually messages I've extracted from the mbox files, PLEASE let me know!