💾 Archived View for bbs.geminispace.org › s › misfin › 20645 captured on 2024-12-17 at 15:02:31. Gemini links have been rewritten to link to archived content

-=-=-=-=-=-=-

misfin link specification proposal

So after some discussion in IRC as well as careful reading of the relevant parts of RFC 3986, here's what we've come up with:

misfin:// links must not be used like mailto: links to trigger the client composing a message. This is because the // portion indicates that the address is the *authority* portion of the link, which is accurate when a client is making a misfin:// request to a server but not when the address is simply being passed to another program.

I understand this is inconvenient since quite a few people (including myself) are currently using misfin:// links on their gemini sites, but this is explicitly disallowed by the URI spec so we should probably make the change.

Therefore, when placing misfin address and/or message information in a url format, the scheme misfin: should be used without a following double slash. This indicates that the address(es) which follow are part of the urls 'path'.

The addresses can be followed by a query portion which begins with a '?' character. The query is a percent-encoded version of the message text which programs handling the url may use to prefill subject line and text body, as is common with mailto: links.

There are no query parameters, however; nothing may be placed in the query which is not message text. Also, since fragment semantics are based on media type alone and are always scheme-independent, there must not be a # symbol in the url (unless the symbol is percent encoded, of course).

Note that including addresses in a misfin link is actually optional; it would be perfectly valid to write this:

misfin:?%23%20Checking%20in%0D%0AIt%27s%20been%20a%20while%20-%20are%20you%20doing%20alright%3F

In this case the handler is simply given no information about the recipients of this message and the user will have to provide this after following the link.

What I'd like to hear from you:

How should the addresses be delimited? I'm leaning towards going with a comma since that's what mailto: uses, but in theory we can choose from any of these:

! $ & ' ( ) * + , ; =

Is there anything misfin: links ought to be able to handle that this proposal doesn't allow for?

Of course, any sort of comments you might have on this are welcome.

Once I hear from everyone here who has not had the chance to weigh in on IRC I will make the necessary changes to the specification or redraft as appropriate.

Also, any interested parties should feel free to join us on IRC:

libera.chat at ##misfin

RFC 3986; see sections 2.2, 3.1, 3.3

Posted in: s/misfin

🐐 satch

Oct 10 · 2 months ago · 👍 vi, lux, clseibold

27 Comments ↓

🚀 clseibold · Oct 10 at 17:33:

Wanted to confirm that this syntax is parseable in Golang using the following code:

URL, _ := url.Parse("misfin:clseibold@auragem.letz.dev,mail@satch.xyz?message+body")
fmt.Printf("%s\n", URL.Scheme)
fmt.Printf("%s\n", URL.Opaque) // Split this on commas
fmt.Printf("%s\n", URL.Path) // Should always be empty
fmt.Printf("%s\n", URL.RawQuery)

In Golang, URIs that use "://" will use Host + Path fields, whereas ones that use just ":" (e.g., mailtos) will just use the Opaque field for everything between the Scheme and the Query.

Note that you will have to parse the addresses within the Opaque field yourself.

☯️ johano · Oct 10 at 17:52:

interesting you mention the URL spec... I noticed yesterday that when clicking a URL formatted as "misfin:johano@hashnix.club" Lagrange opened it and filled in "hano@hashnix.club" in its message dialog (so clearly expecting the double as part of the URL)

🚀 clseibold · Oct 10 at 18:13:

@johano Yeah, URLs that use "://" are handled differently by libraries from URLs that just use ":" (like mailto links). Before, we were all using "://", and since this change is new, langrange would be expecting our older way of doing the links with "://".

🦂 zzo38 · Oct 10 at 18:58:

I still do not intend to use Misfin, and I still think there are some problem. The specification of Misfin addresses in certificates still had not been improved, and it is still restricted to Unicode, as well as some other problems.

However, about address delimiting, I agree with you that a comma should be used. (Message-IDs should also need to be handled. MD5 isn

🐐 satch [OP] · Oct 10 at 19:36:

@clseibold here were my tests for rust’s `url` and python’s `urllib.parse`:

Rust:

use url::Url;

fn main() {
    let test_url = "misfin:some,path,with,commas";
    let parsed = Url::parse(test_url).unwrap();
    println!("Path: {}", parsed.path());
}

Prints `Path: some,path,with,commas`.

Python:

from urllib.parse import urlparse

test_url = "https://example.com/some,path,with,commas"
parsed_url = urlparse(test_url)
print("Path:", parsed_url.path)

Prints `Path: some,path,with,commas`.

☯️ johano · Oct 11 at 00:23:

@clseibold aha, gotcha

🕹️ skyjake [mod...] · Oct 11 at 11:09:

I'm a bit unclear on the rationale for having multiple recipients. Why is this necessary?

The client would need to make multiple requests, sending the message separately to each recipient, via a separate connection to each Misfin server, right?

@johano I'll need to fix Lagrange's URL parsing when it comes to Misfin and make it use the full-fledged parsing mode. Currently it bypasses that and just assumes "misfin://" is at the start, the rest being the recipient.

🐐 satch [OP] · Oct 11 at 12:02:

@skyjake

I'm a bit unclear on the rationale for having multiple recipients. Why is this necessary?

Group threads. Having multiple recipients is built into misfin as far back as the earliest revisions of the (B) spec which I discussed with ~lem.

You're right about the separate connections. Skylab doesn't have great support for multiple recipients yet, but it does work just with a few kinks.

I honestly don't remember what the kinks are right now just that they exist.

🐐 satch [OP] · Oct 11 at 12:08:

@skyjake Can you please deprecate misfin:// links in favor of misfin: so people are forced to switch?

🕹️ skyjake [mod...] · Oct 11 at 13:14:

@satch Yes, I can do that in the next Lagrange patch.

Having multiple recipients is built into misfin as far back as the earliest revisions of the (B) spec which I discussed with ~lem.

You are referring to the gemmail format (section 4)? This is probably where my confusion stems. As far as I understand, gemmail is used for storing messages in the serverside mailbox. However, actual Misfin requests are always targeting a single recipient, so you'd have to include the gemmail line types in sent messages as well, to let members of the group discussion be aware that multiple people are involved?

Is the official position that gemmail is not only a mailbox file format but also the format for actual messages being transmitted from the client to the server? If so, nothing would prevent the client from including fraudulent/bogus header lines in the message?

I was under the impression that the server adds the gemmail-specific lines to messages after they have been received and are being written to a mailbox file.

🚀 clseibold · Oct 11 at 13:35:

@skyjake It's technically both, but as I understand it, the gemmail format was not really required for storing messages, but one suggested way of storing them, and it's also for sending them. The recipient server adds the headers, but if a person is forwarding a message to another person, then they keep the existing headers to send with the message to another server, and that server will add the headers for the last sender and timestamp. This is *necessarily implied* when the misfin(B) spec says you can have multiple sender and timestamp lines, but the spec also explicitly uses the language for "forwarding" and retaining the original headers during forwarding too.

This example of forwarding is explicitly listed in the misfin(B) spec, too.

Yes, this does mean someone can spoof headers :D This was listed as a problem on our misfin(C) page, and it was a big concern for my mailinglist implementation, which is where the biggest problem is with spoofing these headers.

I fixed this for mailinglists by having them reject mails that are forwarded (that have headers already). As for private messages, if you trust the person who is forwarding a message to you, then you trust the headers that come with it, imo. The same applies with mailinglist "federation".

Here's Section 4.1:

Sender lines should be added by the server when saving or retransmitting a message. If a message is forwarded, the original sender line should be preserved and sent alongside, so the final recipient will see both senders:

It suggests that the original sender line should be preserved when you are forwarding a message, which necessarily implies that the gemmail format was also used in transmission.

Timestamps in 4.3 have a similar thing:

Like sender lines, timestamp lines should be added by the receiving mailserver, and only sent if forwarding a message, in which case they should be left as-is.

Finally, the last example in 4.5 shows a gemmail with two sender lines.

As for misfin(C), I believe we made a clear distinction between the format that is stored (which we don't specify in the spec, afaik) and the format that is sent in transmission.

Just to be clear, as I understand it, the recipient server always adds the headers. You only have headers in transmission when you forward mail. Clients should not be adding headers, and they will only include headers in transmission if they are forwarding from a previous server that added headers.

☯️ johano · Oct 11 at 13:48:

@skyjake yes I figured that was the case, not a bug per se, simply a different design choice that is now being superceded

🚀 clseibold · Oct 11 at 13:50:

@johano No, lem's design choice isn't being superseded. If you read the spec, the misfin(B) spec explicitly allows you to send headers during transmission when you are forwarding mails. In fact, it requires it. I have quoted the relevant parts above. (This is also still the case in misfin(C).)

If a message is forwarded, the original sender line should be preserved and sent alongside, so the final recipient will see both senders:

🚀 clseibold · Oct 11 at 13:58:

Did y'all read the same spec I read? :P

I poured over every word of this B spec to make sure I understood it, especially when we designed misfin(C). While some parts are not as well explained as they could be, I think it's fairly simple to understand and read, and some things are necessarily implied.

Recipients of course should be included in gemmails in transmission, or all members of a group wouldn't be able to reply to everyone.

Forwarding mails should also have headers to know the transmission path, or at the very least the original author, of a gemmail.

Receiving servers add headers so that the very last sender and timestamp cannot be spoofed, but yes, all of the previous ones can, which is frankly a very hard problem to solve in general, and isn't worth solving.

🕹️ skyjake [mod...] · Oct 11 at 14:18:

Thanks for the clarifications. I can certainly see how the intention was that this information is passed server-to-server when forwarding, etc.

Spoofable headers are a serious issue. It basically means that none of the header lines in a received Misfin message can be trusted. Even if you trust the sender, a software bug or human error may occur and you end up with false information.

@clseibold johano was referring to the "misfin://" vs. "misfin:" URI parsing, which was producing incorrect behavior in Lagrange.

Did y'all read the same spec I read?

I'm sure we did, however when it comes to reading implied things between the lines, people see things differently. IMO, clients should not be allowed to include any header lines in the message if they cannot be verified, like the sender identity is verifiable. When it comes to multiple recipients, the recipient group should IMO be managed serverside only, so clients can't send invalid info.

One solution would be to have a different request type for server-to-server transmission, where header lines are permitted. One would somehow have to ascertain that only real Misfin servers can perform those requests, though.

🚀 clseibold · Oct 11 at 14:25:

@skyjake Oh, yeah, The URL links originally were "misfin://", and I think lem started that and then we all continued it, but it's not technically in the spec or the best practices, afaik. What is in the spec is the URL format used in the requests, which can be argued as a different thing.

As for the implications in the spec, while I agree that reading between the lines can be missed sometimes, I think it's much easier when the implications are of logical necessity, and that's what I meant by necessarily implied. By logical necessity, senders and timestamps and recipients have to be sent over in transmission.

I do not agree that forwarding is server-to-server by logical necessity. This implies the inexistance of things like IMAP and POP3 (GMAP).

🚀 clseibold · Oct 11 at 14:30:

Server-to-server forwarding would also not work if a server doesn't have access to every private key of every mailbox.

I believe the reference implementation and all of the current implementations have a clear separation between client and server where the client always sends and forwards mail, and the server only receives (with exception to mailinglists).

I believe clients can also verify the existance of mail addresses. Servers are not the only ones that can do this.

When I mentioned spoofing is too hard to solve, I didn't mean the verifying of the existance of misfin addresses, which is easy, but verifying that an address actually sent a mail, which is a different thing.

🐐 satch [OP] · Oct 11 at 14:40:

I’m not totally clear on the conversation being had here but I want to address the concern of spoofed header lines.

There’s a big difference between sender line and recipient line. The sender line should always be added by the receiver. There is necessarily one sender of a given message. It cannot be spoofed. (Except in situations where the messages is forwarded, which just means that the information you’re receiving should only be trusted as much as you trust the person who sent it. This is pretty normal and not a security concern IMO.)

The receiver line lists addresses the sender claims the message is being sent to. It’s a structured way for the sender to say who the recipients were. They can choose whether or not to include the recipients in the message, and they can absolutely lie and say they sent it to people that they did not.

In my view, this is in keeping with the baseline for these kinds of things. It’s a high bar to put on a message protocol that people sending messages should not be able to lie about these things. I also can’t see how this could be exploited to do anything except include some people in the thread who originally weren’t there.

I suppose uneducated users of misfin could be tricked into thinking that someone is in the loop in a conversation when they really aren’t. But as soon as someone who’s being tricked, send a gemmail it would alert the person whose address is erroneously being included in the recipients.

One way we could resolve this is have servers who receive a message send an empty gemmail to the other addresses on the recipients line to confirm receipt. I don’t feel that this is necessary but if you all are concerned, it seems like an OK solution.

🚀 clseibold · Oct 11 at 14:47:

If we care a lot about spoofing and encryption, then LATSSIAM is another protocol mk270 designed that I think takes these things into account much better, afaik. It might be worth looking at.

— Latssiam Protocol

However, LATSSIAM is much more complicated because encryption can be complicated, especially if you are dealing with verifying every single header line (not sure if LATSSIAM even does this). Not even regular email does this (although, yes, regular email is a very low bar when you can barely even verify the last sender, lol).

I agree with satch. Recipients lines are hardly a problem, imo. A person could include a recipient to one person of a group, but not the other, but then that second person wouldn't reply to the included person. If a recipient is added to a group email for everyone, then everyone can see it.

As for senders and timestamps: verifying the existance of mailboxes is easy. Verifying that a mail was actually sent by that address is much harder, and could create a lot of complicated network traffic and significantly increase the complexity of implementation, imo, because now every single server that receives a mail has to send to every server in the headers asking if they sent a mail. That increases exponentially! Yikes!

But also, we have no way of identifying mails either, because there are no message ids. So how would we ask servers that a mailbox sent a mail? So we need to now add that. This also means these servers have to now keep track of every sent mail.

There could be a way to simplify all of this with encryption, however, which is why I suggest looking into LATSSIAM.

🕹️ skyjake [mod...] · Oct 11 at 14:57:

Thanks @satch and @clseibold for the further clarifications. I must admit my thinking was centered very much around Misfin(B) and I had not fully realized that the Misfin(C) draft addresses much of this already. Apologies!

I will need to fix Lagrange so that it follows the C spec correctly and includes the mandated three metadata lines in the message. This is not done currently. (Lagrange only switches to C when the content is too long to fit in B.)

🚀 clseibold · Oct 11 at 14:58:

Sorry y'all if I miss anything or if I add things that are missed. Gemini is a pull system, not a push system, so it's a bit hard, for me at least, to have conversations like this, lol.

🚀 clseibold · Oct 11 at 15:00:

@skyjake It's no problem. Misfin(C) doesn't fix all of the spoofing stuff, but I think it does make the misfin(B) implications more explicit, imo. It could be argued that it is one interpretation of misfin(B), with changes, lol.

If you have ideas on fixing the spoofing issue that we haven't come up with, I would very much like to hear them, only because I couldn't come up with any solution that is simple/easy or doesn't create a lot of traffic.

We did discuss message IDs and senders list spoofing. For message IDs, we didn't want to add them because it added even more changes to misfin(C) and we were trying to be conservative with the changes we made. As for spoofing, we couldn't come up with good solutions.

— Misfin C Change Reasonings Document

🚀 clseibold · Oct 11 at 15:46:

@skyjake Note that the misfin(C) spec that mk270 created has a couple of omissions, like verification messages. I cannot upload an edited document, because satch controls the document atm. Otherwise I would have this added like right this very second, because I'm usually very impatient with these types of things.

☯️ johano · Oct 11 at 16:05:

@clseibold sorry, crossed lines of communication... I was referring to changing to "misfin:" vs "misfin://"

🚀 clseibold · Oct 11 at 16:20:

@johano Yeah, sorry. I misunderstood.

🕹️ skyjake [mod...] · Oct 11 at 16:24:

@satch:

How should the addresses be delimited?

Answering the original question in the thread, I agree commas are a good choice. It is consistent with mailto/email and the intuitive way to write lists (e.g., the English language).

😺 gemalaya · Oct 18 at 16:17:

Thank you @satch for this proposal. I've made the changes in gemalaya. I think commas for delimiting addresses are ideal.

I wonder about the format of the query string. mailto: links use named query parameters (subject=, body=, cc=, etc ..). For misfin maybe we don't really need to pass anything else than a message body in the query string ? Using named query params has some advantages if we want to pass other values in the future.