Unicode vs. the World

🗣️ From: Björn Wärmedal (bjorn.warmedal (a) gmail.com)
📅 Sent: 2020-12-17 07:39
📧 Message 20 of 34

How does a client handle a link like the following:
=> essays/why-spaces-are-%20-in-URLs.gmi

The assumption here is that the author has not percent encoded
themselves -- this is the actual filename, %20 and all.

How can the client tell if it's percent encoded or not? If you start
by decoding it you distort the filename. If you just assume it isn't
percent encoded and go ahead and do that you will handle this link
correctly but break any links that are already percent encoded. I've
only done this in python, using the urllib.parse library. I can tell
that to encode or decode, but it will do what I tell it to without
exception. It's up to me to build logic that avoids breaking the edge
cases.

We can decide to *always* percent encode links in gemtext (as the spec
states now) or to *never* do it, but I don't see how we can reasonably
have both. And never doing it means we can never link to a file with
spaces in the URL, and will have to percent decode anything we copy
paste from web browser's address bar. There will be extra work for
authors either way.

Consider another hypothetical case:
=> teddybearoftheyear.com/vote?ew0k%20The%20Great Vote for me!

How would you solve that?

However much I *want* to have IRIs and IDNs in gemtext and leave the
work to clients and servers, I don't have a solution for that as an
implementer.

Cheers,
ew0k

---

Previous in thread (19 of 34): 🗣️ colecmac (a) protonmail.com (colecmac (a) protonmail.com)

Next in thread (21 of 34): 🗣️ Sean Conner (sean (a) conman.org)

View entire thread.