💾 Archived View for rawtext.club › ~nervuri › journal › 2022-02-20_on-data-uris.gmi captured on 2022-04-29 at 11:39:59. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2022-03-01)

➡️ Next capture (2024-12-17)

-=-=-=-=-=-=-

On data URIs

Lagrange v1.11 will support automatically displaying inline images embedded as "data:" URIs. Skyjake's post details several benefits that data URIs may bring to Gemini:

gemini://skyjake.fi/gemlog/2022-02_our-old-friend-the-data-url.gmi

But while reading it, I couldn't stop a few thoughts from looping in the back of my mind:

Anyway, the most popular (I think) Gemini client is close to setting the expectation that images will be rendered inline if included in this way. Will clients that don't support this become second-class citizens? Well, maybe not. In a follow-up post, Skyjake notes that this feature will be disabled by default and that data URIs will only be auto-displayed if smaller than 8 KB:

gemini://skyjake.fi/gemlog/2022-02_re-holes-and-diversity.gmi

Skyjake also writes:

My hope is that the Gemini specification will set a limit for link line URI length for non-Gemini schemes as it has for the Gemini scheme. Defining an arbitrary universal limit may prevent the proper use some schemes, but in the context of Gemtext this seems reasonable.

I agree that the specification should limit the scope for abuse. Limiting URI length is one idea, another is to add a few broad rules such as:

Alternatively, a rule specifically targeting data URIs could prohibit their inline rendering or limit it by maximum size and/or allowed MIME types.

Adding such things to the specification is perhaps inelegant, but it is better than leaving these holes unplugged.

However, irrespective of the specification, feature creep will be hard to stop. I mean, this sounds reasonable:

I mentioned this is a "minor" change, and here's why: Lagrange has supported decoding data URLs since the very first release (v0.1), thanks to the gemini.conman.org client torture test. Since all "image/*" responses can now potentially become inlined when opened, the new feature is that data URLs, whose data is already in memory, can be inlined whenever the user wants them to be.

If the client supports image rendering and the image is already in memory, why not take that the final step and render it inline? Ah, there goes that thought loop again...

I've opened an issue @ GitLab for further discussion:

https://gitlab.com/gemini-specification/gemini-text/-/issues/16

I'll end with Solderpunk's reaction upon learning of data URIs:

https://lists.sr.ht/~adnano/gemini/%3C3c2683fe-243d-4f22-a4f1-6e79f5031fc9%40email.android.com%3E#%3C20200529093520.GB2354@SDF.ORG%3E

...let's not take delight in actively trying to crush the spirit of this thing? I mean, the web is right there if that's what you want.
Mostly I was just annoyed at learning that these data:// URLs exist. I hadn't been aware of them. And frankly...what the hell? Who thought it was a good idea to turn a harmless way to indicate *where* some data can be found into a way to shoehorn in the data itself? What are we supposed to use in a context where we don't want that to be a possibility? Bah!
I tried *so* hard to avoid this, but you just can't. This data:// URL thing is a monster. No RFC puts a limit on the allowed length of a URL. And the data:// scheme includes MIME types. So they are a vehicle for arbitrary content of arbitrary size. Turns out this entire time Gemini - and even it's hypothetical deliberately stripped-back cousin Mercury - has allowed embedding of inline images, audio and videos. It's just impossible to avoid this stuff unless you throw out URLs entirely and start with something else, in which case the effort to learn the spec and implement it skyrockets because you can't leverage existing knowledge and code. I was naive to think the internet was made out of little do-one-thing-and-do-it-well components you could compose in novel ways to build things of deliberately limited scope. Turns out it's made of a few massively overpowered blobs and it's impossible to build anything small.
Yes, alright, to some extent abuse of this loophole will be limited by the fact that all this content has to be downloaded in a single file in a single transaction before any of it can be displayed, so the user experience will be miserable if there is more than just a little bit of embedded content. We won't quite end up as bad as the web. But still! This is just unspeakably frustrating.

https://lists.sr.ht/~adnano/gemini/%3C7754AECC-1076-4ABA-ABE2-3B4F7FA0DA33%40gmail.com%3E#%3C20200529154517.GC28953@SDF.ORG%3E

The issue is that the history of the web demonstrates that the most powerful/inclusive interpretation of a spec tends to become the only acceptable implementation over a long enough timeline. Everybody builds their content for that interpretation, and more conservative clients come to be considered "broken". It's like trying to surf the modern web with cookies and JS turned off: nothing works. The only hope is to design specs where the most powerful interpretation is within acceptable limits. Which seems to me to be impossible in a world where URLs can be harmless pointers to network resources *or* arbitrarily large chunks of data of arbitrary but unamiguous type.
In that crazy world, our only hope is a strong cultural norm of "No, don't do that!". It's true that maybe that will work better for Gemini than it did for the web, because, you know, the web is actually there alongside Gemini and people who really want the worst of the web will just stick with it and leave us alone.
But I really didn't want to just rely on politely asking people not to do certain things, but to make it impossible or very difficult to do them at the protocol level. I know you can never *really* do that, people can ignore RFCs and implement totally broken stuff and the internet police don't come and arrest them. But I had hoped we could get really close to that ideal.