Titan, the proposed upload protocol for Gemini

Alex Schroeder <alex (a) gnu.org>

Hi all
People on the IRC channel asked me to send an email to the list
regarding the Titan protocol I've been using for my Gemini Wiki
project.

Background

I'm really interested in wikis. They seem to me to be the simplest form
of collaboration in text. Simple both in because you don't need a lot
of technology to make it work, and also simple because all of the
structure, all of the processes, are things that the software doesn't
provide: it's up to people to organize themselves. I wrote a bit more
about it on my site, if you're interested.
https://alexschroeder.ch/wiki/2020-06-15_Why_Wiki%e2%80%bd
gemini://alexschroeder.ch/2020-06-15_Why_Wiki%e2%80%bd

At first, I proposed something I called Gemini+Write, and implemented
it for my site, and for a Gemini Client I was using. It was inspired by
my idea for Gopher Wiki. I know, I know. I keep bashing on that wiki
idea. Anyway, the idea is simple: to *read* from a site, you open a
connection and send a selector. Then, on subsequent lines, you send
more stuff like MIME-type, file-size, password, and so on. It wasn't
pretty.
https://alexschroeder.ch/wiki/2017-12-30_Gopher_Wiki
gemini://alexschroeder.ch/2017-12-30_Gopher_Wiki
https://alexschroeder.ch/wiki/2020-06-04_Gemini_Upload
gemini://alexschroeder.ch/2020-06-04_Gemini_Upload

In any case, Sean Conner picked up on the idea, and started a long
thread back in June. It went a lot further than what I had in mind,
mimicking HTTP methods (GET, PUT, POST), discussed the differences
between PUT and POST, and some form of authentication that used client
certificates to allow users to edit just their pages.
https://lists.orbitalfox.eu/archives/gemini/2020/001611.html
https://lists.orbitalfox.eu/archives/gemini/2020/001657.html

Matthew Greybosch proposed to name this protocol "Titan" in one of the
replies to Sean Conner's post.
https://lists.orbitalfox.eu/archives/gemini/2020/001615.html

There was some resistance to the idea, and my mail setup was apparently
causing some users a bit of grief, and for both of these reasons I
didn't get involved in the discussion. I wanted to get a Gemini Wiki
working and that was all I cared about.

Titan

Sean Conner's Titan proposal had some ideas I liked and some I didn't
care for. The shortest summary is the following, from the second thread
link of his:

  titan://example.com/post-handler/endpoint?size=1234&mime=text/plain
  titan://example.com/path/to/new/resource;size=1234&mime=text/plain
  titan://example.com/path/to/remove;size=0

  The logic goes something like this [2]:

  if the request has a query, it's an upload of data---accept data.
  if the request has no query, and the path parameter (marked by ';')
    doesn't exist---error.
  if the request has no query, and the path parameter exists:
    if size==0, delete the resource
    if size>0, accept data and make the resource available.

I decided that what I really needed was just the second example he
provided:

titan://example.com/path/to/new/resource;size=1234&mime=text/plain

I also needed some form of authorization (are you allowed to edit the
site) and I didn't really care for authentication (are you the person
you claim to be). I imagined my Gemini Wiki working without necessarily
identifying users. I needed some sort of token, a kind of password. My
Oddmuse wiki uses this system since 2003: there are editor and admin
passwords, and knowing one of these passwords allows you to do things
reserved to editors and admins, but these passwords aren't tied to
usernames. They're basically just tokens. You can pass them to a friend
and say: here, join us!

Thus, for my needs I need the following:

titan://example.com/path/to/new/resource;size=1234;mime=text/plain;toke
n=hello

Some people used to web URIs might be wondering: where is the question
mark between path and query? The point is that this isn't a query. If
you look at the URI RFC 3986 section 3.3, you'll see what I mean.

"URI producing applications often use the reserved characters allowed
in a segment to delimit scheme-specific or
dereference-handler-specific subcomponents. For example, the semicolon
(";") and equals ("=") reserved characters are often used to delimit
parameters and parameter values applicable to that segment. The comma
(",") reserved character is often used for similar purposes. For
example, one URI producer might use a segment such as "name;v=1.1" to
indicate a reference to version 1.1 of "name", whereas another might
use a segment such as "name,1.1" to indicate the same. Parameter types
may be defined by scheme-specific semantics, but in most cases the
syntax of a parameter is specific to the implementation of the URI's
dereferencing algorithm."
https://tools.ietf.org/html/rfc3986#section-3.3

So that's how I think about it. Let's look at this example again:

titan://example.com/path/to/new/resource;size=1234;mime=text/plain;toke
n=hello

We want to create a new revision of "resource" and the necessary
parameters, as defined by the Titan protocol, are size, MIME-type, and
token.

Now, you might wonder: why size? The server needs to know when the
transmission of the client ends. Remember, if connectivity is bad,
content can arrive in chunks interspersed with silence. Is the client
done? In text based transmissions, we have more options. A Ctrl-D might
be used to indicate the end-of-file (EOF). In your ASCII man page it
might say "EOT (end of transmission)". In very old tools like Berkley
mail you'd end your letter with a single dot on a line. The Gopher spec
also says that a single dot on a line ends the transmission. All of
these solutions won't work when we're talking about binary files,
however. And yes, I want a wiki where people can upload images, or
audio files, or videos, PDF files. Therefore, no particular byte or
pattern can be used. We need to tell the server how many bytes to
expect, and then the server reads exactly that many bytes, with a
timeout in case the transmission just isn't going to work.

You might also wonder: why MIME-type? Can't we just rely on file name
extensions? I guess we could. But why not use MIME-types? I've gotten
used to them. My wiki serves images called "foo" and tells your browser
via the MIME-type that it's an image/png. It allows my wiki to store
this MIME-type and serve it back in a response when clients request a
file. Otherwise I'd have to guess. I've had unhappy experiences with
/etc/mailcap files. All the programming languages do "something",
merging system mailcap files with user mailcap files and programming
language libary mailcap files, and mailcap files so much more than just
an association of file name extensions and MIME-types. They're a
nightmare. Compared to that complexity (which is often invisible to
developers, granted), I find explicit MIME-types easier to deal with.

Upload

OK, so now we've talked about the request: The client sends
titan://example.com/path/to/new/resource;size=1234;mime=text/plain;toke
n=hello followed by the CR LF sequence we all know, and then it sends
the 1234 bytes it promised to send. That's it.

Now, the client-server reaction can be a bit tricky if you're writing a
client: how do you handle an error message after the first line?
Because there will be errors. My wiki has errors for resources that
aren't editable, sizes that exceed the max size I'm willing to accept,
MIME-types I'm not willing to accept, and tokens that are not correct.
There might be more. I'm using 59 error codes for them all. I know that
Baschdel has been arguing for better error codes. The Baschdel proposal
argued for special error codes: ES for wrong size, EM for wrong MIME
type, and E_ for whatever else. Baschdel also argued that it would be
nice if a server replied with WR to say that it will in fact accept the
upload. I confess, I didn't really care for that and decided to write a
very simple implementation.
https://alexschroeder.ch/wiki/Baschdels_spin_on_Gemini_uploading
gemini://alexschroeder.ch/Baschdels_spin_on_Gemini_uploading

Request for Comments (RFC)

So there you have it. Uploading! Titan! ????

What do you think?

I'd love to see more clients supporting it. I have two independent
implementations to show how it works, both for servers and clients,
four in total:

My wiki uses a server code that interfaces with the underlying Oddmuse
wiki engine to edit pages.
https://oddmuse.org/wiki/Gemini_Server
gemini://alexschroeder.ch/

My Gemini Wiki uses stand-alone code to just serve .gmi files as a
wiki, allowing users to create new ones and to edit existing ones. Note
that it is hosted on the same domain, but on a different port.
https://alexschroeder.ch/cgit/gemini-wiki/about/
gemini://alexschroeder.ch:1968/

My two Bash functions "gemini" and "titan" allow a super simple
interaction with Gemini (for reading) and Titan (for writing).
https://alexschroeder.ch/cgit/gemini-titan/about/

I also love Emacs and so I wrote some code that allows me to edit wiki
pages from Elpher.
https://alexschroeder.ch/cgit/gemini-write/about/

Anyway, let me know what you think. Let me know what you'd like to
implement for your servers and clients and what you'd need to add or
drop from this proposal.

Here's what I can see:

- the size parameter can be made optional if you only care about text
uploads
- the MIME-type can be made optional if you only care about Gemtext
- the token can be made optional if you want to rely on client
certificates

What else?

Hope to hear from you all,
Alex

Link to individual message.

Alex Schroeder <alex (a) gnu.org>

Upon reading my mail the next day I'd like to add a few points.
https://lists.orbitalfox.eu/archives/gemini/2020/002034.html


Titan, the name

I don't mind lel's Java client being called "Titan". After all, there
are many things that are called Titan ? and Apache also named their
webserver httpd. And, like I said, originally I called the protocol
gemini+write... So, I don't know? I like the name Titan, of course. ?


gnutls-cli

Here's an example of me updating a page with the content of test.txt
using gnutls-cli:

(sleep 1; \
 echo "titan://alexschroeder.ch:1968/raw/Test;mime=text/plain;size="`wc
--bytes < test.txt`";token=hello"; \
 cat test.txt) | \
 gnutls-cli --insecure localhost:1965

I'm using --insecure because of the self-signed certificate, and I'm
using sleep 1 because gnutls-cli (at least on my system) isn't
immediately ready to accept ouput.

The drawback with this solution is that there's plenty of gnuttls-cli
info shown that you don't really care about.


openssl

Here's an example of me updating a page with the content of test.txt
using openssl:

echo "titan://alexschroeder.ch:1968/raw/Test;mime=text/plain;size="`wc
--bytes < test.txt`";token=hello" \
  | cat - test.txt | openssl s_client --quiet --connect
alexschroeder.ch:1968 2>/dev/null

Here I can use the --quiet flag to reduce openssl's output, and as the
rest is printed on stderr I can redirect stderr to /dev/null in order
to make the output really quiet.


Gemini Wiki on the web

The Gemini Wiki also serves HTTP (although it's an extremely simple
HTTP server: no content negotiation). That's why you can visit
gemini://alexschroeder.ch:1968 on the web using 
https://alexschroeder.ch:1968 ? Gemini Wiki installations just have to
make sure to use those Let's Encrypt certificates and not their self-
signed ones because browsers are really picky and display all sorts of
warnings. Gemini Wiki only serves the web as secondary citizen. There
are no HTML forms to edit pages from the web. That only works using the
Titan protocol.


Cheers
Alex

Link to individual message.

lel <lel (a) envs.net>

On Sat, Jul 04, 2020 at 11:09:18AM +0200, Alex Schroeder wrote:
> I don't mind lel's Java client being called "Titan". After all, there
lol I'm sincerely hoping mine is never found by anyone,
but Andrew J's looks polished and written with love!

best,
lel

Link to individual message.

colecmac@protonmail.com <colecmac (a) protonmail.com>

Thanks for making this post and explaining Titan!

I think it'd be great to eventually use client certificates for
authentication, but I definitely understand why you went with the
simpler token system for this initial implementation, it's a lot
easier to handle.

makeworld

Link to individual message.

Alex Schroeder <alex (a) gnu.org>

On Sat, 2020-07-04 at 15:17 +0000, colecmac at protonmail.com wrote:
> I think it'd be great to eventually use client certificates for
> authentication, but I definitely understand why you went with the
> simpler token system for this initial implementation, it's a lot
> easier to handle.

Oh, sure. I do think those are two different use cases, though:
authentication and authorization. It's not just easier to implement
(because I don't really understand the nitty gritty of TLS) but it also
involves a design goal of mine regarding anonymous collaboration.

That's how I understand client certificates, in any case:

If we're using temporary client certificates, then nothing is gained.
In a wiki context, we don't need to know about sessions. We don't need
to know that this edit was made by the same person that made the other
edit a few minutes ago. (And in fact, Gemini Wiki computes a changing
"code" for each contributor and stores that, hopefully allowing us to
discern visually whether two edits made a few minutes appart are
probably made by the same person without allowing us to know whether
two edits made days appart were made by the same person.)

If we're using permanent client certificates, we've again gained
perpetual sessions which we don't need, or we're using the client
certificats to hand out capabilities: who is an admin, who is an
editor, who is a visitor, who is banned. We're using them for
autorization. I'd like to avoid thinking about that as part of Titan,
though. It's already part of Gemini. If anybody needs this, like they
want to restrict people to uploads to their own area of a site, then
that's fine. That's already covered by Gemini, as far as I'm concerned.

For my use case, tokens straddle that middle ground of anonymous
editing, web of trust, friends of friends, and password protection. We
can pass tokens on to friends and friends of friends, and if it breaks
down, we will issue a new set of tokens. These will again spread from
person to person (or to anybody who knows where to look them up). That
is, in any case, how I would like it to be. ?

And if we really want, then of course we can tie tokens to users, make
them into bearer tokens like they are used in OAuth2: somewhere, you
make an account, you generate a token, and use it for your Gemini
client. On the original site, you can invalidate the token, or the
person running the server can invalidate the token, the token can be
associated with different capabilities, and so on. Tokens allow us to
go there... but we don't have to.

Anyway, Gemini Wiki has the notion of "spaces" ? separate wikis hosted
within the same software. These spaces could be tied to client
certificates, for example, allowing people to have personal wikis that
are effectively read-only for others. The software isn't there, yet,
but it's something on my radar, for sure. I'm expecting that this will
work without having to change anything about Titan because client
certificates are already part of Gemini ? and if I go there, I might
decide that those spaces either don't need tokens, or I might decide
that within those spaces, the people with the right client certificate
are admins and get to generate and invalidate local tokens for their
friends and family. Then we can have both: authentication for admins,
and simple token-based authorization for collaborators.

Cheers
Alex

Link to individual message.

defdefred <defdefred (a) protonmail.com>

Hello,

Reading the thread, I just think that 
https://en.wikipedia.org/wiki/Magic_number_(programming) are nice to find 
the file format too.

freD.

Link to individual message.

---

Previous Thread: [ANN] Titan - a Console-based Browser Written in Java

Next Thread: GUS, backlinks, and Replies in Geminispace