The Guppy Protocol Specification v0.3

(I don't know if anyone is interested in this, but let's give it a try. It started as a thought experiment, when I looked for 'lighter' protocols I can implement on the Pico W. I see Spartan mentioned here and there, and I wonder if there's any interest in going even more ... spartan. If you find this interesting, useful or fun, and have ideas how to improve this protocol, I'd love to hear from you at my-first-name@dimakrasner.com!)

Overview

Guppy is a simple unencrypted client-to-server protocol, for download of text and text-based interfaces that require upload of short input. It uses UDP and inspired by TFTP, DNS and Spartan. The goal is to design a simple, text-based protocol that is easy to implement and can be used to host a "guplog" even on a microcontroller (like a Raspberry Pi Pico W, ESP32 or ESP8266) and serve multiple requests using a single UDP socket.

Requests are always sent as a single packet, while responses can be chunked and each chunk must be acknowledged by the client. The protocol is designed for short-lived sessions that transfer small text files, therefore it doesn't allow failed downloads to be resumed, and doesn't allow upload of big chunks of data.

Out-of-order transmission of chunked responses should allow extremely fast transfer of small textual documents, especially if the network is reliable. However, this requires extra code complexity, memory and bandwidth in both clients and servers. Simple implementations can achieve slow but reliable TFTP-like transfers with minimal amounts of code.

Changelog

v0.3:

v0.2:

(Response to feedback from slondr)

v0.1:

Sample Implementation

Sample server in Go, with out-of-order transsmission of up to 8 response chunks, 512b each

Sample client in C, with buffering of 8 response chunks, up to 4K each

git clone -b guppy --recursive https://github.com/dimkr/gplaces
cd gplaces
make PREFIX=/tmp/guppy CONFDIR=/tmp/guppy/etc install
/tmp/guppy/bin/gplaces guppy://hd.206267.xyz

Terminology

"Must" means a strict requirement, a rule all conformant Guppy client or server must obey.

"Should" means a recommendation, something minimal clients or servers should do.

"May" means a soft recommendation, something good clients or servers should do.

URLs

If no port is specified in a guppy:// URL, clients and servers must fall back to 6775 ('gu').

MIME Types

Interactive clients must be able to display text/plain documents.

Interactive clients must be able to parse text/gemini (without the Spartan := type) documents and allow users to follow links.

If encoding is unspecified via the charset parameter of the MIME type field, the client must assume it's UTF-8. Clients which support ASCII but do not support UTF-8 may render documents with replacement characters.

Download vs. Upload

In Guppy, all URLs can be (theoretically) accompanied by user-provided input. The client must provide the user with means for sending a request with user-provided input, to any link line.

Server authors should inform users when input is required and describe what kind of input, using the link's user-friendly description.

If input is expected but not provided by the user, the server must respond with an error packet.

Security and Privacy

The protocol is unencrypted, and these concerns are beyond the scope of this document.

Limits

Clients and servers may restrict packet size, to allow slower but more reliable transfer. However, servers must transmit responses larger than 512 bytes in chunks of at least 512 bytes. If the response is less than 512 bytes, servers must send it as one piece, without continuation packets.

Requests (the URL plus 2 bytes for the trailing \r\n) must fit in 2048 bytes.

Client and servers should handle timeouts gracefully and must distinguish between timeout and proper termination of the session.

Packet Order

Servers should transmit multiple packets at once, instead of waiting for the client to acknolwedge a packet before sending the next one.

Servers may limit the number of packets awaiting acknowledgement from the client, and wait with sending of the next continuation packets until the client acknowledges some or even all unacknowledged packets.

The server must not assume that lost continuation packet n does not need to be retransmitted, if packet n+1 is acknowledged by the client.

Clients should re-transmit request and acknowledgement packets after a while, if nothing is received from the server.

Trivial clients may ignore out-of-order packets and wait for the next packet to be retransmitted if received but ignored, at the cost of slow transfer speeds.

Clients that receive continuation or end-of-file packets before the success packet should cache and acknowledge the packets, to prevent the server from sending them again and reduce overall transfer time.

Clients may limit the number of buffered packets and keep up to x chunks of the response in memory, when the server transmits many out-of-order packets. However, clients that save a limited number of out-of-order packets must leave room for the first response packet instead of failing when many continuation packets exhaust the buffer.

Clients should start displaying the response as soon as the first chunk is received.

Packet Types

There are 7 packet types:

All packets begin with a "header", followed by \r\n.

TL;DR -

Requests

	host/path input\r\n

The query part specifies user-provided input, percent-encoded.

The server must respond with a success, redirect or error packet.

The client may attempt re-transmission of a request packet if no response is received after a while and the server must ignore duplicate request packets.

Success

	seq type\r\n
	data

The sequence number is an arbitrary number between 2 and 32767, followed by a space character (0x20 byte). Clients must not assume that the sequence number cannot begin with 1 and confuse success packets with sequence number 10 or 123 with error packets.

The type field specifies the response MIME type and must not be empty.

The server may send a chunked response, by sending one or more continuation packets.

Continuation

	seq\r\n
	data

The sequence number must increase by 1 in every continuation packet.

The client must ignore packets where the sequence number is not the sequence number of the previous packet plus 1.

If the client begins to display a textual response before the entire response has been received, it must not assume that the response is split on a line boundary: a long line may be sent in multiple response packets.

End-of-file

	seq\r\n

The server must mark the end of the transmission by sending a continuation packet without any data, even if the response fits in a single packet.

Clients must wait for the "end of file" packet, to differentiate between timeout, a partially received response and a successfully received response.

Acknowledgement

	seq\r\n

The client must acknowledge every success, continuation or EOF packet by echoing its sequence number back to the server.

The server should wait for the client to acknowledge the previous chunk of the response (the success packet or the previous continuation packet) before sending the next continuation packet, to avoid waste of network bandwidth.

The server may attempt re-transmission of a success, continuation or EOF packet after a while, if not acknowledgement by the client.

The client may attempt re-transmission of an acknowledgement packet and the server must ignore acknowledgement packets it's not waiting for.

Redirect

	0 url\r\n

The URL may be relative.

The client must inform the user of the redirection.

The client may remember redirected URLs. The server must not assume clients don't do this.

The client should limit the number of redirects during the handling of a single request.

The client may forbid a series of redirects, and may prompt the user to confirm each redirect.

Error

	1 error\r\n

Clients must display the error to the user.

Examples

Success - Single Packet

If the URL is guppy://localhost/a and the response is "# Title 1\n":

	> guppy://localhost/a\r\n
	< 566837578 text/gemini\r\n# Title 1\n
	> 566837578\r\n
	< 566837579\r\n
	> 566837579\r\n

Success - Single Packet with User Input

If the URL is guppy://localhost/a and input is "b c":

	> guppy://localhost/a?b%20c\r\n
	< 566837578 text/gemini\r\n# Title 1\n
	> 566837578\r\n
	< 566837579\r\n
	> 566837579\r\n

Success - Multiple Packets

If the URL is guppy://localhost/a and the response is "# Title 1\nParagraph 1\n":

	> guppy://localhost/a\r\n
	< 566837578 text/gemini\r\n# Title 1\n
	> 566837578\r\n
	< 566837579\r\nParagraph 1
	> 566837579\r\n
	< 566837580\r\n\n
	> 566837580\r\n
	< 566837581\r\n
	> 566837581\r\n

Success - Multiple Packets With Out of Order Packets

If the URL is guppy://localhost/a and the response is "# Title 1\nParagraph 1\n":

	> guppy://localhost/a\r\n
	< 566837578 text/gemini\r\n# Title 1\n
	< 566837579\r\nParagraph 1
	> 566837578\r\n
	< 566837579\r\nParagraph 1
	> 566837579\r\n
	< 566837579\r\nParagraph 1
	< 566837580\r\n\n
	> 566837580\r\n
	< 566837581\r\n
	> 566837581\r\n

Success - Multiple Packets With Unreliable Network

If the URL is guppy://localhost/a and the response is "# Title 1\nParagraph 1\n":

	> guppy://localhost/a\r\n
	< 566837578 text/gemini\r\n# Title 1\n
	> 566837578\r\n
	< 566837578 text/gemini\r\n# Title 1\n (acknowledgement arrived after the server re-transmitted the success packet)
	< 566837579\r\nParagraph 1
	< 566837579\r\nParagraph 1 (first continuation packet was lost)
	> 566837579\r\n
	< 566837580\r\n\n
	> 566837580\r\n
	> 566837580\r\n (first acknowledgement packet was lost and the client re-transmitted it while waiting for a continuation or EOF packet)
	< 566837581\r\n (server sends EOF after receiving the re-transmitted acknowledgement packet)
	< 566837581\r\n (first EOF packet was lost while server waits for client to acknowledge EOF)
	> 566837581\r\n

Redirect - Absolute URL

	> guppy://localhost/a\r\n
	< 0 guppy://localhost/b\r\n

Redirect - Relative URL

	> guppy://localhost/a\r\n
	< 0 /b\r\n

Error

	> guppy://localhost/search\r\n
	< 1 No search keywords specified\r\n

Response to feedback

Using an error packet to tell the user they need to request a URL with input mixes telling the user there was an error (e.g. something broke) with an instruction to the user

This is intentional: errors are for the user, not for the client. They should be human-readable.

This also means that for any URL that should accept user input, the author would need to configure the Guppy server to return an error, which is kind of onerous.

Gemini servers respond with status code 1x when they expect input but none is provided. This is a similar mechanism, but without introducing a special status code requiring clients to implement the "retry the request after showing a prompt" logic.

Using an Error Packet to signify user input:
* user downloads gemtext
...

There's a missing first step here: user follows a link that says "enter search keywords" or "new post", then decides to attach input to the request.