💾 Archived View for perplexing.space › 2020 › cgi-programming-in-2020.gmi captured on 2022-07-16 at 16:08:17. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2020-09-24)

-=-=-=-=-=-=-

CGI Programming in 2020

2020-08-09

All the talk happening in Gemini-space about the slow internet, retro-computing, etc. got me thinking about a technology I've been curious about but never tried — CGI.

I can't help but harbor some doubt that commonly accepted "best practices" are overkill for the day to day sorts of computing I usually do. Things like Python's Flask running under Gunicorn reverse proxied with Nginx as "the right way to do things" is a huge amount of overhead/infrastructure for simple web applications. Getting back to basics provides a nice opportunity to put these doubts to the test.

I recently signed up for SDF (which isn't yet in Gemini-space, but is active in the gophersphere) to try out an active pubnix:

SDF.org

I have been delighted with the experience of using a well administered Unix system. Things that have always seemed awkward or convoluted on my own like mail were effortless on SDF. Additionally there is support for CGI programs as part of the available web hosting. With this in mind I set about recreating a standard application I tend to remake every time I learn a new web technology — the pastebin.

I really like using SQLite and wanted to work out a pastebin using just a shell script and the SQLite command line, but got scared off by the possibilities around SQL injection and escaping query parameters. Instead I settled for Python and the built-in CGI and SQLite modules for query string parsing and parameterized queries.

Implementation

The first part is a barebones "schema.sql" to create a SQLite database and populate a single default record:

schema.sql

drop table if exists paste;

create table paste (
  id      string,
  content text,
  added   datetime default (datetime('now'))
);

insert into paste (id, content) values ("default", "not found");

Next is the "new.cgi" script for uploading pasted content to the database:

new.cgi

#!/usr/pkg/bin/python3.6

import cgi
import sqlite3
from random import randint

if __name__ == '__main__':
    c = sqlite3.connect('paste.db')
    text = cgi.FieldStorage().getvalue('text') or ""
    paste_id = ''.join([f"{randint(0,15):1x}" for _ in range(4)])
    c.execute("delete from paste where julianday('now') - julianday(added) > 5 and id <> 'default'")
    c.execute("insert into paste (id, content) values (?, ?)", (paste_id, text))
    c.executescript("vacuum")
    c.close()
    url = f"http://cyanocitta.sdf.org/paste/fetch.cgi?id={paste_id}"
    print(f"Content-Type: text/plain; charset=UTF-8\nStatus: 303 See other\nLocation: {url}\n\n{url}")

I am reasonably happy with the result here. There are a few oddities that make it just interesting enough to discuss, first is the delete call. SDF does not support cron jobs for regular members, which was my initial idea for keeping the database size within reason. Rather than clean old pastes on a schedule I figured I could invoke the delete operation for each new paste. In this way there will never be more than 5 days of pastes at a time.

The vacuum call is a little unfortunate, due to the way SQLite persists on-disk, delete does not really resize the database until a vacuum call is made, during which the database file is moved around on disk. This is inherently slow compared to the bare insert that is the heart of things, but a "real-world" requirement to keeping things running with the limited resources available.

Lastly is the retrieval portion of:

fetch.cgi

#!/usr/pkg/bin/python3.6

import cgi
import sys
import sqlite3

sys.stdout = open(sys.stdout.fileno(), mode='w', encoding='utf8', buffering=1)

if __name__ == '__main__':
    c = sqlite3.connect('paste.db')
    paste_id = cgi.FieldStorage().getvalue('id') or "default"
    text, *_ = c.execute(
        """select 'Content-Type: text/plain; charset=UTF-8' || char(10) || char(10) ||
           coalesce((select content from paste where id = ?), 'not found')""", (paste_id,)
    ).fetchone()
    c.close()
    sys.stdout.write(text)

Another small hiccup in the otherwise simple implementation, I haven't worked out exactly how the terminal encoding works in the case of CGI programs on SDF. There were two options here it seemed, the one I've used above is to set the encoding on STDOUT using the sys module. The other alternative is to set PYTHONENCODING as an environment variable when invoking python, but that required a second CGI program that only wrapped the above with a single variable set before invocation.

There is no real reason to construct the HTTP response in SQL other than I thought it would a funny twist on how web programming is usually done in Python.

Thoughts

I can't help but wonder how far CGI programming could take us in some of the mundane uses of the web. It wouldn't work well at humongous scale, but how many things _need_ to be big? Why not be satisfied with a more small scale version of the internet? The entire program doesn't _do_ much of anything, but it fulfills a purpose and in this way it feels nice to have accomplished the task in a few dozen lines of code. Even nicer there is no special configuration or "deployment", it is just ensuring the files have a ".cgi" extension and I'm off to the races.

CGI isn't without rough edges, but it feels very much like the old web. It is the sort of thing that anyone might pick up and get running in a weekend. The Gemini protocol feels like HTTP in a way, not HTTP2 which is hugely complicated, but approachable and comprehensible with a minimum of reading. I should probably try and port my new CGI program to Gemini. The experience has been so easy it might be elucidating to re-port it to a new protocol and platform.