#!/usr/bin/env perl # Copyright (C) 2017–2020 Alex Schroeder # This program is free software: you can redistribute it and/or modify it under # the terms of the GNU Affero General Public License as published by the Free # Software Foundation, either version 3 of the License, or (at your option) any # later version. # # This program is distributed in the hope that it will be useful, but WITHOUT # ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS # FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more # details. # # You should have received a copy of the GNU Affero General Public License along # with this program. If not, see . =encoding utf8 =head1 Phoebe Phoebe serves a wiki as a Gemini site. It does two and a half things: =over =item It's a program that you run on a computer and other people connect to it using their L in order to read the pages on it. =item It's a wiki, which means that people can edit the pages without needing an account. All they need is a client that speaks both L and L, and the password. The default password is "hello". 😃 =item People can also access it using a regular web browser. They'll get a very simple, read-only version of the site. To take a look for yourself, check out the test wiki via the L or via L. =back =head2 What are pages written in? Pages are written in gemtext, a lightweight hypertext format. You can use your favourite text editor to write them. A text line is a paragraph of text. This is a paragraph. This is another paragraph. A link line starts with "=>", a space, a URL, optionally followed by whitespace and some text; the URL can be absolute or relative. => http://transjovian.org/ The Transjovian Council on the web => Welcome Welcome to The Transjovian Council A line starting with "```" toggles preformatting on and off. Example: ``` ./phoebe ``` A line starting with "#", "##", or "###", followed by a space and some text is a heading. ## License The GNU Affero General Public License. A line starting with "*", followed by a space and some text is a list item. * one item * another item A line starting with ">", followed by a space and some text is a quote. The monologue at the end is fantastic, with the city lights and the rain. > I've seen things you people wouldn't believe. =head2 How do you edit a Phoebe wiki? You need to use a Titan-enabled client. Known clients: =over =item This repository comes with a Perl script called L to upload files =item L is an extension for the Emacs Gopher and Gemini client L =item L are two shell functions that allow you to download and upload files =back =head2 What is Titan? Titan is a companion protocol to Gemini: it allows clients to upload files to Gemini sites, if servers allow this. On Phoebe, you can edit "raw" pages. That is, at the bottom of a page you'll see a link to the "raw" page. If you follow it, you'll see the page content as plain text. You can submit a changed version of this text to the same URL using Titan. There is more information for developers available L. =head2 Dependencies Perl libraries you need to install if you want to run Phoebe: =over =item L =item L =item L =item L =item L =item L =back I'm going to be using F and F in the L instructions, so you'll need those tools as well. And finally, when people download their data, the code calls C. On Debian: sudo apt install \ libalgorithm-diff-xs-perl \ libfile-readbackwards-perl \ libfile-slurper-perl \ libmodern-perl-perl \ libnet-server-perl \ liburi-escape-xs-perl \ curl openssl tar The F script I use to generate F also requires L and L. =head2 Quickstart Right now there aren't any releases. You just get the latest version from the repository and that's it. I'm going to assume that you're going to create a new user just to be safe. sudo adduser --disabled-login --disabled-password phoebe sudo su phoebe cd Now you're in your home directory, F. We're going to install things right here. First, get the source code: curl --output phoebe https://alexschroeder.ch/cgit/phoebe/plain/phoebe Since Phoebe traffic is encrypted, we need to generate a certificate and a key. These are both stored in PEM files. To create your own copies of these files (and you should!), use "make cert" if you have a copy of the Makefile. If you don't, use this: openssl req -new -x509 -newkey ec \ -pkeyopt ec_paramgen_curve:prime256v1 \ -days 1825 -nodes -out cert.pem -keyout key.pem This creates a certificate and a private key, both of them unencrypted, using eliptic curves of a particular kind, valid for five years. You should have three files, now: F, F, and F. That's enough to get started! Start the server: perl phoebe This starts the server in the foreground. Open a second terminal and test it: echo gemini://localhost \ | openssl s_client --quiet --connect localhost:1965 2>/dev/null You should see a Gemini page starting with the following: 20 text/gemini; charset=UTF-8 Welcome to Phoebe! Success!! 😀 🚀🚀 Let's create a new page using the Titan protocol, from the command line: echo "Welcome to the wiki!" > test.txt echo "Please be kind." >> test.txt echo "titan://localhost/raw/Welcome;mime=text/plain;size="`wc --bytes < test.txt`";token=hello" \ | cat - test.txt | openssl s_client --quiet --connect localhost:1965 2>/dev/null You should get a nice redirect message, with an appropriate date. 30 gemini://localhost:1965/page/Welcome You can check the page, now (replacing the appropriate date): echo gemini://localhost:1965/page/Welcome \ | openssl s_client --quiet --connect localhost:1965 2>/dev/null You should get back a page that starts as follows: 20 text/gemini; charset=UTF-8 Welcome to the wiki! Please be kind. Yay! 😁🎉 🚀🚀 Let me return to the topic of Titan-enabled clients for a moment. With those, you can do simple things like this: echo "Hello! This is a test!" | titan --url=localhost/test --token=hello Or this: titan --url=localhost/test --token=hello test.txt That makes it a lot easier to upload new content! 😅 If you have a bunch of Gemtext files in a directory, you can upload them all in one go: titan --url=titan://localhost/ --token=hello *.gmi =head2 Wiki Directory Your home directory should now also contain a wiki directory called F. In it, you'll find a few more files: =over =item F is the directory with all the page files in it; each file has the C extension and should be written in Gemtext format =item F is a file containing all the files in your F directory for quick access; if you create new files in the F directory, you should delete the F file – it will get regenerated when needed; the format is one page name (without the C<.gmi> extension) per line, with lines separated from each other by a single C<\n> =item F is the directory with all the old revisions of pages in it – if you've only made one change, then it won't exist; if you don't care about the older revisions, you can delete them; assuming you have a page called C and edit it once, you have the current revision as F, and the old revision in F (the page name turns into a subdirectory and each revision gets an apropriate number) =item F is the directory with all the uploaded files in it – if you haven't uploaded any files, then it won't exist; you must explicitly allow MIME types for upload using the C<--wiki_mime_type> option (see I below) =item F is the directory with all the meta data for uploaded files in it – there should be a file here for every file in the F directory; if you create new files in the F directory, you should create a matching file here; if you have a file F you want to create a file F containing the line C =item F is a file listing all the pages made to the wiki; if you make changes to the files in the F or F directory, they aren't going to be listed in this file and thus people will be confused by the changes you made – your call (but in all fairness, if you're collaborating with others you probably shouldn't do this); the format is one change per line, with lines separated from each other by a single C<\n>, and each line consisting of time stamp, pagename or filename, revision number if a page or 0 if a file, and the numeric code of the user making the edit (see L below), all separated from each other with a C<\x1f> =item F probably doesn't exist, yet; it is an optional file containing Perl code where you can add new features and change how Phoebe works (see L below) =back =head2 Options Phoebe has a bunch of options, and it uses L in the background, which has even more options. Let's try to focus on the options you might want to use right away. Here's an example: perl phoebe \ --wiki_token=Elrond \ --wiki_token=Thranduil \ --wiki_page=Welcome \ --wiki_page=About Here's the documentation for the most useful options: =over =item C<--wiki_token> is for the token that users editing pages have to provide; the default is "hello"; you can use this option multiple times and give different users different passwords, if you want =item C<--wiki_page> is an extra page to show in the main menu; you can use this option multiple times; this is ideal for general items like I or I =item C<--wiki_main_page> is the page containing your header for the main page; that's were you would put your ASCII art header, your welcome message, and so on, see L below =item C<--wiki_mime_type> is a MIME type to allow for uploads; text/plain is always allowed and doesn't need to be listed; you can also just list the type without a subtype, eg. C will allow all sorts of images (make sure random people can't use your server to exchange images – set a password using C<--wiki_token>) =item C<--host> is the hostname to serve; the default is C – you probably want to pick the name of your machine, if it is reachable from the Internet; if you use it multiple times, each host gets its own wiki space (see C<--wiki_space> below) =item C<--port> is the port to use; the default is 1965 =item C<--wiki_dir> is the wiki data directory to use; the default is either the value of the C environment variable, or the "./wiki" subdirectory =item C<--wiki_space> adds an extra space that acts as its own wiki; a subdirectory with the same name gets created in your wiki data directory and thus you shouldn't name spaces like any of the files and directories already there (see L); not that settings such as C<--wiki_page> and C<--wiki_main_page> apply to all spaces, but the page content will be different for every wiki space =item C<--cert_file> is the certificate PEM file to use; the default is F =item C<--key_file> is the private key PEM file to use; the default is F =item C<--log_level> is the log level to use, 0 is quiet, 1 is errors, 2 is warnings, 3 is info, and 4 is debug; the default is 2 =back =head2 Running Phoebe as a Daemon If you want to start Phoebe as a daemon, the following options come in handy: =over =item C<--setsid> makes sure Phoebe runs as a daemon in the background =item C<--pid_file> is the file where the process id (pid) gets written once the server starts up; this is useful if you run the server in the background and you need to kill it =item C<--log_file> is the file to write logs into; the default is to write log output to the standard error (stderr) =item C<--user> and C<--group> might come in handy if you start Phoebe using a different user =back =head2 Using systemd In this case, we don't want to daemonize the process. Systemd is going to handle that for us. There's more documentation L. Basically, this is the template for our service: [Unit] Description=Phoebe After=network.target [Service] Type=simple WorkingDirectory=/home/phoebe ExecStart=/home/phoebe/phoebe Restart=always User=phoebe Group=phoebe [Install] WantedBy=multi-user.target Save this as F, and then link it: sudo ln -s /home/phoebe/phoebe.service /etc/systemd/system/ Reload systemd: sudo systemctl daemon-reload Start Phoebe: sudo systemctl start phoebe Check the log output: sudo journalctl --unit phoebe =head2 Security The server uses "access tokens" to check whether people are allowed to edit files. You could also call them "passwords", if you want. They aren't associated with a username. You set them using the C<--wiki_token> option. By default, the only password is "hello". That's why the Titan command above contained "token=hello". 😊 If you're going to check up on your wiki often (daily!), you could just tell people about the token on a page of your wiki. Spammers would at least have to read the instructions and in my experience the hardly ever do. You could also create a separate password for every contributor and when they leave the project, you just remove the token from the options and restart Phoebe. They will no longer be able to edit the site. =head2 Privacy The server only actively logs changes to pages. It calculates a "code" for every contribution: it is a four digit octal code. The idea is that you could colour every digit using one of the eight standard terminal colours and thus get little four-coloured flags. This allows you to make a pretty good guess about edits made by the same person, without telling you their IP numbers. The code is computed as follows: the IP numbers is turned into a 32bit number using a hash function, converted to octal, and the first four digits are the code. Thus all possible IP numbers are mapped into 8⁴=4096 codes. If you increase the log level, the server will produce more output, including information about the connections happening, like C<2020/06/29-15:35:59 CONNECT SSL Peer: "[::1]:52730" Local: "[::1]:1965"> and the like (in this case C<::1> is my local address so that isn't too useful but it could also be your visitor's IP numbers, in which case you will need to tell them about it using in order to comply with the L. =head2 Files If you allow uploads of binary files, these are stored separately from the regular pages; the wiki doesn't keep old revisions of files around. If somebody overwrites a file, the old revision is gone. You definitely don't want random people uploading all sorts of images, videos and binaries to your server. Make sure you set up those L using C<--wiki_token>! =head2 Main Page and Title The main page will include ("transclude") a page of your choosing if you use the C<--wiki_main_page> option. This also sets the title of your wiki in various places like the RSS and Atom feeds. In order to be more flexible, the name of the main page does not get printed. If you want it, you need to add it yourself using a header. This allows you to keep the main page in a page called "Welcome" containing some ASCII art such that the word "Welcome" does not show on the main page. This assumes you're using C<--wiki_main_page=Welcome>, of course. If you have pages with names that start with an ISO date like 2020-06-30, then I'm assuming you want some sort of blog. In this case, up to ten of them will be shown on your front page. =head2 GUS and robots.txt There are search machines out there that will index your site. Ideally, these wouldn't index the history pages and all that: they would only get the list of all pages, and all the pages. I'm not even sure that we need them to look at all the files. The L lets you control what the bots ought to index and what they ought to skip. It doesn't always work. Here's my suggestion: User-agent: * Disallow: raw/* Disallow: html/* Disallow: diff/* Disallow: history/* Disallow: do/changes* Disallow: do/all/changes* Disallow: do/all/latest/changes* Disallow: do/rss Disallow: do/atom Disallow: do/all/atom Disallow: do/new Disallow: do/more/* Disallow: do/match Disallow: do/search # allowing do/index! Crawl-delay: 10 In fact, as long as you don't create a page called C then this is what gets served. I think it's a good enough way to start. If you're using spaces, the C pages of all the spaces are concatenated. If you want to be more paranoid, create a page called C and put this on it: User-agent: * Disallow: / Note that if you've created your own C page, and you haven't decided to disallow them all, then you also have to do the right thing for all your spaces, if you use them at all. =head2 Limited, read-only HTTP support You can actually look at your wiki pages using a browser! But beware: these days browser will refuse to connect to sites that have self-signed certificates. You'll have to click buttons and make exceptions and all of that, or get your certificate from Let's Encrypt or the like. Anyway, it works in theory. If you went through the L, visiting C should work! Notice that Phoebe doesn't have to live behind another web server like Apache or nginx. It's a (simple) web server, too! Here's how you could serve the wiki both on Gemini, and the standard HTTPS port, 443: sudo ./phoebe --port=443 --port=1965 \ --user=$(id --user --name) --group=$(id --group --name) We need to use F because all the ports below 1024 are priviledge ports and that includes the standard HTTPS port. Since we don't want the server itself to run with all those priviledges, however, I'm using the C<--user> and C<--group> options to change effective and user and group ID. The F command is used to get your user and your group IDs instead. If you've followed the L and created a separate C user, you could simply use C<--user=phoebe> and C<--group=phoebe> instead. 👍 =head2 Configuration This section describes some hooks you can use to customize your wiki using the F file. Once you're happy with the changes you've made, reload the server to make it read the config file. You can do that by sending it the HUP signal, if you know the pid, or if you have a pid file: kill -s SIGHUP `cat phoebe.pid` Here are the ways you can hook into Phoebe code: =over =item C<@init> is a list of code references allowing you to change the configuration of the server; it gets executed as the server starts, after regular configuration =item C<@extensions> is a list of code references allowing you to handle additional URLs; return 1 if you handle a URL; each code reference gets called with $self, the first line of the request (a Gemini URL, a Gopher selector, a finger user, a HTTP request line), and a hash reference for the headers (in the case of HTTP requests) =item C<@main_menu> adds more lines to the main menu, possibly links that aren't simply links to existing pages =item C<@footer> is a list of code references allowing you to add things like licenses or contact information to every page; each code reference gets called with $self, $host, $space, $id, $revision, and $format ('gemini' or 'html') used to serve the page; return a gemtext string to append at the end; the alternative is to overwrite the C