💾 Archived View for jfh.me › posts › 2020-11-02-gemini-dianostics.gmi captured on 2022-04-29 at 11:23:45. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2021-11-30)
-=-=-=-=-=-=-
I saw that the Gemini documentation has a client torture test. That made me curious if there was a server torture test as well. I found a reference to a test in the Gemini mailing list.
https://github.com/michael-lazar/jetforce/blob/master/jetforce_diagnostics.py
I went to look at ~jetforce_diagnostics.py~ and of course encountered a 404. After looking through the git history and release notes, I was able to find that the diagnostics script was pulled into it's own repo.
https://github.com/michael-lazar/gemini-diagnostics
After taking a quick look at the code I tried it out and tested my server. The script found a lot of issues with my own server implementation. Some of the findings seem to be false positives, but some of the issues seem to be real discrepancies with the implementation of Melchoir and the spec. Here are the issues that the script pointed out
1. [TLSRequired] Non-TLS requests should be refused
2. [HomepageRedirect] A URL with no trailing slash should redirect to the canonical resource
3. [RequestMissingCR] A request without a <CR> should timeout
4. [URLInvalidUTF8Byte] Send a URL containing a non-UTF8 byte sequence
5. [URLAboveMaxSize] Send a 1025 byte URL, above the maximum allowed size
6. [URLWrongPort] A URL with an incorrect port number should be rejected
7. [URLWrongHost] A URL with a foreign hostname should be rejected
8. [URLSchemeHTTP] Send a URL with an HTTP scheme
9. [URLSchemeHTTPS] Send a URL with an HTTPS scheme
10. [URLSchemeGopher] Send a URL with a Gopher scheme
11. [URLEmpty] Empty URLs should not be accepted by the server
12. [URLInvalid] Random text should not be accepted by the server
13. [URLDotEscape] A URL should not be able to escape the root using dot notation
That's a lot of errors!
Fixing the errors here was pretty easy. My basic server was very easygoing when it came to the format of the requests. The only thing I was really doing was making sure directory traversal was not possible. My error codes weren't quite right and I wasn't enforcing checks on the hostname, port numbers, and scheme.
The expected behavior when a request doesn't end with ~\r\n~ was a little surprising.. I.e. I didn't expect that you should get a timeout / empty response If the request didn't end with ~\r\n~. I guess it kind of makes sense. Perhaps in the early phases of the protocol it's better to be a bit strict.