💾 Archived View for gemi.dev › gemlog › 2023-06-20-capsule-health-tool.gmi captured on 2024-07-09 at 00:15:38. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2023-07-10)

-=-=-=-=-=-=-

New Feature: 🩺 Capsule Health Report

2023-00-00 | #kennedy | @Acidus

I built a new report into my search engine Kennedy: the Capsule Health report. This checks a all the URLs of a capsule for:

For any URLs that do have a problem, you can click on the URL and get a list of the inbound links to that page. Personally I've used this fix internal links that are broken, links that I've typo-ed by mistake, and even to put redirects in place to correct old, inbound links to pages that have moved.

To use it, go to Kennedy, select the Capsule Health Report, and enter the domain name of a capsule.

Kennedy Search Engine

And surprisingly, no, I didn't write this feature while waiting for Call of Duty to download and install. I actually wrote it several weeks back, but just hadn't gotten around to writing about it yet. Thanks for Michael Nordmeyer for encouraging me to write more about my Kennedy work.

I just want to fucking play Call of Duty on my day off

How does it work?

It's pretty simple really. One of the outputs I create when crawling Geminispace is a graph of how pages link to each other. This is stored in a database. I can then run a simple query to find all URLs in a capsule that either had a connection problem or had a bad/broken status code. I can also run a query that says "what URLs link to this URL" to build the list of inbound links to the problem page.

What specifically does it look for?

The report is divided into three sections.

First are connection errors, which are pretty straight forward. This flags things like DNS names that don't resolve, TCP connections that close prematurely or timeout, or TLS errors. I include the full error in the report to help you troubleshoot.

Second are broken or missing Gemini status codes. These are status codes in the 4x and 5x range. By far the most common is error here are 51, which is the Gemini equivalent of an HTTP 404. In fact, 51 is the second most common status code across all of Geminispace, with 20 being the first. This is for responses that missing, presumably temporarily, due to a typo or error. But also reported here are things like CGI errors, bad requests, and server errors.

52 Gone!

The final section of the report is specifically for the "52 Gone" status code. Originally I had these in the previous section. However Michael Nordmeyer pointed out to me, gemini defines the response code "52 GONE" as a way for capsule owners to purposely say "this content has been removed and isn't coming back." So content that has a 52 status code may not be a mistake, and really shouldn't be considered an error. However, its possible that it is a mistake. So I decided to create a separate section of just URLs that return a 52 status code, so you can manually review them and ensure you want those URLs to be "gone."

Enjoy! And if you haven't already, given Kennedy a try:

Kennedy Search Engine