💾 Archived View for kennedy.gemi.dev › archive › faq.gmi captured on 2023-09-08 at 15:51:26. Gemini links have been rewritten to link to archived content

View Raw

More Information

⬅️ Previous capture (2023-07-22)

-=-=-=-=-=-=-

🏎 Delorean Time Machine FAQ

What is the Delorean Time Machine?

The Delorean Time Machine (Delorean for short) is a historial archive of content from Geminispace. It's like the Internet Archive's Wayback machine, just for Gemini. It allows you:

How does Delorean work?

Delorean tracks a list of URLs in Geminispace. For each URL Delorean stores "snapshots." A snapshot is the content that URL returned on a specific time/date. As content changes over time, Delorean stores the newer versions, as well as the old. By looking at the different snapshots, you can view a URLs content and see how it has changed over time, even if the original capsule is not available.

Where does Delorean's data come from?

To build it's search engine database, Kennedy crawls capsules and downloads each URL's content. Delorean imports this content into its archive.

Can I exclude my content from Delorean?

Absolutely! Delorean respects a capsule's robots.txt file, which allows you to tell crawlers, search engines, and archivers what content can be included or excluded.

robots.txt for Gemini

Content passes through 2 filters before being archived in Delorean. First, since all content archived by Delorean comes as a byproduct of the crawler for Kennedy, any part of a capsule that is excluded from search engines will not appear in Delorean. All content blocked by any "Disallow" rules targeting all user-agents or the "indexer" virtual user-agent will not crawled by Kennedy, and so it cannnot appear in Delorean.

The second filter is also controlled by robots.txt. Since Delorean is building an archive, it also respects any "archiver" rules in robots.txt. When importing content from Kennedy, Delorean checks the capsule's robots.txt file for "Disallow" rules targeting the "archiver" virtual user-agent, and will not import any content that is blocked.

How do I exclude my capsule from Delorean's archive?

Create a `robots.txt` file in the root directory of your capsule if it doesn't already exist. Add these lines to it:

user-agent: archiver
Disallow: /

Some content I might want to excluded is in Delorean! How can I remove it?

No worries, mistakes happen. Drop me an email and I can remove it.

Email me

How can my capsule appear in search results, but not in Delorean's archive?

You can use "archiver" rules in robots.txt to control what Delorean archives. By not excluding content from "indexer" user-agents, but excluding content from "archiver" user-agents, your content will appear in search engine results while not appearing in Delorean's archive.

What content is archived?

If its served over Gemini, is smaller than 5 MiB, and isn't excluded by robots.txt, it should be archived.

Most specifically:

If a URL meets those requirements, assuming it is not excluded by robots.txt rules for indexers and archives, it will appear in Delorean.

How far back does the Archive go?

The oldest content in the archives is from September 2020, with most content appearing by mid-2022.

The completeness of the archive really is a history of the capabilities of the Kennedy crawler, and how often I run it.

What happens when this baby hits 88 miles-per-hour?

You've gonna see some serious shit