馃懡 acidus

I used all the content from previous crawls by my search engine Kennedy to construct a searchable archive! It's just like the Internet Archive's Wayback machine, but for Gemini. 2 million+ URLs/versions, going back to Jan 2022.

Example: gemini://kennedy.gemi.dev/archive/cached?url=gemini%3a%2f%2fdrewdevault.com%2f&t=637898933801779424&raw=False

You can enter and exact URL, or just search for part of a URL, like searching for a domain name

gemini://kennedy.gemi.dev/

2 years ago 路 馃憤 cobradile94, skyjake, jo, astroseneca, warpengineer, mozz, moddedbear, devyl

Links

gemini://kennedy.gemi.dev/archive/cached?url=gemini%3a%2f%2fdrewdevault.com%2f&t=637898933801779424&raw=False

gemini://kennedy.gemi.dev/

Actions

馃憢 Join Station

6 Replies

馃懡 danrl

@acidus love it! searching without being afraid of everyone easily finding all my past mishaps! keep up the good work and thanks for running your services for all of us to use 路 2 years ago

馃懡 acidus

@danrl sure is. The Gemini subset of Robots.txt talks specifically about this.

gemini://gemini.circumlunar.space/docs/companion/robots.gmi

Basically, set "archiver" exclusion rules in your robots.txt. This allows your capsule to be crawled and available for search engines, without being archived. Or just deny all crawlers. Whatever you feel most comfortable with.

I'll also add some more clear language and a way to contact me to opt out as well 路 2 years ago

gemini://gemini.circumlunar.space/docs/companion/robots.gmi

馃懡 acidus

@skyjack. fixed. Thanks 路 2 years ago

馃懡 danrl

is there a way to opt out capsules,?like with the internet archive which used to follow robots.txt (but then stopped which i find quite audacious, but that鈥檚 a different story) 路 2 years ago

馃懡 skyjake

This is the archive _verision_ of

A small typo... 馃 路 2 years ago

馃懡 cobradile94

That鈥檚 awesome! Gemini needs to be preserved the same way the Clearnet does! I really appreciate the work you鈥檙e doing here! 馃榿 路 2 years ago