💾 Archived View for gmn.clttr.info › sources › geminispace.info.git › tree › docs › handling-robots.… captured on 2024-02-05 at 10:05:12.

View Raw

More Information

⬅️ Previous capture (2023-01-29)

-=-=-=-=-=-=-

# robots.txt handling

robots.txt is fetched for each (sub)domain before actually crawling the content.

GUS honors the following User-agents:


## robots.txt caching

Every fetched robots.txt is cached only for the current crawl.