💾 Archived View for gmn.clttr.info › sources › geminispace.info.git › tree › docs › handling-robots.… captured on 2023-01-29 at 05:05:10.

View Raw

More Information

-=-=-=-=-=-=-

# robots.txt handling

robots.txt is fetched for each (sub)domain before actually crawling the content.

GUS honors the following User-agents:


## robots.txt caching

Every fetched robots.txt is cached only for the current crawl.