Hi, simple question: is the following robots.txt format valid in a form that the "disallow" is applied to all User-agents mentioned before? --- User-agent: researcher User-agent: indexer User-agent: archiver Disallow: about --- or do i need to be more chatty? --- User-agent: researcher Disallow: about User-agent: indexer Disallow: about User-agent: archiver Disallow: about --- kind regards Ren?
On Wed, Jan 27, 2021 at 10:40:39AM +0100, Ren? Wagner <rwagner at rw-net.de> wrote a message of 23 lines which said: > simple question: Complicated answers: > is the following robots.txt format valid in a form that the > "disallow" is applied to all User-agents mentioned before? 1) There is no standard for robots.txt. 2) There is not yet an "official" adaptation to Gemini, just proposals.
It was thus said that the Great Ren? Wagner once stated: > Hi, > > simple question: > is the following robots.txt format valid in a form that the "disallow" is applied to all User-agents mentioned before? > --- > User-agent: researcher > User-agent: indexer > User-agent: archiver > Disallow: about > --- That will work, but you need to add a leading '/' to the Disallow line: Disallow: /about That will match any request starting with '/about', like '/about', '/aboutthis', '/about/that', etc. > or do i need to be more chatty? > --- > User-agent: researcher > Disallow: about > User-agent: indexer > Disallow: about > User-agent: archiver > Disallow: about > --- That will work too (same thing about the Disallow: line though). You can read more about it at <http://www.robotstxt.org/>. -spc
On Wed, Jan 27, 2021 at 05:56:04AM -0500, Sean Conner <sean at conman.org> wrote a message of 33 lines which said: > That will work too (same thing about the Disallow: line though). You can > read more about it at <http://www.robotstxt.org/>. But do note that many Gemini capsules do not follow this specification but one of the others (typically more complicated).
Thanks for the replys. I've opted for the first version at the moment. Off course no one knows how exactly crawlers out there are implemented or if they obey robots.txt at all. Atleast i can serve a valid robots.txt now. cheers Ren?
---
Previous Thread: Proposal: Simple structured form specification