💾 Archived View for soviet.circumlunar.space › oak › mailinglist › 27.gmi captured on 2024-06-19 at 22:49:58. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2021-12-03)
-=-=-=-=-=-=-
Date: Wed, 27 Jan 2021 10:40:39 +0100 (CET)
Hi,
simple question:
is the following robots.txt format valid in a form that the "disallow" is applied to all User-agents mentioned before?
---
User-agent: researcher
User-agent: indexer
User-agent: archiver
Disallow: about
---
or do i need to be more chatty?
---
User-agent: researcher
Disallow: about
User-agent: indexer
Disallow: about
User-agent: archiver
Disallow: about
---
kind regards
Ren?
--------
Date: Wed, 27 Jan 2021 11:27:41 +0100
On Wed, Jan 27, 2021 at 10:40:39AM +0100,
Ren? Wagner <rwagner at rw-net.de> wrote
a message of 23 lines which said:
simple question:
Complicated answers:
is the following robots.txt format valid in a form that the
"disallow" is applied to all User-agents mentioned before?
1) There is no standard for robots.txt.
2) There is not yet an "official" adaptation to Gemini, just
proposals.
--------
Date: Wed, 27 Jan 2021 05:56:04 -0500
It was thus said that the Great Ren? Wagner once stated:
Hi,
simple question:
is the following robots.txt format valid in a form that the "disallow" is applied to all User-agents mentioned before?
---
User-agent: researcher
User-agent: indexer
User-agent: archiver
Disallow: about
---
That will work, but you need to add a leading '/' to the Disallow line:
Disallow: /about
That will match any request starting with '/about', like '/about',
'/aboutthis', '/about/that', etc.
or do i need to be more chatty?
---
User-agent: researcher
Disallow: about
User-agent: indexer
Disallow: about
User-agent: archiver
Disallow: about
---
That will work too (same thing about the Disallow: line though). You can
read more about it at <http://www.robotstxt.org/>.
-spc
--------
Date: Wed, 27 Jan 2021 12:12:54 +0100
On Wed, Jan 27, 2021 at 05:56:04AM -0500,
Sean Conner <sean at conman.org> wrote
a message of 33 lines which said:
That will work too (same thing about the Disallow: line though). You can
read more about it at <http://www.robotstxt.org/>.
But do note that many Gemini capsules do not follow this specification
but one of the others (typically more complicated).
--------
Date: Wed, 27 Jan 2021 15:38:48 +0100 (CET)
Thanks for the replys.
I've opted for the first version at the moment.
Off course no one knows how exactly crawlers out there are implemented or if they obey robots.txt at all.
Atleast i can serve a valid robots.txt now.
cheers
Ren?
--------