Crawlers on Gemini and best practices



> On Dec 10, 2020, at 14:43, Stephane Bortzmeyer <stephane at sources.org> wrote:
> 
> The spec is quite vague about the *order* of directives.

Perhaps of interest:

While by standard implementation the first matching robots.txt pattern 
always wins, Google's implementation differs in that Allow patterns with 
equal or more characters in the directive path win over a matching 
Disallow pattern. Bing uses the Allow or Disallow directive which is the most specific.

In order to be compatible to all robots, if one wants to allow single 
files inside an otherwise disallowed directory, it is necessary to place 
the Allow directive(s) first, followed by the Disallow.

http://en.wikipedia.org/wiki/Robots_exclusion_standard

Also:

https://developers.google.com/search/reference/robots_txt

---

Previous in thread (6 of 41): 🗣️ Stephane Bortzmeyer (stephane (a) sources.org)

Next in thread (8 of 41): 🗣️ Côme Chilliet (come (a) chilliet.eu)

View entire thread.