💾 Archived View for bbs.geminispace.org › u › jsreed5 › 16905 captured on 2024-08-18 at 23:47:02. Gemini links have been rewritten to link to archived content
⬅️ Previous capture (2024-08-18)
-=-=-=-=-=-=-
My intermittent capsule outages are being caused by what appears to be a very aggressive crawler. The capsule's robots.txt file tells bots not to index my CGI scripts, but this crawler is ignoring the file and sending multiple requests per second against my scripts, which overloads the server and causes it to crash. I've temporarily solved the problem by blocking the crawler entirely; I'll look for a more permanent solution.
May 13 · 3 months ago · 👍 stack, tepez, requiem
That’s the way to do it! Also can you publish which crawler it is - what IP it is from? Maybe the creator will see it here…
🚀 jsreed5 [OP] · May 13 at 14:24:
Good point! The crawler's IP address is 104.207.150.107.
Reverse DNS resolves to celery.eu.org; over HTTP it says 'unplanned maintenance', copy-pasting the IP into the browser redirects you to a rickroll. TBH I would just keep the domain in your blacklist for now.