Getting slammed by a client

1. Sean Conner (sean (a) conman.org)


  Is anyone else getting slammed by 70.113.100.216?  Does anyone know who
this is?  I don't mind my server being used by clients as a test platform,
but making a few hundred requests per second is a bit much.  Also, a bunch
of bogus requests.

  Sigh.

  I don't really want to block it, but whoever is responsible for it is
doing a bad job at parsing URLs and a request of "" is NOT a valid Gemini
request.

  On the plus side, it did flush out a bug in GLV-1.12556, so there's that.

  -spc

Link to individual message.

2. Sigrid Solveig HaflΓ­nudΓ³ttir (ftrvxmtrx (a) gmail.com)

On 25 July 2020 01:53:56 CEST, Sean Conner <sean at conman.org> wrote:
>
>Is anyone else getting slammed by 70.113.100.216?  Does anyone know who
>this is?  I don't mind my server being used by clients as a test
>platform,
>but making a few hundred requests per second is a bit much.  Also, a
>bunch
>of bogus requests.
>
>  Sigh.
>
>  I don't really want to block it, but whoever is responsible for it is
>doing a bad job at parsing URLs and a request of "" is NOT a valid
>Gemini
>request.
>
>On the plus side, it did flush out a bug in GLV-1.12556, so there's
>that.
>
>  -spc

Getting that slam as well here.

Link to individual message.

3. colecmac (a) protonmail.com (colecmac (a) protonmail.com)

Same here, over 1494 requests since Jul 23 17:22:40. It ended
at Jul 24 19:41:05 for me. That's in EST timezone.

And yes, I can confirm the URL parsing was horribly done, approximately
300 of those requests (20%) resulted in 51 NOT FOUND errors, seemingly
all because of incorrectly constructed relative URLs. If this happens
again I will likely ban the IP.

Thanks for bringing this to my/our attention, I wasn't aware of until now.
Do you have some monitoring system in place? How did you find out?

Cheers,
makeworld

Link to individual message.

4. Paul Boyd (boyd.paul2 (a) gmail.com)

I had a few of these too. Not nearly as many though. Request counts by
minute (UTC):

      3 2020-07-24T18:16
      2 2020-07-24T18:24
      1 2020-07-24T18:27
      1 2020-07-24T19:36
     29 2020-07-24T19:59
      1 2020-07-24T20:00
     22 2020-07-24T23:25
      3 2020-07-24T23:33
      2 2020-07-24T23:34
      1 2020-07-24T23:40
     24 2020-07-24T23:41

On Fri, Jul 24, 2020 at 8:46 PM <colecmac at protonmail.com> wrote:

> Same here, over 1494 requests since Jul 23 17:22:40. It ended
> at Jul 24 19:41:05 for me. That's in EST timezone.
>
> And yes, I can confirm the URL parsing was horribly done, approximately
> 300 of those requests (20%) resulted in 51 NOT FOUND errors, seemingly
> all because of incorrectly constructed relative URLs. If this happens
> again I will likely ban the IP.
>
> Thanks for bringing this to my/our attention, I wasn't aware of until now.
> Do you have some monitoring system in place? How did you find out?
>
> Cheers,
> makeworld
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20200724/bdcf
a50d/attachment.htm>

Link to individual message.

5. btilley (a) gatech.edu (btilley (a) gatech.edu)

On Sat, Jul 25, 2020 at 02:19:53AM +0200, Sigrid Solveig Hafl?nud?ttir wrote:
> On 25 July 2020 01:53:56 CEST, Sean Conner <sean at conman.org> wrote:
> >
> >Is anyone else getting slammed by 70.113.100.216?  Does anyone know who
> >this is?  I don't mind my server being used by clients as a test
> >platform,
> >but making a few hundred requests per second is a bit much.  Also, a
> >bunch
> >of bogus requests.
> >
> >  Sigh.
> >
> >  I don't really want to block it, but whoever is responsible for it is
> >doing a bad job at parsing URLs and a request of "" is NOT a valid
> >Gemini
> >request.
> >
> >On the plus side, it did flush out a bug in GLV-1.12556, so there's
> >that.
> >
> >  -spc
> 
> Getting that slam as well here.

Same at my gemini server.

Brad

Link to individual message.

6. cmccabe (a) rawtext.club (cmccabe (a) rawtext.club)

I had a few too:

      1 2020-07-23T21:13:27Z
      1 2020-07-24T18:16:29Z
     43 2020-07-24T18:16:27Z
     69 2020-07-24T18:16:28Z


On Fri, Jul 24, 2020 at 08:54:33PM -0400, Paul Boyd wrote:
> I had a few of these too. Not nearly as many though. Request counts by
> minute (UTC):
> 
>       3 2020-07-24T18:16
>       2 2020-07-24T18:24
>       1 2020-07-24T18:27
>       1 2020-07-24T19:36
>      29 2020-07-24T19:59
>       1 2020-07-24T20:00
>      22 2020-07-24T23:25
>       3 2020-07-24T23:33
>       2 2020-07-24T23:34
>       1 2020-07-24T23:40
>      24 2020-07-24T23:41
> 
> On Fri, Jul 24, 2020 at 8:46 PM <colecmac at protonmail.com> wrote:
> 
> > Same here, over 1494 requests since Jul 23 17:22:40. It ended
> > at Jul 24 19:41:05 for me. That's in EST timezone.
> >
> > And yes, I can confirm the URL parsing was horribly done, approximately
> > 300 of those requests (20%) resulted in 51 NOT FOUND errors, seemingly
> > all because of incorrectly constructed relative URLs. If this happens
> > again I will likely ban the IP.
> >
> > Thanks for bringing this to my/our attention, I wasn't aware of until now.
> > Do you have some monitoring system in place? How did you find out?
> >
> > Cheers,
> > makeworld
> >

Link to individual message.

7. Sean Conner (sean (a) conman.org)

It was thus said that the Great colecmac at protonmail.com once stated:
> Same here, over 1494 requests since Jul 23 17:22:40. It ended
> at Jul 24 19:41:05 for me. That's in EST timezone.

  You probably didn't have a large site.  Now that I have added 20 years of
blog entries, it only stopped when I finally blocked it at the firewall
about fifteen minutes ago.  230K packets (12M of network traffic) blocked,
and it shows no sign of letting up, which tells me it's not being monitored.

> Thanks for bringing this to my/our attention, I wasn't aware of until now.
> Do you have some monitoring system in place? How did you find out?

  It's rather unorthodox, but I forward my server logs to my home server
(not just Gemini, anything that's logged via syslog()). There, I have a
program that displays the logs in real time and color coded [1][2].  I'm not


  Ooh, it's still going, 400k packets, 21M bytes.

  -spc

[1]	I have two systems running, each with their own display.  I run the
	program on both, leaving the window open.  I use said windows as a
	type of "screen saver", but it's also useful to see what's going on
	with both my public server and home systems (ah, six different IP
	addresses tried to log in via SSH---they'll be blocked soon enough).

[2]	The custom syslog system I have is here:

		https://github.com/spc476/syslogintr

	The actual program is

		https://github.com/spc476/syslogintr/blob/master/realtime.lua

Link to individual message.

8. Hannu Hartikainen (hannu.hartikainen+gemini (a) gmail.com)

Thanks for pointing this out. I never read logs if everything works, so...

I'm getting lots of requests to urls like
gemini://hannuhartikainen.fi/twinwiki/Welcome,%20visitors%21/twinwiki/Welco
me%252C%2520visitors%2521/_history/twinwiki/_edit/twinwiki/_create/twinwiki
/_create/twinwiki/_help/twinwiki/_help/twinwiki/_index/twinwiki/_edit/twinw
iki/_history/twinwiki/_edit/twinwiki/_create/twinwiki/_help/twinwiki/_histo
ry/twinwiki/_help/twinwiki/_create/twinwiki/_history/twinwiki/_history/twin
wiki/_history/twinwiki/_create/twinwiki/_create/twinwiki/_index/twinwiki/_h
istory/twinwiki/_edit/twinwiki/_edit/twinwiki/_history/twinwiki/_help/twinw
iki/_help/twinwiki/_history/twinwiki/_create/twinwiki/_help/twinwiki/_edit/
twinwiki/_index/twinwiki/_history

Oops, I've written bugs once again! I do have this robots.txt, though:

User-agent: gus
Allow: /

User-agent: *
Disallow: /

(I guess I should disallow even gus from twinwiki, or at least any
non-content pages.)

The crawler also breaks ansi.hrtk.in for other users while crawling
(which disallows even gus in robots.txt). I couldn't figure out how to
make Jetforce stop streaming if the client closes connection. The code
is here if someone has pointers:
https://github.com/dancek/ansimirror/blob/master/ansimirror.py

Anyone have experience fighting misbehaving crawlers? Should we
develop low-resource honeypots to exhaust crawler resources? Or start
maintaining a community blacklist?

-Hannu

Link to individual message.

9. Martin Keegan (martin (a) no.ucant.org)

On Sat, 25 Jul 2020, Hannu Hartikainen wrote:

> Should we develop low-resource honeypots to exhaust crawler resources? 
> Or start maintaining a community blacklist?

I'd start with hardening Gemini servers ...

The server I've written has a bunch of hard limits (in particular, on the 
duration of requests and the number of concurrent CGI processes). It might 
be advisable to add some more back-pressure for the case where user-agents 
are generating obviously redundant requests to an excessive degree. I'd 
put that before trying honeypots and blacklists.

Mk

-- 
Martin Keegan, +44 7779 296469, @mk270, https://mk.ucant.org/

Link to individual message.

10. Julien Blanchard (julien (a) typed-hole.org)

Got a little more than 9K requests in a few hours from this IP, at least
it's good to know that my server on a Pi Zero handled this "load"
gracefully.

Link to individual message.

11. mojibake (mojibake (a) riseup.net)

Thanks for flagging - also getting slammed by this IP.

If anyone has any decent automated methods for picking up and banning this 
traffic from gemini servers, plz share. I wonder if it's possible to 
configure a fail2ban jail for gemini servers? Anyone tried this? I'm not 
familiar enough with the server-side yet to know how I might go about this.

On 25/07/2020 10:32, Julien Blanchard wrote:
> Got a little more than 9K requests in a few hours from this IP, at least
> it's good to know that my server on a Pi Zero handled this "load"
> gracefully.

Link to individual message.

12. Alex Schroeder (alex (a) gnu.org)

2200 hits in the last few days...

I'm going to setup a fail2ban rule.  Adapt the datepattern to your
logfiles. Basically any successful connection counts as "a failed login
attempt". Of these, you may have 20 in a 40s time window, which is what
I think is a reasonable upper limit for humans and bots. If you're
crawling the site faster than that, you get banned for 10min by the
firewall.


/etc/fail2ban/jail.d/alex.conf:

[alex-gemini]
enabled = true
port    = 1965
logpath = /home/alex/farm/gemini-wiki.log
findtime = 40
maxretry = 20


/etc/fail2ban/filter.d/alex-gemini.conf:

[Init]
# 2018/08/25-09:08:55 CONNECT TCP Peer: "[000.000.000.000]:56281"
Local: "[000.000.000.000]:70"
datepattern = ^%%Y/%%m/%%d-%%H:%%M:%%S

[Definition]
# ANY match in the logfile counts!
failregex = CONNECT TCP Peer: "\[<HOST>\]:\d+"


I also activated the recidive rule in fail2ban. This means that people
who get banned by fail2ban repeatedly get banned for even longer times
(hours instead of minutes). This is in the first file again:


/etc/fail2ban/jail.d/alex.conf:

[recidive]
enabled = true


I use this system for my websites, my gopher sites, and now for gemini,
too. The attached image shows what this looks like over time, using
Munin. As you can see, almost all the bans are due to the websites.

Cheers
Alex




-------------- next part --------------
A non-text attachment was scrubbed...
Name: Image-X1D7N0.png
Type: image/png
Size: 36409 bytes
Desc: not available
URL: <https://lists.orbitalfox.eu/archives/gemini/attachments/20200725/b32e
a25d/attachment-0001.png>

Link to individual message.

13. Alex Schroeder (alex (a) gnu.org)

(Resending because it seems that my mail disappeared somewhere.)

2200 hits in the last few days...

I'm going to setup a fail2ban rule.  Adapt the datepattern to your
logfiles. Basically any successful connection counts as "a failed login
attempt". Of these, you may have 20 in a 40s time window, which is what
I think is a reasonable upper limit for humans and bots. If you're
crawling the site faster than that, you get banned for 10min by the
firewall.


/etc/fail2ban/jail.d/alex.conf:

[alex-gemini]
enabled = true
port    = 1965
logpath = /home/alex/farm/gemini-wiki.log
findtime = 40
maxretry = 20


/etc/fail2ban/filter.d/alex-gemini.conf:

[Init]
# 2018/08/25-09:08:55 CONNECT TCP Peer: "[000.000.000.000]:56281"
Local: "[000.000.000.000]:70"
datepattern = ^%%Y/%%m/%%d-%%H:%%M:%%S

[Definition]
# ANY match in the logfile counts!
failregex = CONNECT TCP Peer: "\[<HOST>\]:\d+"


I also activated the recidive rule in fail2ban. This means that people
who get banned by fail2ban repeatedly get banned for even longer times
(hours instead of minutes). This is in the first file again:


/etc/fail2ban/jail.d/alex.conf:

[recidive]
enabled = true


I use this system for my websites, my gopher sites, and now for gemini,
too.

Cheers
Alex

Link to individual message.

14. Solderpunk (solderpunk (a) posteo.net)

And my axe!

Errm, by which I mean, yes, gemini.circumlunar.space has not been left
out of this and I've seen just over 3,000 requests from that IP.

On Sat Jul 25, 2020 at 11:18 AM CEST, Martin Keegan wrote:

> I'd start with hardening Gemini servers ...

Yes, indeed.  The spec does have a status code (44) meaning "slow down!"
intended to allow machine-readable communication of a rate limiting
policy to bots.  Of course, it can be ignored by malicious bots, but it
should help against well-meaning but incompetent bots.  And, if nothing
else, serving up code 44 with no response body over and over again
rather than serving up actual resources should spare the server some
traffic, and also might slow down discovery of new URLs to hit by an
overactive crawler.

Anyway, to the best of my knowledge no server actually implements a rate
limiting scheme using this status code, but I suppose the time has come,
as it inevitably would.  Posts like this one have been common for years
on the gopher mailing list, which is precisely the reason I added such a
status code.  Let's see if it helps any...

Cheers,
Solderpunk

Link to individual message.

15. mojibake (mojibake (a) riseup.net)

That's great - Thanks Alex. Will try this.

On 25/07/2020 14:32, Alex Schroeder wrote:
> (Resending because it seems that my mail disappeared somewhere.)
>
> 2200 hits in the last few days...
>
> I'm going to setup a fail2ban rule.  Adapt the datepattern to your
> logfiles. Basically any successful connection counts as "a failed login
> attempt". Of these, you may have 20 in a 40s time window, which is what
> I think is a reasonable upper limit for humans and bots. If you're
> crawling the site faster than that, you get banned for 10min by the
> firewall.
>
>
> /etc/fail2ban/jail.d/alex.conf:
>
> [alex-gemini]
> enabled = true
> port    = 1965
> logpath = /home/alex/farm/gemini-wiki.log
> findtime = 40
> maxretry = 20
>
>
> /etc/fail2ban/filter.d/alex-gemini.conf:
>
> [Init]
> # 2018/08/25-09:08:55 CONNECT TCP Peer: "[000.000.000.000]:56281"
> Local: "[000.000.000.000]:70"
> datepattern = ^%%Y/%%m/%%d-%%H:%%M:%%S
>
> [Definition]
> # ANY match in the logfile counts!
> failregex = CONNECT TCP Peer: "\[<HOST>\]:\d+"
>
>
> I also activated the recidive rule in fail2ban. This means that people
> who get banned by fail2ban repeatedly get banned for even longer times
> (hours instead of minutes). This is in the first file again:
>
>
> /etc/fail2ban/jail.d/alex.conf:
>
> [recidive]
> enabled = true
>
>
> I use this system for my websites, my gopher sites, and now for gemini,
> too.
>
> Cheers
> Alex
>
>
>
>
>
>

Link to individual message.

16. colecmac (a) protonmail.com (colecmac (a) protonmail.com)

Solderpunk wrote:

> > I'd start with hardening Gemini servers ...
>
> [...]
>
> Anyway, to the best of my knowledge no server actually implements a rate
> limiting scheme using this status code, but I suppose the time has come,
> as it inevitably would.

I have filed a Jetforce issue to add this feature.

https://github.com/michael-lazar/jetforce/issues/39

I'd be happy to see it in Molly Brown as well, and lots of other Gemini
servers!


makeworld

Link to individual message.

17. Sean Conner (sean (a) conman.org)

It was thus said that the Great Sean Conner once stated:
> 
>   Is anyone else getting slammed by 70.113.100.216?  Does anyone know who
> this is?  I don't mind my server being used by clients as a test platform,
> but making a few hundred requests per second is a bit much.  Also, a bunch
> of bogus requests.

  Okay, I was able to track down the party responsible.  I won't go into how
I did, or who they are because I don't want anyone piling up on the party
responsible.  I sent a message to said party about the issues, and
hopefully, I should hear a resonse back, or at the very least, the issues
will be resolved silently.

  -spc

Link to individual message.

18. Petite Abeille (petite.abeille (a) gmail.com)



> On Jul 26, 2020, at 03:02, Sean Conner <sean at conman.org> wrote:
> 
> Okay, I was able to track down the party responsible.  

Well done! :)

Link to individual message.

---

Previous Thread: Open Source Proxy

Next Thread: [ANN] A new flight journal! gemini://cetacean.club