2006-03-23 Wikis

CommunityWiki or Oddwiki are being slammed. Yesterday I fixed an infinite recursive include problem and installed a limit of 10 processes (for cross-script infinite recursive include problems).

fixed an infinite recursive include problem

I ran *top* and saw that we had **load > 60**!

I checked **tail -f access.log** and noticed an IP number.

I added it to .htaccess:

Order Allow,Deny
Allow from all
Deny from 84.19.180.57

And soon enough, load was coming down again:

top - 04:07:02 up 1 day, 14:08, 11 users,  load average: 45.27, 59.65, 54.06
Tasks: 264 total,   2 running, 256 sleeping,   0 stopped,   6 zombie
Cpu(s):  5.8% us,  1.3% sy,  0.0% ni, 19.6% id, 73.1% wa,  0.0% hi,  0.2% si
Mem:   2056776k total,  1348628k used,   708148k free,    47676k buffers
Swap:  7783452k total,   450160k used,  7333292k free,   495848k cached

When I looked at the log using my leech-detector script (see WebServerLogs), I saw this:

WebServerLogs

alex@mirabel:~$ tail -n 10000 websites/web_logfiles/communitywiki.org/access.log | leech-detector | head
        84.19.180.57       9946     0K  99%    0.0s  403 (100%)
         66.249.65.9         17     5K   0%   11.1s  200 (82%), 302 (11%), 400 (5%)
        64.203.57.78         12     2K   0%    5.9s  200 (66%), 304 (16%), 302 (16%)
        194.3.93.136          7     9K   0%    1.7s  200 (85%), 301 (14%)
      195.101.248.97          4     8K   0%    1.3s  200 (100%)
        209.191.83.3          2     9K   0%   22.0s  200 (100%)
       64.158.138.84          2     2K   0%    4.0s  200 (100%)
        72.14.199.12          1     9K   0%          200 (100%)
      82.123.123.155          1     9K   0%          200 (100%)
         83.90.57.26          1     9K   0%          200 (100%)

I wondered – what the fuck is the stupid asshole leeching!?

www.communitywiki.org 84.19.180.57 - - [23/Mar/2006:04:15:44 -0700]
 "GET /en?action=browse;id=BulletSummaryBlock;revision=17 HTTP/1.1" 403 303 0 "-"
 "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"
www.communitywiki.org 84.19.180.57 - - [23/Mar/2006:04:15:44 -0700]
 "GET /en?action=browse;id=BlogControlledByWiki;revision=58 HTTP/1.1" 403 303 0 "-"
 "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"
www.communitywiki.org 84.19.180.57 - - [23/Mar/2006:04:15:44 -0700]
 "GET /en?action=browse;id=BulletSummaryBlock;revision=18 HTTP/1.1" 403 303 0 "-"
 "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"
www.communitywiki.org 84.19.180.57 - - [23/Mar/2006:04:15:44 -0700]
 "GET /en?action=browse;id=BlogControlledByWiki;revision=57 HTTP/1.1" 403 303 0 "-"
 "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"
www.communitywiki.org 84.19.180.57 - - [23/Mar/2006:04:15:44 -0700]
 "GET /en?action=browse;id=BlogControlledByWiki;revision=8 HTTP/1.1" 403 303 0 "-"
 "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"
www.communitywiki.org 84.19.180.57 - - [23/Mar/2006:04:15:44 -0700]
 "GET /en?action=browse;id=BulletSummaryBlock;revision=14 HTTP/1.1" 403 303 0 "-"
 "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"
www.communitywiki.org 84.19.180.57 - - [23/Mar/2006:04:15:44 -0700]
 "GET /en?action=browse;id=BlogControlledByWiki;revision=60 HTTP/1.1" 403 303 0 "-"
 "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"
www.communitywiki.org 84.19.180.57 - - [23/Mar/2006:04:15:44 -0700]
 "GET /en?action=browse;id=BulletSummaryBlock;revision=19 HTTP/1.1" 403 303 0 "-"
 "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"
www.communitywiki.org 84.19.180.57 - - [23/Mar/2006:04:15:44 -0700]
 "GET /en?action=browse;id=BlogWiki HTTP/1.1" 403 303 0 "-" "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)"

They are getting everything – old revisions current stuff, everything.

A little **whois 84.19.180.57** tells me what I wanted to know: IP range (84.19.176.0 - 84.19.191.255) and company (Keyweb AG IP Network). They do hosting, so they or one of their customers has written a bad bad spider...

How many hits?

alex@mirabel:~$ grep 84.19.180.57 websites/web_logfiles/communitywiki.org/access.log | wc -l
100889

I wonder how the script reacted:

alex@mirabel:~$ grep 84.19.180.57 websites/web_logfiles/communitywiki.org/access.log | leech-detector
        84.19.180.57     102769     5K 100%   21.7s  403 (42%), 200 (26%), 503 (21%), 301 (4%), 404 (3%), 302 (1%), 400 (0%), 501 (0%)

Hm, so they did get a lot of 200 OK responses. 503 Server Unavailable is what you get when you send too many requests. I would have expected more of those. Amazing how many 403 Not Authorized they are getting – it seems that now that they are being denied by the server, they’re just speeding up.

alex@mirabel:~$ grep 84.19.180.57 websites/web_logfiles/communitywiki.org/access.log | leech-detector
        84.19.180.57     111378     4K 100%   20.1s  403 (46%), 200 (24%), 503 (20%), 301 (3%), 404 (3%), 302 (0%), 400 (0%), 501 (0%)

See how within minutes they sent another 9000 requests...

Maybe somebody ordered his IE to prepare the site for off-line reading... πŸ˜„

​#Wikis