Some more observations about the MJ12Bot

I received another reply from MJ12Bot [1] about their badly written bot [2] and it just said the person responsible for handling enquiries was out of the office for the day and I should expect a reponse tomorrow. We shall see. In the mean time, I decided to check some of the other bots hitting my site and see how well they fare, request wise. And I'm using the logs from last month for this, so these results are for 30 days of traffic.

Table: Top 10 bots hitting The Boston Diaries
requests	percentage	user agent
------------------------------
46334	19	The Knowledge AI
38097	16	Mozilla/5.0 (compatible; SemrushBot/3~bl; +http://www.semrush.com/bot.html)
17130	7	Mozilla/5.0 (compatible; BLEXBot/1.0; +http://webmeup-crawler.com/)
15928	7	Mozilla/5.0 (compatible; AhrefsBot/6.1; +http://ahrefs.com/robot/)
12358	5	Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
8929	4	Mozilla/5.0 (compatible; MegaIndex.ru/2.0; +http://megaindex.com/crawler)
8908	4	Gigabot
7872	3	Mozilla/5.0 (compatible; MJ12bot/v1.4.8; http://mj12bot.com/)
6942	3	Barkrowler/0.9 (+http://www.exensa.com/crawl)
4737	2	istellabot/t.1.13
------------------------------
167235	70	Total (out of 239641)

So let's see some results:

Table: Results of bot queries
Bot	200	%	301	%	304	%	400	%	403	%	404	%	410	%	500	%	Total	%
------------------------------
The Knowledge AI	42676	92.1	3352	7.2	0	0.0	127	0.3	4	0.0	170	0.4	5	0.0	0	0.0	46334	100.0
SemrushBot/3~bl	36088	94.7	1873	4.9	0	0.0	110	0.3	0	0.0	21	0.1	5	0.0	0	0.0	38097	100.0
BLEXBot/1.0	16633	97.1	208	1.2	124	0.7	114	0.7	0	0.0	46	0.3	5	0.0	0	0.0	17130	100.0
AhrefsBot/6.1	15840	99.4	78	0.5	0	0.0	4	0.0	0	0.0	5	0.0	0	0.0	1	0.0	15928	99.9
bingbot/2.0	12304	99.6	35	0.3	0	0.0	6	0.0	0	0.0	3	0.0	5	0.0	0	0.0	12353	99.9
MegaIndex.ru/2.0	8412	94.2	456	5.1	0	0.0	24	0.3	0	0.0	36	0.4	1	0.0	0	0.0	8929	100.0
Gigabot	8428	94.6	448	5.0	0	0.0	23	0.3	0	0.0	7	0.1	2	0.0	0	0.0	8908	100.0
MJ12bot/v1.4.8	2015	25.6	175	2.2	0	0.0	2	0.0	0	0.0	5680	72.2	0	0.0	0	0.0	7872	100.0
Barkrowler/0.9	6604	95.1	300	4.3	0	0.0	10	0.1	0	0.0	28	0.4	0	0.0	0	0.0	6942	99.9
istellabot/t.1.13	4705	99.3	28	0.6	0	0.0	0	0.0	0	0.0	0	0.0	0	0.0	4	0.1	4737	100.0

Percentage wise of the top 10 bots hitting my blog (and in fact, these are the 10 ten clients hitting my blog) MJ12Bot is just bad at 72% bad requests. It's hard to say what the second worst one is, but I'll have to give it to “The Knowledge AI” bot (and my search-foo is failing me in finding anything about this one). Percentage wise, it's about on-par with the others, but some of its requests are also rather odd:

It appears to be a similar problem as MJ12Bot, but one that doesn't happen nearly as often.

Now, this isn't to say I don't have some legitimate “not found“ (404) results. I did come across some actual valid 404 results on my own blog:

Some are typos, some are placeholders for links I forgot to add. And those I can fix. I just wish someone would fix MJ12Bot. Not because it's bogging down my site with unwanted traffic, but because it's just bad at what it does.

[1] https://mj12bot.com/

[2] /boston/2019/07/09.1

Gemini Mention this post

Contact the author