The Ins and Outs of Calculating Browser Usage

I spent the past few hours writing a program to parse the browser string from the web server log files. Why didn't I use an existing web analyizer package? I wanted the browser strings to be rewriten to have correct information, as well as being in a more consistent style. This meant changing it from, say:

Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; Win 9x 4.90; Q312461)

to

MSIE/6.0 Windows/98

This also means I can generate decent stats about the popularity of certain browsers on the fly (using the Unix command line, I can pull out the browser string, feed that through the newly written program, then count unique browsers easier). An initial run through last month's log file for my blog:

Table: Browser Statistics for The Boston Diaries
# Hits 	Browser/Version	OS/Version
1,228 	Googlebot/2.1 	-/-
748 	MSIE/6.0 	WindowsNT/5.1
712 	MSIE/6.0 	Windows/98
641 	MSIE/6.0 	WindowsNT/5.0
476 	Mercator/2.0 	-/-
371 	MSIE/5.5 	Windows/98
303 	MSIE/5.0 	Windows/98
302 	MSIE/5.5 	WindowsNT/5.0
238 	-/- 	-/-
216 	MSIE/5.01 	WindowsNT/5.0
137 	ia_archiver/- 	-/-
113 	Syndic8/1.0 	-/-
101 	NCSA/- 	-/-
101 	MSIE/5.01 	Windows/98
100 	MSIE/6.0 	WindowsNT/4.0
99 	Mozilla/3.01 	-/-
89 	Gecko/20020529 	Linux/i686
88 	Gecko/20020523 	WindowsNT/5.0
81 	MSIE/5.14 	Mac_PowerPC/-
79 	Mozilla/5.0 	-/-
68 	SlySearch/1.2 	-/-
66 	MSIE/5.5 	Windows/95
62 	MSIE/5.5 	WindowsNT/4.0
62 	Gecko/20020529 	PPC/Mac
61 	Openfind/- 	-/-
55 	MSIE/5.0 	Mac_PowerPC/-
49 	Indy-Library/- 	-/-
48 	Gecko/20020510 	Linux/i686
42 	Mozilla/3.0 	-/-
41 	sitecheck.internetseer.com/- 	-/-
40 	Gecko/20020311 	WindowsNT/5.1
38 	MSIE/5.01 	Windows/95
36 	bumblebee@relevare.com/- 	-/-
33 	Gecko/20020530 	WindowsNT/5.0
28 	bumblebee/1.0 	-/-
28 	Gecko/20020510 	WinNT4.0/-
27 	Opera/6.02 	Windows/2000
27 	MSIE/5.0 	WindowsNT/4.0

This gives a decent flavor for what's being used to view my site (out of the 7,943 hits last month, about 16% were from the Google spider [1]) but one of the primary reasons I did this was to see just how many people are still using older browsers like Netscape 4x or Internet Explorer 4x (which would show up as Mozilla/4.x and MSIE/4.x respectively). So, strip out the operating system column, and look at only the major version numbers, we then get:

Table: More Specific Browser Statistics for The Boston Diaries
# Hits 	Browser/major Version
2,210 	MSIE/6 
1,671 	MSIE/5 
1,228 	Googlebot/2 
543 	Gecko/- 
476 	Mercator/2 
238 	-/- 
142 	Opera/6 
141 	Mozilla/3 
137 	ia_archiver/- 
134 	Mozilla/4 
113 	Syndic8/1 
101 	NCSA/- 
79 	Mozilla/5 
68 	SlySearch/1 
61 	Openfind/- 
49 	Indy-Library/- 
45 	MSIE/4 
41 	sitecheck.internetseer.com/- 
37 	Netscape6/6.2 
36 	bumblebee@relevare.com/- 
28 	bumblebee/1 
26 	linkhype.com/1 
26 	Netscape/7 
24 	BlogBot/1 
22 	Win32/- 
22 	Konqueror/3.0 
20 	Frontier/8.0 
16 	Internet/- 
16 	Ask-Jeeves/- 
15 	Mozilla/- 
14 	Microsoft/- 
14 	Konqueror/2.2 
12 	w3m/0.2 
12 	obidos/bot 
12 	Mozilla/4.7C-CCK-MCD 
11 	myownhomeblogindexingservicecrawler/- 
11 	htdig/3.1 
10 	Mozilla/3.x 

The bad news: 48% of the browsers were Internet Explorer 5x or 6x (although surprisingly enough, I did get five hits from a Mozilla [2] based browser under OS/2). The good news though, is that 58% of the hits were from browsers capable of viewing CSS (Cascading Style Sheets) without crashing. And speaking of horrible browsers that can't support CSS, about 2.5% were running Netscape 4x or IE 4x (they can see the site, only it doesn't look that great).

I also checked the log file for Spring's [3] site (Hi honey!). 53% of her visitors are using Internet Explorer 5 or higher, or Mozilla (or Netscape 6 and higher). Only about 3% are using Netscape 4x or Internet Explorer 4x, which is pretty much on par with my site (the rest are mostly robots or experiemental browsers).

[1] http://www.googlebot.com/bot.html

[2] http://www.mozilla.org/

[3] http://www.springdew.com/

Gemini Mention this post

Contact the author