The Ins and Outs of Calculating Browser Usage

I spent the past few hours writing a program to parse the browser string from the web server log files. Why didn't I use an existing web analyizer package? I wanted the browser strings to be rewriten to have correct information, as well as being in a more consistent style. This meant changing it from, say:

Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; Win 9x 4.90; Q312461)


MSIE/6.0 Windows/98

This also means I can generate decent stats about the popularity of certain browsers on the fly (using the Unix command line, I can pull out the browser string, feed that through the newly written program, then count unique browsers easier). An initial run through last month's log file for my blog:

Table: Browser Statistics for The Boston Diaries
# Hits 	Browser/Version	OS/Version
1,228 	Googlebot/2.1 	-/-
748 	MSIE/6.0 	WindowsNT/5.1
712 	MSIE/6.0 	Windows/98
641 	MSIE/6.0 	WindowsNT/5.0
476 	Mercator/2.0 	-/-
371 	MSIE/5.5 	Windows/98
303 	MSIE/5.0 	Windows/98
302 	MSIE/5.5 	WindowsNT/5.0
238 	-/- 	-/-
216 	MSIE/5.01 	WindowsNT/5.0
137 	ia_archiver/- 	-/-
113 	Syndic8/1.0 	-/-
101 	NCSA/- 	-/-
101 	MSIE/5.01 	Windows/98
100 	MSIE/6.0 	WindowsNT/4.0
99 	Mozilla/3.01 	-/-
89 	Gecko/20020529 	Linux/i686
88 	Gecko/20020523 	WindowsNT/5.0
81 	MSIE/5.14 	Mac_PowerPC/-
79 	Mozilla/5.0 	-/-
68 	SlySearch/1.2 	-/-
66 	MSIE/5.5 	Windows/95
62 	MSIE/5.5 	WindowsNT/4.0
62 	Gecko/20020529 	PPC/Mac
61 	Openfind/- 	-/-
55 	MSIE/5.0 	Mac_PowerPC/-
49 	Indy-Library/- 	-/-
48 	Gecko/20020510 	Linux/i686
42 	Mozilla/3.0 	-/-
41 	-/-
40 	Gecko/20020311 	WindowsNT/5.1
38 	MSIE/5.01 	Windows/95
36 	-/-
33 	Gecko/20020530 	WindowsNT/5.0
28 	bumblebee/1.0 	-/-
28 	Gecko/20020510 	WinNT4.0/-
27 	Opera/6.02 	Windows/2000
27 	MSIE/5.0 	WindowsNT/4.0

This gives a decent flavor for what's being used to view my site (out of the 7,943 hits last month, about 16% were from the Google spider [1]) but one of the primary reasons I did this was to see just how many people are still using older browsers like Netscape 4x or Internet Explorer 4x (which would show up as Mozilla/4.x and MSIE/4.x respectively). So, strip out the operating system column, and look at only the major version numbers, we then get:

Table: More Specific Browser Statistics for The Boston Diaries
# Hits 	Browser/major Version
2,210 	MSIE/6 
1,671 	MSIE/5 
1,228 	Googlebot/2 
543 	Gecko/- 
476 	Mercator/2 
238 	-/- 
142 	Opera/6 
141 	Mozilla/3 
137 	ia_archiver/- 
134 	Mozilla/4 
113 	Syndic8/1 
101 	NCSA/- 
79 	Mozilla/5 
68 	SlySearch/1 
61 	Openfind/- 
49 	Indy-Library/- 
45 	MSIE/4 
37 	Netscape6/6.2 
28 	bumblebee/1 
26 	Netscape/7 
24 	BlogBot/1 
22 	Win32/- 
22 	Konqueror/3.0 
20 	Frontier/8.0 
16 	Internet/- 
16 	Ask-Jeeves/- 
15 	Mozilla/- 
14 	Microsoft/- 
14 	Konqueror/2.2 
12 	w3m/0.2 
12 	obidos/bot 
12 	Mozilla/4.7C-CCK-MCD 
11 	myownhomeblogindexingservicecrawler/- 
11 	htdig/3.1 
10 	Mozilla/3.x 

The bad news: 48% of the browsers were Internet Explorer 5x or 6x (although surprisingly enough, I did get five hits from a Mozilla [2] based browser under OS/2). The good news though, is that 58% of the hits were from browsers capable of viewing CSS (Cascading Style Sheets) without crashing. And speaking of horrible browsers that can't support CSS, about 2.5% were running Netscape 4x or IE 4x (they can see the site, only it doesn't look that great).

I also checked the log file for Spring's [3] site (Hi honey!). 53% of her visitors are using Internet Explorer 5 or higher, or Mozilla (or Netscape 6 and higher). Only about 3% are using Netscape 4x or Internet Explorer 4x, which is pretty much on par with my site (the rest are mostly robots or experiemental browsers).




Gemini Mention this post

Contact the author