2019-03-01 Podcast Numbers

I was wondering about the download statistics for my tiny little podcast. How would I figure this out? On my server, I keep four days of access logs. (See Privacy Policy for more information.) I posted the last episode three days ago and I verified that the last of my log files does not mention it. That means I didn’t miss any of the downloads.

Privacy Policy

three days ago

So what I did is I grepped through the logs for the MP3 file, saving those lines for me to look through.

`wc -l 20-halberds-and-helmets.log` tells me that there are 48 lines. It’s not a lot but it’s what I’ve got.

A quick inspection shows that I have to discard HEAD requests. I should just be counting GET requests!

`grep GET 20-halberds-and-helmets.log | wc -l` gives me 37 lines.

Further inspection shows that quite a few of these requests have the status 206 (”partial content”) so that’s a single application downloading various parts of the file. But how to figure out which of them belong together? I’m trying to figure this out without looking at IP numbers.

Visually, it looks like I can determine what’s going on by looking at the *minutes* and the *user agent* for all the 206 results. Let’s try this. (And yes, I did have a HEAD request with a 206 result!)

grep 'GET.* 206 ' 20-halberds-and-helmets.log | perl -e '
while (<STDIN>) {
  chomp; my ($ts, $ua) =
    /\[\d\d\/\w+\/\d\d\d\d:\d\d:(\d\d).*"([^"\/]*)[^"]*"$/;
  print "$ts $ua\n";
}
'

And here’s the result, with an arrow indicating the rows I consider to be “duplicates.”

20 Mozilla
44 AppleCoreMedia
44 AppleCoreMedia ←
39 iTMS
44 AppleCoreMedia
44 AppleCoreMedia ←
44 AppleCoreMedia ←
46 Mozilla
46 Mozilla ←
46 Mozilla ←
19 Mozilla

Manually counting them, I think we could get away by saying that we need to discount three AppleCoreMedia and two Mozilla results, right?

So let’s count the hits per *user agent* and then we’ll correct for the partial content results above.

grep GET 20-halberds-and-helmets.log | perl -e '
my %count;
while (<STDIN>) {
  chomp; my ($ua) = /"([^"\/]*)[^"]*"$/;
  $count{$ua}++;
  $total++;
}
for my $ua (sort {$count{$b} <=> $count{$a}} keys %count) {
  print sprintf("%5d %s\n", $count{$ua}, $ua);
}
print "---- --------------------\n";
print sprintf("%5d total\n", $total);
'

This would be the result without correcting for the partial content:

   10 Mozilla
    5 AppleCoreMedia
    4 Pocket Casts
    3 Overcast
    3 PodcastAddict
    2 Dalvik
    2 okhttp
    2 Googlebot-Video
    2 stagefright
    1 AndroidDownloadManager
    1 iTMS
    1 Player FM
    1 iCatcher!
---- --------------------
   37 total

Making the correction I mentioned above:

    7 Mozilla ←
    4 Pocket Casts
    3 Overcast
    3 PodcastAddict
    2 AppleCoreMedia ←
    2 Googlebot-Video
    2 stagefright
    2 okhttp
    2 Dalvik
    1 AndroidDownloadManager
    1 iCatcher!
    1 iTMS
    1 Player FM
---- --------------------
   32 total ←

The result shows that *Pocket Casts* is popular. I guess it’s a podcatcher. *iCatcher!* is the one I use. 🙂

I’m surprised that *Mozilla* is up there. When I looked at the details of the user agent strings, I noticed that they mostly belong to bots:

What the hell is *Googlebot-Video* doing, here? Is google offering audio search results somewhere? Using podcasts to train their AI overlords?

All the other user agents look like legitimate tools, frameworks, programming languages, libraries, etc.

That’s why I think that 32 people listened to the podcast episode 20, and a few probably didn’t listen to all of it.

​#Halberds and Helmets Podcast ​#Podcast ​#Administration

Comments

(Please contact me if you want to remove your comment.)

I use PodBean as my catcher, but I’m not sure what engine it uses for download.

– Shelby 2019-03-01 18:28 UTC

Shelby

---

Hey! I’m the guy using Overcast! Home, work, and mobile.

I absolutely love your podcast, and your content holistically. More soon?

– Tim McDowell 2019-06-04 07:28 UTC

---

Thanks! Maybe. I wanted to talk about Mass Combat but then last session the players opted not to use the rules so I’m a bit stumped. 😆

– Alex Schroeder 2019-06-04 17:52 UTC