---------------------------------------- Rating DVDs with Python februar 08th, 2020 ----------------------------------------
Rating DVDs with python
Recently, my kids were talking about buying some DVDs of their
favourite movies. My son was still missing part IV to VI of Star Wars,
and considered a boxed edition with all movies. Buying new stuff is
evil, so I pointed my browser to our local craigslist (finn.no in
Norway). A while later, I'd found what I consider the perfect deal: A
huge box with almost 200 DVDs, including ALL movies that my kids
wanted to buy. And all that for almost nothing (we speak the price of
two(!) Star Wars boxed editions for all 198 DVDs). Most of the movies
in the collection are Limited Editions or Directors Cuts or whatever
you call all these fancy boxes. Hard to say no to this!
So, how do you handle so many movies? Where to start?
I jotted down a list of all movie titles in emacs. To prioritize and
skip the worst movies (time is precious), I like to refer to IMDB
ratings. Everything below 6.5 is a no-go. But how was I going to
retrieve all those ratings? Searching for all movies on imdb.com would
take hours.
Python to the rescue. There is a neat module called imdbpy that is
able to retrieve information about movies from IMDB. There is lots of
information available, but I was only interested in the rating. But, I
found out, that the canonical title was nice to have. After all, my
list was just the result of me punching in all titles as best as I
could. To make things a little bit worse, some of the DVDs had German
titles, some Norwegian titles. IMDB tends to do an OK job with
converting these titles to the original, but not always.
So here is a small snippet of my list:
------
Charlie's Angels
Cliffhanger
Cloverfield
Conair
Corps Bride
D-War
Da Vinci-Koden
Danes With Wolves
Das Boot
------
=> gopher://:/
The script itself consist of only some lines of python. Most of it is
error-handling, in case IMDB was not able to retrieve the movie or
missing information. Here is the code. It's not pretty. You have been
warned!
------ from sys import stdin from imdb import IMDb def main(): lines = stdin.readlines() for i in range(len(lines)): lines[i] = lines[i].replace('\n','') ia = IMDb() for movie_name in lines: movies = ia.search_movie(movie_name) # No hit? Use the original name and continue if len(movies) == 0: print("|", movie_name, "| ? |") continue # Usually, the first match is what we're looking for: movie = movies[0] if movie: ia.update(movie, ['vote details']) dem = movie.get('demographics') if dem: print("|", movie['canonical title'], "|", movie.get('demographics')['imdb users']['rating'], "|") else: # No rating - seems like some movies have bad data print("|", movie_name, "| ? |") else: print("|", movie_name, "| ? |") if __name__ == '__main__': main() ------
Now, to convert my list into an org-mode-table, I just did the
following:
cat ~/Sync/org-files/movies.org | python3 movie-org.py > rated-movies.org
Now it was just the matter of opening the resulting file in emacs and
pressing TAB to align all colums nicely. Resulting in:
------ | Superman Returns | 6.0 | | As It Is in Heaven | 7.5 | | Terminator 2: Judgment Day | 8.5 | | Lincoln Rhyme: Hunt for the Bone Collector | 6.6 | | Borgias, The | 7.9 | | Cell, The | 6.3 | | Dark Knight, The | 9.0 | | Day After Tomorrow, The | 6.4 | ------
So here we go! A list of movies with ratings.
That said, I'm not very fond of watching movies. I guess that's
because I'm sitting in front of a screen all day at work. Also,
watching movies is way to passive for me. Playing with my computer is
not. Before I met my wife, I had hardly watched any movies at home,
except some rentals that I watched together with friends. In fact, I
watched my first DVD together with my wife in my late 20s.
Anyway, this script was a nice little project!
Oh, one more thing: I wrote a little script that exposes parts of the
Internet Movie Database on Gopher. I call it the Gopher Movie Database.
Here you go: