The New York Times released a fascinating Netflix mashup that shows the 10 most rented movies in 2009 in any ZIP code in the United States. You can also cut through the data by the movies themselves by popularity and even see some heat map action in any given ZIP code.
I know Netflix has an API, but because 2009 is a finished year, I wouldn’t expect the New York Times to write REST API requests more than the first time it took to collect the data. Creating a map reflective of the past is much easier than to see what’s happening at this very moment.
By examining Safari’s Activity window, I can see several assets are loaded for the Flash map:
- http://graphics8.nytimes.com/packages/flash/newsgraphics/2010/0108-netflix/data/cities.txt
- http://graphics8.nytimes.com/packages/flash/newsgraphics/2010/0108-netflix/data/ranks.amf.swf
- http://graphics8.nytimes.com/packages/flash/newsgraphics/2010/0108-netflix/data/scores_and_images.txt
- http://graphics8.nytimes.com/packages/flash/newsgraphics/2010/0108-netflix/data/top_movies.txt
- http://graphics8.nytimes.com/packages/flash/newsgraphics/2010/0108-netflix/data/zips.amf.swf
- http://graphics8.nytimes.com/packages/flash/newsgraphics/2010/0108-netflix/data/zips_ms1.swf
- http://graphics8.nytimes.com/packages/flash/newsgraphics/2010/0108-netflix/NetflixGraphic2.swf
It appears the text files are tab-delimited, which means the Flash presentation’s source code accesses the text file at its respective URL, and then loops through each line of the text file, knowing to act differently when encountering each piece of text after a tab. For example, the cities.txt file loads a rudimentary profile on each metropolitan area, complete with latitude and longitude coordinates. scores_and_images.txt and top_movies.txt seem to load similarly formatted data. More curiously, the hard part would be the mapping of the ZIP codes, which could have been done manually or at least collected once manually and reused for this project. It’s difficult to say.
The Flash map was created by Matthew Bloch, Amanda Cox, Jo Craven McGinty and Kevin Quealy. It’s a shame that many of these very crucial people don’t have stronger Web presences with which to share their secrets. Interestingly, at least one of these people is a traditional reporter, which means the data could have been uncovered by other means, especially because I don’t see a ZIP code or location-based REST method in the Netflix API. I didn’t know gumption was still alive.
I would be remiss if I didn’t look at the top 10 movies rented in my ZIP code, and seeing as the maps are limited to 12 metropolitan areas, I’m lucky I can see mine:
- The Curious Case of Benjamin Button
- Slumdog Millionaire
- Gran Torino
- Burn After Reading
- Changeling
- Doubt
- Milk
- Body of Lies
- Seven Pounds
- The Wrestler
Like everywhere else, The Curious Case of Benjamin Button was the most popular. I caught most of it on a 14-hour plane trip last year. It wasn’t that good.
Update on 1/21/09: The Society for News Design posted an article of The New York Times’s own Kevin Quealy talking about the process by which the Netflix interactive graphic was created. Like I suspected, they did not use the Netflix API but spoke with Netflix directly. They also used MapShaper, Matthew Bloch’s mapping framework, to map polygons on ZIP code areas.
It’s discouraging to read that every time I see something cool I would like to replicate, I would have to learn GIS and ArcView, but at least I know it’s there if I want to take the jump, although realistically speaking, there’s no reason a Flash designer/ActionScript developer has to be a geography guru, too.
