Presence of faces in House of Cards TV Series by episode

Share Button

I was wondering if presence of faces in video content was an indicator of anything, and if so, of what. So I decided to scan episodes of a popular TV series and analyze them, second by second, for number of faces in video frames, and then compare charts of various episodes. Here is the result of this research.

I decided to analyze House Of Cards, partly because it’s a great series, but also because it’s character focused, so there are many scenes with a lot of people. I built an interactive viewer, which allows to see which faces were recognized at a particular point in time in Episode 3, which contains a variety of scenes with many people in them.

Launch the viewer, or continue reading for short description of technology.

LAunch the House of Cards Face Recognition Interactive Viewer

Technology

To pull this off, I used the OpenCV computer vision library, which has a good capability to recognize faces. As the computer watches TV, this tool scans every frame for faces, and, if it finds any, communicates the relevant rectangles, so they can be drawn or extracted and saved.

Here’s a screenshot of a scene in church. It’s immediately apparent that the tool does not do such a good job, for many faces remain unrecognized. Still, many are recognized.

Recognized faces in House of Cards

Recognized faces in the church scene, Episode 3

In this frame below, more faces are recognized.

hoc-0201

There are also many false positives. The computer sometimes thinks that something is a face, where it most certainly it’s not, as in this picture below. If one looks carefully, one can sometimes see something face-like in these rectangles.

hoc-0018

To construct the viewer, I extracted individual faces from frames so I could display them on the page. They are of various sizes and look like this:

1 0 2 0 1 0

To construct the charts, I just counted the faces in each seconds, then displayed the time series for each episode.

Results

This is the final chart. It’s a series of timelines that show how many faces were recognized per second. Why are some lines orange, and some yellow?

As video frames scanning progressed, some faces were recognized in only one frame in entire second – there are 23 of them. Some other faces were recognized in more frames, ans others in yet more frames. I thought this to be a good indicator of face detection reliability, but that’s not so. If it tells anything, it’s how steady the camera was in that section.

House of Cards face recognition charts by episode
House of Cards face recognition charts by episode

My inspiration was small multiples, a visualization technique which allows for easier comparison of several datasets from the same domain. Wikipedia says:

A small multiple (sometimes called trellis chart, lattice chart, grid chart, or panel chart) is a series or grid of small similar graphics or charts, allowing them to be easily compared. The term was popularized by Edward Tufte.

According to Tufte (Envisioning Information, p. 67):

At the heart of quantitative reasoning is a single question: Compared to what? Small multiple designs, multivariate and data bountiful, answer directly by visually enforcing comparisons of changes, of the differences among objects, of the scope of alternatives. For a wide range of problems in data presentation, small multiples are the best design solution.

 

As always, if anyone is interested in code, mail me. My address is on About page.

Slovenian business activity by city as animated heatmaps

Share Button

A few months ago, while researching business times of various categories of establishments in Slovenia,  I thought it would be nice to somehow visualize a map with a graphical representation of density of open establishments. I decided on heatmap style, although I later discover that my chosen implementation had some drawbacks.

Getting the data

Data with business hours of commercial establishments is traditionally not open for many reasons, two of them being that (1) this information can be commercially exploited, and (2) the opening hours can be subject to frequent changes, which can tax the database owner with considerable effort should the database stay current and reliable.

First I toyed with the idea of crawling entire  directory of odpiralnicasi.com, then I actually thought about making a version for London, Amsterdam or San Francisco with Yelp data, for which I would have to crawl an entire Yelp city directory, a task I’m not sure it would succeed. Yelp would probably block my IP before I could harvest a significant portion of what interested me.

So I decided I would use the Najdi.si maps business directory. Disclosure: I work there, so I have access to the database with various business data, which is being kept current.

For every company, I took out only the name, geo coordinates, business hours and business category, then I constructed the animated maps. Before I delve into that, a short video of economic activity in Slovenia in course of a typical Monday.

Economic activity in Slovenia from Marko O’Hara on Vimeo.

The animated chart you see on the bottom shows the number of active establishments in various economic categories, such as Restaurants and catering, Industry, Shopping, etc. The full list is:

  • blue: Computers and IT,
  • red: Restaurants and catering,
  • green: Home and garden,
  • yellow: Beauty and health,
  • pink: General business,
  • orange: Free time,
  • violet: Industry,
  • magenta: Culture and schooling

Rendering the maps and constructing the visualization

Rendering one frame in one city at a specific time is just a matter of setting appropriate latitude, longitude and zoom level on the map, selecting the desired time and plotting on the map all establishments  that are open at that time. I used Processing to do that, and for the heat map part I used this excellent example by Philipp Seifried. As a finishing touch, I made maps to switch between day and night styles at appropriate times.

To do entire video, I had to write a parallel rendering queue lest the rendering of a single video took an eternity – Eclipse project available by email request.

To complicate things a bit I decided to include up to four different places on the same map, so the viewer could compare opening hours in Ljubljana in different economic categories, or see how different cities woke up and went to sleep at different times.

A typical frame looks like this:

Video frame / comparison of business activity in Nova Gorica, Koper, Celje and Novo Mesto at noon
Video frame / comparison of business activity in Nova Gorica, Koper, Celje and Novo Mesto at noon

Here’s an example for different economic activities in Ljubljana:

Economic activity in Ljubljana – four categories from Marko O’Hara on Vimeo.

  • top left: General business
  • top right: Restaurants and catering
  • bottom left: Industry,
  • bottom right:Beauty and health

Here’s a comparison between Ljubljana and the city of Maribor:

opentimes ljmb.mp4 from Marko O’Hara on Vimeo.

  • left: Ljubljana
  • right: Maribor

And here a comparison of business activity in Nova Gorica, Koper, Celje and Novo Mesto:

opentimes kpnmceng.mp4 from Marko O’Hara on Vimeo.

 

  • top left: Nova Gorica
  • top right: Koper
  • bottom left: Novo Mesto,
  • bottom right:Celje

 

Commentary

I mostly did this to be able to visually compare levels of business activity in Ljubljana. First of all, the heatmap technique I employed here turned out to be somewhat unreliable for video purposes, because it colors the dots relative to the highest concentration. But concentration and absolute numbers of active businesses change from frame to frame, so it seems that at night there’s more activity that during the day.

Even so it’s still clear that restaurants, bars and clubs are still pretty much open when other activity starts to die down.

This is Ljubljana at noon, again:

  • top left: General business
  • top right: Restaurants and catering
  • bottom left: Industry,
  • bottom right:Beauty and health

The big spot in the northeast is the mall region, where untold number of business operate in ten or more big malls. Business concentration there dwarfs everything else in the city, except maybe in industrial category.

lj at 11 h

Below is Ljubljana at eight o’clock in the evening. Pretty much everything has closed down except for eating and drinking, and maybe the cinema theater in the mall.

lj20h

Below: Ljubljana at ten o’clock in the evening. Some businesses don’t close down at all. I double checked the primary data source and it’s true. There are cleaning services that stay open during the night, etc.

lj22h

I’m relatively satisfied with results except for the heatmap issue. I may correct that if I get the data for a bigger city.