Tagged: K-means clustering

Grouping countries according to flag similarity

Share Button

This topic is apparently interesting enough that it warrants its own discussion on Quora. People there are relying on keen observational powers of human mind, but for this article, I tried to group the flags algorithmically.

I plotted the results on the map below. Countries with same colors have similar flags. The brighter the color, the bigger the group of countries with similar flags.

Launch the interactive viewer to explore the matter interactively.

Countries by flag similarity
Countries by flag similarity

 

Here are some flag groups. To see them all, click the image above.

flags_7flags_80 flags_124 flags_27

How I grouped the flags

I used a machine learning algorithm called k-means clustering. It’s really a rudimentary exercise, but the results are good enough to publish on this wee blog.

The algorithm accepts units to be grouped as vectors, so I had to vectorize the images first, that is to say, convert them in a long string of numbers. Each image was partitioned into a grid, then the average color  value for each cell was computed. The grid was 24 x 24 cells big. I found that enough for simple flags.  These color values were converted into HSB color space and experimentally weighted, then copied into a vector. These vectors were fed into the k-means algorithm with requested number of individual clusters set to 120 (there are 240 different flags). You can see results in the viewer.

Number of clusters was set experimentally, and the clustering is not perfect. For example, Canadian is grouped with some very unlikely lookalikes.

See also the other post with k-means clustering, K-means clustering with Processing.js

 

Enhanced by Zemanta