Category: Uncategorized

Books I read in 2015

Share Button

This is a result of

find *.mobi  -newermt "2015-01-01 00:00"

command in my Kindle Documents folder. If anything, it shows I’m a compulsive reader, preferring Nordic crime novels, cyberpunk sci-fi and historical and natural sciences non-fiction. I probably read ten more programming-oriented books in PDFs, which I’m not including here.

My favorites were novels by Don Winslow and Richard K. Morgan. I also read most of the essays by Miles Mathis, about which I’m still forming an opinion.

127 books in all, in no particular order:

Acid Dreams – Martin A. Lee
A Killing Winter – Tom Callaghan
Altered Carbon – Richard K. Morgan
A Nasty Piece of Work_ A Novel – Robert Littell
Angelica’s Smile – Andrea Camilleri
Anne Holt – What is Mine (Punishment)
A Noble Killing – Barbara Nadel
Artificial Intelligence for Humans, Volume 2- Heaton, Jeff
Artificial Intelligence for Humans, Volume 3 – Jeff Heaton
Aurora – Kim Stanley Robinson
Autumn Killing_ A Thriller – Mons Kallentoft
Axis – Robert Charles Wilson
Beyond Words_ What Animals Think and Feel – Carl Safina
Black Man – Richard K. Morgan
Blackwater – Kerstin Ekman
Blessed Are Those Who Thirst – Anne Holt
Blind Goddess_ A Hanne Wilhelmsen Novel – Anne Holt
Blood on Snow_ A Novel – Jo Nesbo
Broken Angels – Richard K. Morgan
Burned – Thomas Enger
Cibola Burn_ Book Four of the E – James S. A. Corey
Closed for Winter – Jorn Lier Horst
Cold Hearts – Gunnar Staalesen
Complicity – Iain Banks
Consider Phlebas – Iain M. Banks
Dark Secrets – Michael Hjorth
Deadline – Barbara Nadel
Dead of Night – Barbara Nadel
Death by Design – Barbara Nadel
Death of the Demon_ A Hanne Wilhelmsen N – Anne Holt
Destination Void – Frank Herbert
Dope, Inc. Britain Opium War against the U – LaRouche
Drug War Capitalism – Dawn Paley
Epiphany of the Long Sun – Gene Wolfe
Excession – Iain M. Banks
Feersum Endjinn – Iain M. Banks
Inspector of the Dead – David Morrell
Inversions – Iain M. Banks
Killing Pablo – Mark Bowden
Legends – Robert Littell
Litany of the Long Sun – Gene Wolfe
Look to Windward – Iain M. Banks
Magicians of the Gods_ The Forg – Graham Hancock
Mao – Jung Chang
Marid Audran 01 – When Gravity Fails – George Alec Effinger
Marid Audran 02 – A Fire in the Sun – George Alec Effinger
Marid Audran 03 – The Exile Kiss – George Alec Effinger
Market Forces – Richard K. Morgan
Matter – Iain M. Banks
Midnight Sun_ Blood on Snow 2 – Jo Nesbo
Mirrors – Eduardo Galeano
Mother Russia – Robert Littell
Murder as a Fine Art – David Morrell
Nemesis Games – James S. A. Corey
Pierced – Thomas Enger
Poseidon’s Wake – Alastair Reynolds
Pretty Dead Things – Barbara Nadel
Satori – Don Winslow
Savage Continent – Lowe, Keith
Scarred_ A Novel – Thomas Enger
Seveneves – Neal Stephenson
Shadow and Claw – Gene Wolfe
Shibumi – Trevanian
Silenced – Kristina Ohlsson
Smilla’s Sense of Snow – Peter Hoeg
Snowblind – Jonasson, Ragnar
Spin – Wilson, Robert Charles Charles
Spring Tide – Borjlind, Cilla
Summer Death_ A Thriller – Mons Kallentoft
Superintelligence – Nick Bostrom
Sweet Reason – Robert Littell
Sword and Citadel – Gene Wolfe
The Abrupt Physics of Dying – Paul E. Hardisty
The Algebraist – Iain M. Banks
The Cartel_ A Novel – Don Winslow
The Caveman – Jorn Lier Horst
The Cold Commands – Richard K. Morgan
The Company – Robert Littell
The Consorts of Death – Gunnar Staalesen
The Dark Defiles (Land Fit for Heroes #3) – Richard K. Morgan
The Dosadi Experiment – Frank Herbert
The Eiger Sanction – Trevanian
The Ethnic Cleansing of Palestine – Ilan Pappe
The Fifth Profession – David Morrell
The Fire Witness_ A Novel – Lars Kepler
The Girl in the Spider’s Web (Millennium s – David Lagercrantz
The Gulag Archipelago, Volume 1_ An Expe – Solzhenitsyn, Aleksandr
The Gulag Archipelago, Volume 2_ An Expe – Solzhenitsyn, Aleksandr
The Hanging Girl – Jussi Adler-Olsen
The Healer – Antti Tuomainen
The House of the Scorpion – Farmer, Nancy
The Hydrogen Sonata – Iain M. Banks
The Intruder – Hakan Ostlundh
The Jesus Incident – Frank Herbert
The Lady from Zagreb – Philip Kerr
The Life-Changing Magic of Tidying Up_ The – Kondo, Marie
The Loo Sanction – Trevanian
The Lord of Opium – Farmer, Nancy
The Master Algorithm – Pedro Domingos
The New York Trilogy – Paul Auster
The Nightmare_ A Novel – Lars Kepler
The Once and Future Spy – Robert Littell
The Patience of the Spider – Andrea Camilleri
The Player of Games – Iain M. Banks
The Power of the Dog – Don Winslow
The Real Odessa – Goni, Uki
The Shape of Water – Andrea Camilleri
The Snack Thief – Andrea Camilleri
The State Of The Art – Iain M. Banks
The Steel Remains – Richard K. Morgan
The Summer of Katya – Trevanian
The Urth of the New Sun – Gene Wolfe
The Viper – Hakan Ostlundh
The White Plague – Frank Herbert
The Winter of Frankie Machine – Don Winslow
The Writing on the Wall – Gunnar Staalesen
Third Voice – Borjlind, Cilla
Thirteen – Richard K. Morgan
Vicious Circle – Robert Littell
Visiting Professor_ Novel of Chao – Robert Littell
Vortex – Robert Charles Wilson
We Shall Inherit the Wind – Gunnar Staalesen
‘What Do You Care What Other People Think_ – Richard P Feynman
Whipping Star – Frank Herbert
Woken Furies – Richard K. Morgan
Getting MEAN – Simon Holmes
Yours Until Death – Gunnar Staalesen

 

Analysis of traffic violations in Slovenia between beginning of 2012 and end of 2014

Share Button

This is my first attempt to use open data for data visualization in web presentation and for a mobile app. The idea was to cross-pollinate promotion, but it didn’t go so well – more on this later.

The analysis is published on a separate URL due to heavy use of JavaScript, which complicates things in WordPress. Click link above or the big image with parking ticket to read it.

Parking ticket
Parking ticket

According to data provided by state police, highway authority and local traffic wardens, there occurred a little less than a million traffic violations between start of 2012 and September 2014. Given that there are 1,300,000 registered vehicles and 1,400,000 active driving licenses in the country, this is a lot. A big majority of them are parking and toll tickets.

In the main article, there are a lot of images and charts. For example, I analyzed data for major towns in Slovenia to get the streets with the highest number of issued traffic tickets. Here’s an example for Ljubljana:

Parking tickets in Ljubljana
Streets with parking tickets in Ljubljana – click to read article

I had temporal data for each issued ticket, so I could also show on which streets you are more likely to be ticketed in the morning, midday or evening. On the image below, morning is blue, midday is yellow, and evening is red.

Tickets issued by hour
Tickets issued by hour – click for main article

This is, however, only the beginning. Here are questions I tried to answer:

  • Are traffic wardens and traffic police just another type of tax collectors for the state and counties?
  • Do traffic wardens really issue more tickets now than in the past, or is that just my perception?
  • Which zones in bigger towns are especially risky, should you forget to pay the parking?
  • Are traffic wardens more active in specific time intervals?
  • Does the police lay speed traps in locations with most traffic accidents? What about DUI checking?
  • How does temperature influence the number of issued traffic tickets?
  • Does the moon influence the number of issued traffic tickets? If so, which types?
  • Where and when are drivers most at risk of encountering other drunk drivers?
  • Where does the highway authority check for toll, and when to hit the road if one does not want to pay it?
  • How can we drive safer using open data?

Be sure to read the main article to see all the visualizations and interactive maps. There are also videos, for example this one, showing how the ticketing territory expanded through time in Ljubljana:

Parkirne kazni v Ljubljani 2012 – 2014 from Marko O’Hara on Vimeo.

Some other highlights:

The big finding was a sharp increase of number of parking tickets issued in Ljubljana by the end of 2013, which coincides with publishing of debt that the county has run into:

Increase of parking tickets issued in LJubljana
Increase of parking tickets issued in Ljubljana

There’s an interactive map showing the quadrants with most DUI tickets and their distribution by day of week and month in year:

DUI distribution
DUI distribution

Mobile app for Android

Mobile app for android - start screen
Mobile app for android – map

I also wrote an Android mobile app (get it on Google Play if you are interested) that locates the user and shows locations of violations of selected type on the map, as well as a threat assessment, should she want to break the law. Here’s the description on Google Play:

The app helps the user find out where and when were traffic tickets issued in Slovenia, thus facilitating safer driving. 
Ticket database is limited to territory of Republic of Slovenia.

Choose between these issued citations to show in app:
– parking
– speeding
– driving while using a cellphone
– ignoring safety belt laws
– unpaid toll
– DUI
and traffic accidents.

The app will locate you, fetch data about traffic citations issued in your vicinity, and show them on map. To see citations, that were issued somewhere else, click on map. Additionally available is summary of threat level, derived from statistical data, collected by government agencies.

Locating the user and showing dots on map wasn’t really a challenge, but I wanted to show a realistic threat assessment, based on location and time. To do that, I wrote an API method that calculates the number of tickets issued on the same day of week in the same hour interval and then draws a simple gauge.

Let’s say, for example, that you find yourself in the center of Ljubljana on Monday at noon, don’t have the money for parking fee, and you really only want to take a box to a friend who lives there. You’ll be gone for ten minutes only, so should you risk not paying the parking fee?

The app finds out the total number of tickets issued on Mondays in the three-hour period between noon and 3 PM, then graphically shows the threat level along with some distributions, something like this:

Threat assessment
Threat assessment

It works pretty well, and I use it sometimes, although I admit that its use cases may be marginal for majority of population. It does get ten new installs a day, although I don’t know how long this trend will continue.

I did send out press reviews and mounted a moderate campaign on Twitter (here’s the app’s account), but it amounted to precious little. Maybe the timing was bad – I launched it during Christmas holidays, when Internet usage is low. Or this type of app just isn’t so interesting.

I’m currently working on analysis of parking tickets for New York City, maybe that will be more interesting. There were, after all, more than nine million tickets issued there, and data is much richer.

Stay tuned!

group1

Voting and attendance in Slovenian Parliament from 2004 to current term

Share Button

In Slovenia, we have a love/hate relationship with our politicians. We hate them, because at almost every single step they make, they let us know they are corrupt and they can easily get away with it. But in each new election new faces appear, promptly get elected and are hailed as saviors, who will finally clean the Augean stables of greed and corruption that has been accumulating for too long.

Most emotions are reserved for those in the front row, mainly government members. Members of parliament are somehow exempted, as they are not so widely known. Somehow, they are not monitored properly, at least in my book. There is a site that contains session records per member and per session, but it’s not widely known. It was an inspiration for this attempt to present members’ activity in an easily understandable and graphic way for current term and a few terms in the past.

See the interactive version:        Slo                             Eng

Interest groups

The main idea was to group the parliamentary members by similarity of their voting record. Most parliamentary members are bound by strict voting discipline, imposed by the parties they belong to. This way the parties can guarantee that some or another act will pass and become a law. But is this really so? I tried to use a simple machine learning technique to answer that question. First I collected all the voting results from parliamentary term and sorted them in chronological order, then applied the technique (k-means clustering, for technologically minded). Number of groups was set to ten, but I could increase it to see smaller groups – maybe fractions inside parties, or cross-party interest groups.

Below you can see an example of two groups from recent term.

Here is the first:

And here another:group1

It’s apparent that groups do not contain representatives from one party only, and the visual representation imparts a feel for the differences in voting. As I mentioned above, I arbitrarily constructed ten groups, but a serious researcher would play and tinker with the number, as every clustering technique is an exploratory process and must be iterated upon for best results. It’s interesting that the results also show other parliamentary tactics. This one below could be interpreted as obstruction, or simply passivity or indifference. So what is it? To ask this question is to answer it, I guess.

To put it in context, this is a group of left-wing opposition representatives during a period when they were in heavy minority.

Indifference or obstruction?
Indifference or obstruction?

In contrast, this is the right-wing voting machine that prevailed:

A disciplined voting machine
A disciplined voting machine

The contrast between these two groups is so dramatic that it would be funny, if these were funny affairs.  While the opposition was idling away, the majority voted into existence law after law that, together, still influence the lives of the Slovenian citizenry. In interactive version (English) you can explore what the votes were about by simply moving the mouse over horizontal stripes.

See the interactive version:        Slo                             Eng

Attendance record

Session attendance is another telling indicator of particular representative’s zeal in upholding democracy and fulfilling the interests of his constituency. It’s already apparent from  charts above, but I still constructed a separate graphics for that. It’s sorted by presence and more easily readable.

It has to be noted that some representatives were excused from voting sessions for various periods of time. Among them are those who became ministers and those who replaced them in the parliamentary seat, not being there before.

Here’s an example from the recent term. At the bottom, you can see two blocks with alternating presence. That’s because there were two governments. When the first one fell, the ministers returned to their seats; those who originally replaced them, returned to the party’s roster; new ministers were sworn in and abandoned their seats; and new replacements came from opposite camp.

attendanceEN

See the interactive version:        Slo                             Eng

 Yes-men and rebels

Another interesting statistics is: representatives with most votes for yea or nay. I don’t really know how to interpret this, but I did it nevertheless. One could say that in terms with only one governments, members of ruling majority with most yea votes are those who unquestioningly toe the party line.  Conversely, those with most nay votes are most fervent members of the opposition. In terms with two governments, this is a little less clear-cut: one would have to separate the timelines and run the statistics on subperiods for each government. I didn’t do this, but a serious researcher would. I made this report to let them know that they are being monitored, but it’s a task of an investigative journalist to delve into the data and interpret it in a meaningful way. I don’t have time for this, and I don’t really know the particulars of daily politics here enough to be able to do that.

But I’m offering the database to anyone who would like to do that. Send me a mail for details, I’ll gladly oblige.

Here are a few simple pie charts that illustrate what I just wrote:

Yes men and rebels
Yes men and rebels

See the interactive version:        Slo                             Eng

Unity index

While programming, it struck me that I could calculate a synthetic measure that would show the unity in the parliament. The reasoning goes: if the vote was unanimous, the parliament as a whole was united in cause at hand. But if half of representatives voted yea, and the other half nay, the parliament was divided. So I constructed a timeline of all voting sessions and colored every session according to this measure. Blue for unanimous vote, red for evenly split vote, and violet hues as nuances of disharmony.

Additionally, the bar heights indicate the presence ratio. Lower heights obviously mean lower presence.

In some terms, the presence falls toward the end, and the proportion of red bars increase. This means that the representatives lost heart and abandoned their posts, and those who stayed, quarreled bitterly.

Here are these graphics for various terms. They are stretched to same length. Perhaps a more correct, but less visually appealing approach would be not to stretch them, so the length of particular term would be apparent.

indexEN
IV (2004 – 2008) – PM Janez Janša
indexEN
V (2008 – 2011) – PM Borut Pahor – ended prematurely
indexEN
VI (2011 – 2014) PM Janez Janša, PM Alenka Bratušek – ended prematurely
indexEN
VII (present) – probable PM Miro Cerar

See the interactive version:        Slo                             Eng

Session timelines and voting networks

The drive behind this section was to find out whether the attendance is falling, as the session progresses into small hours. I found that not to be so, which is encouraging in a way. These charts at least show which sessions were bitterly contested, and which were almost unanimous. You can see examples of both behaviors in the graphic below.

sessions

Going one step further, I constructed a separate network for each session in a way that if a representative voted for a proposition, he or she is connected with it, otherwise no.

Networks are a little bit messy, and people tend to not understand them well. This network below shows three groups of representatives (you can zoom in and out in the interactive version). They are grouped close to the propositions they voted for. So this is another opportunity to find out the interest groups on the micro level, for each proposition. Some propositions don’t have a name, just a date. That’s not my fault, but the parliament’s, as they didn’t bother to publish it on the web.network

See the interactive version:        Slo                             Eng

Seating order

Finally, here are some heatmaps for various variables, mapped on to seating orders. The first is partitioned according to representatives’ party. Sorry, no legend here. You can mouse over in the interactive version to show details.

The second is attendance heatmap. Green is full attendance, red is total absence, and there’s a linear color scale between them. This one provides at-a-glance overview of attendance of entire party blocks.

Next two are yea and nay heatmaps, so you can see which party blocks mostly voted yea, and which nay. They are normalized to their local maxima for visual appeal, but a more correct approach would be to not normalize them, so it would be apparent that a nay vote is much less frequent than a yea. Why, I have no Idea, but I imagine there must be a lot of technical votings, for example establishing presence and so on.

seatsEN

These seating orders are approximate, as I couldn’t get them for past terms from the parliament. They asserted that they didn’t have them, and claimed they don’t even have the current one, even if it’s published on their own website. There were more lies, but I won’t go into that here. They are, after all, in power, and I’m just a blogger.

Why they should engage in such behaviour is beyond me. Maybe they think that the information is theirs and should be kept from the public.

Again, if anyone needs the MongoDB database, drop me a note. My email address is on the About page.

See the interactive version:        Slo                             Eng

Slovenian business activity by city as animated heatmaps

Share Button

A few months ago, while researching business times of various categories of establishments in Slovenia,  I thought it would be nice to somehow visualize a map with a graphical representation of density of open establishments. I decided on heatmap style, although I later discover that my chosen implementation had some drawbacks.

Getting the data

Data with business hours of commercial establishments is traditionally not open for many reasons, two of them being that (1) this information can be commercially exploited, and (2) the opening hours can be subject to frequent changes, which can tax the database owner with considerable effort should the database stay current and reliable.

First I toyed with the idea of crawling entire  directory of odpiralnicasi.com, then I actually thought about making a version for London, Amsterdam or San Francisco with Yelp data, for which I would have to crawl an entire Yelp city directory, a task I’m not sure it would succeed. Yelp would probably block my IP before I could harvest a significant portion of what interested me.

So I decided I would use the Najdi.si maps business directory. Disclosure: I work there, so I have access to the database with various business data, which is being kept current.

For every company, I took out only the name, geo coordinates, business hours and business category, then I constructed the animated maps. Before I delve into that, a short video of economic activity in Slovenia in course of a typical Monday.

Economic activity in Slovenia from Marko O’Hara on Vimeo.

The animated chart you see on the bottom shows the number of active establishments in various economic categories, such as Restaurants and catering, Industry, Shopping, etc. The full list is:

  • blue: Computers and IT,
  • red: Restaurants and catering,
  • green: Home and garden,
  • yellow: Beauty and health,
  • pink: General business,
  • orange: Free time,
  • violet: Industry,
  • magenta: Culture and schooling

Rendering the maps and constructing the visualization

Rendering one frame in one city at a specific time is just a matter of setting appropriate latitude, longitude and zoom level on the map, selecting the desired time and plotting on the map all establishments  that are open at that time. I used Processing to do that, and for the heat map part I used this excellent example by Philipp Seifried. As a finishing touch, I made maps to switch between day and night styles at appropriate times.

To do entire video, I had to write a parallel rendering queue lest the rendering of a single video took an eternity – Eclipse project available by email request.

To complicate things a bit I decided to include up to four different places on the same map, so the viewer could compare opening hours in Ljubljana in different economic categories, or see how different cities woke up and went to sleep at different times.

A typical frame looks like this:

Video frame / comparison of business activity in Nova Gorica, Koper, Celje and Novo Mesto at noon
Video frame / comparison of business activity in Nova Gorica, Koper, Celje and Novo Mesto at noon

Here’s an example for different economic activities in Ljubljana:

Economic activity in Ljubljana – four categories from Marko O’Hara on Vimeo.

  • top left: General business
  • top right: Restaurants and catering
  • bottom left: Industry,
  • bottom right:Beauty and health

Here’s a comparison between Ljubljana and the city of Maribor:

opentimes ljmb.mp4 from Marko O’Hara on Vimeo.

  • left: Ljubljana
  • right: Maribor

And here a comparison of business activity in Nova Gorica, Koper, Celje and Novo Mesto:

opentimes kpnmceng.mp4 from Marko O’Hara on Vimeo.

 

  • top left: Nova Gorica
  • top right: Koper
  • bottom left: Novo Mesto,
  • bottom right:Celje

 

Commentary

I mostly did this to be able to visually compare levels of business activity in Ljubljana. First of all, the heatmap technique I employed here turned out to be somewhat unreliable for video purposes, because it colors the dots relative to the highest concentration. But concentration and absolute numbers of active businesses change from frame to frame, so it seems that at night there’s more activity that during the day.

Even so it’s still clear that restaurants, bars and clubs are still pretty much open when other activity starts to die down.

This is Ljubljana at noon, again:

  • top left: General business
  • top right: Restaurants and catering
  • bottom left: Industry,
  • bottom right:Beauty and health

The big spot in the northeast is the mall region, where untold number of business operate in ten or more big malls. Business concentration there dwarfs everything else in the city, except maybe in industrial category.

lj at 11 h

Below is Ljubljana at eight o’clock in the evening. Pretty much everything has closed down except for eating and drinking, and maybe the cinema theater in the mall.

lj20h

Below: Ljubljana at ten o’clock in the evening. Some businesses don’t close down at all. I double checked the primary data source and it’s true. There are cleaning services that stay open during the night, etc.

lj22h

I’m relatively satisfied with results except for the heatmap issue. I may correct that if I get the data for a bigger city.

Food democracy: foodstuffs according to their democratic value

Share Button

Update: we won the competition!

This is a contribution to Memefest 21013 Food Democracy competition in collaboration with Miha Mazzini. Food democracy generally means more involved citizen participation in food production and supply chain, but here we have a different take on the topic. In Miha’s words, taken from the project form:

Describe your idea and concept of your work in relation to the festival outlines:

Brain has developed as an organ to help us fill the stomach. A lonely stomach hunting and gathering in the savannah has less chance to survive than a group of them, so the societies have developed.

So, the content of the stomachs must mirror the structure of the society – what are the preferred foods for authoritarian regimes and what for democracies?

We took all of the recipes from Food.com and democracy indexes of The Economist and Wikipedia; we linked national cooking recipes with the countries, split recipes into ingredients and added democracy indexes to them.

What kind of communication approach do you use?
Spoof scientific report on real data.

What are in your opinion concrete benefits to the society because of your communication?

To see have very sweet life in democracy really is.

What did you personally learn from creating your submitted work?

Person should choose their restaurant even more carefully than the country.

Why is your work, GOOD communication WORK?

It’s fun getting some food for thought.

Launch the interactive visualization.

Food democracy network
Food democracy network

More details below.

Main idea:
To range foodstuffs according to their democratic value, if such exists.

Data sources:
Allrecipes.com and food.com were crawled and structured information (ingredients, national provenience, …) extracted from individual recipe pages (~150,000 of them).
Economist Intelligence Unit for democracy indices of individual countries.

Construction:
A network (graph in math parlance) was constructed such that each recipe’s country was associated with one of four main nodes, which represent four democracy groups: authoritarian (0-2.5 on EIU scale), poor (2.5 – 5 on EIU scale), good (5 – 7.5 on EIU scale) and democratic (7.5 – 10 on EIU scale). Then the recipe ingredients were connected to one of these groups. Finally, ForceAtlas2 algorithm was run on the network, producing the result you see in the visualization.

Tools:
Java for crawling the net and original graph construction, Gephi for graph processing and original visualization, sigma.js for web presentation.

Authors:
Miha Mazzini (concept), Marko Plahuta (concept and programming / visualization / web presentation)

What we really found:
That the freshest, most unprocessed food is apparently very undemocratic, which is a side effect of poor countries usually not having a democratic form of government.