Category: Visualizations

Analysis of traffic violations in Slovenia between beginning of 2012 and end of 2014

Share Button

This is my first attempt to use open data for data visualization in web presentation and for a mobile app. The idea was to cross-pollinate promotion, but it didn’t go so well – more on this later.

The analysis is published on a separate URL due to heavy use of JavaScript, which complicates things in WordPress. Click link above or the big image with parking ticket to read it.

Parking ticket
Parking ticket

According to data provided by state police, highway authority and local traffic wardens, there occurred a little less than a million traffic violations between start of 2012 and September 2014. Given that there are 1,300,000 registered vehicles and 1,400,000 active driving licenses in the country, this is a lot. A big majority of them are parking and toll tickets.

In the main article, there are a lot of images and charts. For example, I analyzed data for major towns in Slovenia to get the streets with the highest number of issued traffic tickets. Here’s an example for Ljubljana:

Parking tickets in Ljubljana
Streets with parking tickets in Ljubljana – click to read article

I had temporal data for each issued ticket, so I could also show on which streets you are more likely to be ticketed in the morning, midday or evening. On the image below, morning is blue, midday is yellow, and evening is red.

Tickets issued by hour
Tickets issued by hour – click for main article

This is, however, only the beginning. Here are questions I tried to answer:

  • Are traffic wardens and traffic police just another type of tax collectors for the state and counties?
  • Do traffic wardens really issue more tickets now than in the past, or is that just my perception?
  • Which zones in bigger towns are especially risky, should you forget to pay the parking?
  • Are traffic wardens more active in specific time intervals?
  • Does the police lay speed traps in locations with most traffic accidents? What about DUI checking?
  • How does temperature influence the number of issued traffic tickets?
  • Does the moon influence the number of issued traffic tickets? If so, which types?
  • Where and when are drivers most at risk of encountering other drunk drivers?
  • Where does the highway authority check for toll, and when to hit the road if one does not want to pay it?
  • How can we drive safer using open data?

Be sure to read the main article to see all the visualizations and interactive maps. There are also videos, for example this one, showing how the ticketing territory expanded through time in Ljubljana:

Parkirne kazni v Ljubljani 2012 – 2014 from Marko O’Hara on Vimeo.

Some other highlights:

The big finding was a sharp increase of number of parking tickets issued in Ljubljana by the end of 2013, which coincides with publishing of debt that the county has run into:

Increase of parking tickets issued in LJubljana
Increase of parking tickets issued in Ljubljana

There’s an interactive map showing the quadrants with most DUI tickets and their distribution by day of week and month in year:

DUI distribution
DUI distribution

Mobile app for Android

Mobile app for android - start screen
Mobile app for android – map

I also wrote an Android mobile app (get it on Google Play if you are interested) that locates the user and shows locations of violations of selected type on the map, as well as a threat assessment, should she want to break the law. Here’s the description on Google Play:

The app helps the user find out where and when were traffic tickets issued in Slovenia, thus facilitating safer driving. 
Ticket database is limited to territory of Republic of Slovenia.

Choose between these issued citations to show in app:
– parking
– speeding
– driving while using a cellphone
– ignoring safety belt laws
– unpaid toll
– DUI
and traffic accidents.

The app will locate you, fetch data about traffic citations issued in your vicinity, and show them on map. To see citations, that were issued somewhere else, click on map. Additionally available is summary of threat level, derived from statistical data, collected by government agencies.

Locating the user and showing dots on map wasn’t really a challenge, but I wanted to show a realistic threat assessment, based on location and time. To do that, I wrote an API method that calculates the number of tickets issued on the same day of week in the same hour interval and then draws a simple gauge.

Let’s say, for example, that you find yourself in the center of Ljubljana on Monday at noon, don’t have the money for parking fee, and you really only want to take a box to a friend who lives there. You’ll be gone for ten minutes only, so should you risk not paying the parking fee?

The app finds out the total number of tickets issued on Mondays in the three-hour period between noon and 3 PM, then graphically shows the threat level along with some distributions, something like this:

Threat assessment
Threat assessment

It works pretty well, and I use it sometimes, although I admit that its use cases may be marginal for majority of population. It does get ten new installs a day, although I don’t know how long this trend will continue.

I did send out press reviews and mounted a moderate campaign on Twitter (here’s the app’s account), but it amounted to precious little. Maybe the timing was bad – I launched it during Christmas holidays, when Internet usage is low. Or this type of app just isn’t so interesting.

I’m currently working on analysis of parking tickets for New York City, maybe that will be more interesting. There were, after all, more than nine million tickets issued there, and data is much richer.

Stay tuned!

Mobile apps

Share Button

Redar

banner_phones
Mobile app screens

 

Click image to download, or click here.

The app helps the user find out where and when were traffic tickets issued in Slovenia, thus facilitating safer driving.
Ticket database is limited to territory of Republic of Slovenia.

Choose between these issued citations to show in app:
– parking
– speeding
– driving while using a cellphone
– ignoring safety belt laws
– unpaid toll
– DUI
and traffic accidents.

The app will locate you, fetch data about traffic citations issued in your vicinity, and show them on map. To see citations, that were issued somewhere else, click on map. Additionally available is summary of threat level, derived from statistical data, collected by government agencies. These are:

– overall threat assessment to be issued a citation, should you violate traffic laws
– average interval between citations issued at that location,
– date of most recent citation issued relative to data source update
– number of citations issued in vicinity
– distance to closest issued citation

And several statistical distributions of issued citations, such as:
– by days in week (how many on Monday, Tuesday, …
– by hours in day (how many in intervals between 9-12h, …)
– by months in year (how many on January, February, …)
– by weather conditions (how many in rain, snow, clear weather)
– by temperature (how many in temperature interval between 5-10 Celsius, …)

Same information si also available on address list, ordered by number of citations issued.

Data sources

Many thanks to traffic wardens, police, and other officials, who supplied the raw data, used to build this app:
– traffic wardens of Ljubljana,
– state police,
– DARS,
– Parkings of Ljubljana,
– traffic wardens of Maribor,
– traffic wardens of Kranj,
– traffic wardens of Celje,
– traffic wardens of Novo Mesto,
– traffic wardens of Nova Gorica

Data was acquired for time interval between 2012 and end of 2014.
The database contains is a little less than a million traffic citations.

Discovering and visualizing songs with similar trends on the British Top 40 Charts from 1990 to 2014

Share Button

I often wondered what is an average lifetime of a pop song on the charts. If one follows music, it becomes intuitively apparent that there are in fact several types of hits. Some stay on the charts for many weeks, and others barely make it, then immediately slip out.

So I set about discovering groups of songs with similar trends, as they moved on weekly British Top 40 Chart from 1990 to 2014. A total of 1284 different songs appeared on the charts in that period. After a series of experiments, 100 groups were arbitrarily decided on. Position data for each song was collected across the weeks, then the songs were grouped using k-means clustering.

The result is part interactive, part static visualization, consisting of an exploratory chart and 100 small charts showing each separate group.

Check it out here! Or click the image below.Song trends over time in a typical group

Song trends over time in a typical group

 

To group the songs, the data was first scraped from www.officialcharts.com, then arranged in format suitable for k-means clustering. The visualization was constructed with d3.

And here are some of the small multiples.

Some of the 100 different groups. Click image for more.
Some of the 100 different groups. Click image for more.

Conspiracy theories as network graphs: antigravity, Illuminati/NWO, JFK, 9/11, chemtrails visualized

Share Button

This is an attempt at visualizing different conspiracy theories. The visualization tries to show interconnectedness of actors, organizations and concepts in each one, so a network graph was chosen as a mode of presentation. The presented theories are: The Antigravity Drive, Chemtrails, The Cabal (American deep state from JFK assassination to 9/11), The Illuminati/New World Order, and the most recent, the Malaysian Airlines Flight MH370 disappearance. In a way, it’s a progression from the previous network visualization about the PRISM scandal, which was once also considered a conspiracy theory.

I chose this topic because those theories always attracted me as a means of alternative explanation of things that I couldn’t understand in official versions of events. That is not to say that I necessarily believe in any of them. For example, I’d be hard pressed to believe in the Moon Landing Hoax theory, which I first included here because of relative ease of gathering source material, but later discarded because of its relatively low value. The Flt 370 theory has extremely low credibility too, and I wonder what I’ll think when this post is a year old.

Some of the others, for example Nick Cook’s antigravity drive thesis, are extremely well researched, and many eminent scientists appear to believe at least part of it, if we are to believe his book The Hunt for Zero Point: Inside the Classified World of Antigravity Technology.

Launch the Conspiracy Theory Explorer! Use Chrome, if possible. FF and II are terribly slow.

Conspiracy Theory Explorer
Conspiracy Theory Explorer – launch viewer!

 

Conspiracies and conspiracy theories

A conspiracy, according to Wikipedia… may also refer to a group of people who make an agreement to form a partnership in which each member becomes the agent or partner of every other member and engage in planning or agreeing to commit some act.“. This is a pretty broad definition. It can apply to a government, a company, or every group of people who are trying to further an agenda, be it good or bad for their natural or social environment. But anything labelled as a conspiracy almost always has an evil association, for example “A civil conspiracy or collusion is an agreement between two or more parties to deprive a third party of legal rights or deceive a third party to obtain an illegal objective.” (Wikipedia – civil conspiracy), or “In criminal law, a conspiracy is an agreement between two or more persons to commit a crime at some time in the future.” (Wikipedia – criminal conspiracy).
A conspiracy theory is therefore an attempt at explaining a real or imagined conspiracy. In this sense, even official stories of various incidents are conspiracy theories, unless they are well founded in evidence and irrefutable facts. In a free society, a kind of market then forms of conspiracy theories, in which those with better means, but also more vested interests, compete for public’s attention  with other bodies of citizenry, whose interests and aims can differ significantly. For example, a government can execute a false flag attack, as the Nazis did in Poland at the beginning of WW2, and spin a theory that the other party did it, in order to go to war and grab land. The public may then be motivated to concoct a variety of counter theories with various motives – simply seeking the truth, overthrowing the government by exposing the lies it tells, furthering some commercial agenda, for example selling books, or purely personal paranoid agendas, which serve no one else than the authors and their need to sustain their delusions.
Let me briefly explain the theories I used in this visualization. First two are quite believable.

The Cabal: the story of American deep state and events from JFK assassination to 9/11 attacks

JFK Assassination in Dallas
JFK Assassination in Dallas

How the secret cabal of highly influential men formed behind US Government for the purpose of killing President Kennedy, and how it later evolved into a secret government that controls most of the aspects of American politics and life. In it: JFK and RFK assassinations, the presidential careers of the Bushes, Clinton, and Obama, the Oklahoma City bombing, the 9/11 plot, and the murder of countless witnesses, politicians, and journalists who sought to expose them, including Sen. Paul Wellstone and even Hunter S. Thompson. Everything, according to the authors, has been an inside job.
The research has been done by Mark Gorton, and material from visualization comes from his two essays (Fifty Years of Deep State and  The Political Dominance of The Cabal) available on the Internet, but also from these books he references: How the CIA Controlled The House Select Committee on Assassinations” Chapter 17 of “The Taking ofAmerica 1-2-3” by Richard Sprague, The Road to 9/11: Wealth, Empire, and the Future of America by Peter Dale Scott, Defrauding America: Encyclopedia of Secret Operations by the CIA, DEA, and Other Covert Agencies by Rodney Stitch, George Bush: The Unauthorized Biography by Webster Tarpley and Anton Chaitkin, Dark Alliance: The CIA, the Contras, and the Crack Cocaine Explosion by Gary Webb, and Compromised: Clinton, Bush and the CIA, by Terry Read.

The Antigravity Drive

The Henge - a test site for Die Glocke?
The Henge – a test site for Die Glocke?

A story of how the Nazi regime allegedly developed a form of anti gravity propulsion in total secrecy, made possible by a strictly compartmentalized environment, imposed on the German war production efforts by the SS. The technology was then seized by the US military and other allies after the war and developed further in utmost secrecy. The first such machines ever seen were so-called foo fighters. These balls of light, sighted and documented by various US Air Force pilots, flew in parallel with bombers and fighter planes, and frequently executed seemingly impossible air maneuvres. Also mentioned is a mythical machine The Glocke (The Bell), which ran on red mercury and was responsible for death of several scientists due to extreme radiation it produced, and the discoveries of Viktor Schauberger. His implosion engine, which drew heavily on vortex physics, was allegedly successful, and produces two flying prototypes. The US military immediately grabbed and classified much of this work, and it stays secret until now. It’s said to be employed in B-2 bomber and various flying craft sighted around Area 51 in Nevada. The story also goes to mention modern experiments in anti gravity physics, notably performed by Evgeniy Podkletnov, which allegedly succeeded in reducing gravity over a spinning superconducting electromagnet for two percent.

The material for this story comes entirely from Nick Cook’s book The Hunt for Zero Point: Inside the Classified World of Antigravity Technology, although there are many other books on this topic, for example Tom Agoston’s Blunder!: How the U.S. gave away Nazi supersecrets to Russia, Dr. Paul LaViolette’s Secrets of Antigravity Propulsion: Tesla, UFOs, and Classified Aerospace Technology, Joseph Farrell’s The SS Brotherhood of the Bell: The Nazis’ Incredible Secret Technology and Reich of the Black Sun: Nazi Secret Weapons & the Cold War Allied Legend, and some others. All of them are well-researched and worth reading.

Chemtrails

Chemtrails
Chemtrails

A popular conspiracy theory about fat trails that civilian airliners leave in their wake. These chemical trails – as opposed to regular vapor contrails – are said to contain microbiological material and heavy metals, which seem to serve a variety of purposes. Among them: population reduction through novel diseases, such as Morgellons disease, which causes plastic fibers to grow through the skin, weather engineering for purpose of military dominance by the U.S., geoengineering to further reduce population, facilitation of communication with deeply submerged military submarines, and straight mind control in conjunction with HAARP.
Material came from assorted Internet sources, most notably Rense.com, and the book Chemtrails Confirmed by William Thomas. There are other books, for example Chemtrails, HAARP, and the “Full Spectrum Dominance” of Planet Earth by Elana Freeland, and What In The World Are They Spraying? by G. Edward Griffin. Morgellons disease is expounded on in the book How to Get Your Life Back From Morgellons and Other Skin Parasites Limited Edit by Mr Richard L. Kuhns.

Illuminati / New World Order

Illuminati / NWO
Illuminati / NWO

How a handful of secret societies dominate the world. The plot allegedly has its roots in The Bavarian Illuminati society, started in the eighteen century by Adam Weisshaupt. They were eradicated, but some claim they survived in a covert form, forging an alliance with international bankers. Most big world events since then were planned in advance, among them both the advent of Communism, Nazism and Zionism, World Wars, and the third too. Says Pike: “The Third World War must be fomented by taking advantage of the differences caused by the “agentur” of the “Illuminati” between the political Zionists and the leaders of Islamic World. The war must be conducted in such a way that Islam (the Moslem Arabic World) and political Zionism (the State of Israel) mutually destroy each other. Meanwhile the other nations, once more divided on this issue will be constrained to fight to the point of complete physical, moral, spiritual and economical exhaustion…We shall unleash the Nihilists and the atheists, and we shall provoke a formidable social cataclysm which in all its horror will show clearly to the nations the effect of absolute atheism, origin of savagery and of the most bloody turmoil.”

In recent times, the organizations that further Illuminati goals are Council for Foreign Relations, Trilateral Commission and the Bilderbergers. Here are some books: The Illuminati: Facts & Fiction by Mark Dice, and The Illuminati original by Adam Weisshaupt.

Malaysian Airlines Flight MH370 disappearance

Boeing 777 (symbolic picture)
Boeing 777 (symbolic picture)

A recent theory about the whereabouts of the missing plane. On it, there seemed to be an awful lot of technical personnel, involved in developing military hardware. They supposedly worked for a company named Freescale Semiconductors, which was in a patent wrestle with the Rothschild family. Acording to the story, Israeli agents and elements of US military hijacked the plane and secretly flew it to Diego Garcia military base in the Indian Ocean to debrief the experts and possibly use the plane in another 9/11-style attack in the future.

 

 

Construction and visualization of visualization networks

A few words for technologically minded.  The networks were constructed by text-mining the source material, isolating known entities in sentences by means of massive dictionaries, connecting them in subnetworks (each sentence – one subnetwork), and finally adding them in the master network for that topic. Only sentence-length subnetworks were constructed, although it would be probably more fruitful to connect entities in paragraphs too. That would yield a too convoluted master network, so I stayed with sentences for clarity.

The dictionaries were automatically generated from source texts, then edited, Many synonyms had to be added, since my dictionary generating technique relies more on brute force than on semantic aspects of text. Again, the connections are not semantic, which means that if there was a sentence “The Illuminati are NOT connected with the CFR”, Illuminati and CFR would still be connected. Here I’m relying on the power of statistics: in majority of sentences there mostly appear connected entities. For the minority in which they are not, the bonds between them are too weak to influence the big picture.

I did try to process volumes of texts with a natural language processing framework, namely Apache OpenNLP, but got frustrated with the amount of work that would be needed for this little hobby project. I’d need to train the classifiers to extract named entities, which is no small feat, and I’d probably not use them again. To gain some insight in types of connections between these entities, I tried parsing the sentences into parse trees, then extract relationships, but parsing tech is not very accurate. It would probably do, again relying on power of statistics, but the sheer amount of relationship types would add little to visual value of the graphs, so I decided that I’d do this with a simpler project first. The logic I wrote is still in project source code, so if anyone is interested, mail me (About page) and I’ll send it your way. Same goes for the graph files and the categorized dictionaries.

Finally, the topic networks were exported as subgraphs, so that every node in the network is represented by a subgraph. These subgraphs are added into – or removed from – the master graph by the client. The networks in Browser are managed by sigma.js. Preliminary analysis was done in Gephi, I recommend Network Graph Analysis and Visualization with Gephi by Ken Cherven.

Additionally, geographic entities were extracted for each node. These are represented on a small map in the bottom of the screen. Map is managed by d3.js.

Interacting with visualization

There are two modes – reading the story or exploring on your own. Switch between them by clicking a button on top right of the graph. While read the story, the graph will change in real time as you scroll the text down. If you choose to explore, you can click on terms, and their subgraphs will be interactively added to the master graph.

Clicking on a graph node will expand it (load its associated nodes and display them, if previously not loaded), or delete it, if it was already loaded, at the same time showing the text from which its existence was text-mined.

There’s no way for the user to control the map. It’s there for informative and decorative purposes.

There’s more help in the main visualization, check it out!

Addresses with most registered companies in Slovenian towns

Share Button

There appeared an article, in which an attempt was made to expose questionable practices of some Slovenian enterpreneurs. The scheme is such: establish a company, perform some work, bleed it dry, then establish a new one and move all workers into it, at the same time avoiding paying benefits and a sizable portion of salaries. When the new company has server its purpose, establish a new one, and so on, as far as it goes. These companies are frequently registered at the same address.

The article says that there are as many as 120 companies registered in one residential building. But because of a weakness of the law, state inspectors can’t put an end to such practice.

I wanted to see these addresses on the map, so here’s an attempt. For every address with more than five companies, there’s a dot, with color and radius proportional with number of companies registered there. The biggest dots represent business buildings, in which a predominantly legitimate businesses reside. My data sources didn’t allow for filtering out just residential buildings.

You can see the standalone map here. (In Slovene.)

Interactive map showing addresses with most companies

Clicking on a marker displays a popup with a list of companies, sorted by date of establishment – youngest first. There’s also a chart of predominant business categories at that address. The categories that the article mentions as most prone to scheme in question, are Construction and Retail. So even of this map can’t really show the locations with these questionable companies, it can maybe help their discovery. If there’s a big dot with predominantly these categories, there’s a certain possibility that some of these fraudulent companies are there.

Most addresses shown here of course don’t have anything to do with any illegal activity.

Data source: Zemljevid Najdi.si.

Enhanced by Zemanta