Tagged: tutorial

My experience with publishing data visualizations on the web

Share Button

After a year of publishing data visualizations and learning many things in the process, I think I can share a thing or two about best publishing practices that I’m so far aware of.

Here’s the Google Analytics Audience Overview for virostatiq.com in little more than a year of operation. Big spikes all come from social media. The biggest spike happened when some bigger players linked to one of my posts.  Lulls in activity are because of my holidays.

There are two more charts from my site further down. One shows number of visits from social media, the other top referring sites.

Google Analytics - Audience Overview for virostatiq.com
Google Analytics – Audience Overview for virostatiq.com

In other words, you made a data visualization, and now you’d like to share it with the right people so they can appreciate the findings or the execution.  Where to publish it?

Thinking about it, there are several groups one should target. Some effort is required to distribute your work appropriately.

Your target groups

Data visualization enthusiasts 

People who enjoy a well executed visualization, and may care about the actual content just in terms of whether it’s well and fairly presented or not. Be aware that data visualization is a geekish thing.

There are several such communities:


Dataisbeautiful on Reddit – a subreddit dedicated to sharing data visualizations. Infographics are not accepted there. You may want to publish the link at a time that maximizes visibility for North American visitors. Subscription is required before posting.
A word of caution – you shouldn’t use Reddit exclusively for self-promotion. Posting just links to your work will be considered as spamming sooner or later, especially if you cross-post same link to other subreddits, so tread lightly, and do engage in discussion whenever possible. The FAQ says that it’s barely acceptable to post one link to your site per five links to others. Conspicious violation of these rules will get you banned, or worse yet, shadowbanned, which means you won’t even know your posts don’t appear on Reddit.



Visualizing.org – a very nice website allowing users to submit various kinds of visualizations. They also run competitions from time to time with awards up to $5,000. If your visualization gets featured, that can substantially add to your traffic, as they also have an active Twitter account and regularly publish what they think is best. If you make it to that list, your reach may expand significantly.
They also partner with academic organizations and big companies, so this is definitely the site to publish on.



Visual.ly – another website like Visualizing.org, but they also have a business. They will help you execute a project for a fee, but their galleries are well visited. After submitting a visualization, an editor reviews it and either approves it or not. I’ve had several rejected, mainly because they were wrongly categorized, or were in a gray zone. For example, I submitted a  blog post with an interactive embedded map, but it was rejected. Then I made a separate page with the map, resubmitted it and got approved. Sometimes, they will mark a well-executed visualization as a Staff Pick, which can lead to more attention from the users.
Another thing with this site is that they may have really high-profile visitors, as in journalists who write for mainstream media. In fact, one of my most visited posts – My heart rate during latest episode of  Game of Thrones – got spotted there by a Popular Science journalist, who published it in an article, which attracted more journalists from other media, and they published separate articles on their own, all linking to the original.
Another well-received post was a map of building ages in my city (a Staff Pick) , which attracted attention of Wired’s science editor, who was putting together a Wired Map blog post about such maps. After a few mails back and forth, they featured my map on Wired site (see Media page for a list of all such articles).



Visualoop – I’m not sure how to reliably submit your work to this site, but they do have a contact form at the bottom. They included me twice or three times in their weekly reviews. They probably spotted what I published on other sites. I recommend following their monthly dataviz calendar of events, there are a lot of conferences and hackatons there.


Various blogs and Twitter accounts

Some specialize in infographics, not making a distinction between that and a data visualization, others post just map-related stuff, and some are just run by geeks who enjoy something novel or cool. Search for them on Google.
As for Twitter accounts, search for tags: #dataviz, #ddj, #data-journalism and such, then follow frequent posters and institutions. Also try to find accounts of data journalists and professionals and technologies you used in your work, for example Sigma.js. Follow, retweet, etc … if even one of them retweets your link, you can see a hundred times more exposure as usual. Also, try to patiently build and cultivate your online community. This is an area in which I lack, as I’m more work oriented. Read articles such as these.

Kantar Dataisbeautiful awards
Kantar Dataisbeautiful awards

Data visualization competitions

It’s good to apply for as many of these as you can, even if you have to pay a symbolic fee. If nothing else, you may get longlisted, and your link will be displayed on a prestigious page, thus exposing your work to more interested people.

Try these:
Knight – Mozilla Fellowships – this is not strictly an award, but they offer fellowships at prestigious news organizations around the world. They only select up to six people a year for a ten-month term, but I think that’s a great opportunity.
Information is Beautiful Awards – There’s a low fee to enter. Yearly.
Global Editors Network Data Journalism Awards – link is for 2014, admission is free.
Visualizing.org Open Challenges – interesting challenges with handsome awards.
Urban Data Challenge – they supply urban data, your job is to visualize it. Yearly.

More events at Visualoop/Events.
Be on lookout for various visualization hackathons.

I also recommend subscribing to DashingD3 newsletter, go to DashingD3.com, there’s a sign up an the bottom. They also have another D3 newsletter for D3 freelance opportunities.

Journalists interested in data visualization

Data-driven journalism is an emerging trend. Most big publishing houses create prestigious visualizations that garner a lot of online interest. Guardian, Bloomberg, Washington Post and New York Times come to mind first, but there are many more, so there are naturally many journalists who are looking for a scoop in this area. You can find these people on Twitter, but there’s also a newsletter which some of them read. Subscribe to http://okfn.org/, and send your posts to data-driven-journalism@lists.okfn.org.

Business-oriented people

Join the LinkedIn’s group Data visualization. Actually, I got the idea for this post from a past discussion there. It has a ton of mostly business oriented posts and resources. You can also post your creations there. There’s also a lot of people there who might need a service you provide. More on this in the section about monetizing your work below.

Academia and  government

In my experience, this is an organic thing. If your visualizations have an educational value and you regularly post them, academics will notice and contact you. I was once contacted by a Canada’s health department’s official who used my findings from this post in his presentation at an international conference.

People interested in content and findings of your data visualization

This is the trickiest part, and possibly the most rewarding. There’s a lot of trial and error and improvisation involved here, but try posting on online forums and other communities, mailing to editors at news organizations, Facebook groups and such. I once posted a link to  My heart rate during latest episode of  Game of Thrones to a Game of Thrones forum and got overwhelming response.
Be careful though, as there’s a thin line that separates rightful enthusiasm from obnoxious spamming. In an ideal world, you would be an active member of these communities before you posted your link there.

Social networking

Social media visits to virostatiq-com
Social media visits to virostatiq-com

I mentioned social networking above, but I feel this topic requires a separate treatment. Examine chart above to get an idea of relative importance of these media.

Make sure you add sharing links to all your visualizations to make it easy for visitors to share them. They probably won’t use the buttons, but some of them will be reminded of possibility to share, and will do it their own way.

  • Facebook – try to join various data visualization groups and post there. Here are some: Gephi, Urban Data Visualization, Infographics and Data Visualization. Be aware that some of these groups are private. Also, post on your wall (obviously) and walls of organizations or pages that publish content that relates to your work, but carefully.
    Another strategy is to ask friends, who might be opinion makers, to post links to your works. I have such a friend, and when he does it, it makes a huge difference. Like factor ten difference, and they will reach other opinion makers, who will repost.
  • Google+ – consider creating a page with your efforts, and link it with your blog or site as per instructions. That will bolster your search results on Google and give you another avenue for showcasing your work and promotion. For example, here’s my page.
  • Twitter – I mentioned Twitter strategy above, so again, read articles such as these.
  • LinkedIn – join groups and post your work. This is a good place to develop business leads. Complete your profile and publish link to it on your blog.
  • Pinterest – create account, pin static images of your work to a panel with appropriate name, for example “data visualizations”.
  • Tumblr – consider creating a separate blog there and repost everything.
  • StumbleUpon – submit every link you produce. I had moderate success with Interactive timeline of the PRISM scandal.
  • Digg – submit all your links. data visualizations frequently appear on Digg front page, although I didn’t make it yet, so I can’t give a firsthand account on what kind of traffic you can expect.
  • GitHub – if you’re an accomplished programmer, clean your code and commit it there. I’m not, so I don’t. But it surely helps, especially if you manage to put together a library that others will use.


This is an area in which I don’t excel, but it’s a game which can potentially make a huge difference. Be sure to optimize your code and insert with meta tags. If your work is ajaxed, read Google guidelines for indexing such sites.

For more information turn to bloggers who make money out of their sites, there are tons of super useful resources out there. I’m a one man band, so it’s hard for me to keep current on all this in addition to technology and content.

Book to read: The Art of SEO by Eric Enge.

Monetizing your work

If visualizing data is more of a hobby than your primary work area, this article about reinventing yourself might boost your courage. In any case, don’t expect an avalanche of business opportunities and money from your hobby. Some might materialize though. So far I had the pleasure to do three projects for a small fee, and there’s another one in the works. A relatively well-known social network from Seattle contacted me to make a map. They saw my gallery over at Visualizing.org and proposed some business, and I accepted. Needless to say, anything made for real production must be super tight, so there was a learning opportunity.

Some friends suggested that I display ads on my site. I won’t – firstly because I don’t believe that data vis enthusiasts would click on any, and secondly because I don’t have enough traffic to warrant inclusion. It would just be silly. The most I did was to enroll in Amazon Associates program and placed some links in posts to see what kind of revenue we’d be talking about. It’s of no consequence, but I might continue to do that, if only for information value in the books advertised.

Half a year after starting this blog, I won an award on Memefest Friendly Competition about Food Democracy. I went to Australia on their budget. That’s pretty cool. Now I consider my blog and my hobby as a potential vehicle to enrich my life in such unexpected ways.

Book to read: How to Make Money with Your Blog: The Ultimate Reference Guide for Building, Optimizing, and Monetizing Your Blog.

Other considerations

  • Optimize for mobile! There are times when half of visitors on my site have mobile devices. So make your visualizations responsive, and be careful with user interface so that you catch touch events.
  • Do not cut corners. A week more programming can make a difference between a featured visualization or a mediocre one, that’s going to get buried under other submissions in a day.
  •  Content is king. Ever heard this phrase? I did, but I had trouble understanding it. It means that a mediocre, but tight work on a superhot topic can be a hundred times more interesting than a perfectly executed job on uninteresting data.

 Top referring sites to virostatiq.com

Here’s a last chart to sum up. It’s self-explanatory and gives a little more perspective to topic at hand.

Referrals to virostatiq.com in one year
Referrals to virostatiq.com in one year
Enhanced by Zemanta

How to publish Gephi graphs on MapBox with TileMill

Share Button

This is a simple how-to on publishing Gephi graphs in a tile-based, zoomable map suitable for online presentation. It’s intended for Gephi users, who find other solutions lacking, or would simply like to learn to publish in a free, cloud-based service.

If you have a Gephi graph and would like to publish it, you can export node and edge data in one of standard formats and use it to render tiles to use in Leaflet, Google Maps or other suitable APIs. This requires a server to store the tiles and some knowledge to render them and use the API.

There’s another possibility – you can use TileMill to render the tiles and free online service MapBox to host and display them. So here’s how to do it. I used this procedure to render the map here.

Exporting the graph data to TileMill

There’s more ways to do that. This one is relatively easy,  but it requires some programming knowledge, as you have to use the Gephi Toolkit.

First make the graph. You can use Gephi or Gephi Toolkit. I use Gephi, since it allows to visually inspecting the graph, run additional correcting layout algorithms and so on. Save the graph as a .gephi file.


The graph is already spatialized and analyzed for modularity classes, so the node information in .gephi file contains at least coordinates (x, y), modularity classes, labels and sizes. TileMill can import CSV, so this is what we are going to do using Gephi Toolkit.

Here is the Java code:

 public class RGraph {
private static final String root = “C:\\Users\\solipsy\\Documents\\!Data\\Gephi\\”;

public void openGephi (File file) {
//Init a project – and therefore a workspace
ProjectController pc = Lookup.getDefault().lookup(ProjectController.class);
if (pc.getCurrentProject() != null) {
Workspace workspace = pc.getCurrentWorkspace();

ImportController importController = Lookup.getDefault().lookup(ImportController.class);
GraphModel graphModel = Lookup.getDefault().lookup(GraphController.class).getModel();
AttributeModel attributeModel = Lookup.getDefault().lookup(AttributeController.class).getModel();

//Append imported data to GraphAPI
//See if graph is well imported
DirectedGraph graph = graphModel.getDirectedGraph();
System.out.println(“Nodes: ” + graph.getNodeCount());
System.out.println(“Edges: ” + graph.getEdgeCount());

//export to CSV
try {
CSVWriter writer = new CSVWriter(new FileWriter(root + file.getName() + “.csv”), ‘\t’);
String[] header  = “Latitude#Longitude#Modularity#Size#Label”.split(“#”);
for ( Node n: graphModel.getGraph().getNodes().toArray()) {
if (Math.sqrt(n.getNodeData().x() * n.getNodeData().x() + n.getNodeData().y() * n.getNodeData().y()) > 3500) {
String [] entry = new String[5];
entry[0] = String.valueOf(n.getNodeData().x());
entry[1] = String.valueOf(n.getNodeData().y());
entry[2] = String.valueOf(n.getAttributes().getValue(Modularity.MODULARITY_CLASS));
entry[3] = String.valueOf(n.getNodeData().getRadius());
entry[4] = String.valueOf(n.getNodeData().getLabel());
System.out.println (“modclass: ” + n.getAttributes().getValue(Modularity.MODULARITY_CLASS) +
“\tx: ” + n.getNodeData().x() +
” \ty:” + n.getNodeData().y() +
“\tsize: ” + n.getNodeData().getRadius() +
“\tlabel: ” + n.getNodeData().getLabel());


} catch (IOException e) {
// TODO Auto-generated catch block

ExportController ec = Lookup.getDefault().lookup(ExportController.class);
try {
ec.exportFile(new File(root + file.getName() + System.currentTimeMillis() + “.png”));
} catch (IOException ex) {

You’ll need to import Gephi Toolkit and OpenCSV. Incorporate the export class in your project and call the “openGephi” method. Correct paths to reflect directory structure on your computer. The method should produce a CSV file with following attributes: latitude (x), longitude (y),  size, label, modularity class. Open the file in Notepad and replace out the quotes. Now it’s ready to import into TileMill.

Importing data into TileMill and setting up a project


This is not what you’ll get when you import the data. Its’ a look of a finished project. First you open Tilemill and add a new layer:


Give it a name and choose “900913” (Google) in SRS dropdown. That’ll place your graph right in the center of the map. You’ll notice that it’s just a tiny dot on the first zoom level. Zoom in, until you can clearly see the distinct dots. Zoom some more to decide the zoom bracket for your map, then set it using the slider. For the map above, I used zooms from 14 to 19. You should really use this option, or else the map will be huge, result in thousands of GBs of data and render for a year. You should also mark a part of the whole map to later export. Shift and drag around your graph to select the smallest possible area.

2-tilemill-project settings

The metatile setting is important, but leave it on 1 for now. It’s used to prevent marker and label clipping on closer zoom levels. Larger means less clipping, but also less responsive map during editing.

Now it’s time to style your map so the nodes and labels are displayed in correct sizes and colors.

Styling the map

TileMill uses something called CartoCSS for styling labels, lines, markers, etc. It’s a simple conditional CSS. You can adjust values for each zoom level, and that’s what we are going to do. We’ll use markers to display the nodes, and set marker sizes so that they reflect the values in the “Size” column of your CSV file.


We’ll have to set the marker size to read the data in the column. This if for the biggest zoom level. Marker sizes get lower on lower levels by a factor 2, so for zoom levels 19 and 18 the marker size is specified like this:

[zoom = 19] {
marker-width: [Size] * 8;


[zoom = 18] {
marker-width: [Size] * 4;


You can guess the rest, it’s just dividing down the marker size. Unfortunately, it’s impossible to do something like that for labels. So we have to generate a list of node size brackets and corresponding marker sizes for each zoom level separately. I use Excel to do this, but maybe it’d be better to just write another method to generate all that during export. So, for zoom 19 we have:

[zoom = 19] {
marker-width: [Size] * 8;
[Size >0][Size <= 5] {text-size:8 }
[Size >5][Size <= 10] {text-size:13 }
[Size >10][Size <= 20] {text-size:20 }
[Size >20][Size <= 30] {text-size:40 }
[Size >30][Size <= 40] {text-size:60 }
[Size >40][Size <= 50] {text-size:80 }
[Size >50][Size <= 60] {text-size:100 }
[Size >60][Size <= 70] {text-size:120 }
[Size >70][Size <= 80] {text-size:140 }
[Size >80][Size <= 90] {text-size:160 }
[Size >90][Size <= 100] {text-size:180 }
[Size >100][Size <= 110] {text-size:200 }
[Size >110][Size <= 120] {text-size:220 }
[Size >120][Size <= 130] {text-size:240 }
[Size >130][Size <= 140] {text-size:260 }
[Size >140][Size <= 150] {text-size:280 }
[Size >150][Size <= 160] {text-size:300 }
[Size >160][Size <= 170] {text-size:320 }
[Size >170][Size <= 180] {text-size:340 }
[Size >180][Size <= 190] {text-size:360 }
[Size >190][Size <= 200] {text-size:380 }
[Size >200][Size <= 210] {text-size:400 }
[Size >210][Size <= 220] {text-size:420 }
[Size >220][Size <= 230] {text-size:440 }
[Size >230][Size <= 240] {text-size:460 }
[Size >240][Size <= 250] {text-size:480 }
[Size >250][Size <= 260] {text-size:500 }
[Size >260][Size <= 270] {text-size:520 }
[Size >270][Size <= 280] {text-size:540 }
[Size >280][Size <= 290] {text-size:560 }
[Size >290][Size <= 300] {text-size:580 }
[Size >=300][Size <= 310] {text-size:600 }

and for 18:

[zoom=18] {
marker-width: [Size] * 4;
[Size >0][Size <= 5] {text-size:2 }
[Size >5][Size <= 10] {text-size:5 }
[Size >10][Size <= 20] {text-size:10 }
[Size >20][Size <= 30] {text-size:20 }
[Size >30][Size <= 40] {text-size:30 }
[Size >40][Size <= 50] {text-size:40 }
[Size >50][Size <= 60] {text-size:50 }
[Size >60][Size <= 70] {text-size:60 }
[Size >70][Size <= 80] {text-size:70 }
[Size >80][Size <= 90] {text-size:80 }
[Size >90][Size <= 100] {text-size:90 }
[Size >100][Size <= 110] {text-size:100 }
[Size >110][Size <= 120] {text-size:110 }
[Size >120][Size <= 130] {text-size:120 }
[Size >130][Size <= 140] {text-size:130 }
[Size >140][Size <= 150] {text-size:140 }
[Size >150][Size <= 160] {text-size:150 }
[Size >160][Size <= 170] {text-size:160 }
[Size >170][Size <= 180] {text-size:170 }
[Size >180][Size <= 190] {text-size:180 }
[Size >190][Size <= 200] {text-size:190 }
[Size >200][Size <= 210] {text-size:200 }
[Size >210][Size <= 220] {text-size:210 }
[Size >220][Size <= 230] {text-size:220 }
[Size >230][Size <= 240] {text-size:230 }
[Size >240][Size <= 250] {text-size:240 }
[Size >250][Size <= 260] {text-size:250 }
[Size >260][Size <= 270] {text-size:260 }
[Size >270][Size <= 280] {text-size:270 }
[Size >280][Size <= 290] {text-size:280 }
[Size >290][Size <= 300] {text-size:290 }
[Size >300][Size <= 310] {text-size:300 }

Then just continue dividing, until you reach your last zoom level. It’s important not to use too many intervals in a zoom level, or TileMill will crash, at least on Windows.

Now set up colors. If you used modularity, look up the the numbers for modularity classes in Gephi and use them in CSS. In my example, I use Modularity column in CSS to determine node colors:

[Modularity = 43]  {marker-fill:#0000FF;}
[Modularity = 72]  {marker-fill:#008B00;}
[Modularity = 5]  {marker-fill:#EEB422;}
[Modularity = 3]  {marker-fill:#8E388E;}
[Modularity = 9]  {marker-fill:#FF1493;}
[Modularity = 0]  {marker-fill:#CD0000;}
[Modularity = 25]  {marker-fill:#388E8E;}
[Modularity = 38]  {marker-fill:#8E8E38;}
[Modularity = 6]  {marker-fill:#1E90FF;}
[Modularity = 10]  {marker-fill:#000080;}
[Modularity = 7]  {marker-fill:#00EE00;}
[Modularity = 21]  {marker-fill:#B8860B;}

Before exporting, don’t forget to set the metatile size to at least 7 to prevent clipping.

Exporting to MapBox

Create an account on MapBox. It’s free for up to 50 MB of data and 5000 views in a rolling month.

In TileMill, select “Export / MBTiles”. Name your map, review the settings and click “Export”, if you want to save tiles to disk, or “Upload” if you want them to immediately appear in your MapBox account. Then make your map visible and share!