5 Tools to make government (or any) data attractive and engaging

For my JSK Fellowship at Stanford I decided to work on a topic that has fascinated me for as long as I can remember: the future of cities. Specifically, I’m interested in the role that local media will play in the future of our urban areas, and how media can nurture a more engaged citizenship. I explored a number of possible approaches to this question and became interested in the potential of data, or more precisely, open data, to become a source for citizen engagement.

For the past six months, I have been working on a project to make data about cities more engaging — a broad project scope that I’ve worked on refining. I have been learning how to wrestle databases with MySQL and SequelPro, studying the basics (and-not-so-basics) of interactive mapping tools, wrapping my head around the capabilities of OCR scanning technology, refreshing my HTML, CSS and JavaScript skills, practicing Human-Centric Design Thinking®, GitHub a-forking, data Tableau-ing, spreadsheet GoogleFusion-ing, and many more things that would have probably sounded like gibberish to me a couple of years ago.

So if data isn’t the area in which I’m most experienced, why tackle this issue at all? It’s a long story. The short version is that this project draws extensively from three particular aspects of my background: I have led and edited four editorial projects during the last decade (three print, one digital), co-founded one design studio, and studied social sciences during my college days. In other words, I’m not approaching this as a data project, but rather, as an opportunity to find a way to display data in an editorially compelling, visually striking, people-centric, entertaining and statistically sound way. Easy peasy, right?

I’m currently finishing up a prototype for this project, which we have called Citymatter, at our website. During the research for this project I’ve had a great deal of fun learning to use the tools that will help bring it to life; and I have become especially excited by the enormous possibilities to make data look beautiful using simple, and mostly freely available tools. So, without further ado, here’s an annotated list of five tools to to make any data look stunning:

1. Google Fusion Tables

Dan Nguyen’s amazing Public Affairs Data Journalism class at Stanford, taught during the Fall quarter, was the first step in sharpening my data chops. Within the first 15 minutes, I learned that Google Drive can be connected to a number of file formats or “apps” including, for the purposes of this post, an app for data viz. Google Fusion Tables can take simple data from a spreadsheet or even data uploaded in any comma-separated file format and turn it into an embeddable graph or map. It offers options such as pie charts, bar charts, lineplots, scatterplots, timelines, and geographical maps.

Info and Tutorials:

2. TileMill (formerly by Mapbox)

TileMill is a great place to ‘viz’ your data viz — it lets you build beautiful maps for your data rather than just mapping it out on Google. Now a totally open-source project, TileMill allows you to add your database sources in layers, style them using CartoCSS, and even add tooltips and legends to your maps. You can also export them in many different formats. Bottom line: take data, output beautiful custom maps. And if the instructions all sound like Martian to you, no need to worry: there’s a million tutorials out there.

Info and Tutorials:

3. CartoDB

Another great open source tool is CartoDB, which you can use to store and visualize geospatial data to display on a web browser. It’s great for analysis as well, allowing you to perform visual queries to gain insights from your data. CartoDB operates a ‘freemium’ model and is free for up to 50MB of data. They’ve recently announced that you can now open data files from the U.S. “open government” on its platform.

Info and Tutorials:

4. Tableau

Tableau has a series of products for data visualization, many of which focus of business intelligence. It allows you to explore and analyze your data quickly, slicing it in different ways much like you might use Excel’s pivot tables. But unlike Excel, the results show up visually and lightning fast, and you can use “Big Data” sized sets. Most of its products don’t come cheap, with the Tableau Online hosted server starting at $500 per year per user. However, they do have “Tableau Public,” for use with open data, which is free up to 1GB. As an aside, Tableau Public user base seems to concentrate among “Bloggers, Journalists, Political Junkies, Quantified Selves, Sports Fans, and Nonprofits.”

Info and Tutorials:

5. OpenRefine (formerly Google Refine)

Let’s take a step back from data visualization, and return to the data itself. Public, free, and open data … well, it can get messy. OpenRefine is an open-source tool designed to help you clean up messy data. It allows you to group identical cells in a particular column, showing you the number of rows in each group. This is useful in catching inconsistencies, and cleaning them up to make the data useful and easy to manipulate in all these other wonderful data visualization tools above.

Info and Tutorials:

Bonus: Python

Python is obviously useful for a hell of a lot more than just gathering data for visualizations; but eventually you’ll just need to learn some of this nasty stuff if you want to build truly powerful applications. A great place to start is CodeAcademy.

 (@gophermagazineco-founded the online culture magazine The Gopher Illustrated.