From the data to the story: A typical ddj workflow in R

by Marie-Louise Timcke 0 Comments
From the data to the story: A typical ddj workflow in R

R is getting more and more popular among Data Journalists worldwide, as Timo Grossenbacher from SRF Data pointed out recently in a talk at useR!2017 conference in Brussels. Working as a data trainee at Berliner Morgenpost’s Interactive Team, I can confirm that R indeed played an important role in many of our lately published projects, for example when we identified the strongholds of german parties. While we also use the software for more complex statistics from time to time, something that R helps us with on a near-daily basis is the act of cleaning, joining and superficially analyzing data. Sometimes it’s just to briefly check if there is a story hiding in the data. But sometimes, the steps you will learn in this tutorial are just the first part of a bigger, deeper data analysis.

JavaScript: Interactive Leaflet.js map with search bar

JavaScript: Interactive Leaflet.js map with search bar

In our previous posts about Leaflet.js, we coded an interactive marker map and learned how to update our data with a google spreadsheet. In this short tutorial, we will show you how to make your map searchable so users can find a specific marker.

We’ll use the code of the already finished map that’s hosted on our GitHub page so we don’t have to start at the very beginning again. Simply download the directory to your computer and open it in your text editor. If you’ve never created an interactive map with Leaflet.js before, please have a look at the two tutorials mentioned above before proceeding with this one.

JavaScript: Loop through a Google Spreadsheet

JavaScript: Loop through a Google Spreadsheet

In our post Coding marker maps with Leaflet.js we filled a map visualization with information given directly in the code, but we also learned how to loop through data in a JSON file to create markers and pop-ups. This time, we will loop through an public Google Spreadsheet. We will do this with the same example map as in the previous post. But this little trick will not only help you with building marker maps but can come in handy for any JavaScript application you need to automate the handling of your data.

So let’s start right away with setting up a data table with Google Spreadsheets.

JavaScript: Coding marker maps with Leaflet.js

JavaScript: Coding marker maps with Leaflet.js

Showing locations on a map can be pretty cool to provide some context for your story or to give your reader an overview of where the story takes place. A good way to build a simple, yet responsive and professional looking map is to use the JavaScript library Leaflet.

In an earlier post, “Your first choropleth map“, we used Leaflet as well, but coded the map using the Leaflet R package, which works like a wrapper to translate the more common Leaflet functions into R syntax. It’s very useful if you’re more used to R syntax and don’t want to learn JavaScript anytime soon. But using the original JS library and coding the map with JavaScript will give us way more freedom when customizing the map, which is why we’ll try it that way today.

As an example, let’s start with a map of the locations of some data journalism newsrooms in the German speaking area. As always you can find all the code of this tutorial on our GitHub page.

This is what the finished map will look like:

R: Your first web application with shiny

R: Your first web application with shiny

Data driven journalism doesn’t necessarily involve user interaction. The analysis and its results may be enough to write a dashing article without ever mentioning a number. But let’s face it: We love to interact with data visualizations! To build those, some basic knowledge of JavaScript and HTML is usually required.
What? Your only coding skills are a bit of R? No problemo! What if I told you there was a way to interactively show users your most interesting R-results in a fancy web app?

Shiny to the rescue

Shiny is a highly customizable web application framework that turns your analysis into an interactive web app. No HTML, no JavaScript, no CSS required — although you can use it to expand your app. Also, the layout is responsive (although it’s not perfect for every phone).

In this tutorial, we will learn step by step how to code the shiny app on Germany’s air pollutants emissions that you can see below.


R: Tidy Data

R: Tidy Data

Unfortunately, data comes in all shapes and sizes. Especially when analyzing data from authorities. You’ll have to be able to deal with pdfs, fused table cells and frequent changes in terms and spelling.

When I analyzed the swiss arms export data as an intern at SRF Data, we had to work with scanned copies of data sheets that weren’t machine-readable, datasets with either french, german or french and german countrynames in the same column as well as fused cells and changing spelling of the categories.

R: plotting with the ggplot2 package

R: plotting with the ggplot2 package

While crunching numbers, a visual analysis of your data may help you get an overview of your data or compare filtered information at a glance. Aside from the built-in graphics package, R has many additional packages to help you with that.
We want to focus on ggplot2 by Hadley Wickham, which is a very nice and quite popular graphics package.

Ggplot2 is based on a kind of statistical philosophy from a book I really recommend reading. In The Grammar of Graphics, author Leland Wilkinson goes deep into the structure of quantitative plotting. As a product, he establishes a rulebook for building charts the right way. Hadley Wickham built ggplot2 to follow these aesthetics and principles.