A web scraping toolkit for journalists

A web scraping toolkit for journalists

Web scraping is one of the most useful and least understood methods for journalists to gather data. It’s the thing that helps you when, in your online research, you come across information that qualifies as data, but does not have a handy “Download” button. Here’s your guide on how to get started — without any coding necessary.

Note: This tutorial was first published in our data-driven Advent calendar 2018 behind door number 13. You can check it out here.

From the data to the story: A typical ddj workflow in R

From the data to the story: A typical ddj workflow in R

R is getting more and more popular among Data Journalists worldwide, as Timo Grossenbacher from SRF Data pointed out recently in a talk at useR!2017 conference in Brussels. Working as a data trainee at Berliner Morgenpost’s Interactive Team, I can confirm that R indeed played an important role in many of our lately published projects, for example when we identified the strongholds of german parties. While we also use the software for more complex statistics from time to time, something that R helps us with on a near-daily basis is the act of cleaning, joining and superficially analyzing data. Sometimes it’s just to briefly check if there is a story hiding in the data. But sometimes, the steps you will learn in this tutorial are just the first part of a bigger, deeper data analysis.