One Article


Scraping [ˈskreɪpɪŋ]: Automatically extracting data from the Internet. Might be useful when your government releases a dataset, but does not make it easy to download.

Usually the process of scraping involves two steps: First you download a web page and then you process the source code to extract the data. There are tools to help you, but usually a little programming experience comes in handy. A good library to download a webpage is python’s request package. To further process the content a basic knowledge of web programing is very helpful. BeautifulSoup (bs4) is a great python package to process html-pages. Manytimes, data already comes in json-format and can thus be extracted directly.