Extracting geodata from OpenStreetMap with Osmfilter
When working on map related projects, I often need specific geographical data from OpenStreetMap (OSM) from a certain area. For a recent project of mine, I needed all the roads in Germany in a useful format so I can work with them in a GIS program. So how do I do I get the data to work with? With a useful little program called Osmfilter.
There are various sites that provide OSM datasets for certain areas. However, these datasets include ALL the geodata OSM provides. That means houses, streets, rivers etc – basically everything you see on a normal map. In addition, the file is in a rather inconvenient format for direct usage. A quite complicated way to proceed with the dataset would be to load it into a database and query what you need. This can be time consuming and – depending on the power of your PC – impossible.
If you only need to get info about small areas, I recommend using the site overpass turbo. For larger datasets, there is Osmfilter, which easily lets you filter OSM geodata. With a little help from two other free tools, you will have a dataset you can work with in no time.
What you’ll need for this tutorial
- the following tools:
- osmfilter -> wiki.openstreetmap.org/wiki/Osmfilter
- osmconvert -> wiki.openstreetmap.org/wiki/Osmconvert
- ogr2ogr -> www.gdal.org/index.html
- some very basic knowledge of the command line / shell / bash.
Get an OSM Dataset
To start out, we need to download an OSM dataset, which is saved in a format called pbf (a format to compress large data sets). For this tutorial, I will use a dataset provided by geofabrik, but there are other sources, too. Lets download the pbf file for Liechtenstein and save it in a folder of your choice. Once you have downloaded the data, open the shell and go to the folder where your new dataset is stored with the command
cd path/to/folder
Prepare your data
Osmfilter only supports the file formats osm and o5m. For fast data processing, using o5m is recommended. You can convert your pbf file to o5m with osmconvert in your shell like this:
osmconvert liechtenstein-latest.osm.pbf -o=liechtenstein.o5m
This translates to: Use the program osmconvert and convert the file called liechtenstein-latest.osm.pbf to a o5m file called liechtenstein. The -o stands for output.
You will now have the same dataset in the o5m format in the same folder.
Filter your data
Now, you can filter your geodata using the shell that should still be open. The osmfilter command logic is built up as follows:
osmfilter || input data || some filter commands || output data
Let’s look at the part about the filter commands. Here is where you can tell the program which parts of the dataset you need by writing --keep=DATA_I_WANT. Here is an example which creates a file for you called buildings.osm that contains all the buildings (and only the buildings) from the Liechtenstein dataset:
osmfilter liechtenstein.o5m --keep="building=" -o=schools.osm
To find out how features are stored and classified in OSM, you can go to this site and look up the feature you want. Tip: You can head over to overpass turbo to test your query on a small area of your choice by using the wizard.
Of course you can do much more with Osmfilter. You can specify which building type you want. For example, you might only want to look at schools:
--keep="building=school”
You can query multiple features by chaining them. For example if you want all the schools and universities:
--keep="building=school =university”
You can exclude things by adding the flag –drop. For example, if you don’t want to have buildings that are warehouses but keep everything else:
--drop="building=warehouse"
You can reduce the final file size by dropping extra data on the authors and the version by adding:
--drop-author --drop-version
You can, of course, combine these flags. Here is a query that gives you all the highway types that cars can use in Lichtenstein without the ones where motor vehicles are not allowed:
osmfilter liechtenstein.o5m --keep="highway=motorway =motorway_link =trunk =trunk_link =primary =primary_link =secondary =secondary_link =tertiary =tertiary_link =living_street =residential =residential_link =unclassified " --drop="motor_vehicle=no" --drop-author --drop-version -o=streets_liechtenstein.osm
ou find more on the filtering options on the Osmfilter site with some examples too.
Convert to a useful format
As a final step, you can convert your osm file to the most widely supported geodata format called a Shapefile (the GIS program QGIS can handle osm files, but it sometimes doesn’t work well with large datasets). You can convert your osm file to a Shapefile with the program ogr2ogr like this:
ogr2ogr -skipfailures -f "ESRI Shapefile" streets_shapefiles streets_liechtenstein.osm
The above command converts the file streets_liechtenstein.osm to the shapefile format and tells it to store it in a folder called streets_shapefiles. In the newly created folder you will find four different shapefiles (one for every geometry type). In case of the streets, we are only interested in the file lines.shp. You can open this file in a GIS program of your choice.
Ogr2org also allows you to convert your newly created Shapefiles to other geodata formats that you may need, such as GeoJSON, CSV and many more. Have a look at the ogr2ogr website for more info. If you’re tired of using the shell, try the online tool Mapshaper which allows you to convert your Shapefile file to formats such as GeoJSON, SVG and CSV. The file size for Mapshaper is limited but I have tried it with files bigger than 1 GB.
Have fun filtering OSM and happy mapping!