FOSS4G 2023

The state of OpenStreetMap buildings: completeness assessment using remote sensing data
06-30, 14:30–15:00 (Europe/Tirane), UBT C / N110 - Second Floor

OpenStreetMap (OSM) is the largest crowd-sourced mapping effort to date, with an infrastructure network that is considered near-complete. The mapping activities started as any crowd-sourced information platform: the community expanded OSM anywhere there was a collective interest. Initial efforts were found around universities or hometowns of mappers. Events, such as natural disasters can also trigger a major update. The recent earthquakes in Turkey and Syria lead to a massive contribution by the Humanitarian OSM Team (HOT) of more than 1.7 million buildings in the region in less than a month after the event1. This type of activities result in a map that is of non-uniform completeness, with some areas having all building footprints in, while other areas remain incomplete or even untouched. Currently, with 550 million footprints, OSM identifies between a quarter and half of the total building footprints in the world, if we estimate that there are around 1-2 billion buildings in the world.

A global view on the local completeness of buildings in OSM did not yet exist. Unlike other efforts, that only look at a subset of OSM building data (Biljecki & Ang 2020; Orden et al., 2020; Zhou et al., 2020), we have used the Global Human Settlement Layer (GHSL) to estimate completeness of the entire dataset. The remote sensing dataset is distributed onto a grid of approximately 100x100 meter tiles. In each tile of the grid, the built area of GHSL is compared to the total area of OSM building footprints. The computed ratio is measured against a completeness threshold that is calibrated using areas that were manually assessed.

Using information derived from remote sensing datasets can be problematic: GHSL does not only measure building footprints: it includes any human-built structures, including infrastructure and industrial areas. Next to that, due to sub-optimal input data or failing algorithms, the dataset is not of the same quality as the crowd-sourced data in OSM in areas that are complete. Even with these limitations, a comprehensive global completeness assessment is created. The assessment should not be used as ground truth, but rather as reflection on the OSM building dataset as is and as a guideline for priorities for the future. Statistics on regional completeness can be created and the quality of GHSL could be assessed on countries that are considered to be complete, such as France or the Netherlands.

I'm a geodata expert residing in Berlin and over the last years I worked mainly on BIM and GIS. In my current occupation as geospatial data scientist at the GFZ German Research Centre for Geosciences, I look into exposure modeling for natural hazards. Exposure models typically only get distributed about once a decade, contain data about aggregated districts and are covering mere regions or at best countries. In our research we are particularly interested into getting a building-by-building, up-to-date and global model. All projects are based on open data and are fully open-source.

I work as a senior scientist at the German Research Centre for Geosciences in Potsdam, Germany, in the section "Earthquake Hazard and Risk Dynamics". My current main project is the first dynamic exposure model on the building scale for different natural hazards. I also work on assessments of recording quality of seismic networks, on earthquake forecasting for hazard modeling, and on testing frameworks for earthquake forecast models. I am an active OSM mapper since 2009. I have combined my enthusiasm for OSM with my scientific work, which results in the building exposure model being based on OSM data. During my scientific career, I have lived in Switzerland, USA and Japan and worked in the respective earthquake institutes of these countries.

This speaker also appears in: