FOSS4G 2023

OpenStreetMap as an input source for producing governmental datasets: the case of the Italian Military Geographic Institute
06-29, 16:00–16:30 (Europe/Tirane), UBT E / N209 - Floor 3

The collection, curation and publication of geospatial information has been for centuries the sole prerogative of public sector organisations. Such data has been traditionally considered the reference source for datasets and cartographic outputs. However, new geospatial data sources (e.g. from the private sector and citizen-generated[1]) have emerged that are currently challenging the role of the public sector [2]. In response to this, governments are currently exploring new ways of managing the creation and update of their geospatial datasets [3].
Datasets of high relevance are increasingly produced by both private companies and crowdsourced initiatives. E.g., in 2022 Microsoft released Microsoft Building Footprints, a dataset of around 1 billion building footprints extracted from Bing Maps imagery from 2014 to 2022. More recently, in December 2022n Amazon Web Services (AWS), Meta, Microsoft, and TomTom founded the Overture Maps Foundation (https://www.linuxfoundation.org/press/linux-foundation-announces-overture-maps-foundation-to-build-interoperable-open-map-data), a joint initiative in partnership with the Linux Foundation with the aim to curate and release worldwide map data from the aggregation of multiple input sources including civic organisations and open data sources, especially OpenStreetMap data.
These initiatives aim to improve the coverage of existing governmental geospatial information through the release of open data and a strong dependency on OpenStreetMap. In particular, the Overture initiative has the explicit goal to add quality checks, data integration, and alignment of schemas to OSM data.
Recently, the Italian Military Geographic Institute (IGM, one of the governmental mapping agencies in Italy) has released a multi-layer dataset called “Database di Sintesi Nazionale” (DBSN, https://www.igmi.org/en/dbsn-database-di-sintesi-nazionale). The DBSN is intended to include geospatial information relevant to analysis and representation at the national level, with the additional purpose to derive maps at the scale 1:25,000 through automatic procedures. The creation of the DBSN builds on top of various information sources, with regional geotopographic data as primary source of information and products from other national public bodies (e.g. cadastral maps) as additional sources. The source is recorded in a specific attribute field for each feature in the database, with a list of codes referencing the various sources. Among the external sources used as input for the work of integration in the DBSN, OpenStreetMap was explicitly considered and used.
One of the elements of novelty, at least in the Italian context, is the release of the DBSN under the ODbL licence (https://opendatacommons.org/licenses/odbl), caused by the fact that the inclusion of OSM data requires derivative products to be released with the same licence.
Currently, the DBSN includes data covering only 12 out of the 20 Italian regions (Abruzzo, Basilicata, Calabria, Campania, Lazio, Marche, Molise, Puglia, Sardegna, Sicilia, Toscana, Umbria). The remaining ones will be released in the near future.
The datasets have been downloaded from the official IGM website in January 2023.
The DBSN schema is a subset of the specifications defined in the "Catalogue of Spatial Data - Content Specifications for Geotopographic Databases” (Decrete 10 November 2011) and is composed of 10 layers, 29 themes and 91 classes. We compared it with the OpenStreetMap specifications (based on the community-based tagging scheme at https://wiki.openstreetmap.org/wiki/Map_Features) and selected two main themes (buildings and streets).
The analysis was performed through a set of Python scripts available under the open source WTFPL licence at https://github.com/napo/dbsnosmcompare.
Firstly, we analysed—for buildings and streets in the IGM database—where OSM data was used as the primary source of information. The percentage of buildings derived from OSM is minimal, ranging from 0.01% in Umbria to 1.3% in Marche; regarding streets, the differences between regions increase, ranging from almost 0% in Abruzzo and Calabria to 94% in Umbria.
Secondly, we calculated the area covered by buildings and the length of streets in both the IGM and OSM databases to understand how much OSM completeness is good, compared to the official IGM dataset.
In the 12 regions, the area covered by buildings in OSM is on average about 55% of the corresponding area in IGM, while the percentage of the length of streets is about 78%. Anyway, these numbers are highly variable among regions, ranging between 32% in Calabria and 105% in Puglia for buildings, and between 46% in Calabria and 103% in Umbria for streets.
These first results show that the main source information in the DBSN (namely the official regional data) is highly variable across the 12 regions, which required the IGM to find additional data sources to fill the gaps. OSM plays a minor role for integrating buildings in the database, while it demonstrates a high potential for contributing to street information.
Results also show that, even with only a small contribution, some elements that are present in OSM are still not included in the DBSN. This can be due to at least two reasons: (i) the current workflow of selection of elements in OSM (through tags) does not include some potentially relevant elements; ii) the (ideally) daily update of OSM is able to bring in the database new features at a pace that is unbeatable by the IGM, and governmental organisations in general.
While this study highlights the importance that OpenStreetMap has achieved as a reference source of geospatial information for governmental bodies as well, providing evidence of its contribution to the national database of the IGM, iit also paves the way for improving OpenStreetMap itself by importing data for various layers, benefiting from the release of the DBSN under the ODbL licence.

Dr. Marco Minghini obtained a BSc degree (2008), an MSc degree (2010) and a PhD degree (2014) in Environmental Engineering at Politecnico di Milano. From 2014 to 2018 he was a Postdoctoral Research Fellow at the GEOlab of Politecnico di Milano, Italy. Since 2018 he works as a Scientific Project Officer at the European Commission - Joint Research Centre (JRC) in Ispra, Italy, focusing on (geo)data interoperability, sharing and standardisation in support of European data spaces, and contributing to the operation and evolution of the INSPIRE infrastructure. He is an advocate of open source software and open data. OSGeo Charter Member since 2015, active OpenStreetMap (OSM) contributor and Voting Member of the Humanitarian OpenStreetMap Team, Chair of ISPRS ICWG "Openness in Geospatial Science and Remote Sensing". He is a regular participant and presenter at global and local FOSS4G events. He was the Secretary and organiser of FOSS4G Europe 2015.

Alessandro Sarretta is a researcher at the Italian National Research Council (CNR), since 2019 in the Research Institute for Geo-hydrological Protection, in Padua, previously at the Institute of Marine Sciences, in Venice. He deals, in marine/coastal and now geomorphological domains, with environmental data management and processing, Spatial Data Infrastructures, implementation of Decision Support Systems, standards and interoperability of research data. He is interested and involved in various fields of "openness", from open source software to open science, open knowledge and participatory mapping (OpenStreetMap).