Marco Minghini


Sessions

07-16
16:30
30min
Monitoring the FAIRness of geospatial data: Lessons learnt from the European Union
Marco Minghini

Over the last decade, the global landscape of data sharing has gone through profound changes that also applied to the geospatial domain, where traditional Spatial Data Infrastructures (SDIs) have progressively evolved into multifaceted data sharing ecosystems [1]. These ecosystems embraced fundamentally new elements in terms of: (big) data sources (e.g. from research, Earth Observation, Internet of Things devices, crowdsourcing initiatives, synthetic data from Artificial Intelligence/Machine Learning algorithms or Digital Twins); technology and infrastructures (e.g. cloud/edge/fog architectures, standards to encode and share data, AI/ML models); actors such as private companies and citizens becoming valuable providers of data and services; legislation (e.g. to open up data, foster data sharing and protect privacy); and business models and governance mechanisms [2,3].
Within this modern and dynamic context, it becomes increasingly important to setup targeted, fit-for-purpose, and efficient mechanisms to monitor the status and evolution of SDIs to extract the insights needed by policy and decision makers. In this work, we present the experience of monitoring the implementation of the European SDI established after the INSPIRE Directive, reflect on the lessons learnt from the process, and distill some recommendations for future policy-relevant scientific work.
In force since 2007, the INSPIRE Directive [4] set the legal basis to create an interoperable pan-European SDI based on the SDIs of the European Union (EU) Member States, with legal and technical requirements on the FAIRness (Findability, Interoperability, Interoperability, Reusability) of data: discoverability through metadata, accessibility through network services, and interoperability through common data models. The status of implementation for each Member State is assessed through an annual monitoring exercise, in which 19 indicators defined in a legal act [5] are calculated based on the metadata harvested each year from the Member States catalogues. These are grouped in 5 categories focused on: i) availability of datasets, ii) conformity of metadata, iii) conformity of datasets, iv) accessibility of datasets, and v) conformity of network services. Since the entry into force of the legal act on INSPIRE monitoring [5] in 2019, the Joint Research Centre (JRC) of the Commission has calculated those indicators for 6 consecutive years, thus producing a valuable time series from which to derive insights and lessons learned, which in the following are grouped in three categories: i) geospatial resources (data, metadata and services); ii) tools/technology and iii) community/governance.
Regarding the resources shared by the Member States, over the last 6 years INSPIRE implementation has overall advanced. Although it remains heterogeneous across countries, aggregated statistics on the indicators show that more datasets have been made available and these are increasingly more interoperable and accessible. Nevertheless, challenges remain as e.g. i) some indicators rely on self-declarations of conformity made by data providers, which were proven to be unreliable; ii) indicators are provider-centric, i.e. they describe the offer from data providers but not their actual adoption and reuse by the public; iii) they do not analyse the quality of the datasets; and iv) while capturing the amount of available datasets, they do not capture the presence or lack of specific datasets. Research should investigate ways to address these challenges without increasing the required effort, e.g. by leveraging new, AI-based solutions.
Tools and technology have played a crucial role in the INSPIRE monitoring process. The calculation of indicators has been fully automated and the software stack has evolved over the years and currently includes all open source applications: the INSPIRE Geoportal based on GeoNetwork, the INSPIRE Reference Validator based on the ETF, and a set of custom-made open source Python and SQL scripts. The open source nature of the components, with clear release processes allowing data providers to test their implementations in advance, ensures objectivity, transparency and reproducibility of results. The use of a reference validation tool also brings legal certainty to the process. Additionally, Large Language Models (in particular, the open-source Mixtral) have proven extremely useful in refining, testing, and validating results. Finally, the monitoring process has benefited from integrating newly developed standards such as OGC API - Features for data sharing and GeoPackage for data encoding, enabling Member States to streamline and modernise their infrastructures in a legally viable way.
Finally, the success of the INSPIRE monitoring exercise, an iterative process undergoing incremental changes over the years, relies on establishing a continuous dialogue and building trust with the relevant community. This is achieved through a clear governance structure, provision of open and scientifically sound guidance on indicator calculation, clear explanation of results, and targeted, country-specific feedback on potential improvement areas. The process has also highly contributed to the evolution of the open source tools mentioned above.
The lessons learned from this unique SDI initiative, with no equivalent in temporal and spatial extension, can inform and benefit similar initiatives, while also highlighting the need for specific scientific and technological advancements, including in the field of open source, to further reduce the distance between data and (data-driven) decision-making.

Academic track
PA01 (Quarticle)
07-17
12:00
30min
Unlocking the value of geospatial data: early insights from the EU Open Data Directive
Marco Minghini

Published in 2019, the Open Data Directive (Directive 2019/1024) introduced the notion of high-value datasets. These are datasets from EU public sector organisations that – thanks to their reuse, especially from small and medium-sized enterprises (SMEs) – hold the potential to generate significant socioeconomic or environmental benefits as well as innovative services. As such, the Directive required that high-value datasets are made available free of charge, under open licenses (CC BY 4.0 or any equivalent or less restrictive license), via Application Programming Interfaces (APIs) and, where relevant, as a bulk download.
While the Directive only listed the six categories of high-value datasets (Geospatial, Earth observation and environment, Meteorological, Statistics, Companies and company ownership, Mobility), the subsequent Implementing Regulation – in force since February 2023 and applicable from June 2024 – provided the actual list of datasets to be made available by EU Member States, together with the requirements for their publication, e.g. in terms of granularity, key attributes and metadata. Three out of the six categories of high-value datasets (Geospatial, Earth observation and environment, and Mobility) include datasets with a geospatial nature, which were on purpose defined to match the datasets already in scope of the INSPIRE Directive (Directive 2007/2/EC). This is the Directive, in force since 2007 and currently under revision within the GreenData4All initiative, which established a pan-European spatial data infrastructure. INSPIRE was mostly focused on achieving data discoverability, accessibility and interoperability, but it did not provide requirements on data licensing. The result is that datasets are made available under several different reuse conditions, including only a portion of open data. Therefore, the high-value datasets Implementing Regulation is expected to add an open license requirement to INSPIRE data, thus opening up new opportunities for all stakeholders interested in reusing EU public sector data.
This talk will first describe the high-value datasets Regulation and provide an overview of the geospatial datasets in scope and the requirements on their provision. Afterwards, it will present the results of the first reporting exercise from EU Member States due in February 2025, compare the results with the indicators measuring the implementation of the INSPIRE Directive, and reflecting on lessons learnt and emerging good practices. Finally, the talk will zoom on specific and significant examples of EU public sector data that have been unlocked thanks to the Regulation and hold the potential to drive meaningful impact in various fields.

FOSS4G ‘Made in Europe’
EL11 (Geosolutions)