Reproducibility in geospatial research: a case study. FOSS4G 2026 general tracks

Reproducibility in geospatial research: a case study.
.ical
2026-09-01 16:00–16:30, Room2

Reproducibility in geospatial science is hindered by opaque data, proprietary software, and complex machine learning workflows. This talk highlights challenges in deep learning reproducibility and presents practical strategies, tools, and documentation practices to create transparent, repeatable experiments using open‑source technologies.

Computational reproducibility is the ability to obtain consistent results from the original work using the same input data and methods, and to conduct them by different researchers. To conduct the experiments, the input data and methods must be transparent. However, transparency and reproducibility remain a challenge across several research domains, including geoscience, hindering trust in findings.
In the field of geospatial science, numerous algorithms, such as statistical analysis and machine learning, are used to analyse and extract valuable insight from geospatial and remote sensing data. This analysis can be performed by using GIS software or through custom coding. While sharing codes and datasets may appear sufficient for reproducing the research, the researchers may still encounter compatibility issues when executing the code, e.g., incorrect file paths, missing libraries, or computational environment issues. In addition to these technical barriers, factors that hinder the reproducibility are the unavailability of data and code, using proprietary software, and time required to reproduce others’ work. Hence, open-source technologies are crucial for implementing reproducible workflows.
Recently, machine learning, especially DL, has been applied to numerous geoscience studies. However, ensuring reproducibility is more challenging in this context due to their complexity and many components or configurations are involved, which can exhibit non-deterministic behaviour.
Some of the main challenges in machine learning that affect reproducibility are model uncertainty and the training method. In this talk, we discuss how to mitigate these challenges and explore best practices for documenting model architecture and computational environments. We describe our experience using open-source technologies and best practices for reproducibility while developing a deep learning workflow for a case study.

Level of technical complexity: 2 - intermediate Indicate what is (are) the open source project(s) essential in your talk:

Python
Pytorch
Jupyter Notebook

Rosa Aguilar

Presenting author: Rosa Aguilar
Assistant professor at University of Twente, The Netherlands. Formal Education: PhD in Urban Planning – Computer Science background.

Rosa is an Assistant Professor at the University of Twente, where she coordinates the GeoAI module of the UNIGIS Master's programme. She is an active member of the QGIS community and brings a genuine enthusiasm for working with communities in participatory contexts. Her research focuses on developing machine learning models that support evidence-based decisions — and beyond her academic work, she is a dedicated advocate for women in STEM.

Reproducibility in geospatial research: a case study. .ical 2026-09-01 16:00–16:30, Room2

Reproducibility in geospatial research: a case study.
.ical
2026-09-01 16:00–16:30, Room2