2026-09-02 –, Dahlia2
Reproducibility in geospatial science is hindered by opaque data, proprietary software, and complex machine learning workflows. This talk highlights challenges in deep learning reproducibility and presents practical strategies, tools, and documentation practices to create transparent, repeatable experiments using open‑source technologies.
Computational reproducibility is the ability to obtain consistent results from the original work using the same input data and methods, and to conduct them by different researchers. To conduct the experiments, the input data and methods must be transparent. However, transparency and reproducibility remain a challenge across several research domains, including geoscience, hindering trust in findings.
In the field of geospatial science, numerous algorithms, such as statistical analysis and machine learning, are used to analyse and extract valuable insight from geospatial and remote sensing data. This analysis can be performed by using GIS software or through custom coding. While sharing codes and datasets may appear sufficient for reproducing the research, the researchers may still encounter compatibility issues when executing the code, e.g., incorrect file paths, missing libraries, or computational environment issues. In addition to these technical barriers, factors that hinder the reproducibility are the unavailability of data and code, using proprietary software, and time required to reproduce others’ work. Hence, open-source technologies are crucial for implementing reproducible workflows.
Recently, machine learning, especially DL, has been applied to numerous geoscience studies. However, ensuring reproducibility is more challenging in this context due to their complexity and many components or configurations are involved, which can exhibit non-deterministic behaviour.
Some of the main challenges in machine learning that affect reproducibility are model uncertainty and the training method. In this talk, we discuss how to mitigate these challenges and explore best practices for documenting model architecture and computational environments. We describe our experience using open-source technologies and best practices for reproducibility while developing a deep learning workflow for a case study.
Python
Pytorch
Jupyter Notebook
Presenting author: Rosa Aguilar
Assistant professor at University of Twente, The Netherlands. Formal Education: PhD in Urban Planning – Computer Science background.
Rosa is an Assistant Professor at the University of Twente, where she coordinates the GeoAI module of the UNIGIS Master's programme. She is an active member of the QGIS community and brings a genuine enthusiasm for working with communities in participatory contexts. Her research focuses on developing machine learning models that support evidence-based decisions — and beyond her academic work, she is a dedicated advocate for women in STEM.