11-20, 13:30–13:55 (Pacific/Auckland), WG126
This presentation covers the development of a national data cube for Estonia, integrating remote sensing data and using open-source tools. It provides analysis-ready data for biodiversity and carbon research, overcoming technical hurdles. User-friendly tools and cloud computing enhance data access, empowering informed decision-making for sustainable development.
Introduction
Addressing global environmental challenges like land use and climate change requires timely, accurate information. Earth Observation (EO) data, from satellites and UAVs, is essential for monitoring these dynamics. Thanks to open data policies and advancements in software and cloud computing, EO data enhances environmental management and policy assessment, contributing to sustainable development. However, there are technical challenges, including data storage and analysis, and the need for computational architectures that handle large datasets.
Traditional data cubes often lack the readiness needed for advanced AI and machine learning techniques, which require structured, rich datasets. User-friendly platforms with intuitive access and customizable tools are crucial for researchers and policymakers.
Our project aims to create a comprehensive data cube for Estonia, utilizing remote sensing and geospatial data and open-source tools to advance biodiversity and carbon dynamics research. The fusion of LiDAR, radar, and passive remote sensing offers untapped potential for modeling, and multi-temporal datasets can predict vegetation and environmental variables effectively.
Data and Methods
We incorporated data from Sentinel-1, Sentinel-2, Landsat, and high-resolution airborne LiDAR. We used Google Earth Engine and Python for data pre-processing. Additionally, digital elevation models and the Estonian soil map were used to prepare the data cube layers. The study area was divided into manageable tiles using a spatial grid, creating 10m resolution Cloud Optimized GeoTIFFs (COGs) to facilitate efficient processing and downloading.
LiDAR data allowed us to calculate biodiversity-relevant indices, including ecosystem height, cover, and structural complexity. These were processed with tools like PDAL and laspy for precise classification and filtering.
The data cube runs on a high-performance cloud platform, using S3 storage for COGs and libraries such as rasterio to gather metadata. This metadata is integrated into a STAC-compatible web service, enabling seamless access through platforms like QGIS and Python for efficient querying and processing.
Data Cube Access
Our data cube portal (https://geokuup.ee/estonia) , developed with the Phoenix framework, utilizes MapLibre for data visualization. This setup supports quick visualizations and queries, organizing datasets into collections for user convenience. Users can create custom collections, tailoring data sets to specific research needs, which enhances the system’s flexibility.
By adopting best practices in geospatial data management, we leveraged open-source tools like GeoServer and pygeoapi, along with the Pangeo ecosystem, to streamline processing. The Phoenix framework offers a robust and efficient solution for managing concurrent users, ensuring stability and performance.
Outcomes and Future Work
The data cube provides high-resolution spatial data for academic and governmental purposes, with a strong focus on biodiversity and carbon research. It offers a scalable solution that can be extended to other research domains by incorporating additional data layers.
Future work will focus on processing data into multiple resolutions and expanding the range of datasets and workflows to enhance data retrieval and analysis. This will further support informed decision-making and sustainable development initiatives, empowering researchers and policymakers with timely environmental information.
Alex is an Associate Professor in Geoinformatics and a Distributed Spatial Systems Researcher with many years of experience in open-source geospatial data management and web- and cloud-based geoprocessing with a particular focus on land use, soils, hydrology, hydrogeology and water quality data. His interests include Discrete Global Grid Systems (DGGS), OGC standards and web-services for environmental and geo-scientific data sharing, modelling workflows and interactive geo-scientific visualisation.
Alex completed a Marie Skłodowska-Curie Individual Fellow (MSCA) with our Landscape Geoinformatics working group on improving standardised data preparation, parameterization and parallelisation for hydrological and water quality modelling across scales and has now started a 5-year project on spatial modelling of soil properties using machine-learning.
I'm a geospatial scientist specialising in big geospatial data management, data quality, and environmental modelling. Over the last five years, my research has focused on machine learning-driven predictive spatial modelling. I'm passionate about open-source tools and open data, and an avid participant in the 30DayMapChallenge. I serve as a Professor in Geoinformatics and lead the Landscape Geoinformatics Lab at the University of Tartu.