Christoph Friedrich
Studied B.Sc. Geoinformatics at ifgi / Uni Münster, then M.Sc. Informatics also in Münster. Now working at the Earth Observation Research Cluster / Remote Sensing Department of the University of Würzburg. Got hooked on processing big geodata in the cloud due to being involved in the openEO project while being a student at Edzer Pebesma's group. Now doing the datacube for the AgriSens project.
Sessions
Application-oriented research projects often involve diverse consortium members, from universities to research institutions, to authorities, to domain end users. This requires the integration of very heterogeneous data sources, the facilitation of their combined processing, and the presentation of the results in adequate ways. The challenge here is not only the technical realisation itself, but maybe even more to design solutions catering to this wide user group, balancing feature-richness with easy usability. Data is abundant, and processing plentiful, but it all needs to go the last mile to the final user.
One such research project is “AgriSens DEMMIN 4.0”, which is advancing remote sensing for the digitalisation in agricultural crop production. Thus, the user group includes programmers, domain scientists, as well as farmers. Until now, agriculture is not yet widely taking advantage of EO products. Therefore, the project not only addresses the creation of novel remote-sensing-based application techniques and their implementation, but puts an equally distinct emphasis on the development of an accompanying data integration and visualisation system. In this work, we describe how our architecture – consisting of several pieces of free and open source geo software – closes the gap between data providers and information consumers as it facilitates necessary analysis steps and combines these with adequate presentation to decision makers.
In our IT architecture, we utilise one central datacube to conquer this problem, which acts as a cloud-based geospatial data holding and computation platform. It gathers a multitude of data, ranging from optical and radar raster imagery through weather data to in-situ field measurements, and pre-processes it into an interoperable, analysis-ready state. These resources can then be accessed through APIs for external usage, or computations can be carried out directly on the datacube and the results immediately visualised with tools hosted on the same server.
The whole system is located at the Leibniz Supercomputing Centre of the Bavarian Academy of Sciences and Humanities (LRZ). Apart from utilising the computing resources available there, this also opens up synergies with already-existing projects: We can make use of the enormous amount of EO data that is already available within the LRZ’s “Data Science Storage” and the DLR’s “terrabyte” platform. These storages are directly mounted into our server so that the datacube can access petabytes of imagery without having to duplicate it again, saving costs and emissions.
At the core of our infrastructure is an instance of the “Open Data Cube” (ODC) software package. Metadata is ingested into its PostgreSQL database and can be retrieved via the built-in web-based data discovery application “ODC Explorer” or via an API endpoint of the emergent STAC standard (which in turn can be accessed via the “STAC Browser” web application or any other compatible software). All raster data is provided in the Cloud-Optimised GeoTIFF (COG) format, allowing efficient access even from remote machines.
Our main interface for scientific computation is Jupyter Hub, enabling collaborative work across institutions. For each user, a dedicated Jupyter Lab instance is spawned in its own Docker container that can access all the data of the previously mentioned storages and has a certain amount of computing resources allocated to it. Users can write their code in Python or R, arguably the most popular programming languages in the EO community, which offer straightforward packages for connecting to an ODC, namely “datacube” and “odcR”. This way, scientists receive the typical professional online analysis environment in which they can work closely to the data and use all the powers of these programming languages and their EO-friendly ecosystem.
Another option to work with the data is via openEO, the new standardised way to interact with big EO data cloud processing backends. We incorporate this standard by utilising the “openEO Spring Driver”, an adapter to translate user-submitted openEO process graphs into analysis code that can be run using ODC. For compatibility with legacy software, it is also possible to request rendered images of pre-configured analysis algorithms via WMS, which are being served by an instance of the powerful “datacube-ows” package.
The final goal, however, is to connect farmers and other end users to these results, who cannot or do not want to deal with complex interfaces. Therefore, these highly technical tools are not sufficient, but need to be accompanied by easy-to-use graphical interfaces. Drawing on the possibilities of modern web technologies, we realise these through purpose-built web apps. Arranged around an OpenLayers-powered map component, data products are streamed in COG format from the datacube and displayed along with the needed additional tooling for interpretation. For example, in this fashion we realised a demonstrator showcasing the results of a water balance model for irrigated potato fields.
Another outlet of the project is the “FieldMApp”, a mobile application designed to be used by farmers both in the field as well as in the office to digitise and monitor areas of lower yield within crop fields. For evaluating plant vitality, a vegetation-index-based raster product is calculated on the datacube using the latest Sentinel-2 imagery and long-term crop-specific averages. Due to the tablet application being programmed with the platform-agnostic Flutter framework, but its standard mapping component not yet supporting COGs, it was necessary to resort to WMS for serving the raster data. As mentioned above, this is comfortably possible by configuring “datacube-ows” on top of ODC.
Overall, the challenge of interoperating various data supplies, processing chains and custom-tailored interfaces – as typically encountered in interdisciplinary research projects – requires complex solutions, but can be achieved quite well by utilising a datacube approach with free and open source geo software building blocks. Our integrated system successfully demonstrates such a use case for the domain of remote sensing in agriculture.