Eike Hinderk Jürrens

Eike Hinderk Jürrens is a software engineer working for more than 15 years in the geoinformatics field.


Session

07-01
15:30
30min
Optimizing resource usage of interoperable geospatial processing infrastructures with Kubernetes
Eike Hinderk Jürrens, Martin Pontius

The remote execution of processing workflows is a common task in many spatial data infrastructures and projects. To increase interoperability, the OGC (Open Geospatial Consortium) published the API Processes standard in 2021. Its RESTful design and use of JavaScript Object Notation (JSON) encoding make it suitable for cloud environments. Pygeoapi is an open-source implementation of this standard.

In pygeoapi, several plugins are available and a manager component must be implemented to manage process jobs. A common feature of the built-in managers is that the processing jobs are executed directly within the pygeoapi Python environment. Hence, a job with high resource demands influences the resource requirements and usage of the pygeoapi instance itself. Recognizing the limitations of pygeoapi’s built-in job managers regarding isolation and resource handling, we developed the pygeoapi-K8s-manager.

Resource sharing and non-existent job isolation are some of the disadvantages of this architecture. Due to the resource-intensive nature and need for scheduled execution of certain processes within our projects, we had to run a “heavy” pygeoapi deployment in our cluster. We also needed to execute processes in diverse runtime environments outside of Python, e.g., using the CUDA Fortran model execution.

We decided to decouple the management and execution layers to address these demands. Having already deployed a pygeoapi instance in our K8s cluster, it made sense to take advantage of the cluster's processing capabilities. Our team used K8s-CronJobs for process scheduling and K8s-Jobs for execution. The Kubernetes API server handled the process management and pygeoapi provided an interface. The resulting, generic “pygeoapi-k8s-manager” was developed based on EOX IT Services GmbH’s “pygeoapi-kubernetes-papermill“.

By decoupling management and execution, we were able to define complex process requirements such as using GPUs via Job properties. An autoscaler installed in the cluster applies these properties. This enables on-demand provision of the requested resources.

Our team implemented two processes: a HelloWorld-K8s process and a process to run generic images. The first process demonstrates how to run a preconfigured image. The generic process enables image configuration via the pygeoapi configuration file.

We will present the current pygeoapi-K8s-manager implementation, future development plans and illustrate its application through exemplary use cases, such as data ingestion, flood modelling and ship voyage optimization workflows. Listeners will gain practical insights into how Kubernetes and OGC API Processes can improve your geospatial data processing workflows, e.g., by reducing resource requirements. The talk will cover sustainable resource management and explain the operation of pygeoapi in a cloud-native environment. We aim to encourage wider adoption, feedback, and contributions to these ongoing developments through this conference.

Open standards and interoperability for geospatial
Auditorium