FOSS4G 2022 general tracks

Spatial data processing with workflow engines
2022-08-25, 10:10–10:15 (Europe/Rome), Room 4

Workflow engines like Apache Airflow are commonly used in data engineering nowadays. They provide an infrastructure for setting up, executing and monitoring a defined sequence of tasks, arranged as a workflow application. Tasks and dependencies are defined in a declarative way or in a programming language like Python. Airflow established using directed acyclic graphs (DAGs) to manage workflow orchestration.

This talk compares a selected subset out of the huge number of available Open Source workflow engines, which are especially suited for workflows containing spatial data processing. It compares the well known Apache Airflow engine with Dagster, an other solution using DAGs and a BPMN-based workflow engine using Celery as distributed task queue.

In the same space there is the new OGC API - Processes standard which is a modern REST API for wrapping computational tasks into executable processes. This talk gives an overview of the API and shows possible integrations with available workflow engines.

Pirmin is a geospatial software developer since more than 15 years. He has contributed to GDAL, QGIS, T-Rex and several other projects. Pirmin is co-founder of Sourcepole, a Swiss company providing GIS services and solutions.

This speaker also appears in: