From Cron Job to Self-Healing Pipeline, using Argo and STAC for EO Data Ingestion.
2026-06-29 , A13

Building analysis-ready Earth observation products starts well before any algorithm runs. Source data need to be accessible, complete, up to date. That sounds obvious, but doing it reliably across multiple satellite missions while backfilling years of historical archives is not an easy task.

This talk is about how we built that foundation. The starting point is a simple Argo CronWorkflow that queries a STAC API and downloads one day of data to S3. Nothing impressive, but Argo already gives you things a cron job doesn't: built-in retries, a web UI showing exactly which step failed, and the full log. Your Python script doesn't change, you're just not the (only) one watching it anymore.

This talk follows what happened when we scaled this up across various satellite products. Each problem we ran into pushed us to add something: fan-out parallelism when sequential backfills were taking days, STAC as a logbook of what had already been ingested and what is missing, and eventually an observability layer when we needed to understand periods of higher error rate.

The combination of autonomous backfill and automated monitoring creates a system that self-corrects at two levels: individual failed items are retried via STAC gap detection, while systemic issues surface in daily reports for human intervention.
All the tools are open source: Argo Workflows, STAC API, Python, Kubernetes, CI pipelines Attendees will leave with a concrete understanding of what Argo Workflows gives you at each stage of complexity, from replacing a cron job to running a system you can trust "unsupervised".


Indicate what is (are) the open source project(s) essential in your talk:

Argo Workflows, STAC, Python, Kubernetes

Give indication of resources (video, web pages, papers, etc.) to read in advance, that will help get up to speed on advanced topics.:

Resources to read in advance:
- https://argo-workflows.readthedocs.io/ - https://stacspec.org/
- https://cloudnativegeo.org/ , background on STAC and cloud-native EO patterns
- Radiant Earth: STAC Tutorial (stacspec.org/tutorials)

Assign a number between 1 and 4 indicating the level of technical complexity of your contribution.: 2: some technical/thematic skills required Select at least one general theme that best defines your proposal: Data collection, data sharing, big data, data exploitation platforms, Applications of FOSS4G (disaster management, cartography, environment monitoring etc)) Under which license do you make your contribution available? The conference contribution comprises the abstract, the text contribution for the conference proceedings, the presentation materials as well as the video recording and live transmission of the presentation: CC BY

Loïc is a Cloud Engineer at Development Seed with a blend of scientific data expertise and cloud infrastructure experience. Before moving into software engineering, he spent over a decade as a physical oceanographer, building data-processing systems for large geospatial datasets from ocean robots, research vessels, buoys, and satellites.

At Development Seed, Loïc designs and builds scalable cloud platforms that help organizations process and access Earth observation datasets.