11-19, 16:00–16:25 (Pacific/Auckland), WG607
A retrospective on building cirrus, a cloud-native framework for building STAC-based data orchestration pipelines. We'll look at the design and architecture evolution over five years of development and some lessons learned adapting to ecosystem and requirement changes.
Cirrus is an open source, cloud-native framework for orchestrating geospatial data pipelines built using the concept of STAC (SpatioTemporal Asset Catalog) workflows. It provides a flexible and modular approach to deploying and managing serverless pipelines in AWS via python components and a Terraform-based deployment mechanism. Cirrus enables scalable, repeatable data processing workflows in the cloud, and is designed to help teams transform, validate, and catalog geospatial data in STAC-compliant formats at scales both large and small.
Over the past five years, cirrus has evolved from a directory of loosely-organized bits of configuration and components built on top of the Serverless Framework to a robust, open source cloud-native data pipeline management system. In this talk, I’ll share my journey maintaining and evolving cirrus from what I inherited to its current state, and lessons I’ve learned along the way.
Together we’ll explore cirrus’ origins and the original architecture and its challenges. We’ll examine the decision to shift away from duplicating deployment code via the configuration merge system of the first cirrus CLI, and what benefits and pitfalls that brought along with it. We’ll trace some of the tooling and ideas that spun out along the way, like stac-task and swoop. Finally, we’ll look at the version 1.0 release’s move away from Serverless Framework and the decoupling of the deployment logic from the codebase, the new cirrus Terraform module, and how this 1.0 release has prompted the reconsideration of what actually constitutes cirrus now.
Whether you're maintaining your own internal tooling, building cloud-native data processing pipelines, or just trying to keep an open source project healthy through shifting technical landscapes, this talk will offer practical insights drawn from real-world experience. We'll cover the technical decisions, tradeoffs, and lessons learned—especially those relevant to anyone maintaining cloud-native tooling in a fast-moving landscape.
Jarrett Keifer is a Senior Geospatial Software Engineer at Element 84, a commercial geospatial consultancy that uses open-source to build effective customer solutions. His interests include education and outreach, geospatial data formats, and high-performance systems/network programming. He enjoys designing systems to operate at scale, particularly to support remote sensing data processing and earth science applications, and has over ten years of experience contributing to open source projects.