Anatomy of a file
2026-09-03 , Conference Management Room4

Raster, vector, point cloud: every geospatial format solves the same core problems, including linearization, chunking, compression, and metadata. Let’s build a format from scratch to see these concerns in practice.


Every geospatial format exists to solve the same fundamental problem: how do we persist spatial (and sometimes temporal) data so it can be efficiently stored, shared, and queried? These concerns apply to all data types: raster, vector, point cloud, etc. Yet as a community, we often treat formats as islands, each with its own ecosystem, tooling, and mental model.

In this talk, we start from data. We have geospatial data and we need to persist it. How do we linearize multidimensional data into bytes? How do we chunk it for partial reads? Are we going to support writing in parallel? What encoding and compression strategies will we use and what tradeoffs do they carry? As we work through these questions, the need for metadata arises — coordinate systems, data types, spatial indices, encoding indicators— because without it, the data is useless. Piece by piece, we build up the anatomy of a file.

We then map what we've built to real formats. The same structures appear in each: chunking strategies, index structures, encoding schemes, metadata organization. The specific tradeoffs differ, but the underlying anatomy is consistent across raster, vector, and beyond.

This consistency raises a question worth confronting: if formats share this much of their architecture, why does our tooling treat them as fundamentally different things? Perhaps we should be recognizing our ecosystems have built walls around format tradeoffs that the underlying anatomy doesn't justify.


Level of technical complexity: 2 - intermediate I make my conference contribution available under the CC BY 4.0 license. The conference contribution comprises the abstract, the text contribution for the conference proceedings, the presentation materials as well as the video recording and live transmission of the presentation:

Jarrett Keifer is a Senior Geospatial Software Engineer at Element 84, a commercial geospatial consultancy that uses open-source to build effective customer solutions. His interests include education and outreach, geospatial data formats, and high-performance systems/network programming. He enjoys designing systems to operate at scale, particularly to support remote sensing data processing and earth science applications, and has over ten years of experience contributing to open source projects.

This speaker also appears in: