Exploring Cloud-Native Geospatial Formats: Hands-on with Vector Data Workshop FOSS4G 2025

Exploring Cloud-Native Geospatial Formats: Hands-on with Vector Data Workshop
.ical

11-18, 09:00–12:00 (Pacific/Auckland), WF503

Dig into geospatial vector formats—including GeoJSON, WKT/WKB, and cloud-native GeoParquet—using Python to see in detail how vector features are stored in each format and to understand what cloud-native means for vector data.

Cloud-native geospatial is all the rage these days, and for good reason. As file sizes grow, layer counts increase, and analytical methods become more complex, the traditional download-to-the-desktop approach is quickly becoming untenable for many applications. It's no surprise then that users are turning to cloud-based tools to scale out their analyses, or that traditional tooling is adopting new ways of finding and accessing data from cloud-based sources. But as we transition away from opening whole files to now grabbing ranges of bytes off remote servers it seems all the more important to understand exactly how cloud-native data formats actually store data and what tools are doing to access it.

This workshop aims to dig into how cloud-native geospatial data formats are enabling new operational paradigms, with a particular focus on vector data formats. Unlike its raster workshop counterpart, this workshop will be a bit more experimental. Vector data formats tend towards greater complexity than raster formats, so exactly how deep we get into which topics will be dependent on the audience’s interests and the time available. Broad themes to explore might include:

GeoJSON: what is it, what does it represent, and how it is not cloud-native
Well-Known Text/Binary (WKT/WKB): how these vector formats work and why they are important in GeoParquet
GeoParquet: how does parquet store data, how geo maps into that paradigm, and what it takes to read some subset of data from a parquet file
Other cloud-native formats like FlatGeobuf, PMTiles, etc.
Practical considerations when using these formats

The content of this workshop aims to not only be theoretical: a strong goal is to be as hands-on with these formats as possible by working with them in Python without any specific geospatial format libraries. We’ll look at interacting with object storage directly, to pull down files and fragments and inspect them, to build up working understanding of what common higher-level tooling does under the hood and abstracts away from users.

Prerequisites

This workshop expects some familiarity with geospatial programming in Python and a basic understanding of the vector data model and its utility. Most of the notebook code is already provided, so any gaps in understanding don't necessarily prohibit completing the exercises. That said, some knowledge of the geospatial vector formats and tooling is quite helpful.

Exploring Cloud-Native Geospatial Formats: Hands-on with Vector Data Workshop .ical 11-18, 09:00–12:00 (Pacific/Auckland), WF503

Prerequisites

Exploring Cloud-Native Geospatial Formats: Hands-on with Vector Data Workshop
.ical

11-18, 09:00–12:00 (Pacific/Auckland), WF503