The meta-problem in geospatial data
2026-08-31 , 701

We will make an attempt to understand geospatial data forever!


It's easy to get lost in the ocean of geospatial data. We have multiple file formats, metadata specifications, new ideas of claiming they are more efficient than others, cults of supporters, or waves of dislikes. A gazillion data providers from different countries, software that does different stuff with them, and much confusion among the people.

Whenever I dig into a subject when I have a problem, I find many details that make me realize the play here is different. Lately, I've been dealing with metadata from multiple imagery providers that dump a wide range of vector and raster data formats (TIFF, NITF, SHP, KML, GML ...) alongside common metadata formats like XML and JSON. As diversity increases, you encounter more problems with tasks that are supposed to be easy, like opening a file with Python or just displaying it in QGIS. When you transfer data from system to system, download, transform, and upload, you realize your assumptions about GDAL are not very true.

We will dig in together and handle various datasets using GDAL, Python (on Docker), and QGIS on your local machine. We will only try to perform very simple tasks, like visualizing them and maybe reading them. Tweaking a couple of things, we'll see how these tools break or why they don't break. On the parallel side, we'll discuss specifications like STAC and internalize their necessity as we progress.

The idea of the workshop is to get confused. At any level you are, you are welcome to get confused here, and I'd encourage you to do so.

PS: I'm not an expert on anything


Level of the workshop: 1 - beginner Pre-requirements for attendees:

Docker, QGIS, and some instructions to complete before the workshop, which will be shared later.

What skills do participants require to have?:

Curiosity

Working as a data engineer at UP42.
I'm a lifelong amateur actor.