The ecosystem of geospatial machine learning tools in the Pangeo world
10-18, 13:50– (Pacific/Auckland), Te Iringa

Several open source tools are enabling the shift to cloud-native geospatial Machine Learning workflows. Stream data from STAC APIs, generate Machine Learning ready chips on-the-fly and train models for different downstream tasks! Find out about advances in the Pangeo ML community towards scalable GPU-native workflows.


An overview of open source Python packages in the Pangeo (big data geoscience) Machine Learning community will be presented. On read/write, kvikIO allows low-latency data transfers from Zarr archives via NVIDIA GPU Direct Storage. With tensors loaded in xarray data structures, xbatcher enables efficient slicing of arrays in an iterative fashion. To connect the pieces, zen3geo acts as the glue between geospatial libraries - from reading STAC items and rasterizing vector geometries to stacking multi-resolution datasets for custom data pipelines. Learn more as the Pangeo community develops tutorials at Project Pythia, and join in to hear about the challenges and ideas on scaling machine learning in the geosciences with the Pangeo ML Working Group.

Geospatial Data Scientist/Machine Learning Engineer at Development Seed, developing tools for cloud-native geospatial machine learning! Open source maintainer for Python libraries like zen3geo, xbatcher and PyGMT. Check out my profile on GitHub and LinkedIn.