09-11, 14:00–14:30 (America/Chicago), Grand C
Discover a fast, scalable hazard mapping service built with Python, cloud and FOSS4G stack. Explore how cloud-optimized GeoTIFFs, virtual rasters and GDAL stack enable efficient location-based querying of hazard data of the world.
The field of disaster resilience involves querying and computing on large volumes of meteorological and climatological data. Often this data is multi-dimensional, spatio-temporal and is traditionally stored in large, monolithic and records based formats such as CSV or netCDF. These formats can be slow and inefficient to access, especially for specific locations or time periods, posing a challenge for on-demand access or computation. This talk explores the methodology we built using the FOSS4G stack to process, store and query such voluminous geospatial data for both on-demand and batch processing.
The service, called Hazard Map Service utilizes Python, Kubernetes and cloud-native geospatial stack to build a data-processing pipeline that converts raw data in multiple formats and resolutions into a standardized raster format for various perils such as flood, wind etc. The raster is stored in a cloud-optimized manner using object storage services on the cloud. The second component of the service exposes a REST API which the clients can call to stream the voluminous hazard data from object storage on-demand.
The talk gets into detail on how we engineered this service, including the spatial file indexing system (built on open location codes), raster formats, asynchronous GDAL calls, the various optimization experiments we ran, their outcomes and lessons learned. These insights aim to aid the development of similar fast, lean, and cost-effective raster lookup services using cloud-native geospatial stack.