FOSS4G 2022 general tracks

James Hilton

James Hilton is a principal research scientist in Data61, CSIRO. His current role involves the development of methods and analytical frameworks for geospatial analysis as well as modelling of natural hazards such as wildfires and floods.

The speaker's profile picture

Sessions

08-24
11:30
30min
Geostack: a high performance geospatial processing, modelling and analysis framework
James Hilton, Nikhil Garg

Large geospatial data sets generated by modern remote sensing and environmental modelling provide new opportunities for analysts, scientists, and researchers. However, the size of these data sets can present challenges due to the computation and resource management required for analytics and processing. Current solutions for processing such data sets largely focus on horizontal scaling approaches on, for example, distributed systems such as the Cloud, without fully exploiting the opportunities offered by modern computing architecture. Furthermore, the variety of formats and types of geospatial data often result in complex processing workflows composed of multiple tools for reading and writing, transformation, processing, and resource management. We present an introduction, overview, and demonstration of the open-source Geostack framework (gitlab.com/geostack/library). This has been developed to help simplify many common operations, provide economy of code and to transparently take advantage of modern CPU/GPU hardware. We have aimed to provide three main routes to simplify and accelerate geospatial processing. These are: 1) a unified interface to read vector and raster data and interoperate between them, with no software dependencies for common geospatial data formats, 2) treatment of all data as objects independent of geospatial transforms, with transparent resource management through an underlying tile-based caching system, reprojection and interpolation carried out where needed, 3) extensive use of OpenCL to provide computational acceleration and automatic processing vectorisation on GPUs and multi-core CPUs as well as user-defined scripts to be executed over these objects. The framework also includes many common geospatial operations as well as several base geospatial solvers (including moving fronts, flow networks, particle modelling) accelerated using OpenCL. Geostack is a C++ API with Python bindings, the code examples and demonstrations are presented in Python. The Python bindings are available through conda and fully interoperable with common Python libraries including numpy, gdal, xarray, netcdf, geopandas and sqlite, allowing users to use as much or as little of the Geostack functionality as required. We present demonstrations of several common geospatial tasks with benchmark comparisons to alternate workflows.

State of software
General online