Scalable geospatial processing using dask and mapchete
07-04, 11:30–12:00 (Europe/Tallinn), GEOCAT (301)

Dask is a flexible parallel computing library that seamlessly integrates with popular Python data science tools. With its task graph and parallel computation capabilities, Dask excels in managing large-scale computations on both the local machine as well as on a computing cluster.

Mapchete, an open-source Python library, specialises in parallelizing geospatial raster and vector processing tasks. Its strengths lie in its ability to efficiently tile and process geospatial data, making it a valuable asset for handling vast datasets such as satellite imagery, elevation models, and land cover classifications.

This talk delves into the integration of these two technologies, showcasing how their combined capabilities can be used to conduct large-scale processing of geospatial data. It will also show how we at EOX are currently deploying our infrastructure and which challenges we face when using it to process the cloudless satellite mosaics under the EOxCloudless product umbrella.

Joachim Ungar is Lead Cartographer and Geospatial IT Engineer at EOX, an open source software oriented company based in Vienna. He has a MSc degree in Cartography and Geoinformation and specialized in large scale processing of geodata, mainly satellite imagery, using GDAL and Python. He enthusiastically participates and contributes to FOSS4G events since 2011.