FOSS4G 2022 general tracks

State of GeoPandas and friends
2022-08-24, 14:15–14:45 (Europe/Rome), Room Limonaia

GeoPandas is one of the core packages in the Python ecosystem to work with geospatial vector data. By combining the power of several open source geo tools (GEOS/Shapely, GDAL/fiona, PROJ/pyproj) and extending the pandas data analysis library to work with geographic objects, it is designed to make working with geospatial data in Python easier. GeoPandas enables you to easily do operations in Python that would otherwise require desktop applications like QGIS or a spatial database such as PostGIS.

This talk will give an overview of recent developments in the GeoPandas community, both in the project itself as in the broader ecosystem of packages on which GeoPandas depends or that extend GeoPandas. We will highlight some changes and new features in recent GeoPandas versions, such as the new interactive explore() visualisation method, improvements in joining based on proximity, better IO options for PostGIS and Apache Parquet and Feather files, and others. But some of the important improvements coming to GeoPandas are happening in other packages. The Shapely 2.0 release is nearing completion, and will provide fast vectorized versions of all its geospatial functionalities. This will help to substantially improve the performance of GeoPandas. In the area of reading and writing traditional GIS files using GDAL, the pyogrio package is being developed to provide a speed-up on that front. Another new project is dask-geopandas, which is merging the geospatial capabilities of GeoPandas with the scalability of Dask. This way, we can achieve parallel and distributed geospatial operations.

I am a core contributor to Pandas and Apache Arrow and maintainer of GeoPandas. I did a PhD at Ghent University and VITO in air quality research, worked at the Paris-Saclay Center for Data Science, and currently, I am a freelance software developer and teacher and working for Voltron Data.

This speaker also appears in: