Vitor George
Software Engineer at Development Seed
Sessions
Visualizing and analyzing large-scale geospatial datasets, such as Overture Maps' comprehensive global dataset containing more than 3 billion features grows more challenging as these datasets continue to expand in size and scale. This presentation introduces Lonboard, a cutting-edge open-source Python library designed to address this challenge by enabling fast, interactive geospatial vector data visualization within Jupyter notebooks.
Lonboard's exceptional performance and ease of use stem from its innovative architecture, built on four key technologies: deck.gl for GPU-accelerated rendering, GeoArrow for efficient in-memory representation, GeoParquet for optimized file storage, and anywidget for seamless Jupyter integration. This powerful combination allows Lonboard to move data from Python to JavaScript and then to the GPU with unprecedented efficiency.
Unlike existing solutions that rely on slower GeoJSON encoding, Lonboard employs a fully binary pipeline. It leverages GeoPandas as the primary user interface, internally managing conversions to GeoArrow and GeoParquet for efficient data transport and rendering. This approach not only accelerates data processing but also significantly reduces the data size transferred to the browser.
We will demonstrate Lonboard's capabilities using Overture Maps data, which is provided in GeoParquet format as monthly releases. This showcase will highlight how Lonboard's simple interface allows researchers and data scientists to effortlessly visualize and interact with cloud-native, optimized geospatial data at a global scale.
Every day, almost 2.5 million edits are made in OpenStreetMap. To maintain the high quality and reliability of OSM data, keeping track of changes is crucial. In 2024, a new OSM data pipeline was built for OSMCha, one of the main OpenStreetMap data validation tools. This pipeline is fully open-source, and it allows us to visualize and run data quality checks on each edit made on OSM. Besides the set of open-source tools and the Kubernetes deploy infrastructure, the resulting data is available for free under the AWS Open Data program. We’ll share how this new pipeline streamlines data integrity and enables developers to build downstream cloud-native applications to monitor the changes happening in OpenStreetMap. We will also demo Gradient, a web application that displays OSM edits using this new real-time OSM pipeline.