FOSS4G 2023

Jashanpreet Singh

Over the past five years, I have honed my skills in the geospatial domain, gaining diverse experience in Climate tech startups, Agritech solutions, and Public urban transport planning. Throughout these experiences, I have heavily relied on Postgres + PostGIS, Python, and AWS technologies to drive my work.

My experience extends to working with satellite data (including Sentinel and Landsat), geospatial data modeling, and handling large datasets at scale in the cloud using Docker, Python, S3, etc

I am most interested in building a generic spatial-temporal database that can handle a wide variety of data and use cases (building digital earth in postgres)


Sessions

06-29
15:00
30min
Leveraging the Power of Uber H3 Indexing Library in Postgres for Geospatial Data Processing
Jashanpreet Singh

The Uber H3 library is a powerful geospatial indexing system that offers a versatile and efficient way to index and query geospatial data. It provides a hierarchical indexing scheme that allows for fast and accurate calculations of geospatial distances, as well as easy partitioning of data into regions. In this proposal, we suggest using the Uber H3 indexing library in Postgres for geospatial data processing.

Postgres is an open-source relational database management system that provides robust support for geospatial data processing through the PostGIS extension. PostGIS enables the storage, indexing, and querying of geospatial data in Postgres, and it offers a range of geospatial functions to manipulate and analyze geospatial data.

However, the performance of PostGIS can be limited when dealing with large datasets or complex queries. This is where the Uber H3 library can be of great use. By integrating Uber H3 indexing with Postgres, we can improve the performance of PostGIS, especially for operations that involve partitioning of data and distance calculations.

We propose to demonstrate the use of Uber H3 indexing library in Postgres for geospatial data processing through a series of examples and benchmarks. The proposed presentation will showcase the benefits of using Uber H3 indexing for geospatial data processing in Postgres, such as improved query performance and better partitioning of data. We will also discuss the potential use cases and applications of this integration, such as location-based services, transportation, and urban planning.

The proposed presentation will be of interest to developers, data scientists, and geospatial analysts who work with geospatial data in Postgres. It will provide a practical guide to integrating Uber H3 indexing with Postgres, and offer insights into the performance gains and applications of this integration.

Use cases & applications
UBT C / N110 - Second Floor
06-29
16:30
30min
Time series raster data in PostgreSQL with the TimescaleDB and postgis_raster
Jashanpreet Singh

Raster data is a type of digital image data that is stored and processed as a grid of cells, each of which represents a specific area or location in the image. This grid is known as a raster or pixel grid, and each cell contains a value that represents a characteristic of the corresponding area or location in the image, such as color, elevation, temperature, or other attributes. Depending upon the resolution of the data these raster file sizes can vary from a few MBs to few GBs. Hence reading data from a large set of raster dataset which has time dimension associated with it is challenging.

PostgreSQL can be used to store time series raster datasets, which are raster datasets that have a time dimension associated with them. This can be useful for storing and analyzing raster data that changes over time, such as satellite images, climate data, or land cover change data.

To store time series raster datasets in PostgreSQL, we will use the postgis_raster extension, which provides support for storing and manipulating raster data in the database, and the TimescaleDB extension to add time series functionality to PostgreSQL, allowing us to store and query raster data with a time dimension.

Using the TimeScaleDb extension we will partition the raster table by converting it to hypertable which is what TimescaleDB uses to optimally store and process time series data. This can help us to optimize query time.
For aggregated values from raster data over time and space, we will use the Continuous aggregate feature of TimescaleDB which is a form of materialized view to pre-compute and store raster data over time.
Moreover, TimescaleDB allows compression of data which can be very helpful in cases where the data is huge which is usually the case with raster datasets in postgres saving us space in the Database and optimizing some queries.

The proposed presentation will be of interest to developers, data scientists, and geospatial analysts who work with Raster datasets. It will provide a practical guide to querying the raster datasets in PostgreSQL with TimescaleDB and postgis_raster extension.

Use cases & applications
UBT C / N110 - Second Floor