FOSS4G 2023

Time series raster data in PostgreSQL with the TimescaleDB and postgis_raster
2023-06-29, 16:30–17:00 (Europe/Tirane), UBT C / N110 - Second Floor

Raster data is a type of digital image data that is stored and processed as a grid of cells, each of which represents a specific area or location in the image. This grid is known as a raster or pixel grid, and each cell contains a value that represents a characteristic of the corresponding area or location in the image, such as color, elevation, temperature, or other attributes. Depending upon the resolution of the data these raster file sizes can vary from a few MBs to few GBs. Hence reading data from a large set of raster dataset which has time dimension associated with it is challenging.

PostgreSQL can be used to store time series raster datasets, which are raster datasets that have a time dimension associated with them. This can be useful for storing and analyzing raster data that changes over time, such as satellite images, climate data, or land cover change data.

To store time series raster datasets in PostgreSQL, we will use the postgis_raster extension, which provides support for storing and manipulating raster data in the database, and the TimescaleDB extension to add time series functionality to PostgreSQL, allowing us to store and query raster data with a time dimension.

Using the TimeScaleDb extension we will partition the raster table by converting it to hypertable which is what TimescaleDB uses to optimally store and process time series data. This can help us to optimize query time.
For aggregated values from raster data over time and space, we will use the Continuous aggregate feature of TimescaleDB which is a form of materialized view to pre-compute and store raster data over time.
Moreover, TimescaleDB allows compression of data which can be very helpful in cases where the data is huge which is usually the case with raster datasets in postgres saving us space in the Database and optimizing some queries.

The proposed presentation will be of interest to developers, data scientists, and geospatial analysts who work with Raster datasets. It will provide a practical guide to querying the raster datasets in PostgreSQL with TimescaleDB and postgis_raster extension.

Over the past five years, I have honed my skills in the geospatial domain, gaining diverse experience in Climate tech startups, Agritech solutions, and Public urban transport planning. Throughout these experiences, I have heavily relied on Postgres + PostGIS, Python, and AWS technologies to drive my work.

My experience extends to working with satellite data (including Sentinel and Landsat), geospatial data modeling, and handling large datasets at scale in the cloud using Docker, Python, S3, etc

I am most interested in building a generic spatial-temporal database that can handle a wide variety of data and use cases (building digital earth in postgres)

This speaker also appears in: