FOSS4G NA 2024

Vector DBs and why should geospatial people care
09-09, 09:00–12:00 (America/Chicago), Deloitte Conference Room

Get started with shiny new AI/ML! This workshop is hands on AI/ML vector data and use cases. Starting with unstructured data sets we will create vectors, put them in PostgreSQL, and then use pg_vector (and PostGIS) for analysis.


The recent rise of “AI,” and its potential impact, has made it the focus of discussion (and hype). Vector databases play a vital role in this emerging area of technology. These databases can play a key role in your search and AI workflows. There are data stores built de novo, just for vector data and most of the major traditional data stores have added vector capabilities as well.

This workshop is going to get you hands on with vector data and some use cases. You will be able to answer questions such as: What are these “vectors,” where do they come from, how do you query them, what are their use cases, and what role are they going to play in your existing infrastructure?

We will look at some unstructured data sets, then look at their vectors, next put them in a vector store (PostgreSQL pg_vector), and finally do some fun and interesting queries. Bring your laptop and the ability to read Python code. If you are comfortable with Python, then you should be able to extend the code.

Come get your hands dirty with this innovative technology, discuss its pros and cons, and get a sense of how to pick a starting vector database.

Our vector database will be PostgreSQL with both PostGIS and pg_vector installed. Both of the datasets used in the examples come from open data set
1. https://alex.macrocosm.so/download
2. https://github.com/cvdfoundation/google-landmark

The programming will be in Python and SQL. We will run it all using devcontainers running on GitHub.