Portable Spatial-Semantic RAG for 3D City Models Using DuckDB
2026-09-02 , Dahlia1

“Find me sunny buildings near Hiroshima Station.”
This may sound easy for AI, but handling spatial relationships and 3D information such as which way building surfaces face is not straightforward.
This talk presents a portable DuckDB-based approach to 3D city model search that combines spatial, semantic, and geometric cues.


The Problem
Imagine asking: “Find me sunny buildings near Hiroshima Station.”
It sounds simple, but it actually involves two different problems.

The first is a spatial problem, such as “near Hiroshima Station.” Standard RAG can capture textual similarity, but it cannot handle spatial relationships such as distance, containment, or area-based search. For example, if you ask for buildings near Hiroshima Station, it treats near as just another word.

The second is a 3D problem, such as “sunny.” To answer this, we need some understanding of which way building surfaces face. Project PLATEAU, Japan’s urban digital twin initiative, provides detailed 3D building data in CityGML, including semantic surface labels for walls, roofs, and ground surfaces. In practice, PLATEAU data is often converted to simpler formats like GeoPackage so it works in standard GIS tools. In this process, all surfaces of a building are merged into one geometry — which is much easier to work with, but those surface type labels are lost along the way.

Our Approach
We built a portable workflow that handles both spatial and semantic search in a single pipeline.

To address the first problem — spatial relationships such as “near Hiroshima Station” — we use DuckDB with its spatial and vss extensions so that geometry, attributes, embeddings, and indexes can be managed in a single .duckdb file. This allows geocoding, location-based filtering, and semantic ranking to be handled in one flow, without setting up a separate database server.

To address the second problem — conditions related to 3D building shape, such as “sunny” — we added a step that recovers lost surface information from geometry. More specifically, we analyze each polygon face in the 3D geometry, compute a normal vector, classify each face as roof, ground, or wall, and estimate wall direction. We then encode this recovered information — together with attribute data — into searchable building descriptions and use them to generate embeddings.

This face-level step is implemented by combining DuckDB with Python-based geometry processing. As a result, the overall workflow remains portable without requiring a separate spatial database server.

What This Talk Covers
We validated this workflow on PLATEAU data covering the area around Hiroshima Station. The talk introduces the full workflow, shares what worked well and what was difficult, and discusses how this approach can be used in practice.


Level of technical complexity: 3 - advanced Give indication of resources (video, web pages, papers, etc.) to read in advance, that will help get up to speed on advanced topics.:

Project PLATEAU
https://www.mlit.go.jp/plateau/en/

DuckDB Spatial Extension
https://duckdb.org/docs/stable/core_extensions/spatial/overview

DuckDB VSS Extension
https://duckdb.org/docs/stable/core_extensions/vss

Indicate what is (are) the open source project(s) essential in your talk:
  • DuckDB (with spatial and vss extensions)
  • Python
  • MapLibre GL JS
  • PLATEAU GIS Converter
I make my conference contribution available under the CC BY 4.0 license. The conference contribution comprises the abstract, the text contribution for the conference proceedings, the presentation materials as well as the video recording and live transmission of the presentation:

I am an engineer specializing in GIS, remote sensing, and AI at a Japanese aerial surveying company. I combine satellite imagery, aerial photography, and machine learning to tackle real-world challenges through geospatial technology.

This speaker also appears in: