Build a GraphRAG To Bring Geospatial Awareness to LLM Agents
11-03, 09:00–12:00 (America/New_York), Lake Thoreau

Graph-based retrieval-augmented generation (GraphRAG) can capture the semantics of geoinformatics data, which can be leveraged by LLMs. We will demonstrate how to build a GraphRAG and how to write prompts so that an LLM Agent can utilize it.


Large Language Models (LLMs) have the potential to address pressing global challenges, including public health, economic development, and disaster resilience. However, to effectively utilize LLMs in these domains, they require access to comprehensive geoinformatics data that is integrated by location. This data will enable LLMs to provide near-real-time decision support for problems that vary based on local geographic contexts. To achieve this sustainably and at scale, feature-level metadata is essential. However, geospatial data is currently managed at the feature class level.

For many years, we have treated data quality as an analytics problem, delegating dirty data to the data team for cleanup in the data warehouse or lake. This approach is not suitable for AI applications. GenAI applications operate in real time, making decisions on the fly. If the data is incorrect, incomplete, or poorly structured, AI will not rectify it. Instead, it will make erroneous decisions more rapidly. You can’t wait until the analytics layer to ensure data quality when AI agents need to reason, plan, and act in real-time.

This builds upon the workshop given at FedGeoDay, 2025, teaching participants a hands-on approach that leverages open-source software for publishing feature-level metadata using the data mesh architecture pattern. We will build a spatial knowledge graph (SKG) from feature classes that represent semantic geospatial relationships across entire networks of features in multiple domains, which can be updated in real-time to understand the downstream impact or cumulative effect of events of interest.

We will then publish the SKG as a GraphGRAG and learn to write prompts to teach an LLM Agent how to generate GeoSPARQL queries from natural language, and subsequently convert the query results into readable text. Finally, we will learn how we can display the features referenced in natural language answers on a web map!