Awakening Dormant Geospatial Data: Structuring Large-Scale Government Documents with LLM FOSS4G 2026 general tracks

Awakening Dormant Geospatial Data: Structuring Large-Scale Government Documents with LLM
.ical
2026-09-02 15:30–16:00, Ran1

We use LLM to extract and structure geospatial data buried in 100K–1M+ PDF and Office files held by Japan's MLIT, enabling visualization, spatial analysis, and evidence-based policymaking — demonstrated through real-world use cases, no coding required.

By combining LLM-based structuring with spatial joins, we achieve robust data integration that goes beyond simple text matching.

Key features:

Batch structuring and high-speed parallel processing of PDF, Excel, Word, and PowerPoint files using LLM
Data cleansing, geocoding, and spatial joins to reconstruct documents as geospatial data
Spatial analysis and visualization leveraging the rich geographic density unique to MLIT datasets
End-to-end pipeline from data extraction through anonymization to open data publication

Who Should Attend:

Government and municipal officials interested in digital transformation and data infrastructure
Researchers and think tank professionals involved in EBPM
Data engineers, GIS developers, and no-code/low-code developers
Startups and corporate representative working on projects that utilize open or public data

Level of technical complexity: 1 - beginner I make my conference contribution available under the CC BY 4.0 license. The conference contribution comprises the abstract, the text contribution for the conference proceedings, the presentation materials as well as the video recording and live transmission of the presentation:

Kazuma Tsuchiya

I am leading the development of an application to be introduced in the talk.

Kasra Mahsouli

Tomohiro Akiya

I'm s software engineer at Eukarya Inc. I handle the development of LINKS-Veda, a government-facing platform built on GCP with LLM and OCR capabilities.

Awakening Dormant Geospatial Data: Structuring Large-Scale Government Documents with LLM .ical 2026-09-02 15:30–16:00, Ran1

Key features:

Who Should Attend:

Awakening Dormant Geospatial Data: Structuring Large-Scale Government Documents with LLM
.ical
2026-09-02 15:30–16:00, Ran1