2026-09-03 –, Conference Management Room3
Nominatim struggles with CJK place name search. Searching for "広島平和記念資料館" (Hiroshima Peace Memorial Museum) returns no results.
Its token-based search requires exact token matches and depends on complete alternative name data - which is often missing in OSM.
I demonstrate how PostgreSQL extensions can fix this with minimal code changes.
Nominatim is the default geocoder for OpenStreetMap and works well for many languages.
However, searching for place names in CJK (Chinese, Japanese, Korean) languages often produces poor results.
The Problem
Nominatim's token-based search has several limitations:
- Search queries must exactly match registered tokens.
- Alternative name fields (alt_name, short_name) could help, but they are often left empty.
- In rare cases, ICU transliteration produces incorrect values for the original name.
When any of these issues apply, searching for "広島平和記念資料館" (Hiroshima Peace Memorial Museum) returns no results - even though two matching features exist.
The Solution
I added full-text search capability to Nominatim using a PostgreSQL extension.
With this extension and a small set of code changes, searching for "広島平和記念資料館" now correctly returns the two expected results with fast performance.
Specifically, I achieved full-text search that works alongside Nominatim's existing token-based search with the following changes:
- Adding one text column to the existing search table
- Creating an index on that column
- Modifying three source files (a few lines each)
(This is a proof-of-concept patch. Further work would be needed for upstream inclusion in Nominatim.)
The advantage of this approach is that search accuracy improves using existing data alone - without requiring additional alt_name or short_name entries in OSM.
Why PostgreSQL Extensions
Nominatim's strength is that it runs on PostgreSQL alone, without requiring additional services.
My approach preserves this by using a PostgreSQL extension rather than adding an external search engine.
This keeps the deployment simple and the architecture unchanged.
PostgreSQL offers many extensions that could improve geocoding - not just for CJK, but for fuzzy matching, typo tolerance, and more.
I suggest that Nominatim could benefit from a plugin mechanism that allows communities to plug in the extensions best suited for their language and use case.
Broader Impact
This is not a Japan-only issue.
My analysis confirms that CJK search limitations and sparse alternative name data are common across Japanese, Chinese, and Korean OSM data.
The approach presented is applicable to all three languages.
I also discuss the complementary need for better OSM data.
Technical improvements and community data contributions go hand in hand - improving search technology helps even when data is incomplete, and enriching data makes all search approaches work better.
Open Source Projects
- Nominatim ( https://nominatim.org/ )
- PostgreSQL ( https://www.postgresql.org/ )
- PGroonga ( https://pgroonga.github.io/ ) - used as the PostgreSQL extension in this proof of concept
- OpenStreetMap ( https://www.openstreetmap.org )
- Nominatim documentation: https://nominatim.org/release-docs/latest/api/Search/
- PGroonga documentation: https://pgroonga.github.io/
- Nominatim ( https://nominatim.org/ )
- PostgreSQL ( https://www.postgresql.org/ )
- PGroonga ( https://pgroonga.github.io/ ) - used as the PostgreSQL extension in this proof of concept
- OpenStreetMap ( https://www.openstreetmap.org )
Software engineer in Japan.
Working on FOSS development, especially full-text search.