Eurostat vs OSM vs Census: Choosing Open Mobility Data for Urban Function Maps
2026-09-03 , Conference Management Room2

Which open mobility dataset should you trust for urban function analysis? Using Copernicus Urban Atlas polygons, we compare Eurostat experimental MNO statistics, OpenStreetMap GPS traces, and census commuting flows. You will learn their biases, and get a fully reproducible Python/PostGIS workflow.


Open mobility data is everywhere, but different sources answer different questions -- and the differences are easy to miss until results conflict. This talk presents a reproducible, open-source workflow to compare three widely accessible mobility proxies on a common spatial reference.

Using Copernicus Urban Atlas polygons and the DEGURBA classification (derived from 1 km2 population grid cells), we convert each source into comparable density-normalised temporal indicators (day/night ratios, intraday profiles, weekday/weekend patterns). The three sources:

  • Eurostat experimental MNO statistics (aggregated, anonymised mobile network operator statistics published by national statistical offices)
  • OpenStreetMap public GPS traces (community-contributed trace archive; participation bias applies)
  • Census commuting flows from Eurostat (static origin-destination baseline)

We apply a simple clustering step (HDBSCAN) to group similar temporal profiles into functional signatures (residential, office, late-evening activity, mixed-use), and use UMAP only as a visual explanation aid. Instead of selling one "best" dataset, we provide a practical decision guide: which source is best for presence vs flows, what biases to expect, and how to combine sources when a single source falls short.

Early findings: MNO statistics capture temporal presence well but availability and comparability vary by country; OSM GPS traces reflect contributor behaviour more than population-level patterns; census flows miss intraday dynamics but anchor the OD baseline. Where sources agree, classification is robust; where they diverge, the divergence reveals structural data limitations worth knowing.

What you take away:

  • A decision matrix with concrete rules of thumb (e.g., intraday presence -- start with MNO; commuting structure -- census OD; fine-grain routes/activities -- OSM, with known participation bias)
  • Typical biases and coverage gaps of each source across different European urban contexts
  • A reproducible pipeline: Python + PostGIS + OSRM + QGIS-ready layers and scripts
  • We will publish the full pipeline as an open repository (Docker/conda, pinned environments, end-to-end scripts to regenerate figures and maps)

This talk is for anyone choosing open mobility data for urban analysis who wants to stop guessing and start comparing.


Level of technical complexity: 2 - intermediate Give indication of resources (video, web pages, papers, etc.) to read in advance, that will help get up to speed on advanced topics.:

Optional advance reading (for context, not required):

1) Eurostat / ESS position paper on MNO data for official statistics (methodology, challenges, comparability):
https://ec.europa.eu/eurostat/documents/7870049/17468840/KS-FT-23-001-EN-N.pdf

2) DEGURBA methodology (Degree of Urbanisation; derived from 1 km² population grid cells):
https://ec.europa.eu/eurostat/web/degree-of-urbanisation/methodology

3) Copernicus Land Monitoring Service – Urban Atlas product page:
https://land.copernicus.eu/en/products/urban-atlas

4) Copernicus/CLMS data policy (attribution, modifications, no-endorsement):
https://land.copernicus.eu/en/data-policy

5) OpenStreetMap copyright & license (ODbL, attribution requirements):
https://www.openstreetmap.org/copyright

6) OSRM backend (routing engine for travel-time/distance):
https://github.com/Project-OSRM/osrm-backend

Indicate what is (are) the open source project(s) essential in your talk:

PostgreSQL + PostGIS; QGIS; Python; GeoPandas; OSRM; OpenStreetMap; Docker; scikit-learn; hdbscan; umap-learn

I make my conference contribution available under the CC BY 4.0 license. The conference contribution comprises the abstract, the text contribution for the conference proceedings, the presentation materials as well as the video recording and live transmission of the presentation:

Senior Geospatial Analyst at Rockup, building neighborhood analytics tools. Previously Geospatial Researcher (R&D) at Habidatum, developing cross-country urban mobility pipelines for European policy institutions — OD matrices, temporal land-use profiling, service accessibility mapping across 16 countries. Former geospatial data scientist at Yandex, where I built GeoAI prediction models and led spatial feature engineering for service expansion. Invited lecturer on geospatial data science (MIPT Deep Learning School) and QGIS (RheinMain University, Germany). Jury member at IAAC Barcelona. Daily tools: Python, GeoPandas, PostGIS, QGIS. Admitted to MSc Geomatics at TU Delft. I run URBAN_MASH, a geoanalytics community (2,200+ subscribers).

This speaker also appears in: