FOSS4G 2022 general tracks

Not too big, not too small: open source geospatial units that are just right
08-25, 17:55–18:00 (Europe/Rome), Room 4

Publicly available data tends to be spatially aggregated to administrative units, limiting the feasibility of nuanced analyses that reflect the natural state of communities and provide actionable insights for a wide range of stakeholders. While higher resolution data is generally available within government agencies, access for external researchers is limited due to well-established privacy concerns. Inspired by our own use case of developing a regional quality of life metric for neighborhoods in Denmark, our team at Aalborg University’s Department of the Built Environment, in collaboration with data.org’s Growth and Recovery Challenge, and Data Clinic, set out to develop and open source not only foundational granular spatial units and data that adhere to privacy laws, but also the accompanying methodology that has the potential for broad applicability in other countries.

In this presentation, we will demonstrate the methodology’s generalizability, particularly across common European land use and geographical features, and show how the resulting high-resolution shape files and community data can become crucial tools for government decision-makers, community organizations, and researchers in their efforts to increase transparency and engage in practical, actionable research.

Focused initially on our Denmark use case, we algorithmically create spatial units with minimum household and population counts from country-wide hectare cell level data. Our approach uses data on road networks and administrative boundaries to create socially meaningful component polygons. This is achieved by developing tools based on already existing open source packages available in R and Python. The hectare cells are then mapped onto the polygons and clustered using the max-p regionalization algorithm with constraints on the minimum population and household counts to arrive at the final set of spatial units.

To improve the accessibility of this data to not just researchers but also administrative decision-makers, community organizations, and the general public, we are developing an online tool to explore and visualize indicators within the resulting fine-grained regions such as disposable income, educational level, housing prices, migration rates, distances to public institutions, and labor market attachments in Denmark. Regional inequality in Denmark has increased over time, and with the help of this tool, we hope to provide the ability to study these key metrics both within and across municipal regions. In the development of the tool, we prioritize user feedback and common use cases to ensure both applicability and longevity.

This project has been developed with an open-source mindset by: 1) creating flexible open data resources that can adapt to a wide range of public use cases 2) open sourcing the methodology for use in other countries/regions and 3) enabling the use of existing open data and tools such as Open Street Maps, R and Python in the pipeline.

We firmly believe that the project has the potential to improve knowledge sharing and collaboration between GIS experts, decision-makers, researchers and the general public not only in Denmark, but also in Europe and beyond.

Postdoc at Aalborg University (BUILD), PhD in Social Statistics from the University of Manchester.

Fields of interest include:
- Bayesian methods, structural equation modelling and spatial modelling.
- Survey research methods and measurement invariance in the social and political sciences
- Data visualisation and dashboards

Head of Operations of Data Clinic | Two Sigma