FOSS4G 2022 academic track

Analysis of the spatiotemporal accumulation process of Mapillary data and its relationship with OSM road data: A case study in Japan
08-24, 15:20–15:25 (Europe/Rome), Room Hall 3A

Japan's open infrastructure map development using OpenStreetMap was triggered by the Great East Japan Earthquake in 2011, which led to a widespread understanding of the activity, and by the end of September 2019, more than 35,000 unique users had made some kind of contribution, and the data is still being updated daily. The data is still being updated daily. In addition, the Mapillary project (Juhász and Hochmair, 2016; Mahabir et al., 2020) which started in April 2014, is a location-based landscape photo-sharing service that, like OSM, is crowdsourced and allows users to post photos of places around the world, not just on roads.

This activity has started to spread in Asia, especially in Japan, where the number of contributors and the number of photos taken is rapidly increasing (Ma et al., 2020). These voluntary crowdsourcing activities are a great incentive to work on the creation of micro-scale road data, especially those that cannot be maintained or updated by public agencies. On the other hand, most of the research on Mapillary to date has been concerned with technical methodologies, such as the study of ground object extraction based on deep learning of images using Mapillary data, and approaches such as local comparison of data generated by contributors, as is commonly done in OSM research, have not made much progress. This study was conducted in September 2014.

In this study, we obtained about 41.7 million log data through the Search Images API of Mapillary API ver3 taken in Japan from September 2014 to September 2019. Then, together with the line data of OSM roads at the same point in time, the maintenance status of Mapillary and OSM road data in municipal units in Japan was spatially analyzed mainly with QGIS, considering the time series and user trends. The data for the entire country of Japan to be analyzed is so huge that it is difficult to perform spatial analysis with the basic database (PostGIS), so we tried to add various attributes that can be analyzed spatially in QGIS by converting the data to FlatGeobuf format, which has been attracting attention recently. We also tried to add various attributes that can be analyzed spatially in QGIS. The added attributes include the administrative name of the local government in Japan, and the type, version, last editor, and date of data update of the road in the nearest vicinity of the taking photo point (maximum search radius set to 50m) from the OSM dump file obtained separately.

Some of the results of the analysis are as follows. The number of unique contributors who participated in the maintenance of Mapillary data for five years across Japan was about 1500, and it was found that the top 20 users generated about 90% of the data. The top three contributors each shared more than 5 million images. The number of contributors involved in the OSM road data as a comparison of user participation is about 4,800, suggesting that Mapillary data is generated by about 1/3 of the users compared to OSM.

We extracted the major contributors for each of the 1,700 municipalities in Japan and found that about 50 users were involved. Although the Mapillary data in Japan is supported by a smaller number of contributors than the OSM data, we succeeded in bringing to light the image of contributors in each region by analyzing the data on a micro-regional basis.

In terms of the number of Mapillary images taken and their spatial characteristics, the number of images taken on major roads (equivalent to OSM's highway = trunk or primary) in non-urban areas in the Tohoku region (especially Fukushima Prefecture: approximately 6 million images, Iwate Prefecture: about 4 million images) and Kansai region (Kyoto Prefecture: about 3 million images) is outstanding, while the number of images taken on sidewalks (highway = sidewalk) in the metropolitan areas of Tokyo and Osaka is low. In the metropolitan areas of Tokyo and Osaka, the data developed to supplement the OSM data for sidewalks (highway=path, footway, unclassified) and other road types that exist in reality but are not well-developed in the OSM data. In the paper, we plan to describe the local activities in Fukushima Prefecture and Kyoto City, where Mapillary activities are particularly active, in addition to comparisons at the national and municipal levels. In this paper, we also focus on the temporal transition of data maintenance. In this paper, we also focus on the temporal transition of data maintenance. Specifically, we analyzed the relationship between OSM data and the points where Mapillary images were taken using time series clustering.

This study is a multifaceted spatial analysis of long-term photography logs through Mapillary and the first study to reveal macro trends across Japan as well as more local trends in combination with attributes of road data from municipalities and OSM. In addition, by using distributed processing methods such as tiling technology and FlatGeobuf to obtain a large dataset of more than 41 million POIs (Points of Interests) from APIs and analyze the data spatially in QGIS, we were able to process the data without requiring a large-scale server. This is also a significant achievement. Finally, since the Mapillary log data used for the analysis is large-scale, we are planning to provide both archived data and spatially aggregated GIS data.

Spatially aggregated from Mapillary POI data (41 765 634) for all of Japan used in the analysis, we are providing both FlatGeobuf format data per municipal-level (232.2 MB; 32 attribute values) and per 1-km grid-level (126.5 MB; 20 attribute values), via a Github repository:
https://github.com/tossetolab/mapillary-analysis-japan.

See also: Slide (1.7 MB)

Dr Toshikazu Seto is a Associate Professor, Komazawa University, Japan. He is a member of OSGeo.JP, OpenStreetMap Foundation Japan and OSGeo foundation charter member. He is a social geographer and geographical information scientist. In recent years, he has been engaged in research on participatory GIS and civic-tech open data.