Geodaysit 2023

The use of open-source machine learning techniques for urban features extraction
06-14, 15:15–15:30 (Europe/London), Sala Videoconferenza @ PoliBa

This research aimed to identify important urban features for sustainable development in the urban landscape of Turin, Italy, using machine learning techniques. Specifically, the study sought to identify physical and social elements such as buildings, roads, vegetation, and open land. The goal was to contribute to more sustainable urban environments.
The study employed the open-source platform QGIS and Orfeo Toolbox (OTB), a software library for processing images from Earth observation satellites. OTB offers various algorithms, including filtering, feature extraction, segmentation, and classification. The primary dataset used for classification consisted of orthophotos with 3 RGB bands at a resolution of 25 cm.
The challenge was encountered when classifying pavement and flat roofs, prevalent features in modern urban areas exhibiting similar radiometric contents in the spectral domain. Flat roofs play a significant role within sustainable urban environments, as they can be utilized to install green roofs improving energy efficiency and reducing the urban heat island effect. Additionally, in Italy, where most old roofs are typically made of “terracotta” tiles, flat roofs result being a relatively new feature in the urban landscape. Identifying flat roofs can, therefore, help monitor changes in urban morphology and land use over time.
To address this challenge, a 4th band was added as DEM (digital elevation model) exhibiting a Ground Sampling Resolution of 50 cm/pix. Its main application was to create an integrated data set providing information on the elevation of the terrain. This helped in distinguishing pavement and flat roofs based on their height difference. Adding the 4th band as DEM increased the dimensionality and complexity of the data, as a single pixel is now classified as four inputs, RGB and DEM. The random forest algorithm in OTB was applied using pixel-based classification, a machine-learning algorithm that combines multiple decision trees to create a robust classifier.
Five classes were generated for analysis using the unsupervised learning k-means algorithm from OTB: buildings, flat roofs, roads, vegetation, and open land. These classes represent the most common urban features of the study area, a linear concentration of urban settlements along major transportation routes. The random forest algorithm was then trained on these classes using a subset of the integrated dataset as training data. The trained model was used to classify the rest of the dataset, resulting into the final classification map.
Applying the random forest algorithm on the integrated dataset significantly improved accuracy, increasing the overall classification accuracy from 0.83 to 0.90. Notably, the accuracy for the road class rose from 0.796 to 0.944, while that for the flat roof class improved from 0.598 to 0.773. These results provide strong evidence for the effectiveness of using open-source platforms and tools like OTB to identify urban features sustainably. Furthermore, adding more bands, such as the DEM, can enhance the potential of these methods for creating more accurate and detailed maps of urban environments.
This study departs from traditional land cover and land use classification methods that rely on pixel-based classification using only spectral information. Pixel-based classification assigns a single class to each pixel based on its spectral signature, which may not fully capture urban features' spatial variability and heterogeneity. Additionally, discriminating between similar characteristics like pavement and flat roofs requires more than just spectral information.
It is worth noting that this study focused solely on identifying urban features, including buildings, flat roofs, roads, vegetation, and open land. However, suppose the goal is to identify a specific feature, such as only roofs or roads. In that case, the inclusion of irrelevant features in the dataset may result in redundant data and decrease the overall accuracy of the classification. Therefore, future studies may need to explore more advanced algorithms, such as convolutional neural networks, to improve the accuracy and efficiency of identifying specific urban features.