UNS Novi Sad: LiDAR Dataset for point cloud classification in urban areas
07-17, 14:45–14:50 (Europe/Sarajevo), PA01

1.1 Introduction
Automatic and reliable 3D point cloud classification is a crucial yet challenging task with applications across various domains, including urban planning, 3D modeling, and the development of smart cities. Airborne LiDAR (Light Detection and Ranging) has emerged as an efficient and effective tool for conducting large-scale 3D surveys of urban areas, offering high spatial resolution and accurate data collection. Over the years, numerous algorithms and methodologies have been proposed for point cloud classification. Despite advancements in machine learning and deep learning, this task remains a significant challenge in the geospatial community.
One of the primary challenges lies in the availability of sufficient labelled data for training classification algorithms. The creation of publicly accessible, large-scale datasets is essential for developing and benchmarking new methods. While several databases have been introduced, such as ISPRS Vaihingen (Niemeyer, et. al., 2014), LASDU (Ye, et. al., 2020) or AHN3 (AHN, 2024), they have some limitations. For instance, while LASDU and AHN3 datasets are valuable resources for point cloud classification, they lack the comprehensive diversity of urban-specific classes, limiting their utility in capturing the complexity of dense urban environments.
The ISPRS benchmark dataset is most commonly used in resources in this field. It provides a point cloud classified into nine classes, along with features such as x, y, z, intensity, return number, and the number of returns. Additionally, the synchronized orthophoto of the same area is available, providing valuable context for classification tasks. However, this dataset also presents some challenges, including its highly unbalanced class distribution and the relatively small number of points available, particularly for training deep learning methods.

In this paper, we introduced an aerial LiDAR point cloud dataset, UNS Novi Sad, designed specifically for the classification of complex urban environments. The dataset comprises over five million points, classified into seven distinct classes, and is focused on the City of Novi Sad, which is known for its unique urban morphology. The city’s layout reflects the architectural and planning styles typical of Southeastern Europe in the post-World War II era, featuring a mix of high-density residential blocks, green spaces, wide boulevards, narrow streets, and diverse building types. These characteristics ensure that the dataset captures a wide range of structural and spatial variations. In addition to providing new data, we evaluate the PointNet and PointNet ++ algorithm for classification of the proposed dataset.
1.2 Study area
The UNS Geo dataset comprises over five million points, classified into seven distinct classes, and is focused on the City of Novi Sad, which is known for its unique urban morphology. The city’s layout reflects the architectural and planning styles typical of Southeastern Europe in the post-World War II era, featuring a mix of high-density residential blocks, green spaces, wide boulevards, narrow streets, and diverse building types. These characteristics ensure that the dataset captures a wide range of structural and spatial variations.
The study area is in the urban area of Novi Sad (Figure 1.), consisting of Liman, located in the southeast part of the city, and the left Danube bank with high residential blocks, spacious green areas, and boulevards. The topography of the study area is flat, with an average elevation of 77 m. The ALS point cloud data were collected using a Riegl LMS-Q680i laser scanner and a digital camera DigiCam H39 onboard a helicopter.
The total number of annotated points is 5.4 million of points. The dataset is divided into two .las files: for training and for testing.
Table 1. Dataset characteristics
Num. of points Point density [pts/m2]
Training 4.8 M 37
Test 0.6 M 35

In the .las file, each point was assigned the following attributes: Position: X, Y, Z coordinates of each point in UTM 34N (EPSG:32634) projection, Intensity, Return number, Number of returns, Classification, Scan Angle Rank, Time, RGB: Each
Regarding the labeling, the automatic, semi-automatic, and manual classification was used. We selected classes with a focus on different applications such as mapping, urban planning, and forestry monitoring. The points are classified into seven different classes: ground, roads, parking, pedestrian lens, buildings, high vegetation, and cars. The training and testing datasets have a similar distribution, except for pedestrian lenses and cars. The high vegetation class points contain the largest number of points. This is expected since the multiple returns are characteristic of this class and it is also a commonly occurring class in this type of city. The car class only reaches 2.64 % of all labeled points, making them one of the most challenging classes to detect. The imbalance of classes should be considered during the training or testing phase.
1.3 Classification
To provide a brief evaluation of the proposed dataset, the supervised classification to label points was performed. The PointNet architecture is a neural network that directly classifies raw point cloud. PointNet++ applies PointNet to local neighbourhoods to capture local features. The evaluation metrics include the recall, precision, and F1 score.


Give indication of resources (video, web pages, papers, etc.) to read in advance, that will help get up to speed on advanced topics.

AHN, 2024. Actueel Hoogtebestand Nederland. https://www.ahn.nl/. (1 November 2024)

Niemeyer, J., Rottensteiner, F., Soergel, U, 2014: Contextual classification of lidar data and building object detection in urban area. ISPRS J. Photogramm. Remote Sens., 152-165. doi.org/10.1016/j.isprsjprs.2013.11.001

Ye, Z., Xu, Y., Huang, R., Tong, X., Li, X., Liu, X., Luan, K., Hoegner, L., Stilla, U., 2020: LASDU: A Large-
Scale Aerial LiDAR Dataset for Semantic Labeling in Dense Urban Areas, ISPRS Int. J. Geo-Inf., doi.org/10.3390/ijgi9070450

Select at least one general theme that best defines your proposal I make my conference contribution available under the CC BY 4.0 license. The conference contribution comprises the abstract, the text contribution for the conference proceedings, the presentation materials as well as the video recording and live transmission of the presentation – yes

Gordana Jakovljevic, Ph.D. is an assistant professor at the University of Banja Luka, Bosnia and
Herzegovina. Her practical and theoretical research interest lies in the field of remote sensing, deep
learning, and environment protection, especially in water management. The primary aim of Gordana's
research is to develop a standardized, clearly defined methodology for the automated processing of remote
sensing data in real or near-real time in order to increase the usability of remote sensing data in
environmental management and decision-making. She is a member of FIG commission 4 and FIG 4.3
working group (Mapping plastic)