FOSS4G 2024 Academic Track

To see our schedule with full functionality, like timezone conversion and personal scheduling, please enable JavaScript and go here.
10:30
10:30
30min
Poster Session I

Poster Session I

Analyzing Randomness in Point Patterns: An Algorithmic Approach

  • Tony Sampaio, Earth Science Department, Department of Geography, Federal University of Paraná, Brazil; Spatial Pattern Analysis and Thematic Cartography Lab, Federal University of Paraná, Brazil
  • Cláudia M. Viana, Centre of Geographical Studies, Institute of Geography and Spatial Planning, University of Lisbon, Portugal; Associate Laboratory Terra, Lisbon, Portugal
  • Eduardo Gomes, Centre of Geographical Studies, Institute of Geography and Spatial Planning, University of Lisbon, Portugal; Associate Laboratory Terra, Lisbon, Portugal
  • Silvana Camboin, Geodetic Science Graduate Program, Department Geomatics, Federal University of Paraná, Brazil; Open Geospatial Lab, Federal University of Paraná, Brazil
  • Fábio Breunig, Earth Science Department, Department of Geography, Federal University of Paraná, Brazil; Spatial Pattern Analysis and Thematic Cartography Lab, Federal University of Paraná, Brazil
  • Edenilson Nascimento, Earth Science Department, Department of Geography, Federal University of Paraná, Brazil; Spatial Pattern Analysis and Thematic Cartography Lab, Federal University of Paraná, Brazil
  • Elaine de Cacia de Lima Frick, Earth Science Department, Department of Geography, Federal University of Paraná, Brazil; Spatial Pattern Analysis and Thematic Cartography Lab, Federal University of Paraná, Brazil; Geography Teaching Laboratory, Federal University of Paraná, Brazil
  • Jorge Rocha, Centre of Geographical Studies, Institute of Geography and Spatial Planning, University of Lisbon, Portugal; Associate Laboratory Terra, Lisbon, Portugal

Performance Benchmarking for Resource Allocation Optimization in GeoNode Ecosystems on Kubernetes Clouds

  • Marcel Wallschläger, Leibniz Centre for Agricultural Landscape Research (ZALF), Müncheberg, Germany
  • Igo Silva de Almeida, Leibniz Centre for Agricultural Landscape Research (ZALF), Müncheberg, Germany
  • Xenia Specka, Leibniz Centre for Agricultural Landscape Research (ZALF), Müncheberg, Germany

Applying Spatio-Temporal Analysis for Data Mining on Shooting Data

  • Felipe Sodré Mendes Barros, Facultad de Ciencias Forestales, Universidad Nacional de Misiones, Argentina
  • Terine Husek Coelho, Instituto Fogo Cruzado, Rio de Janeiro, Brazil
  • Iris Rosa, Instituto Fogo Cruzado, Rio de Janeiro, Brazil
  • Davi Santos, Instituto Fogo Cruzado, Rio de Janeiro, Brazil
Academic Track
Main Hall
13:00
13:00
60min
Lunch
Room I
13:00
60min
Lunch
Room II
13:00
60min
Lunch
Room III
13:00
60min
Lunch
Room IV
13:00
60min
Lunch
Room V
14:00
14:00
30min
A Spatial Data Infrastructure using Modern Standards: Lessons Learned from the eMOTIONAL Cities Project
Antonio Cerciello, Joana Simoes

Standards in the Geographic Information Systems (GIS) domain are crucial for ensuring interoperability, data consistency, and efficiency across diverse applications and platforms. As in other domains, they are necessary to ensure that different GIS software can work together. Moreover, the continuous improvement and development of these standards are essential to keep pace with evolving technologies and user requirements, enhancing the overall functionality and usability of GIS. By adhering to and advancing these standards, the GIS community can foster innovation, support informed decision-making, and address complex geospatial challenges more effectively.
Therefore it’s important to be conservative, using widely supported standards, but also open to emerging technologies, preparing for the future global leap, and welcoming it proactively. The design of the eMOTIONAL Cities Spatial data infrastructure (SDI) is built around this approach (Simoes, J., and Cerciello, A. (2022). Serving Geospatial Data Using Modern and Legacy Standards: a Case Study from the Urban Health Domain. The International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences, 48, 419-425)

The environment we live in affects our mental health and well-being. ​​The eMOTIONAL Cities project has set out to understand how the natural and built environment can shape the feelings and emotions of those who experience it. It does so with a cross-disciplinary and data driven approach, which resulted in numerous datasets from more “traditional” GIS based fields like Urban Planning, as well as other fields like Neuroscience. The common denominator between all these datasets is the geospatial dimension. One of the main goals of the project is to assemble these disparate datasets in a common SDI, in order to enable scientists and eventually the general public, to discover and access the data for the purposes of analysis and decision making.
The OGC API is a family of modern Standards from the Open Geospatial Consortium (OGC), which leverage modern web technologies like OpenAPI, REST and JSON (Percivall, G. (2017) OGC® Open Geospatial APIs - White Paper). Although very appealing to web developers, they are relatively new compared to the OGC Web Services (OWS) like WFS, WMS or WMTS, which have been in the GIS domain for more than twenty years. When we started the project, we were unsure if it would be possible to set up an SDI, purely based on OGC API, both because of the maturity of the Standards and the availability and Technology Readiness Level (TLR) of implementations. This has led us to initially create an SDI that contains both a modern and legacy stack (Simoes, J., and Cerciello, A. (2022). Serving Geospatial Data Using Modern and Legacy Standards: a Case Study from the Urban Health Domain. The International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences, 48, 419-425). However, in the past two years we have seen huge developments in OGC API Standards, with many Standards having parts approved and implementations catching up on those developments. One implementation in particular, pygeoapi (https://pygeoapi.io/), is exemplar in terms of the Standards development process, by participating actively in the OGC Code Sprints, by being an Early Implementer (EI) and even a Reference Implementation (RI) of several OGC API Standards. It embodies the new OGC paradigm, where the development of the Standard goes hand in hand with the development of implementations, resulting in published Standards which are market ready.
The eMOTIONAL Cities SDI demonstrates that it is now possible to share geospatial data using OGC API with Free and Open Source Software (FOSS). We have selected Standards to enable the publication of feature data (OGC API - Features), tiles of geospatial information (OGC API - Tiles), sensor data (SensorThings API) and metadata (OGC API - Records). Although that was not the case when we started this work, they are now all approved Standards. The SDI uses a stack of FOSS software, with pygeoapi at its core and several supporting services. In order to ease the deployment and reproducibility of the system, the services were virtualized into docker containers and orchestrated using docker-compose. This resulted in a system that is infrastructure agnostic and can be deployed in any Cloud Service Provider (CSP) in a matter of minutes. The code is available on GitHub with an MIT license (https://github.com/emotional-cities/openapi-sdi) and released in Zenodo with DOI 10.5281/zenodo.6591179. We have also set up pipelines to enable both humans and machines to ingest data and metadata into the SDI and extensive documentation about how to access the SDI, using clients such as QGIS, MapStore or jupyter notebooks.
The SDI is live at: http://emotional.byteroad.net/ and it includes 97 datasets from five different cities (e.g.: Lisbon, London, Copenhagen, Tartu and Lansing). It has collections that characterize the physical environment (e.g.: Normalized difference vegetation index (NDVI), Annual mean NO2 concentration), the built environment (e.g.: Buildings with repair needs ratio, Average age of buildings) socio-economic aspects (e.g.: Area Deprivation Index, Number of People Travel by Bicycle to Work) and health data (e.g.: Crude percent of adults with depression, Mortality rate), as well as results of experiments (e.g.: London outdoor walk test data: Air Quality Temperature, London outdoor walk test data: Sound Pressure levels). The data can be discovered and queried in the OGC API - Records searchable catalog: https://emotional.byteroad.net/catalogue
In this article we would like to share our journey during the process of implementing the SDI, and how we navigated the technological and human challenges of adopting emerging technologies in constant development. We hope the results of this project can encourage scientists, urban planners and other experts who deal with geospatial data in some way, to embark on a similar journey and contribute towards making geospatial information FAIR; e.g.: Findable, Accessible, Interoperable and Reusable. At the same time, we hope to promote a family of GIS standards (e.g.: OGC API) that seeks to mitigate the learning curve that has always characterized them.

Academic Track
Room V
14:00
30min
Democratizing AI, making geotechnology accessible to all
Thomaz Franklin de Souza Jorge, Cauã Guilherme Miranda, João, Igor Augusto da Costa Nunes, Lucas Alvarenga Lopes, Gabriel Viterbo

With the increasing presence of spatialized data and information
in people's daily lives, the constant need to make data-driven
decisions, and the expansion of artificial intelligence
technologies in society, this work seeks a technological solution
focused on simplifying geospatial analyses. The goal is to
democratize access to and understanding of these resources for
common users without the need for advanced knowledge of the
specific geographic information tools currently most used.
To this end, a system was developed that transforms natural
language questions directly into SQL queries, specifically using
PostgreSQL/PostGIS (Li and Jagadish, 2014; Ramsey, 2007).
This system is based on a chat model built on Gemini, which
interprets user queries and generates the corresponding SQL
queries. The back-end API executes these queries and returns the
results, which are visualized in an intuitive and interactive
graphical interface. This allows for dynamic exploration of
geospatial data, facilitating the analysis and visualization of
complex information without the need for advanced technical
knowledge in SQL. The integration of natural language
processing (NLP) and geospatial database queries represents a
significant innovation. This system reduces the learning curve
associated with traditional GIS tools, making the technology
accessible to a broader audience (Craglia et al., 2012). By using
the Gemini model, the system can understand and process a wide
range of natural language inputs, translating them into precise
SQL queries that interact with the geospatial database.
To demonstrate the system's effectiveness, a case study was
conducted using a database composed of 20 tables containing
data released by the National Water Agency (ANA), the Brazilian
Institute of Geography and Statistics (IBGE), and the Energy
Research Company (EPE), which were adjusted for reading by
the system. This database includes a data dictionary that provides
detailed information on what each value represents and its
corresponding context.
The results were evaluated based on the accuracy of answers
given to 192 questions posed within the context of the case study.
Out of these 192 questions, 167 answers were correct, yielding
an accuracy rate of 87% in the total evaluated, allowing detailed
visualization of the geometries and information required by the
user's query in the developed interface. Enhanced accessibility
for non-technical users is one of the most significant benefits
identified in this work, as there is no need for in-depth technical
knowledge in spatial data filters, in addition to the reduced query
time and the ability to generate valuable insights from these data
sets.
This work also highlights the importance of an intuitive and
interactive graphical interface system. The interface allows the
visualization of layers and tables resulting from the query,
enabling users to dynamically explore, filter, and manipulate the
data according to their needs, and obtain insights that would be
difficult to achieve without advanced knowledge of GIS tools (Li
and Wang, 2013).
It is important to note that this work represents a first step in an
initiative for this technological solution model implementing
Artificial Intelligence, from which it was possible to identify
several points of improvement not only in the model but also in
the databases and their construction and acquisition process. The
case study demonstrated that the system can accurately interpret
and execute user queries, providing reliable and relevant results.
With this approach, new possibilities are opened for the
exploration and analysis of geospatial data, enhancing decisionmaking based on information obtained from various areas such
as environmental monitoring, urban planning, and territorial
planning. Future work should focus on improving the system's
capabilities, expanding its application domains, and exploring
new ways to integrate emerging technologies, continuing to drive
innovation in this critical area.

Academic Track
Room II
14:00
30min
Integrating, Processing and Presenting Big Geodata with Earth Observation Datacubes in an Interdisciplinary Research Context
Christoph Friedrich

Application-oriented research projects often involve diverse consortium members, from universities to research institutions, to authorities, to domain end users. This requires the integration of very heterogeneous data sources, the facilitation of their combined processing, and the presentation of the results in adequate ways. The challenge here is not only the technical realisation itself, but maybe even more to design solutions catering to this wide user group, balancing feature-richness with easy usability. Data is abundant, and processing plentiful, but it all needs to go the last mile to the final user.

One such research project is “AgriSens DEMMIN 4.0”, which is advancing remote sensing for the digitalisation in agricultural crop production. Thus, the user group includes programmers, domain scientists, as well as farmers. Until now, agriculture is not yet widely taking advantage of EO products. Therefore, the project not only addresses the creation of novel remote-sensing-based application techniques and their implementation, but puts an equally distinct emphasis on the development of an accompanying data integration and visualisation system. In this work, we describe how our architecture – consisting of several pieces of free and open source geo software – closes the gap between data providers and information consumers as it facilitates necessary analysis steps and combines these with adequate presentation to decision makers.

In our IT architecture, we utilise one central datacube to conquer this problem, which acts as a cloud-based geospatial data holding and computation platform. It gathers a multitude of data, ranging from optical and radar raster imagery through weather data to in-situ field measurements, and pre-processes it into an interoperable, analysis-ready state. These resources can then be accessed through APIs for external usage, or computations can be carried out directly on the datacube and the results immediately visualised with tools hosted on the same server.

The whole system is located at the Leibniz Supercomputing Centre of the Bavarian Academy of Sciences and Humanities (LRZ). Apart from utilising the computing resources available there, this also opens up synergies with already-existing projects: We can make use of the enormous amount of EO data that is already available within the LRZ’s “Data Science Storage” and the DLR’s “terrabyte” platform. These storages are directly mounted into our server so that the datacube can access petabytes of imagery without having to duplicate it again, saving costs and emissions.

At the core of our infrastructure is an instance of the “Open Data Cube” (ODC) software package. Metadata is ingested into its PostgreSQL database and can be retrieved via the built-in web-based data discovery application “ODC Explorer” or via an API endpoint of the emergent STAC standard (which in turn can be accessed via the “STAC Browser” web application or any other compatible software). All raster data is provided in the Cloud-Optimised GeoTIFF (COG) format, allowing efficient access even from remote machines.

Our main interface for scientific computation is Jupyter Hub, enabling collaborative work across institutions. For each user, a dedicated Jupyter Lab instance is spawned in its own Docker container that can access all the data of the previously mentioned storages and has a certain amount of computing resources allocated to it. Users can write their code in Python or R, arguably the most popular programming languages in the EO community, which offer straightforward packages for connecting to an ODC, namely “datacube” and “odcR”. This way, scientists receive the typical professional online analysis environment in which they can work closely to the data and use all the powers of these programming languages and their EO-friendly ecosystem.

Another option to work with the data is via openEO, the new standardised way to interact with big EO data cloud processing backends. We incorporate this standard by utilising the “openEO Spring Driver”, an adapter to translate user-submitted openEO process graphs into analysis code that can be run using ODC. For compatibility with legacy software, it is also possible to request rendered images of pre-configured analysis algorithms via WMS, which are being served by an instance of the powerful “datacube-ows” package.

The final goal, however, is to connect farmers and other end users to these results, who cannot or do not want to deal with complex interfaces. Therefore, these highly technical tools are not sufficient, but need to be accompanied by easy-to-use graphical interfaces. Drawing on the possibilities of modern web technologies, we realise these through purpose-built web apps. Arranged around an OpenLayers-powered map component, data products are streamed in COG format from the datacube and displayed along with the needed additional tooling for interpretation. For example, in this fashion we realised a demonstrator showcasing the results of a water balance model for irrigated potato fields.

Another outlet of the project is the “FieldMApp”, a mobile application designed to be used by farmers both in the field as well as in the office to digitise and monitor areas of lower yield within crop fields. For evaluating plant vitality, a vegetation-index-based raster product is calculated on the datacube using the latest Sentinel-2 imagery and long-term crop-specific averages. Due to the tablet application being programmed with the platform-agnostic Flutter framework, but its standard mapping component not yet supporting COGs, it was necessary to resort to WMS for serving the raster data. As mentioned above, this is comfortably possible by configuring “datacube-ows” on top of ODC.

Overall, the challenge of interoperating various data supplies, processing chains and custom-tailored interfaces – as typically encountered in interdisciplinary research projects – requires complex solutions, but can be achieved quite well by utilising a datacube approach with free and open source geo software building blocks. Our integrated system successfully demonstrates such a use case for the domain of remote sensing in agriculture.

Academic Track
Room III
14:00
30min
The relationship between rural credit and deforestation
George Porto Ferreira

Tropical forests host half of Earth’s biodiversity (Dirzo & Raven, 2003), 62% of global terrestrial vertebrate species (Pillay et al., 2022), and play a crucial role as a carbon sink (Mitchard, 2018). Despite their importance, every year, 3 to 4 million hectares of primary tropical forests are lost, mainly in Brazil, Indonesia, and the Democratic Republic of Congo (DRC) (Hansen et al., 2013; Seymour, 2022), contributing to 22% of total greenhouse gas (GHG) emissions worldwide along with agriculture, forestry and other land use (AFOLU) (IPCC, 2023).

Preventing deforestation requires understanding its root causes, particularly the capital availability to the farm sector. In many tropical countries, rural credit is available as loans at subsidized interest rates to improve agricultural production or support agricultural costs (Servo, 2019). However, these loans may be leading to more deforestation. Some studies have analyzed this issue on a municipal scale, but few peer-reviewed studies have linked rural credit to individual property-scale deforestation. Recently, the NGO Greenpeace (Greenpeace, 2024) and the Climate Policy Initiative (Mourão et al., 2024) published two studies showing the relationship between rural credit and deforestation. Understanding this relationship can improve public policies to prevent deforestation from happening even before it starts.

Methods
In this study, I used open data and FOSS4G to quantify the amount of Rural Credit released to rural properties that committed Deforestation. The datasets came from different open data sources. The Central Bank of Brazil provides data on rural credit on the SICOR System. The National Institute of Space Research (INPE) provides data on deforestation in the Terrabrasilis system. The Brazilian Forest Service provided data for each property's Rural Environmental Registry (CAR), providing their boundaries. The Brazilian Institute of Geography and Statistics (IBGE) provides data for administrative boundaries (state and municipality).

Using the Terra library in CRAN-R,. I processed the data sets from three states that contributed the most to deforestation: Rondônia, Mato Grosso, and Pará. I used a Spatialite database and QGIS Geographic Information System to check the results. The novelty here is that by using R scripts, it was possible to rebuild the relational database from SICOR in a geospatial environment, providing a reproducible environment. All steps are described below.

First, using R, all the data needed for the analysis was downloaded from their source and loaded into the R environment. The second step, still using R, was to recreate the SICOR, CAR, and PRODES Deforestation tables and populate them into a Spatialite (SQLite) database. This step provides a valuable tool for monitoring by both environmental agencies and the banks that provide loans for rural credit.

The next step was to intersect the deforestation data with the CAR property boundaries, calculating the amount of deforestation on each property using PRODES data between 2008 and 2023. Next, the total number of loans between 2013 and 2023 was identified for each property. All these steps were processed using the Terra library in R.

Results
In 1992, the Brazilian Parliament enacted Law 4,829, creating subsidies for rural credit, known as Safra Plan. The interest rates of the Safra Plan have always been significantly lower than those practiced in the market. In March 2019, for example, while the average interest rate on loans for non-rural purposes stood at 31.6% per year, rural credit was observed at 10.8% p.y. on market rates, and even lower with controlled rates observing an average rate of 6.1% p.y.(Servo, 2019) .

The results show that from 2013 to 2023, more than BRL 17 billion was loaned to properties with some deforestation in these three states (RO, PA, MT). Counting deforestation from August 2008 to July 2023, 8197 km² is the total amount of clearing in properties that received rural credit in those same three states, representing 8.5% of all deforestation for the period.

Academic Track
Room I
14:30
14:30
30min
Advancing Geospatial Data Integration: The Role of Prompt Engineering in Semantic Association with chatGPT
Fabíola Andrade

Semantic interoperability is essential for integrating open geospatial collaborative and official data. While geosemantics has long been a topic of discussion, recent research has explored automated semantic integration without fully leveraging the capabilities of large language models (LLMs) in artificial intelligence. This study investigates using chatGPT-4 to semantically associate OpenStreetMap (OSM) tags with the Brazilian topographic mapping model, the Technical Specification for Structuring Vector Geospatial Data (ET-EDGV). Focusing on five classes within the buildings category, the study tested three data structuring methods: spreadsheets, OWL ontology, and XML. Results indicated that ontology and XML formats produced more accurate semantic associations than spreadsheets, with OWL yielding the most coherent results. These findings underscore the importance of properly structured data to capture hierarchical relationships between concepts better. The study also noted the need for precise and detailed queries, highlighting some limitations in chatGPT's ability to understand complex geospatial model inputs. Further research is recommended to enhance LLMs' potential in facilitating semantic interoperability and to explore the role of prompt engineering in optimizing these interactions.

Academic Track
Room II
14:30
30min
From cave buffer zones to protected areas: speleology data management with free and open-source software (FOSS)
Alexandre Assuncao

The field of speleology is dependent upon the accurate mapping and analysis of data to gain an understanding of subterranean environments. Free and open-source software (FOSS) has facilitated advancements in spatial data management, offering robust tools for data collection, analysis, and fieldwork. Software solutions such as PostgreSQL with PostGIS, QGIS, GRASS, and QField facilitate efficient geospatial data management and data collection. The accurate location determination of caves is of paramount importance, given their significant ecological, historical, and cultural value. In Brazil, the implementation of rigorous environmental legislation has resulted in the establishment of a 250-meter buffer zone surrounding caves. This regulatory measure is designed to ensure the protection of these vulnerable ecosystems and to regulate activities within their vicinity. The radius may be modified based on the findings of environmental studies, thereby ensuring the preservation of caves while facilitating socio-economic development. This study presents EspeleoVale, a software as a service (SaaS) solution hosted on AWS. It employs open source scripting languages and frameworks, including AngularJS, PostgreSQL with PostGIS, and MapStore2 integrated with Geoserver, to effectively manage and visualize speleological data for a mining company in Brazil. By employing SQL queries and spatial functions, users can visualize cave locations and their restricted areas, thereby facilitating the assessment of project impacts on these geomorphological features. In this study, examples of visualization and spatial analysis are presented for five hypothetical caves and a hypothetical project, returning intersections, differences and merged areas, which are vital for the environmental protection of cavities and for knowing project restrictions. Thus, the integration of an RDBMS for spatial analysis and FOSS tools for data visualization fosters new developments and promotes more efficient speleology data management.

Academic Track
Room I
14:30
30min
Photogrammetry and 3D Modelling Applied to the Creation of Virtual Reality in Realistic Environments: Analysis of Free Software for Image Processing
Felipe Oliveira Silva, Rosangela Leal

3D modeling involves the three-dimensional representation of characters or scenes, providing a greater visualization of details for the object being represented, creating the concept of depth. This concept also opens up a vast array of applications that a simple 2D drawing would be unable to present. This type of representation is widely used in various fields, such as the entertainment industry (e.g., films and games), automotive engineering, architecture/engineering, etc., having diverse purposes and applications. 3D modeling can be achieved through different methodologies, with the primary distinction between them being the intended use of the modeled object. Notable methodologies include Box Modeling, Digital Sculpting, and Poly-by-Poly modeling.
Traditionally, Photogrammetry was defined as the 'science and art of obtaining reliable measurements through photographs' (American Society of Photogrammetry). Although it is a powerful technique for environmental detail formation, there is inherent complexity in its equipment, including both hardware and software, with a significant cost associated with its acquisition and the necessity for specialized knowledge for effective use by its operators. However, with technological advancements, the availability of higher processing capacity equipment at lower costs and more user-friendly software has facilitated wider dissemination and use among a larger number of users. Among these technologies, the introduction of drones into the technical and professional market, with new forms of application and use of photography, has spurred new growth in the use of photogrammetry across various professional sectors and the development of techniques for processing small-format images.
Thus, 3D modeling through photogrammetry allows for the acquisition of large-scale data, enabling detailed studies, sometimes at centimeter or even millimeter scales, with a level of detail that would previously have been impossible or impractical. This contributes positively to feasibility studies, risk analysis, project presentation, among other applications in various fields such as Civil Engineering, Architecture, and Surveying. The combination of these technologies not only enhances project visualization but also amplifies the collection of geospatial data, promoting a more comprehensive and precise approach.
Virtual Reality (VR) is a technology that creates a simulated environment through electronic devices, thereby providing users with a new way of visualization, whether applied to video games or integrated with other fields. The combination of this technology with models obtained from photogrammetry provides realistic environments with impressive detail richness.
This study will address 3D modeling through the close-range technique, using terrestrial photographs. The research is divided into four stages. The first three stages involve processing these images using various types of hardware and software. For the hardware, both processing power and image capture quality are considered, aiming to demonstrate the best possible results at a low cost. Regarding software, the use of open-source programs from various developers was explored, with the intention of making comparisons and achieving the best results among them.
In the fourth and final stage of the research, a market evaluation will be conducted to understand the needs of professionals in the field concerning this technology. To carry out the work, photographs were initially taken with a Canon EOS 200D of the building housing the Graduate Program in Modeling (PPGM) at the State University of Feira de Santana (UEFS). Subsequently, additional images were obtained using other capture equipment, such as mobile phones, following the same research line. A total of 100 photos were collected and processed using three open-source software programs: Meshroom, Colmap, and Regard3D, which are noted for their prominence and positive recommendations among free software options.
The goal was to calibrate parameters to achieve the best possible model, considering software and hardware limitations. With the obtained results, a comparison was made to determine which software offered the best outcome, combining modeling quality, ease of post-processing, and compatibility with the graphics engine (Engine) that will be used for creating the realistic environment. This engine is called Unreal Engine, developed by Epic Games, widely used in video game development but with significant potential for application in fields such as Civil Engineering, Architecture, and Surveying.
Thus, the research could delve into the combination of modeling obtained from photogrammetry with virtual reality. One of the software programs used, which demonstrated good performance, was Regard3D, designed for creating 3D models from two-dimensional images. A machine with low processing hardware was used specifically to compare these results with those from more modern computers. The configuration used is as follows:
• Processor: Intel Pentium G620
• Motherboard: DXH61Z M2 Duex
• Graphics Card: RX580 8GB MingZhou
• RAM: 16GB
Parameter selection was carried out iteratively in successive stages to improve processing quality. It was observed that processing times were high, particularly in specific stages such as mesh computation. During this process, the software analyzes the provided images to find correspondences, known as interest points, which are distinct and uniquely characterized points in the images. For the various software programs, the most commonly used algorithms for point detection are SIFT (Scale-Invariant Feature Transform) (Lowe, 1999) and ORB (Oriented FAST and Rotated BRIEF) (Ethan Rublee et al., 2011), describing them as characteristic vectors for finding common points between images. After detection, filtering is performed to discard points that are misaligned relative to others.
Additionally, significant time was observed in the densification process, which involves increasing the number of points in a 3D model to add more interest points for better image quality. Various techniques are used in this process, with interpolation being particularly noteworthy, as it uses the characteristics of nearby points to estimate geometry and generate additional points.
Using minimal parameters, the model generation time was extended on this computer configuration, with high RAM usage. Significant storage was required, as the models increased in size at each processing stage, reaching approximately 20GB in the final stage. However, the results were satisfactory compared to the available paid software on the market. Therefore, the use of photogrammetry-generated models, combined with virtual reality to create realistic virtual environments, can be considered positive, with low cost and using free open-source software.

Academic Track
Room III
14:30
30min
Systematic Technology Review of OGC Standards and OSGeo Projects
Luiz Fernando Satolo

OGC Standards and OSGeo Projects have been widely applied to different kinds of geospatial data and extended for the implementation of geospatial data science environments. However, there’s no review comprehensively summarising and discussing the progress of these open source technologies for publishing geospatial databases on the Web. The proposed Systematic Technology Review is a stylized version of the Systematic Literature Review, covering the documentation of OGC Standards and OSGeo Projects. The search strategy consisted of screening OGC and OSGeo websites for the latest version of OGC Standards' implementation (or community) specification and OSGeo Projects' developers manual. This review considered the technologies published until June 2024. A total of 80 OGC Standards and 52 OSGeo Projects were identified. To recognize the main topics of each technology in detail, the documentation was analysed by Latent Dirichlet Allocation - LDA using the Scikit-learn package in Python. Grid-search was used to find the optimal hyperparameters for the number of components and the decay of the learning rate. With the maximum number of iterations set to 100, the best model was obtained with 8 components and 0.1 learning decay. Then, the most probable topic was predicted for each documentation. The network of similarities arising from LDA was exported to Gephi for visualisation, where ForceAtlas2 layout algorithm was used to create a weighted undirected graph, keeping only edges with weight greater than 0.33. The latest developments in terms of the OGC Standards for data encoding took place in the GeoPackage standard. For accessing, processing or visualising data, the trend was the development of OGC API related standards. However, GML is the most implemented OGC Standard for data encoding in OSGeo Projects, along with Web Services like WMS, WFS, WCS and WPS for accessing, processing and visualising the data. Community Standards represented less than 10% of the OGC Standards, while Community Projects represented more than 50% of the OSGeo Projects. The adoption of these technologies were evaluated based on the number of Github forks and stars, as well as Docker pulls. With more than 100 million pulls, PostGIS is the most downloaded OSGeo Project, followed by GeoNetwork and Open Data Cube, with more than 5 million pulls each. But many of the analysed technologies lacked an official Docker image. In terms of Github forks and stars, the most shared and favoured OSGeo project is OpenLayers, followed by QGIS and GDAL. The Latent Dirichlet Allocation analyses found eight topics underlying the OGC Standards and OSGeo Projects. The keywords of the top four topics were conformance, layer, tile and response. Based on the analysis of the Implementation Standard and Community Standard documentations, the most similar OGC Standards were OGC API - Tiles and Two Dimensional Tile Matrix Set. On the other hand, based on the analysis of developer manuals, the most similar OSGeo Projects were GDAL and MDAL. The strongest relationship of an OGC Standard and an OSGeo Project occurred between WPS and ZOO-project, followed by WPS and PyWPS. Overall, the OSGeo Project most closely related to the entire set of OGC Standards was rasdaman, followed by MapServer and deegree. Notably, a large group of standards and projects showed scarce connections, mainly those that are domain specific, like PubSub, LAS and PipelineML among the OGC Standards and like Giswater and MobilityDB among the OSGeo Community Projects, or those that are the basis of the other technologies, like Simple Features, WKT and Coordinate Transformation standards and like PROJ and PostGIS projects. The presented Systematic Technology Review can promote the evolution of the current OGC Standards and OSGeo Projects, as well as the development of new technologies. It can also support developers of new solutions in the geospatial community. Specifically, this review is the basis for the proposal of a new library for the integrated access of INPE’s environmental databases. An important limitation of this systematic review is that it was not possible to find any PDF documentation for almost 20% of the existing technologies, which were excluded from the analysis.

Academic Track
Room V
15:00
15:00
30min
Deep Pavements Framework: Combining Ai Tools And Collaborative Terrestrial Imagery For Pathway Mapping
Kauê de Moraes Vestena
  1. Introduction & Related Work
    Pedestrian mobility is crucial in urban environments, and its promotion can contribute to the achievement of many UN SDGs (Adriazola-Steil et al., 2021). Mapping, which enables public scrutiny and long-term optimized planning, is indispensable in this context.
    With the widespread availability of a large set of Open Street-Level Imagery, such as Mapillary, there is now a significant opportunity that presents significant challenges for data extraction (Ma et al., 2019). The richness of detail in these urban landscape representations can help us better understand the peculiarities of urban environments. The scope of this project is to make use of them for the study of pathways, focusing in particular on the verification of their existence, their categorization (road, sidewalk, or general footpath), and the identification of their surface material. Since pedestrian crossings are part of the car and pedestrian network, and road characteristics (such as material and width) significantly impact pedestrian safety, it is worth noting that the study of roads is also fundamental to pedestrian infrastructure (Mesfin & Denbi, 2022). However, the central aspect remains sidewalks, often the most ubiquitous type of pedestrian thoroughfare (Kim, 2019), a valuable space for sociability (Osman, 2016), whose "health" is symptomatic of how pedestrian-friendly the city is (Mesfin & Denbi, 2022).
    Despite the importance of knowledge about them for understanding the urban environment, pavements are often poorly mapped (Vestena et al., 2023). Even fewer works delve into the problem of pathway surface identification: Zhou et al. (2023) used conventional Convolutional Neural Networks (CNN) to identify pavement classes limited to asphalt, gravel, and cement; Zhang et al. (2022) used a similar approach to identify asphalt-only damage such as "potholes" and "patches"; only Mesquita et al. (2022) and Hosseini et al. (2022) made pixel-level identification, albeit the first one was limited to only "paved" and "unpaved" categorization, while the second one notwithstanding having a more wholesome approach has its categorization focused on a New-York centered classes and only classifies sidewalks. There is still a gap in approaches considering standardized surface types and generalized path detection.
  2. The Framework
    We propose the Deep Pavements Framework to address these issues. It is a modular project, with each part contributing to the solution of the different challenges. The first module is the Surface-patches Dataset, labeled following the OpenStreetMap surface=* tags standard, supporting the categories of "asphalt", "cobblestone", "compacted", "concrete plates", "concrete", "grass", "gravel", "ground", "paving stones", and "sett" currently, The second module is the Runner, to process the data for a given region. The third is the Sample-picker, which generates random samples for dataset generation. There is also the Sample-labeler to label samples interactively and a central module to guide the potential user into the project's usage. Each module relies upon a different set of dependencies, thus reducing runtime issues. It is important to highlight the primary usage of containerizing engines, i.e., Docker.
    Beyond modularity, Deep Pavements has as core design principles: 1) complete openness, meaning that all its dependencies must have a broadly permissive license that enables even for commercial usage; 2) the ease of reproducibility by a straightforward setup with an as such and well-documented, command line interface (CLI); 3) evolvability, the State-of-The-Art (SOTA) algorithms are constantly changing, then at each new release of runner/sample-picker images a new set of tools can be employed while keeping the same CLI; nevertheless the user would still be able to use a previous release; 4) Standard-anchored with classes that had been agreed upon by the broad crowd-sourced knowledge base constituted by OSM community (Rahmig & Kludge, 2013; Mooney & Minghini, 2017).
    The implementation of the main modules (Runner/Sample Picker) uses open-vocabulary AI algorithms to perform the data extraction, following this workflow: 1) Grounding Dino (Liu et al., 2023) based on a free-input, detects the bounding box of the detections; 2) Segment Anything (Kirillov et al., 2023) transforms it into a mask; 3) 3 different versions of the CLIP algorithm (Radford et al., 2021) tests if the detection is not a hallucination; 4) If confirmed, a specialized version of CLIP is used for finally check the surface material using the cheaply clipped biggest rectangle in the detection, assuring the usage of the patch whose texture got less hindered by the effects of perspective (Lederman & Klatzky, 1995) being no-data pixels, free, which is another potential source for classification jeopardy (Kang et al., 2019).
    The workflow results from experiments for the presented design mainly point out the need for hallucination testing. This procedure acts as a shield for one of the main drawbacks of open vocabulary algorithms (Ben-Kish et al., 2024). The use of this particular type of algorithm was essential due to its flexibility (Wang et al., 2024) and the potential for better semantic understanding of the scene due to its embedded language model (Eichstaedt et al., 2021). There is also the possibility of allowing the user to opt out of some or all of OSM standardized classes, which can be helpful in some scenarios with regional uniqueness.
  3. Final Remarks
    Deep Pavements is an innovative and comprehensive toolset under continuous development with all modules maintained at Github, with the central module available at https://github.com/kauevestena/deep_pavements_project. The framework enables creating pavement data that is seamlessly plugable into OSM.
    As future challenges, we plan to filter lousy quality images that can happen on the primary data source (Ma et al., 2019) to detect other visually-identifiable pavement traits such as its decay, standardizing with OSM tags such as smoothness=*; to integrate photogrammetric tools for obtaining additional modeling of pavements, having as main interest in measuring pavement width, one of the most relevant info for accessibility assessment (Kim et al., 2011).
Academic Track
Room III
15:00
30min
Free and Open-Source Software Solutions in the GeoRondônia Project: Efficiency in Georeferencing of Rural Settlements with QGIS and GeoINCRA plugin
Leandro França, Dra. Ranieli dos Anjos de Souza, Valdir Moura, Marcelo Vinicius Assis de Brito, Bárbara Laura Tavares

The National Institute of Colonization and Agrarian Reform (INCRA) is a body of the Brazilian federal government, linked to the Ministry of Agriculture, Livestock and Supply (MAPA). Its main mission is to execute national agrarian policy, promoting agrarian reform and land planning in Brazil. INCRA works on several fronts to guarantee territorial regularization, sustainable development and social justice in the countryside.

The Amazon is a strategic region for INCRA due to its unique characteristics and specific challenges. The body works on land regularization to guarantee the legal security of squatters and combat land grabbing, thus contributing to environmental preservation, and thus, favors the control and monitoring of rural properties, helping to combat illegal deforestation and environmental degradation.

For regularization to become a reality, INCRA has signed Decentralized Execution Terms in several Brazilian states. In Rondônia, the partnership was established with the Federal Institute of Education, Science and Technology of Rondônia (IFRO), through TED 20/2021/INCRA-SEDE/IFRO, called GeoRondônia Project. The project aims to serve and document more than 25,000 families and rural properties. To achieve this, the necessary steps are the georeferencing of properties, rural environmental registration and occupational supervision.

The georeferencing of rural properties is a complex process that requires knowledge in surveying, legal aspects, precision and efficiency. With the growing demand for land regularization in Brazil, especially in large areas such as the Amazon, it is essential to seek solutions that reduce costs, automate tasks and ensure data quality. This article presents the productivity gains achieved by the GeoRondônia project, which uses QGIS and the GeoINCRA plugin to automate tasks. The methodology developed involves data validation and quality control, with automation implemented in Python, ensuring accuracy in property certification and the large-scale generation of .ODS spreadsheets (certification document).

This is an innovative methodology, unavailable in any other Geographic Information System, open or private, whose steps were developed to meet the highest technical quality standards for georeferencing projects with large volumes of data, such as in GeoRondônia, which has worked in settlements that have more than 2,000 properties.

Initially, all georeferenced data processing dynamics were carried out manually. In this way, the Space Research Group (GREES/IFRO), together with other collaborators from the Paraíba Institute (IFPB), has supported the project's actions for the development of innovative technology, in order to optimize actions involving certification of rural properties, using free tools.

The biggest challenge of the project is time and qualified labor, therefore, aiming to increase productivity, the Free and Open Source Software Solution was developed, called FOSSS GeoRondônia, which integrates features from QGis and, mainly, the GeoINCRA Plugin . This set of functions enabled temporal gains, professional qualification of employees and finances, as it is completely free, and can -be replicated to any other large data volume project involving georeferencing.

To optimize demands, the main steps of FOSSS GeoRondônia, after adjusting the observations tracked in the field to meet INCRA's positional accuracies, are: a) elimination of topological errors in the geometries of the settlement database; b) elimination of errors when filling in vertices and limits in the settlement database; c) elimination of errors in the name of the property and .ODS spreadsheet; d) generation of .ODS spreadsheets in an automated way and launch in INCRA's Land Management System (SIGEF).

The results obtained with FOSSS GeoRondônia are notable: Process Automation with the assembly of the database with greater security, precision and practicality; Automation in product generation, where ODS spreadsheets that were previously done manually are now generated automatically; Reduction of spreadsheet preparation time by around 70%; Elimination of topological and writing errors, ensuring accurate and consistent data; Cost savings using QGIS and the GeoINCRA plugin, eliminating the need for expensive licenses, allowing for a more efficient allocation of public resources.

Currently, the GeoRondônia Project has already organized the databases for 8 Settlement Projects, which make up around 1,500 properties. A very productive result was the field collection, processing and launch of 3 new Settlements (574 properties) in the state of Rondônia in record time, for launch by the Federal Government, all carried out between April and May 2024.

The processing of georeferenced data on a very high scale, using FOSSS GeoRondônia, allowed the generation of products with precision and the consequent launch in SIGEF with reliability, which can be replicated for any project in Brazil. This has promoted greater efficiency in the services performed, and helped INCRA in meeting the large volume of Agrarian Reform demands, to transform rural settlements into true agents of sustainability and productivity.

Based on this, we are interested in presenting this new functionality to the national and international public present at FOSS4G, and who are looking for free solutions for different demands, in order to promote the dissemination and replication of the use of FOSSS GeoRondônia.

Academic Track
Room I
15:00
30min
The Use of GeoAI Techniques for Gathering, Storing, and Analyzing Historical Agroecological Data
Cláudia M. Viana

Most historical sources, available in multiple formats (e.g., tabular and analog data), contain valuable geographic information. This data can be transformed to generate both quantitative and qualitative insights, enabling the creation of digital maps and unlocking significant potential for scientific analysis. However, the use of historical data presents several challenges: 1. Sources need to be digitized; 2. Collections are often spread across multiple archives; 3. Metadata is often unavailable; 4. Standardizing diverse sources and quantitatively reconstructing data from various periods is difficult; 5. The reliability of historical data can be uncertain; 6. There is limited spatial resolution; and 7. Inaccuracies and text legibility issues are common. These challenges underscore the need for novel methodologies aimed at enhancing the quality and quantity of such sources. This paper presents the findings of the exploratory project AgroecoDecipher (2022.09372.PTDC) dedicated to extracting a comprehensive database from historical textual records and analogue map files to trace agroecological patterns. Employing an exploratory methodology grounded in artificial intelligence (AI) and Geographic Information Systems (GIS), the projected solutions include the establish-ment of routines based on AI tools that combines GIS, machine learning (ML), and Large Language Models (LLMs). Approxi-mately 271 survey books from the 1950s were digitized at the municipal level, with a total sheet count exceeding 42,000. Addi-tionally, more than 100 analogue maps were digitized, processed, and vectorized, resulting in a detailed geodatabase map ar-chive. The results are promising and demonstrate that the integration of AI and geospatial tools has proven essential in trans-forming raw historical data.

Academic Track
Room II
15:30
15:30
15min
Coffee-Break
Room I
15:30
15min
Coffee-Break
Room II
15:30
15min
Coffee-Break
Room III
15:30
15min
Coffee-Break
Room IV
15:30
15min
Coffee-Break
Room V
15:30
30min
Poster Session II

Building an AI Dataset to Estimate Vegetation Carbon Using a QGIS-Based Annotation Tool

  • Yoon Jeongho, Korea Environment Institute, Korea
  • Lee Sanghyuk, Korea Environment Institute, Korea
  • Son Seung-woo, Korea Environment Institute, Korea

Violence: Types, Distribution, and Frequency. A Study on the Re-concentration of Homicides in the City of Rosario, Argentina (2007-2023)

  • Silvina Meritano, National Scientific and Technical Research Council (CONICET), Argentina; Centre for Research and Studies on Culture and Society (CIECS-UNC), Córdoba, Argentina

Use of Vegetation Indices for Monitoring Forest Cover in a Conservation Unit in the Far West of the State of Acre

  • Ananda Kellen Silva Rocha, Federal University of Acre, Cruzeiro do Sul, AC, Brazil
  • Anelena Lima de Carvalho, Professor, Federal University of Acre, Cruzeiro do Sul, AC, Brazil
Academic Track
Main Hall
15:45
15:45
30min
Natural Language Processing and Voice Recognition for Geolocation and Geospatial Visualization in Notebook Environment
Nathan Damas

Innovations, such as voice recognition and natural language processing (NLP), have significantly impacted various fields by enabling more natural interactions between humans and machines (Mahmoudi et al., 2023). In geoinformatics, these advances are crucial for visualising geospatial data, allowing the creation of interactive and dynamic maps (Craglia et al., 2012). Online mapping applications, like OpenStreetMap (OSM), have democratised spatial information by enabling public participation in its creation and maintenance (Haklay, 2010). Geolocation is essential in contemporary applications, such as navigation, emergency services, and location-based services. Google Colaboratory (or Colab) Notebook Environment stands out in promoting open science due to its accessibility, ease of use, and collaborative capabilities, and enabling the embodiment of the FAIR principles (Camara et al., 2021). This study aims to develop a voice interaction application in Google Colab Notebook Environment to answer the question: "Is it possible to develop a voice command application for geolocation and visualisation of geospatial data within the Google Colab environment?" The methodology includes FOSS libraries and tools such as geopy, speech_recognition, ffmpeg, librosa, and flask, subdivided into six stages: Audio Data Acquisition, Audio Processing, Speech Recognition, Geocoding, Visualization, and Interface Development. The complete code, under an open license, and how to reproduce this work are available on GitHub. Audio capture is performed using the Web Speech API in JavaScript (JS), which allows real-time voice recognition and integration with the MediaDevices API to access the user's microphone. This method provides an interface for high-quality audio recording, essential for speech recognition and geocoding accuracy. Audio processing involves converting the ".webm" format to ".wav" using ffmpeg, efficiently maintaining the original audio quality. The Librosa library loads the audio, adjusts the sampling rate, and extracts relevant features from the audio signal, such as spectrograms (Bisong, 2019). Speech recognition is performed with the SpeechRecognition library in Python, which provides an interface for various speech recognition services, including the Google Web Speech API. This choice is due to its high accuracy and support for multiple languages, ensuring the system's flexibility and accessibility to a diverse audience (Nassif et al., 2019). Geocoding transforms textual descriptions of locations into geographic coordinates, allowing the visual representation of these locations on an interactive map. The geopy library and the Nominatim service from OSM are used to convert addresses into latitude and longitude coordinates (Mooney & Corcoran, 2012). For the visualisation of geocoded data, a web server was implemented using Flask, a microframework for Python that allows the creation of lightweight and efficient web applications. The user interface was developed with HTML, CSS, and JS, providing an intuitive and interactive experience. The results show that the user and machine interaction occurred satisfactorily. The first message displayed to the user instructs them to slowly state the name of the city, state, or country they wish to geolocate. The use of JS and the Web Speech API allowed the system to detect specific voice commands to start and stop recording, as indicated by the interface colors and states. This step is crucial for subsequent steps to ensure that the captured audio is clear and understandable. When the start command is recognised, the interface changes to indicate that the recording is in progress. The message "Command recognised: starting recording" confirms that the command was detected correctly. If the voice command is not recognised, the interface displays a message asking the user to repeat the command. After recording, the audio is saved in ".webm" format. If a previous audio file exists, it is automatically overwritten. This approach simplifies file management and avoids the accumulation of unnecessary data. Next, the audio is converted to ".wav" format using the ffmpeg library. Then, the audio is transcribed using the Web Speech API and the SpeechRecognition interface for the recognised language, along with the confirmation of the geocoded location and its respective latitude and longitude. The visual feedback proved essential for the user to confirm that the entered information was recognised, improving the system's usability. The displayed information includes city, region, country, latitude, and longitude. The interactive map allows the user to visualise and interact with the located area, altering the zoom level and receiving a voice message informing the map's current zoom level. This work presented the integration of tools that assist in advances in human-computer interaction in geoinformatics, offering an intuitive and accessible interface for users of different technical proficiency levels. The results confirm the feasibility of voice command geolocation in Google Colab, a platform that can be used for education, research, collaboration, and sharing in science, enabling this work's reproducibility. Future research can improve voice interaction features, explore geolocation methods such as bounding boxes, and reduce dependence on JS and Flask. Improving the requirements for peripheral devices could further increase the system's accuracy, accessibility and user experience. The importance of geospatial accessibility lies in enhancing service provision, urban planning, and social inclusion, facilitating mobility for people with disabilities, and improving urban infrastructure (Han et al., 2020).

Academic Track
Room II
15:45
30min
The Web GIS as an Auxiliary Management Tool for In-Person Teaching with Technological Mediation in the Amazon Rainforest: The Case of CEMEAM, Amazonas, Brazil.
Alexandre Donato da Silva

The Amazon faces significant logistical challenges due to its size and complex geographical features, such as the dense rainforest and vast hydrographic network. In Brazil, Amazonas stands out for its ecological diversity and difficulties of access, requiring innovative solutions for educational development. The Amazonas Education Media Centre (CEMEAM) uses In-Person Teaching with Technological Mediation (IPTTM) to overcome these obstacles, promoting digital inclusion and democratizing access to education. Implemented in 2007 by the Amazonas State Secretary of Education and School Sports (SEDUC-AM), it combines face-to-face classes with satellite videoconferences and other media, reaching more than 25,000 students in an area of 1,571,000 km². To help with spatial management, CEMEAM was presented with a proposal for a Geographic Information System in a web environment (WebGIS) using free software and plugins (QGIS and QGIS Cloud) as well as freely accessible products. The WebGIS enabled a variety of analyses and proved it could contribute to more efficient spatial management, adapted to regional specificities, such as the isolation of localities and the dependence of students on river transportation. Although it is not yet a formalized institutional tool, SIG Web demonstrates the potential of geotechnologies in educational management in the Amazon, serving as an important experiment to support CEMEAM in its needs.

Academic Track
Room I
15:45
30min
WebODM free software as a tool for digital aerial photogrammetric processing: employability in scientific productions
Fabrício Lisboa Vieira Machado

With the advancement of technologies applied to Global Navigation Satellite Systems and the popularization of Remotely Piloted Aircraft, aerial photogrammetry has experienced significant advances, especially in geographic and environmental research. Drones equipped with high-resolution sensors have revolutionized data collection, essential for topographic mapping and vegetation analysis. However, many technologies for digital processing and spatial analysis of original images are proprietary and expensive, making access difficult for institutions and researchers, especially in Brazil. This scenario makes free and open-source software, such as WebODM, an important alternative to democratize access to high-quality tools. In view of this, this article analyzes the applicability of WebODM in academic works, through bibliometric analysis in scientific repositories. The search included variations in the software nomenclature and the data were analyzed quantitatively. The Elsevier and Scopus databases led with 59 and 39 publications, respectively, while the national Scielo Brasil database had only one article hosted. There has been a gradual increase in the number of publications involving WebODM since 2016, with a peak of 38 papers in 2023. In comparison, the term "Agisoft Metashape", a proprietary solution for digital aerial photogrammetric processing, returned 1,595 publications in the same search portals. As an initial contribution to the understanding of the state of the art, it was observed that the thematic axes involving remote sensing, photogrammetry, precision agriculture and agricultural management were those that concentrated the largest number of scientific productions on WebODM in the investigated period, exceeding 30 papers. It is concluded that WebODM has stood out as a relevant tool in scientific research, especially for the processing of images derived from drones. Future studies should qualitatively evaluate the results obtained with the use of WebODM in comparison with proprietary software.

Academic Track
Room III
16:15
16:15
30min
GeoAI in resource-constrained environments.
marc böhlen

Advances in spatial and spectral resolution in private sector satellite imagery, together with geography aware algorithms, have created new venues for the use of Artificial Intelligence (AI) in geospatial applications, sometimes referred to as AI4Geo. However, these advancements are accompanied by significant costs in the procurement of data, computing resources, communication infrastructure and human expertise. We describe a case study in central Bali in which we developed multiple AI4Geo approaches to assist the WISNU foundation, a Non-Governmental Organization in Bali, Indonesia, in their ongoing efforts to manage community resources and to perform land mapping across small villages in Bali.

Concepts
The concept we explore here is multipath AI4Geo that seeks to find the “best” approach to AI4Geo for resource constrained environments. The assumption that larger models are always better does not hold where AI4Geo, trained on data from dominant western institutions, is applied in the majority world. Some of the most ambitious AI4Geo models are trained for land cover categories that are mostly of interest to the Northern Hemisphere. Given this imbalance, we ask how participants from low-resourced environments can best make use of AI4Geo.

Methodology
Based on field data from a study site in Bali, Indonesia, we have developed multiple open source AI4Geo land cover approaches to find the best way to represent agroforestry, a key indicator of sustainable and robust food production. We compare the image segmentation results from small models such as Random Forests (RF) and Support Vector Machines (SVM) with large models such as U-Net and ResNet152 not only along established model performance metrics such as f-score, but also in terms of their suitability for use in low-resource conditions. This generally includes limited ability to collect large data sets, limited computational infrastructure, limited AI expertise and limited internet connectivity. We then describe a mixed-method multi-pathway approach to produce good AI4Geo results while building capacity for the NGO to continue the integration of AI4Geo into its operations while planning for an even more challenging AI4Geo future dominated by large homogenizing AI models.

Here are links to code experiments and instructions on generating the required input data for the U-Net model from geospatial shapefiles.

Small models (RF, SVM based on the Orfeo library)
https://github.com/realtechsupport/cocktail/tree/main/code

Large models (Custom designed U-Net and SATLAS based ResNet models)
https://github.com/realtechsupport/cocktail/tree/main/satlas_test
https://github.com/realtechsupport/cocktail/blob/main/sandbox/working_model/working_model_inference.ipynb

Results
While RF, SVM and U-Net approaches were all able to detect agroforestry in 8-band, 3-meter spatial resolution datasets provided by Planet Labs, we found that the SVM algorithm was most responsive to the limitations of our dataset while producing useful results that we could verify in the field. SVM was furthermore painless to update with additional field data. Figure 1 summarizes the results from the image segmentation after model training.

While U-Net’s f1 accuracy for agroforestry exceeds that of RF and SVM, it is likely an overestimate of the actual extent of agroforestry. We believe this to be the case because the U-Net architecture ingested patches of 16 x 16 pixels, and these dimensions exceed the size of the smaller agroforestry plots detected in the field. The choice of the input patches was in turn a function of the dimensions of the U-Net architecture selected for its ability to minimize loss during training across all land cover categories.

As opposed to the three other models listed above, the large ResNet152 model was not trained on data Planet Lab satellite imagery but on Sentinel-2 imagery. Because Sentinel-2 only has a maximum spatial resolution of 10m/pixel it is not able to distinguish small scale landscape features, agroforestry that typically utilizes small plots in random arrangements. While the ResNet algorithm was trained on the largest dataset, with over 300 distinct labels across 137 classes represented across 64 million images, the class labels are not tuned to the spectral signatures of agroforestry and deliver only crude results in our selected study area, as Figure 2 shows. Moreover, The ResNet152 model that supports multi spectral Sentinel-2 input has over 80 million trainable parameters, exceeding our bespoke U-Net model by more than an order of magnitude, thus making its use more costly.

While we have not fine-tuned the Resnet152 model with our own highest resolution Planet Lab data due to spatial resolution mismatches, it seems clear that the effort would exceed the capacities of our partner organization WISNU. Our dilemma is that the most promising large models are unwieldy and not adapted to our land cover conditions while the smaller models we have end to end control over can be tuned with smaller dataset but run the risk of becoming obsolete in the AI arms race over time, where larger and more powerful models become standard-bearers. While the agroforestry specific results we observe are characteristic of our study area and the constraints our project operates under, the homogenizing forces of large models pose a condition all AI4Geo operations are faced with. For that reason, the territory of this project is significant beyond the immediate results we produce.

Our solution to this dilemma is two-fold. We deploy multi-pathway AI4Geo across various technical complexity levels while retaining agency for local stakeholders. The Wisnu foundation does preliminary studies of Sentinel-2 satellite imagery through the QGIS environment to survey sites and build simple datasets. They then use QGIS integrated small model approaches such as Random Forests to build baseline segmentation maps of a given area. The research team will then collect Planet Labs based higher resolution data and use the cocktail suite of models, including U-Net, to deepen the study results. Parallel to this approach, we together use the SATLAS ResNet models to find synergies in those results. Across the approaches, we build land cover analysis results that optimize limited resources while producing solid analytical results.

Academic Track
Room III
16:15
30min
OpenStreetMap in the Training of Local Actors in Projects Aimed at Community-Based Interventions in Favelas: A Systematic Review Supported by AI
Patricia Lustosa Brito, Pedro Melhado

Public participation is of utmost importance for community mobilization and engagement, so that through their networks and relationships, both within and outside the community, they create space through social action. According to Goodchild (2007), there is a demand to generate information that helps vulnerable communities to strengthen relationships with the government responsible for promoting important interventions to bring about change. It is possible to use the sensitivity and ability to inform the needs of each resident, from their perception, and understand the needs of their community. Therefore, it is important to use open and free mapping tools that represent the community's demands, producing information that allows collective autonomy to carry out strategies that involve the public authorities through co-optation and coalitions aimed at community-based interventions (Silva, 2014).

In this sense, OpenStreetMap (OSM) stands out as a tool for collaborative mapping and community interventions in highly vulnerable urban areas, such as favelas. Given OSM's participatory and open nature, it allows the creation of maps with records of features of various kinds, but it is also of great value as a platform for community training and the development of geospatial skills (Bortolini and Camboin 2019).

The objective is to analyze from the literature how OpenStreetMap has been used for training and empowering local actors in projects aimed at community-based interventions in favelas.

A systematic literature review was carried out with the support of artificial intelligence tools (Chat GPT-4, Elicit, Semantic Scholar, Chat pdf), bibliography management software (Zotero), and software for visualizing bibliometric networks (VosViewer) combined with other research methodologies (P.I.C.O., Bardin) to assist in the overall evaluation of the literature. Although AI tools have great power to aid the review, they do not replace the need for critical judgment and human expertise, demanding confidence in the knowledge of the contents and scientific methodology.

The following keywords were used in the platforms of WoS and Scopus collection lists in the first search: [Collaborative mapping, Community intervention, OpenStreetMap, Community empowerment, Community mobilization, Citizen participation]. In this first search, some filters were established, such as the publication date within the last 10 years and articles that were open access. Based on these filters, 43 articles were found that fit these specifications in the Web of Science, with the vast majority in English. From this first literature search, there also arose the need to increase the number of articles that most closely aligned with the theme. The keywords were adjusted based on these articles. In this second search, other databases were also included for the research, such as Scopus and Google Scholar. The use of artificial intelligence tools Elicit and Semantic was essential to find articles using the keywords that were most repeated in the main articles. Still in this second search, these adjusted keywords were entered into Chat-GPT 4 to generate search strings under the acronym P.I.C.O (Population; Intervention; Comparison; Outcome) for use in the Web of Science.

Keywords for the second search: [Collaborative mapping, informal settlements, urban slums, OpenStreetMap, public participation, community engagement, community-based intervention, community intervention].
Final search string: [("Collaborative mapping" OR "participatory mapping") AND ("community intervention" OR "community-based intervention") AND ("OpenStreetMap" OR OSM) AND ("community empowerment" OR "empowerment") AND ("community mobilization" OR "community engagement") AND ("citizen participation" OR "public participation") AND (favelas OR "informal settlements" OR "urban slums")]. Finally, twenty articles were identified in the Web of Science, Google Scholar, and Scopus databases.

Based on these 43 articles, a synthesis framework is being built that aims to systematize information about the works found, informing: 1) Source/Base/Collection, 2) Reference according to ABNT, 3) Name of the Journal, 4) Contact of the main author - email, 5) Country of affiliation of the authors, 6) Country of the mapped community, 7) Problem/Objective/Hypothesis, 8) Methodology, 9) Materials used, 10) Techniques used, 11) Main results, 12) Does it work with Favela? 13) What is the nature of the community-based intervention? 14) Did favela residents operate the OpenStreetMap? 15) Where is the favela and/or intervention? 16) If mapping in a favela, what features and attributes were mapped? 17) Were integrated digital and analog cartographic technologies used? Which ones? 18) Were methods used for community appropriation of cartographic tools and data? 19) Was educational material provided? Indicate link, 19) Was a method for evaluating the tools and processes used implemented? 20) Were community impact indicators used? 21) Are effective impacts felt by the community reported? Which ones?

By constructing this framework, we are evaluating aspects such as the geographic diversity in the use of OSM, indicating the platform's flexibility and adaptability, or the still limited participation of local actors in mapping their communities. By compiling data on the nature of community-based interventions, techniques and methodologies used, and community impact indicators, we aim to identify common patterns in the types of interventions that have been most effective. Furthermore, the analysis of reported impacts can indicate tangible benefits of these projects.

The analysis of how projects addressed training and education, including the provision of educational materials and methods for community appropriation of cartographic tools and data, can indicate strategies used to empower local communities. We are also analyzing the features and attributes mapped specifically in favelas, to identify the main challenges and specific needs of these areas, supporting the indication of demands for improvements in methodologies and mapping tools in these urban contexts.

Thus, we are building a comprehensive framework on the current state of the use of OpenStreetMap for training and intervention in favelas, identifying gaps, challenges, and opportunities for future research and projects.

Academic Track
Room I
16:45
16:45
30min
Bridging the Gap: GPSSample – An Innovative Tool for Enumeration and Sampling in Health Surveys
Amber Dismer, Joel Adegoke

Abstract:
Global Gap: Updated population estimates and total households (HHs) per area, typically obtained through a census, are used to construct unbiased sampling frames necessary for accurate estimates in health surveys. These population and HH estimates are used to select a smaller representative sample of enumeration areas (EAs) to visit for a public health study. Several methods exist to select which HHs to visit at the EA level including systematic sampling (every nth HH), geographically sampling structures using satellite imagery, segmenting an area, and mapping and listing all HHs in the selected EA. Accurate estimates are essential for public health programming and response; however, forty-two countries have not conducted a census for over a decade (United Nations Statistics Division 2024). Instead, programs often generate an accurate sampling frame by enumerating HHs within selected EAs obtaining answers to eligibility criteria, drawing a sample from this enumerated list, and navigating back to selected HHs. This requires considerable resources. To date, no free mobile-based application exists to streamline these processes.
Solution: The GPSSample application is a user-friendly sampling solution to select HHs within EAs. The study administrator makes a configuration in GPSSample creating the eligibility screening questions and specifying the number or percent of HHs to return to in each EA. An example screening question for an immunization coverage survey question is: “Are there any children living in this HH between six and fifty-nine months old?”. In GPSSample, teams can rapidly enumerate HHs in an EA and collect answers to these screening questions. Teams send encrypted data to a supervisor via new local-only mobile hotspot QR codes. Next, the supervisor presses a button, easily generating a simple random sample from the sampling frame of enumerated HHs. The selected HH list is sent to teams. Using GPSSample, teams navigate back to selected HHs to conduct surveys. GPSSample integrates seamlessly with survey applications, including ODK Collect and Kobo Toolbox, opening the second app’s designated form. Users send unique HH ids and cluster data from GPSSample to the specified HH survey form. Upon saving the HH survey, teams are returned to the GPSSample app to mark the status of the HH. Teams use a map and a list view of selected HHs for monitoring field work. Supervisors can view EA and study level summary statistics in GPSSample to monitor field work.
Furthermore, the GPSSample app can be used in surveys lacking any advanced information on population or areas of concern. It is not necessary for a country to have conducted a census. Supervisors can draw an area within GPSSample onsite, segment the area, and assign to field teams to rapidly enumerate locations before sampling them for the assessment or survey. This novel capacity in GPSSample highlights the flexibility and potential for use in outbreak investigations and emergency responses where HHs may be damaged or destroyed. Additionally, food outbreak investigations conducted at market stalls or stores may not be collated in a central list. While designed in the public health context, GPSSample is useful for other disciplines.

GPSSample is a free Android 8+ application available in six languages: English, French, Spanish, Portuguese, Russian, and Bahasa. It is designed for field practitioners with limited mobile networks or Wi-Fi. The app was developed using the Kotlin open-source language and it uses the open-source SQLite database. User guides, GPSSample Decoder application decrypting data, demonstration videos, and Quarto analyses are available through the GPSSample GitHub site.
Use Cases and Road Map: Development lessons learned will be presented from two public health GPSSample application pilots in India and Kenya. We aim to engage the FOSS4G development community to enhance GPSSample’s geospatial functionalities and learn best practices on maintaining and updating open-source code. GPSSample currently uses Mapbox. Ideally in a future version, users will also be able to select OpenStreetMap for the base map and the app will include navigation using offline turn-by-turn instructions.

References
United Nations Statistics Division, 2024: 2020 World Population and Housing Census Programme. Available at: https://unstats.un.org/unsd/demographic-social/census/censusdates/. Accessed 6/24/2024, 2024.

Academic Track
Room IV
16:45
30min
Georeferencing of Urban Trees Using Drones and Ground-Level Imaging, and Classification of Their Species by Machine Learning
Paulo Roberto Ferreira Maciel, Rodrigo Smarzaro

Forest registration is essential for effectively managing natural resources, enabling improved tree management (Kattenborn, Eichel and Fassnacht 2019). This process simplifies urban planning, allowing for a more conscientious approach and significantly contributing to the preservation of green areas. The proposal to reduce environmental impact, survey time, and required effort (Barbosa et al. 2018; Li et al. 2015; Beloiu et al. 2023) has motivated the growing use of computer vision for these tasks. Today, this represents a true cartographic revolution. These innovations enhance the quality of life in cities by providing accurate and up-to-date data to support critical decisions (Barbosa et al. 2018).
This work aims to detect, classify, and georeference trees in urban environments using image segmentation algorithms applied to aerial and street-level images. Several studies use aerial images (Beloiu et al. 2023; Wäldchen and Mäder 2018; Mlenek, Dalla Corte e Santos 2020), but our approach seeks to improve the detection and identification of tree species by combining street-level images with aerial images. Our model will be developed with the algorithms that present the best metrics for species segmentation and classification based on related studies. The project also prioritizes using free and open-source software in its development. This not only democratizes access to robust monitoring and analysis tools but also encourages collaboration and innovation in the geospatial community, aligning with the values of FOSS4G.
We will apply pre-processing techniques to the images to enhance the model’s accuracy, including geometric and atmospheric correction with QGIS software. Gaussian filters will also be applied to reduce noise and contrast adjustments to make edges and textures more distinct. After this step, we will proceed to the feature extraction stage for automatic species identification using a machine-learning model. Given the increasing need for environmental preservation and sustainable management, identifying and classifying tree species have become solid allies for ecological conservation, positively impacting urban quality of life.
To map the urban area of Rio Paranaíba, an unmanned aerial vehicle (UAV) drone equipped with a high-resolution camera was used, capturing images with a 3.5 cm resolution. The UAV was operated autonomously, flying in parallel strips over the city. A 70% overlap between the images was used, resulting in the creation of an accurate orthomosaic of the region, favoring more accurate georeferencing of the trees. OpenStreetMap software was used to create the orthomosaic. GPS performed georeferencing during the flight. Street-level images were obtained with a camera that provides 360º coverage. For species classification, a training dataset was created from samples collected in the field, both aerial and ground-level. Various machine learning algorithms, such as Random Forest, Support Vector Machine (SVM), and Convolutional Neural Networks (CNN), were researched and evaluated for their accuracy in species classification.
Tree identification through images of trunks and leaves presents significant challenges due to high intraclass variability and high interclass similarity. High intraclass variability refers to the substantial differences between images of trunks or leaves of the same tree species caused by lighting variations, capture angle, and tree condition. On the other hand, high interclass similarity refers to the very similar visual characteristics between different species, making it difficult to distinguish one from another based solely on appearance. Additionally, improper color balance adjustments by cameras can introduce unwanted shades, such as a greenish tint, further complicating accurate classification. These combined factors make using deep learning for tree classification a complex and challenging problem (Cotrim et al. 2019). This technique, which combines remote sensing with aerial and ground-level images and advanced machine learning techniques, is expected to present a significant advance in tree species classification. This approach allows for detailed analysis of trunk and leaf textures, potentially significantly improving species identification accuracy. Studies such as those by Kattenborn, Eichel and Fassnacht (2019) have demonstrated that CNN-based segmentation (U-net) can achieve an 84% accuracy in vegetation classification using high spatial resolution RGB images. The U-net is widely recognized for its effectiveness in image segmentation tasks, especially in high-precision and detail scenarios. Its architecture captures complex features, making it ideal for detecting and classifying specific elements in high-resolution images. Additionally, the U-net has shown consistent results in various remote sensing applications, making it a reliable choice for geospatial data analysis projects. Therefore, adopting the U-net in the project can ensure superior tree species identification and mapping performance. This work aligns closely with the themes addressed at the FOSS4G event, as it demonstrates the practical application of free and open-source software tools in an environmental monitoring context. QGIS, OpenDroneMap, and OpenStreetMap exemplify how open technologies can be integrated to solve complex georeferencing and species classification problems. Furthermore, the focus on urban areas and the combination of drone and street view data provides valuable insights for the geospatial community, showing the feasibility and benefits of free software for urban and environmental applications.

Academic Track
Room I
17:15
17:15
30min
Study Case of Erizo Juan Santamaría: from free map to official cartography
Jaime Gutiérrez Alfaro

This article presents the case of the mapping of the informal settlement Erizo Juan Santamaría. The neighborhood went from being an empty space on digital maps to be part of the official cartography of Costa Rica. The mapping was carried out using technologies based on free/open software and participatory cartography methodologies; the work was done jointly between the people who live in the community and Laboratorio Experimental (LabExp) a research and extension project of the public university Instituto Tecnológico de Costa Rica. The active participation of the community in the process was key for the Municipal Council of Alajuela, where the neighborhood is located, to make official the traces and names of the streets and alleys of Erizo Juan Santamaría for municipal purposes. Furthermore, at the request of the municipal council, the National Nomenclature Commission approved the names at the national level.

The informality of the Erizo Juan Santamaría neighborhood lies in the fact that the people who live in the space do not own the land. The territory where the neighborhood is located belongs to two public institutions, one part to the Municipality of Alajuela and the other to the National Institute of Housing and Urbanism (INVU). In the 1970s, the first families began to occupy the territory where the settlement is currently located. Since then, the inhabitants of Erizo Juan Santamaría have solved their basic common infrastructure needs, as well as managed access to public services. The two public institutions, owners of the land, as well as the neighboring neighborhoods have assessments and interests in the informal settlement, which are manifested in a tense relationship that includes marginalization, manipulation, stigmatization, and invisibility.

In 2017, LabExp and representatives of the neighborhood agreed to work together on a 4 years university extension project aiming to make the informal settlement visible to decision-making institutions and neighboring neighborhoods through maps. Until then, the neighborhood was not represented on commercial digital maps or on the free OpenStreetMap map. LabExp proposed a work plan based on participatory processes, the use of free software and open geospatial data.

It was determined to prioritize two elements to be mapped, considering the relevance for the community in its relationship with the different decision-making actors. The first was the houses, since INVU was interested in developing a project to improve the neighborhood's housing infrastructure, the institution would carry out a census. Through a number in each house, the map could be linked to the census data. The second were the streets and alleys, with the intention that neighbors improve the way in which they gave their home addresses when requesting services. At all times, OpenStreetMap was considered as the repository where the collected data would be stored. The mapping process was carried out with free and open tools from the OSM ecosystem: OSMTracker to capture GPS data in the field, Fieldpapers to collect data in workshops and conversations with neighbors, JOSM to edit the OSM map and QGIS both to create maps to capture data and to create maps to disseminate the mapping process. The mapping activities and dynamics included: free cartography workshops with students at the local school, field trips and unstructured playful dynamics with children in the neighborhood.

In addition to the mapping, two activities were key to foster a feeling of ownership of the process by the residents of the neighborhood and to disseminate the partial and final results. The first was the production of short videos in order for the community's inhabitants to narrate their reality about infrastructure, show the neighborhood, and describe the relationship with the decision-making institutions, in such a way that they linked these experiences with the process of mapping. The second activity was a voting process to choose names for streets and alleys. Each person in the neighborhood had the opportunity to make name proposals for the mapped transit spaces. Subsequently, the residents of the neighborhood were called to elections. One Sunday morning, each person had the opportunity to express their will, voting for the names of their streets and alleys together.

The mapping process was completed by 2021, Erizo Juan Santamaría appeared on the digital maps. In OSM, the houses were included with their respective numbering according to the needs of the INVU, the streets and alleys with the names selected by the inhabitants, elements of public infrastructure, trees and the proper name of the neighborhood. The community was also represented on other commercial maps. Thanks to the dissemination of the short videos and press releases in the University's and national media, the mapping process of Erizo Juan Santamaría was known to the members of Municipal Council of Alajuela. The Council dedicated an entire session to heard about the project and agreed to make official the names of the streets and alleys decided in the voting process by the neighbors. In addition, the Council managed to make official the names before the National Nomenclature Commission of the National Geographic Institute.
The case of Erizo Juan Santamaría is a unique example in the country where, through participatory cartography, the production of free geospatial data is contributed to official cartography. The visibility of the neighborhood on digital maps makes it easier for the inhabitants to access services that were previously denied or restricted due to the insecurity that people offering the service felt about visiting the neighborhood, partly due to stigmatization and partly because the location led to an empty space on the digital map. Given the increasing use of digital maps to access services and make decisions, it is important to discuss the right of communities to appear on digital maps.

Academic Track
Room I
17:45
17:45
30min
SIMMAM 3.0 – Updating the Toolbox for the Conservation of Marine Mammals
Alencar Cabral

Marine mammals occur in low densities and usually in areas that are difficult to access. One of the main sources of information for marine mammals are stranded animals. However, strandings are rare events and to be biologically meaningful they need to be accumulated over large distances, long times, or both. This work describes SIMMAM (Sistema de Apoio ao Monitoramento de Mamíferos Marinhos), a project aimed at organizing a database of marine mammal sightings and strandings along the Brazilian coast and available at https://libgeo.univali.br/simmam. It began as an internal research project by UNIVALI but now is used by IBAMA and ICMBio. Its initial implementation has already been described [Moraes, 2005; Barreto et al., 2006]. However, it was almost completely rewritten since its initial implementation and SIMMAM 3.0 now conforms to the DarwinCore (DwC) standard, which is an international scientific initiative of the Taxonomic Database Working Group - TDWG. The data architecture adopted is compatible with GBIF, which allows SIMMAM to become a data publisher of marine mammal occurrences.
For the development of SIMMAM 3.0 free source code tools were adopted. On the server side, PHP 7.4 was used with Symfony 5.x framework. For the web client side, the site is rendered on the server side and delivered to the browser as an HTML + JavaScript page with Bootstrap 5. The data exchange API was also implemented in PHP, following the XML standard of DwC. Data is stored in PostgreSQL 11.x with PostGIS 3.x which allows manipulation of geospatialized data. Tables were structured according to the DwC standard, to reduce the complexity of the communication API.
As all occurrences in SIMMAM need to have a geographic position, the main interface for users to view the data is through an interactive WebGIS. The implemented WebGIS has filters by taxon and by type of occurrence (sighting, stranding, incidental capture) to allow users to focus on specific data. To better display areas with high density of records without generating visual clutter, the occurrence layer was clustered, grouping and ungrouping records according to the zoom level. Leaflet Map [Agafonkin, 2020] was used with the OpenStreetMap base map, as it is a modern map engine, has functionalities optimized for mobile devices, and does not have any external dependency. Leaflet supports multiple layers and is compatible with the Open Geospational Consortium (OGC) standard such as support for map mosaics, georeferenced images, WMS [Leaflet, 2020] and GeoJSON [IETF, 2021].
One key aspect of biological information is the taxonomic identification. To avoid taxonomic instability in SIMMAM, it uses the taxonomic list provided by the Integrated Taxonomic Information System – ITIS (www.itis.gov). As the taxonomic classification of mammals is very stable, it was decided to keep a copy of the ITIS database locally to reduce latency, being updated on demand.
The types of occurrence records currently supported by SIMMAM are stranding, incidental capture and sightings. All these occurrence records have fields for defining the best taxonomic level, geo-referencing the occurrence, information on biological material collected and the person responsible for the data. Stranding and incidental capture records contain information regarding the state of the animal (alive or dead), the condition of the carcass (decomposition stage), sex and length. For sightings it is possible to inform environmental parameters such as weather condition, sea state, wind speed, as well as if it was a single animal, part of a group and group size.
The first version of SIMMAM was made available in 2007 to the Centro Mamíferos Aquáticos - CMA, that started to use it as the main tool to integrate data for the Brazilian Stranding Network of Aquatic Mammals (Rede de Encalhes de Mamíferos Aquáticos do Brasil, REMAB). On the same year, SIMMAM was presented to the then General Coordination of Oil and Gas (Coordenação Geral de Petróleo e Gás – CGPEG), current General Coordination of Marine and Coastal Enterprises (Coordenação-Geral de Licenciamento Ambiental de Empreendimentos Marinhos e Costeiros – CGMAC), that used it to aggregate and organize marine mammal sighting data generated by marine mammal observers [Barreto et al., 2019; Britto; 2009]. Presently, sighting data are regularly uploaded to SIMMAM directly by the licensed companies.
As of June 2024, SIMMAM has 423 active users and holds 75,340 aquatic mammal records. Of these, 61% records are private, but this proportion is very different depending on the type of record. For strandings, that are in most part submitted by research institutions, 91% are private as they are the results of individual efforts. But for sightings 61% are public, as they come mostly from the oil industry as part of the environmental licensing of their operations, and they mirror the public reports that have been delivered to IBAMA. As mentioned before, all the data held in the SIMMAM database, regardless of its public availability, can be seen by Brazilian environmental agencies (IBAMA and ICMBio).
The option to allow government agencies to use the whole dataset is extremely important for management purposes, as it enables environmental agencies to use even unpublished data generated by research institutions. But as the data is not available for the general public, it does not compromise their future use in academic publications. Also, a limited visualization of private data in the WebGIS, where details of the record such as species and date are not shown, serves as an indication for other researchers that a specific institution has data on marine mammals in a specific area, fostering collaborations among institutions.
We believe that presenting this work at FOSS4G 2024 will allow us to discuss SIMMAM with the geospatial community and receive input to further improve the system. It shows a successful implementation of open geospatial technologies that is being used both by government and the academic community.

Academic Track
Room V
17:45
30min
Urban Cycling: Intelligent Bicycle Sensors for Road Safety and Sustainability
Felix Erdmann, Luis Fernando Villaça Meyer, Beatriz Gonçalves

1. Introduction

Urban transportation is transforming with a focus on sustainability and smart city initiatives. Cycling, a key element of sustainable urban mobility, needs robust infrastructure and reliable data for growth and integration into city planning. Despite advancements in sensor technology and (geo)data analytics, there is a gap in comprehensive collection and use of cycling-specific environmental, safety, and pathway data.

One major deterrent for citizens using bicycles are the perceived dangers in traffic. Identifying insecure sections is crucial to improving cycling infrastructure. Safe countermeasures can change negative perceptions and promote cycling as a safe and sustainable mode of transport. Traditionally, only actual crashes are included in official data, informing city planning decisions. However, analysing high-risk occurrences like near-miss incidents, which greatly impact the perceived danger, can provide a more accurate understanding of cycling safety.

There already exist a number of projects using different technologies to gather and provide data on bicycle safety and urban mobility, but combining environmental and road safety aspects is unique. Projects examining cyclist safety, particularly dangerously close overtaking manoeuvres, often involve remote data processing with machine learning or human analysis. Using live video or images from bike-mounted smartphones is effective but creates data overhead and privacy concerns. Additionally, microcontroller-based sensing systems can be complex to assemble, requiring technical skills and special equipment.

The objective of this work is to address the aforementioned gap by developing an innovative bicycle sensor system that leverages embedded artificial intelligence (AI) to process sensor data on the device. This approach has the potential to reduce data overhead and address privacy concerns while simultaneously providing actionable insights. Our work has the potential to make significant contributions to traffic and transport planning by providing valuable insights into traffic patterns and road safety concerns using extensive spatial datasets gathered by citizens.

System Design

At the core of our system is a microcontroller unit (MCU) of the senseBox family. The senseBox is a versatile, open hardware electronics kit specifically designed for citizen science projects and educational initiatives, with an emphasis on environmental monitoring and data collection.

The following environmental sensors are used:
- Temperature & rel. Humidity (HDC1080)
- Particulate Matter (SPS30)
- Acceleration (MPU6050)
- Time-of-Flight (ToF) ranging (VL53L8CX)

Moreover, battery management, Bluetooth Low Energy (BLE), and OLED-Display modules are included for connectivity and user feedback. All parts fit into a custom designed, 3D printed enclosure which is attached to the seat post of a bicycle.

The device is communicating with an open source smartphone app using BLE which receives sensor data and combines them with geolocation data. Datasets are recorded and saved on the smartphone, but can also be uploaded to openSenseMap as open data during the ride. Users can control levels of privacy (e.g. by setting privacy zones) to foster digital sovereignty.

2.1. Machine Learning on the Bike

We are introducing two approaches to utilise machine learning capabilities using Tensorflow Lite on the sensor device: overtaking detection and road surface / quality classification. By processing the data directly on the device instead of sending it to larger servers, bandwidth and energy consumption is kept minimal.
In the low resolution depth images recorded by the 8x8 multizone ranging ToF sensor, overtaking vehicles can be detected using shallow neural networks. This has already been described and implemented as a standalone solution in (Scharf et al., 2024), but for integrating it into the mobile sensor system some considerations for available processing capacities, suitable inference times and necessary accuracies will be addressed as part of this work.
To classify the road surface and its quality, the acceleration sensor will be used. While raw acceleration values can identify the roughness of a road, surface classifications and quality estimations can reveal deviations from intended surfaces to actual surfaces. Using acceleration values and geolocation data, we will explore training a machine learning model using OpenStreetMap Surface information as ground truth data.

3. Workshops

Engaging citizens in data collection, problem identification, and the construction of sensor stations empowers them and fosters the generation of new ideas. Our solution is a solder-free, easy-to-assemble mobile sensor device. We conduct a workshop in São Paulo, Brazil, where 20 participants build and mount their own mobile sensor device on bicycles. Afterwards, they collect environmental and bicycle-specific sensor data. Follow-up workshops in Münster, Germany will allow the comparison of the contrasting bicycle infrastructures in these cities, as well as the general urban environment differences, and will provide valuable data and insights into participants' perceptions.
This collaborative effort enhances participants' understanding of scientific methods and urban mobility challenges while ensuring that the collected data reflects cyclists' authentic experiences. By involving citizens as active contributors, we aim to bridge the gap between scientific research and community needs, fostering a more inclusive and participatory approach to urban mobility solutions.
After the workshops we conduct user studies with the workshop participants on the following topics:
Usability: Through surveys at the end of each session and interviews, participants provide feedback on assembling, mounting and connecting the bicycle sensor device.
Trust in Data: Participants review the data of their recorded dangerous takeovers and road surface types and compare its accuracy with their own perceptions.

4. Conclusion and Future Work

This comprehensive evaluation aims to provide a thorough understanding of both the user experience and the technical performance of the system, ultimately guiding the data-driven foundation for improvements in urban mobility solutions. Insights gained from this work will inform future iterations of the project, ensuring the system collects high-quality data and meets the needs of cyclists, thereby effectively enhancing urban mobility and road safety not only for cyclists but for all users of the urban mobility system. Future works will include the development of an open source bike-related data analysis platform as a recommender system for bike infrastructure measures in cities.

Academic Track
Room I
No sessions on Thursday, Dec. 5, 2024.
No sessions on Friday, Dec. 6, 2024.
No sessions on Saturday, Dec. 7, 2024.
No sessions on Sunday, Dec. 8, 2024.