12-05, 17:00–17:30 (America/Belem), Room V
In today's fast-paced, data-driven environment, organizations that use geospatial data for analysis are challenged by managing complex datasets and frequent updates. Geospatial data provides valuable context and information, enhancing applications in various domains such as logistics, urban planning, environmental monitoring, and marketing.
Traditionally, many organizations have relied on no-code geospatial software with click-based interfaces. Due to their accessibility and user-friendly interface, these tools allow team members to visualize and manipulate geospatial data without the need for programming knowledge. However, as data starts to become more complex, these tools often present scalability limitations, restricting the full potential of geospatial data applications.
This paper explores the benefits of transitioning to a hybrid approach by integrating Python and open-source geospatial libraries into the data processing phase of geospatial analysis. By presenting the possible advantages gained and providing a hands-on example of Python use in geospatial data, the aim of this paper is to demonstrate how Python can play a pivotal role in overcoming the limitations of no-code solutions.
Python can enhance the data extraction phase, enabling integration with various data sources and APIs and connecting to external databases and web services. This capability supports consistent data exchange and real-time data integration. This phase can also be automated, summarizing all steps into a script that can be applied to every new dataset.
The processed data can be visualized using various libraries in a Python environment or used as input for traditional geospatial software. The hybrid approach leverages the user-friendly visualization tools of no-code software while enabling more sophisticated data processing capabilities.
The output data can also be used as input for developing custom algorithms, including the integration of machine learning models and artificial intelligence. This step enables a wide range of applications, such as urban feature prediction, classification or segmentation of remote sensing data, and clustering of spatial data.
Adopting a hybrid approach significantly enhances an organization's analytical capabilities. These advanced analyses provide deeper insights into spatial patterns and trends that manual methods alone may not reveal.
The hands-on example, based on census data from the Brazilian Institute of Geography and Statistics (IBGE), demonstrates geospatial data processing with Python and the GeoPandas library, both open-source solutions. This example will include the use of Python in geospatial data processing through the following steps: data extraction, data processing, and customized algorithm application.
In conclusion, integrating Python into these workflows enhances flexibility and analytical capabilities, allowing organizations to innovate in their solutions and create new opportunities for products and services. Coding elevates data-driven decision-making and enables more sophisticated and scalable analyses, particularly when dealing with large and complex datasets. This case study serves as an inspiring example for organizations and researchers aiming to maximize the potential of their geospatial data, highlighting the significant benefits of combining traditional geospatial software with powerful open-source tools.
This work received financial support from the State of São Paulo Research Foundation (FAPESP) (grant 2023/15663-7, 2024/05553-2, 2024/05727-0, 2024/05481-1).
Data scientist with over five years of experience in geospatial analysis, finance, marketing, and software development industry. Academic background in Architecture and Urbanism, with a PhD in Architecture and Urbanism, a Master's in Civil Engineering, and a Master's in Data Science, the latter with research focused on Deep Learning classification models applied to urban environment metrics, using satellite and street view images.