Integrating Earth Observation Data for Enhanced Health Response Systems: The EODCtHRS component of HARMONIZE Project
Karine Ferreira, Marcos Lima Rodrigues, Adeline Marinho Maciel, Miguel Monteiro, Gabriel Sansigolo, Yuri Domaradzki Moreira Nunes, Ana Claudia Rorato Vitor, Luana Becker da Luz, Rachel Lowe
The lack of an integrated understanding of the connections between extreme weather events, environmental degradation, socioeconomic disparities, and their impacts on infectious disease outbreaks heightens the risk of disease spread. This issue is particularly critical in Latin America and the Caribbean (LAC) region, where vulnerable communities have been more frequently affected by these events. The HARMONIZE project goal is to create digital toolkits that stakeholders in climate change hotspots can use to combine data about the environment, climate and health cost-effectively to monitor and send out alerts about a set of diseases that are affected by its effect.
This talk will give an overview of the Earth Observation Data Cube tuned for Health Response Systems (EODCtHRS), an HARMONIZE Project component. The EODCtHRS presents a technical-scientific proposal termed HARMONIZE Instance composed of back/front-end solutions developed using free and open-source software for integration and interoperability between specific sets of health, environmental and climate data and the digital infrastructure of the Brazil Data Cube (BDC) project of the National Institute for Space Research (INPE).
The development of this proposal was divided into four working streams, Drone, Health, Climate, and Data Science Environment modules. Furthermore, we developed a custom version of the web platform for data visualization and analysis of these sources based on BDC Explorer 3.0 (https://brazildatacube.dpi.inpe.br/portal/explore), which presents improved capabilities for discovering, visualizing, and downloading data cubes from remote sensing images (https://brazildatacube.dpi.inpe.br/harmonize/dev/portal/explore). An Harmonize Instance ALPHA Version has been generated.
The core background of this platform is the SpatioTemporal Asset Catalog (STAC) specification which defines a way to store and search data using spatial and temporal operations. The STAC enables the harmonization of data from different sources and maintains interoperability between all system parts. The solution utilizes a suite of technologies from Python and R environments in addition to PostgreSQL/PostGIS and GeoServer needed to store and publish data collections.
Below we present a brief description of each working stream:
Module 1 - Drone image: The main goal of drone image integration in the context of EODCtHRS is to provide a data infrastructure that meets the demands of health surveillance, especially in areas considered hotspots of climate change. Consequently, we started exploring the integration of the images generated by fieldwork campaigns in some locations of Pará State. The processing of these images is based on auxiliary information (course angle and flight height) and EXIF and TIFF metadata tags to support the conversion of the raw images into Cloud Optimized GeoTIFF (COG) files ideal for integration with STAC specification implemented by BDC infrastructure. Besides that, mosaics were created using the OpenDroneMap application. The Alpha version of these data collections (scenes/mosaics) has been published as layers with GeoServer and associated metadata available in STAC catalogs.
Module 2 - Health data: This module integrates health data for the EODCtHRS, including information from different stakeholders, mainly Fiocruz's Health Information Laboratory (LIS) and the InfoDengue initiative. Both projects produce health indicators, considering the impacts of environmental and climate change on the Brazilian population. The module also covers the development of two main packages.
The first, called EODCtHRS Health Indicator Processing (EHIPR), was developed in Python to obtain health indicators from CSV and Parquet files, aggregate them spatially and temporally, spatialize them and accommodate them on the HARMONIZE platform . Second, called EODCtHRS Data PUblisher (EDPU), is a package developed in Python to publish the HARMONIZE datasets as a layer in GeoServer and its metadata in STAC Catalog to make available at the HARMONIZE Explorer. All sources of data (drone images, climate and health indicators) used the EDPU package to publish the ALPHA version of collections produced in the context of the HARMONIZE project.
Module 3 - Climate data: This module integrates climatological data for EODCtHRS, enabling direct query execution via access interfaces, and eliminating the need for data transfer. Within the project's scope, we consider products produced by Fiocruz team from the Copernicus Climate Change Service (C3S), which the European Centre implements for Medium-Range Weather Forecasts (ECMWF) ERA5-Land reanalysis dataset and available by the Center for Weather Forecasting and Climate Studies (CPTEC/INPE): SAMeT and MERGE.
This module developed the EODCtHRS R Climate Processing Package (rclimpr) to generate climate indicators. The rclimpr uses scripts to extract indicators like temperature and precipitation from netCDF files through spatial and temporal aggregations (epidemiological weeks and months). It outputs raster files in COG format and vector formats like GeoJSON and Shapefile, providing suitable data formats for analysis and visualization.
Module 4 - The Geospatial Data Science Environment (BDC-Lab) aims to provide a set of geospatial data analysis tools integrated with BDC data, avoiding the necessity to download large amounts of Earth Observation data and allowing researchers to produce deep analysis using tools such as RStudio, QGIS, Metview, VSCode and Jupyter Notebooks with several R and Python geospatial libraries pre-installed. Currently, it is in an experimental phase, where some users are testing its functionalities and providing feedback for its improvement.
This talk proposal presents an overview of a software environment developed to harmonize Earth observation, environmental, climate, and health data aiming to provide ways to visualize, analyze, monitor, and alert for spreading diseases in climate change hotspots in LAC region. The development of the HARMONIZE Instance has demonstrated the utility of geoservices and technologies, with standard infrastructure and protocols, as an effective way to harmonize different data formats from diverse data sources in the health context.
The HARMONIZE project is financed by the Wellcome Trust (https://wellcome.org) grant number 224694/Z/21/Z, through the Foundation for Scientific and Technological Development In Health (FIOTEC) ID Project: ICICT-002-FEX-22 and coordinated by Prof. Rachel Lowe leader of the Global Health Resilience Team in the Earth Sciences Department from Barcelona Supercomputing Center (BSC).