11-04, 15:30–16:00 (America/New_York), Lake Anne
DustCast is a heterogeneous ensemble model forecasting monthly dust concentrations across the Arabian Peninsula using open-source ERA5, MERRA-2, and IOD data. Combining MLR, KNN, DT, and RF weighted by RMSE, it achieves accuracy and captures seasonal dust patterns.
This study presents DustCast, an ensemble machine learning (ML) model developed to forecast monthly atmospheric dust concentrations across the Arabian Peninsula (AP). Motivated by the increasing frequency and intensity of dust storms in the region and their associated adverse impacts on health, agriculture, and the environment, the model integrates multiple free and open-source meteorological and aerosol datasets, including ERA5 reanalysis, MERRA-2 aerosol diagnostics, and the Indian Ocean Dipole index. The methodology employs a heterogeneous parallel ensemble framework that combines four regression techniques: multiple linear regression (MLR), K-nearest neighbors (KNN), decision tree (DT), and random forest (RF), with weights assigned based on each model's performance as evaluated by root mean squared error (RMSE). Spatial aggregation using the H3 hexagon grid system facilitates efficient and precise data binning and analysis. Results indicate that MLR and RF exhibit superior predictive capabilities among the individual models on the surface, with the aggregated ensemble prediction achieving an RMSE of 0.00972 micrograms (µg/m3) and an R2 of 0.887, outperforming each base learner. When applied to the atmospheric column, DustCast derives the majority of the predictive contributions from DT and RF, with the ensemble prediction achieving an RMSE of 0.00550 milligrams (mg/m2) and an R2 of 0.984. The DustCast ensemble model captures seasonal patterns of dust mobilization across the AP. The model performs particularly well during the summer months (JJA) when the Shamal wind strongly influences sand and dust storms throughout the region. It also captures seasonal dust events associated with frontal systems and dynamic pressure gradients throughout the year.