Geodaysit 2023

Ramin Heidarian Dehkordi


Investigating PLSR and RF for retrieving wheat crop traits in a field phenotyping experiment using full-range hyperspectral data: performance assessment and modelling interpretation
Ramin Heidarian Dehkordi, Mirco Boschetti, Gabriele Candiani, Federico Carotenuto, Carla Cesaraccio, Andrea Genangeli, Beniamino Gioli, Donato Cillis, Marina_Ranghetti

Crop traits monitoring is a fundamental step for controlling crop productivity in the context of precision agriculture and field phenotyping. At present, the usage of hyperspectral data in machine learning regression algorithms (MLRAs) has attracted increasing attention to alleviate the challenges associated with traditional crop trait measurements. However, the performance assessment of such hyperspectral-based MLRA models for crop trait retrievals with respect to the well-known natural variations in either structural or biochemical crop properties remains largely elusive. As such, this experiment was set up to assess whether full-range hyperspectral data, acquired by a handheld spectrometer (Spectral Evolution; 350 – 2500 nm), as inputs to partial least squares regression (PLSR) and random forest (RF) models are capable of modeling different wheat crop traits at the canopy level. The examined crop traits were leaf area index (LAI), canopy water content (CWC), canopy chlorophyll content (CCC), and canopy nitrogen content (CNC). This approach allowed us, as an overarching objective, to compare the performance of the two aforementioned MLRA models while also focusing on the physical interpretation of the modelling results for each particular crop trait.
Overall, PLSR provided remarkably higher accuracy, tested with a cross-validation strategy, as compared to RF for all the crop traits. More precisely, PLSR denoted R2 (resp. nRMSE%) values of 0.72 (11.97), 0.77 (10.89), 0.70 (14.61), and 0.74 (14.38) for LAI, CWC, CCC, and CNC, respectively. All PLSR models indicated robust prediction capability with RPD values greater than 1.4, and amongst them, CWC was found to have excellent prediction performance with an RPD higher than 2. However, RF yielded less predictive models with R2 (resp. nRMSE%) values of 0.59 (14.59), 0.42 (17.42), 0.50 (18.86), and 0.42 (21.41) for LAI, CWC, CCC, and CNC, respectively. RF models for LAI and CCC showed good prediction capabilities (RPD > 1.4), whilst RF models of neither CWC nor CNC were reliable (RPD < 1.4).
In general, RF band importance and PLSR regression coefficient results revealed physically- meaningful and consistent patterns for each specific crop trait. Specific wavelengths at SWIR (1716-1745 nm) and NIR (1057-1120 nm), Green, and the Red-Edge bands respectively showed the highest importance for LAI retrieval. Water absorption regions around 910 nm and 1200 nm as well as the Red-Edge and Visible parts were of higher importance for the retrieval of CWC. The best-performing bands were situated in Red-Edge and Green spectral channels for CCC retrieval. SWIR spectral regions between 1600-1800 nm and 2100-2300 nm appeared to be important (in particular with respect to the other traits) alongside the Red-Edge part of the spectrum to retrieve CNC.
We demonstrated that full-range hyperspectral data in combination with MLRA algorithms can provide accurate estimates of wheat crop traits at the canopy level. The success of utilizing hyperspectral data in MLRA algorithms was further highlighted by the physically-meaningful modelling performances in accordance with the subtle structural and biochemical crop properties. Our results suggest that such spectroscopic hyperspectral-based MLRA approaches could be a powerful tool to accurately monitor crop status throughout the cropping season to improve high-throughput phenotyping activities and to further aid precision agricultural practices.

AIT Contribution
Sala Videoconferenza @ PoliBa
Spectroscopic Determination of Crop Residue Cover using Exponential-Gaussian Optimization of absorption features and Random Forest
Ramin Heidarian Dehkordi, Monica Pepe, katayoun Fakherifard

Non-photosynthetic vegetation (NPV) detection and quantification represent a key variable in remote sensing of conservative agriculture, and, more recently, in carbon farming due to its important role in water, nutrient and carbon cycling. For this reason, both mapping and characterization of NPV represent a relevant topic in the exploitation of Earth Observation (EO) data for agriculture monitoring.
Studies on NPV mapping by EO data benefit from the availability of hyperspectral data due to the high spectral resolution particularly at wavelengths from 1.6 to 2.3m, where the spectral features of carbon-based constituents of plants are distinctive. The launch of new generation hyperspectral satellites, as PRISMA (PRecursore IperSpettrale della Missione Applicativa) and, more recently, EnMAP (Environmental Mapping and Analysis Program) offers research opportunities in the field, which before was mainly investigated by proximal and aerial sensing.
Early studies already proved the potential of PRISMA in NPV due to the prominence of the cellulose-lignin key absorption feature at 2.1m. More recent studies on PRISMA make use of machine learning regression algorithm (MLRA) trained on the basis of radiative transfer model simulations, or on the basis of Exponential Gaussian Optimization (EGO) of specific absorption features on sensed data.
This second approach, proposed in this study, is aimed at the determination of Crop Residue Cover (CRC) using PRISMA hyperspectral imagery by a two-step approach making use of: i) firstly, an Exponential Gaussian Optimization to model pre-selected absorption features, also reducing the spectral dimension; ii) secondly, a Random Forest paradigm, performing non-linear regression to finally predict and map CRC.
This study exploits for the training phase an extensive and well documented spectral library, namely “Reflectance spectra of agricultural field conditions supporting remote sensing evaluation of non-photosynthetic vegetation cover” made available online by USGS ( It consists of 916 in situ surface reflectance spectra collected using a proximal full range spectroradiometer (350 to 2500 nm). Spectra are annotated with the corresponding fractions of NPV, Soil and (if any) Green Vegetation, as estimated by point sampling digital photograph of the radiometer field-of-view.
This spectral library was resampled to PRISMA spectral resolution, prior to the Gaussian Exponential Optimization (EGO) on 4 spectral intervals of interest, already tested in previous studies, and corresponding to absorption bands of: cellulose-lignin, plant pigments, vegetation water content and clays.
The EGO algorithm optimizes continuum-removed spectra by 4 parameters - absorption band depth, center, width and asymmetry – and since this is performed for each spectral interval, it results in 16 parameters. This is a reduced space as compared to the one of the input spectra (around 230 bands). This parameter space was used to train a Random Forest to model the regression between Crop Residue Cover percentage and EGO parameters, achieving a determination coefficient around 0.8 (RPD ˜2.1; MSE ˜ 0.02) on the test set.
The RF model was firstly validated against an independent spectral library of around 100 spectra, collected during a proximal sensing survey with a portable full range spectroradiometer, conducted in a large farm test site (3800ha) located in Jolanda di Savoia (Italy). Also in this case, spectra are annotated with Crop Residue Cover percentages, and resampled to PRISMA spectral resolution. The model performance on this dataset is in agreement with the test on the USGS spectral library.
Finally, the regression model was applied to a PRISMA image , acquired on the Jolanda di Savoia farm (June 21st 2021), for CRC mapping. The resulting map was validated against field observations: the CRC map show values and patterns in good agreement with ground data confirming encouraging prediction capabilities of the model
In conclusion, the proposed classification approach, trained on a spectral library is predictive, as proved on an independent spectral data set and on the PRISMA image. Further work will encompass testing the robustness of the model by collecting field ground data of Crop Residue Cover at the PRISMA scale; monitoring CRC dynamics on PRISMA time series; and, the use of Radiative Transfer Model simulations to enlarge the training set, accounting also for different factors controlling reflectance (e.g. soil moisture).

AIT Contribution
Sala Videoconferenza @ PoliBa