Early detection of Flavescence dorée leaf symptoms remains an open question for the research community. This work tries to fill this gap by proposing a methodology exploiting per-pixel data obtained from hyperspectral imaging to produce features suitable for machine learning training. However, since asymptomatic samples are similar to healthy samples, we propose “uncertainty-aware” models that address the probability of the samples being similar, thus producing, as output, an “unclassified” category when the uncertainty between multiple classes is too high. The original dataset of leaves hypercubes was collected in a field of Pinot Noir in northern Italy during 2023 and 2024, for a total of 201 hypercubes equally divided into three classes (“healthy”, “asymptomatic”, “diseased”). Feature predictors were 4 for each of the 10 vegetation indices (population quartiles 25-50-75 and population’s mean), for a total of 40 predictors in total per leaf. Due to the low number of samples, it was not possible to estimate the uncertainty of the input data reliably. Thus, we adopted a double Monte Carlo procedure: First, we generated 30,000 synthetic hypercubes, thus computing the per class variance of each feature predictor. Second, we used this variance (serving as uncertainty of the input data) to generate 60,000 new predictors starting from the data in the test dataset. The trained models were therefore tested on these new data, and their predictions were further examined by a Bayesian test for validation purposes. It is highlighted that the proposed method notably improves recognition of “asymptomatic” samples with respect to the original models. The best model structure is the Decision Tree, achieving a prediction accuracy for “asymptomatic” samples of 75.7% against the original 49.3% for the Ensemble of Bagged Decision Trees (ML4) and of 44.6% against the original 13.2% for the Coarse Decision Tree (ML1).

Incorporating Uncertainty in Machine Learning Models to Improve Early Detection of Flavescence Dorée: A Demonstration of Applicability

Nuzzi, Cristina
Methodology
;
Pasinetti, Simone
Formal Analysis
2025-01-01

Abstract

Early detection of Flavescence dorée leaf symptoms remains an open question for the research community. This work tries to fill this gap by proposing a methodology exploiting per-pixel data obtained from hyperspectral imaging to produce features suitable for machine learning training. However, since asymptomatic samples are similar to healthy samples, we propose “uncertainty-aware” models that address the probability of the samples being similar, thus producing, as output, an “unclassified” category when the uncertainty between multiple classes is too high. The original dataset of leaves hypercubes was collected in a field of Pinot Noir in northern Italy during 2023 and 2024, for a total of 201 hypercubes equally divided into three classes (“healthy”, “asymptomatic”, “diseased”). Feature predictors were 4 for each of the 10 vegetation indices (population quartiles 25-50-75 and population’s mean), for a total of 40 predictors in total per leaf. Due to the low number of samples, it was not possible to estimate the uncertainty of the input data reliably. Thus, we adopted a double Monte Carlo procedure: First, we generated 30,000 synthetic hypercubes, thus computing the per class variance of each feature predictor. Second, we used this variance (serving as uncertainty of the input data) to generate 60,000 new predictors starting from the data in the test dataset. The trained models were therefore tested on these new data, and their predictions were further examined by a Bayesian test for validation purposes. It is highlighted that the proposed method notably improves recognition of “asymptomatic” samples with respect to the original models. The best model structure is the Decision Tree, achieving a prediction accuracy for “asymptomatic” samples of 75.7% against the original 49.3% for the Ensemble of Bagged Decision Trees (ML4) and of 44.6% against the original 13.2% for the Coarse Decision Tree (ML1).
File in questo prodotto:
File Dimensione Formato  
sensors-25-07493-v2.pdf

accesso aperto

Descrizione: Articolo
Tipologia: Documento in Post-print
Licenza: PUBBLICO - Pubblico con Copyright
Dimensione 1.36 MB
Formato Adobe PDF
1.36 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11379/637186
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact