ObjectivesTo evaluate the performance of vision transformer-derived image embeddings for distinguishing between normal and neoplastic tissues in the oropharynx and to investigate the potential of computer vision (CV) foundation models in medical imaging.MethodsComputational study using endoscopic frames with a focus on the application of a self-supervised vision transformer model (DINOv2) for tissue classification. High-definition endoscopic images were used to extract image patches that were then normalized and processed using the DINOv2 model to obtain embeddings. These embeddings served as input for a standard support vector machine (SVM) to classify the tissues as neoplastic or normal. The model's discriminative performance was validated using an 80-20 train-validation split.ResultsFrom 38 endoscopic NBI videos, 327 image patches were analyzed. The classification results in the validation cohort demonstrated high accuracy (92%) and precision (89%), with a perfect recall (100%) and an F1-score of 94%. The receiver operating characteristic (ROC) curve yielded an area under the curve (AUC) of 0.96.ConclusionThe use of large vision model-derived embeddings effectively differentiated between neoplastic and normal oropharyngeal tissues. This study supports the feasibility of employing CV foundation models like DINOv2 in the endoscopic evaluation of mucosal lesions, potentially augmenting diagnostic precision in Otorhinolaryngology.Level of Evidence4 Laryngoscope, 2024A computational study using endoscopic frames with a focus on the application of a trained self-supervised vision transformer model for tissue classification. This study supports the feasibility of employing vision foundation models in the endoscopic evaluation of mucosal lesions. image

Computer Vision Foundation Models in Endoscopy: Proof of Concept in Oropharyngeal Cancer

Piazza, Cesare;
2024-01-01

Abstract

ObjectivesTo evaluate the performance of vision transformer-derived image embeddings for distinguishing between normal and neoplastic tissues in the oropharynx and to investigate the potential of computer vision (CV) foundation models in medical imaging.MethodsComputational study using endoscopic frames with a focus on the application of a self-supervised vision transformer model (DINOv2) for tissue classification. High-definition endoscopic images were used to extract image patches that were then normalized and processed using the DINOv2 model to obtain embeddings. These embeddings served as input for a standard support vector machine (SVM) to classify the tissues as neoplastic or normal. The model's discriminative performance was validated using an 80-20 train-validation split.ResultsFrom 38 endoscopic NBI videos, 327 image patches were analyzed. The classification results in the validation cohort demonstrated high accuracy (92%) and precision (89%), with a perfect recall (100%) and an F1-score of 94%. The receiver operating characteristic (ROC) curve yielded an area under the curve (AUC) of 0.96.ConclusionThe use of large vision model-derived embeddings effectively differentiated between neoplastic and normal oropharyngeal tissues. This study supports the feasibility of employing CV foundation models like DINOv2 in the endoscopic evaluation of mucosal lesions, potentially augmenting diagnostic precision in Otorhinolaryngology.Level of Evidence4 Laryngoscope, 2024A computational study using endoscopic frames with a focus on the application of a trained self-supervised vision transformer model for tissue classification. This study supports the feasibility of employing vision foundation models in the endoscopic evaluation of mucosal lesions. image
File in questo prodotto:
File Dimensione Formato  
The Laryngoscope - 2024 - Paderno - Computer Vision Foundation Models in Endoscopy Proof of Concept in Oropharyngeal.pdf

solo utenti autorizzati

Tipologia: Full Text
Licenza: DRM non definito
Dimensione 9.42 MB
Formato Adobe PDF
9.42 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11379/605845
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact