Early detection of cardiovascular diseases (CVDs) is crucial for improving patient outcomes and alleviating healthcare burdens. Electrocardiograms (ECGs) and phonocardiograms (PCGs) offer low-cost, non-invasive, and easily integrable solutions for preventive care settings. In this work, we propose a novel bimodal deep learning model that combines ECG and PCG signals to enhance the early detection of CVDs. To address the challenge of limited bimodal data, we fine-tuned a Convolutional Neural Network (CNN) pre-trained on large-scale audio recordings, leveraging all publicly available unimodal PCG datasets. This PCG branch was then integrated with a 1D-CNN ECG branch via late fusion. Evaluated on an augmented version of MITHSDB, currently the only publicly available bimodal dataset, our approach achieved an AUROC of 96.4%, significantly outperforming ECG-only and PCG-only models by approximately 3%pts and 11%pts, respectively. To interpret the model's decisions, we applied three explainability techniques, quantifying the relative contributions of the electrical and acoustic features. Furthermore, by projecting the learned embeddings into two dimensions using UMAP, we revealed clear separation between normal and pathological samples. Our results conclusively demonstrate that combining ECG and PCG modalities yields substantial performance gains, with explainability and visualization providing critical insights into model behavior. These findings underscore the importance of multimodal approaches for CVDs diagnosis and prevention, and strongly motivate the collection of larger, more diverse bimodal datasets for future research.
Bimodal ECG and PCG Cardiovascular Disease Detection: Exploring the Potential and Modality Contribution
Calzoni A.
;Savardi M.;Silvestri M.;Benini S.;Signoroni A.
2025-01-01
Abstract
Early detection of cardiovascular diseases (CVDs) is crucial for improving patient outcomes and alleviating healthcare burdens. Electrocardiograms (ECGs) and phonocardiograms (PCGs) offer low-cost, non-invasive, and easily integrable solutions for preventive care settings. In this work, we propose a novel bimodal deep learning model that combines ECG and PCG signals to enhance the early detection of CVDs. To address the challenge of limited bimodal data, we fine-tuned a Convolutional Neural Network (CNN) pre-trained on large-scale audio recordings, leveraging all publicly available unimodal PCG datasets. This PCG branch was then integrated with a 1D-CNN ECG branch via late fusion. Evaluated on an augmented version of MITHSDB, currently the only publicly available bimodal dataset, our approach achieved an AUROC of 96.4%, significantly outperforming ECG-only and PCG-only models by approximately 3%pts and 11%pts, respectively. To interpret the model's decisions, we applied three explainability techniques, quantifying the relative contributions of the electrical and acoustic features. Furthermore, by projecting the learned embeddings into two dimensions using UMAP, we revealed clear separation between normal and pathological samples. Our results conclusively demonstrate that combining ECG and PCG modalities yields substantial performance gains, with explainability and visualization providing critical insights into model behavior. These findings underscore the importance of multimodal approaches for CVDs diagnosis and prevention, and strongly motivate the collection of larger, more diverse bimodal datasets for future research.| File | Dimensione | Formato | |
|---|---|---|---|
|
s10916-025-02245-5.pdf
accesso aperto
Tipologia:
Full Text
Licenza:
DRM non definito
Dimensione
1.39 MB
Formato
Adobe PDF
|
1.39 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


