The 12-lead electrocardiogram (ECG) remains a cornerstone of cardiovascular diagnostics, with over 300 million performed annually worldwide. Existing AI solutions for automated interpretation often lack generalizability, remain closed source, and depend on supervised learning requiring extensive labelled datasets.
We developed and validated two open-source foundation models for ECG interpretation: DeepECG-SL, trained using traditional supervised learning, and DeepECG-SSL, a self-supervised model leveraging contrastive learning and masked-lead modeling. Both models predict 77 cardiac conditions derived from American Heart Association recommendations.
Training Data: Over 1 million ECGs from the Montreal Heart Institute (MHI-ds), with DeepECG-SSL additionally pre-trained on 1.9 million ECGs combining MHI-ds, Code-15, and MIMIC-IV datasets.
External Validation: Both models were validated across 11 geographically diverse cohorts totaling 881,403 ECGs:
Multilingual Capability: We developed a BERT-based classifier trained on 640,518 paragraph-label pairs to enable automated diagnostic extraction from both English and French ECG reports.
Both models achieved consistent high performance across internal and external datasets:
Performance remained robust across diagnostic categories: rhythm disorders (AUROC >0.92), conduction abnormalities (>0.96), and chamber enlargement (>0.92).
We evaluated both models on emerging biomarker applications beyond traditional ECG interpretation:
Left Ventricular Ejection Fraction (LVEF)
5-Year Atrial Fibrillation Risk Prediction (iAF5)
Long QT Syndrome (LQTS)
The SSL advantage widened inversely with training dataset size, demonstrating particular value for rare diseases and data-limited clinical applications.
We developed an automated three-step preprocessing pipeline enabling deployment across heterogeneous ECG acquisition systems:
This pipeline improved cross-dataset AUROC by up to 0.251, addressing a fundamental challenge in ECG-AI deployment.
Using the Equalized Odds framework, both models demonstrated strong fairness across demographics:
DeepECG-SL is 60 times smaller and 29 times faster at inference, reducing CO₂ emissions by up to 9.7 times on equivalent tasks.
Direct comparison across shared diagnostic classes demonstrated consistent superiority of DeepECG models over ECGFounder and ECG-FM on external datasets. DeepECG-SSL achieved net reclassification improvements ranging from +0.113 to +1.20 across overlapping labels.
Self-supervised learning enables development of generalizable, high-performance, and fair ECG models. DeepECG-SSL excelled in adapting to novel tasks when annotated data is limited, while DeepECG-SL offers a lightweight alternative suitable for resource-constrained environments. Both models maintain robust fairness across demographic groups.
By releasing model weights, preprocessing tools, and validation code, we aim to support robust, data-efficient AI diagnostics across diverse clinical environments.
Code: https://github.com/HeartWise-AI/DeepECG_Docker/tree/main