Facial reconstruction

Search LJMU Research Online

Browse Repository | Browse E-Theses

Impact of ECG data format on the performance of machine learning models for the prediction of myocardial infarction

Bellfield, RAA, Ortega-Martorell, S, Lip, GYH, Oxborough, D and Olier, I (2024) Impact of ECG data format on the performance of machine learning models for the prediction of myocardial infarction. Journal of Electrocardiology, 84. pp. 17-26. ISSN 0022-0736

[img]
Preview
Text
Impact of ECG data format on the performance of machine learning models for the prediction of myocardial infarction.pdf - Published Version
Available under License Creative Commons Attribution.

Download (10MB) | Preview

Abstract

Background We aim to determine which electrocardiogram (ECG) data format is optimal for ML modelling, in the context of myocardial infarction prediction. We will also address the auxiliary objective of evaluating the viability of using digitised ECG signals for ML modelling. Methods Two ECG arrangements displaying 10s and 2.5 s of data for each lead were used. For each arrangement, conservative and speculative data cohorts were generated from the PTB-XL dataset. All ECGs were represented in three different data formats: Signal ECGs, Image ECGs, and Extracted Signal ECGs, with 8358 and 11,621 ECGs in the conservative and speculative cohorts, respectively. ML models were trained using the three data formats in both data cohorts. Results For ECGs that contained 10s of data, Signal and Extracted Signal ECGs were optimal and statistically similar, with AUCs [95% CI] of 0.971 [0.961, 0.981] and 0.974 [0.965, 0.984], respectively, for the conservative cohort; and 0.931 [0.918, 0.945] and 0.919 [0.903, 0.934], respectively, for the speculative cohort. For ECGs that contained 2.5 s of data, the Image ECG format was optimal, with AUCs of 0.960 [0.948, 0.973] and 0.903 [0.886, 0.920], for the conservative and speculative cohorts, respectively. Conclusion When available, the Signal ECG data should be preferred for ML modelling. If not, the optimal format depends on the data arrangement within the ECG: If the Image ECG contains 10s of data for each lead, the Extracted Signal ECG is optimal, however, if it only uses 2.5 s, then using the Image ECG data is optimal for ML performance.

Item Type: Article
Uncontrolled Keywords: 1102 Cardiorespiratory Medicine and Haematology; Cardiovascular System & Hematology
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
R Medicine > R Medicine (General)
Divisions: Computer Science & Mathematics
Nursing & Allied Health
Sport & Exercise Sciences
Publisher: Elsevier
SWORD Depositor: A Symplectic
Date Deposited: 11 Mar 2024 11:37
Last Modified: 26 Mar 2024 14:00
DOI or ID number: 10.1016/j.jelectrocard.2024.03.005
URI: https://researchonline.ljmu.ac.uk/id/eprint/22764
View Item View Item