Comparing the performance of statistical, machine learning, and deep learning algorithms to predict time-to-event: A simulation study for conversion to mild cognitive impairment

Export Citation

Billichova, M, Coan, LJ, Czanner, S ORCID: 0000-0002-8471-6895, Kovacova, M, Sharifian, F and Czanner, G ORCID: 0000-0002-1157-2093 (2024) Comparing the performance of statistical, machine learning, and deep learning algorithms to predict time-to-event: A simulation study for conversion to mild cognitive impairment. PLoS One, 19. e0297190. ISSN 1932-6203

Preview

Text
journal.pone.0297190.pdf - Published Version
Available under License Creative Commons Attribution.
Download (770kB) | Preview

Publisher URL: http://doi.org/10.1371/journal.pone.0297190

Abstract

Mild Cognitive Impairment (MCI) is a condition characterized by a decline in cognitive abilities, specifically in memory, language, and attention, that is beyond what is expected due to normal aging. Detection of MCI is crucial for providing appropriate interventions and slowing down the progression of dementia. There are several automated predictive algorithms for prediction using time-to-event data, but it is not clear which is best to predict the time to conversion to MCI. There is also confusion if algorithms with fewer training weights are less accurate. We compared three algorithms, from smaller to large numbers of training weights: a statistical predictive model (Cox proportional hazards model, CoxPH), a machine learning model (Random Survival Forest, RSF), and a deep learning model (DeepSurv). To compare the algorithms under different scenarios, we created a simulated dataset based on the Alzheimer NACC dataset. We found that the CoxPH model was among the best-performing models, in all simulated scenarios. In a larger sample size (n = 6,000), the deep learning algorithm (DeepSurv) exhibited comparable accuracy (73.1%) to the CoxPH model (73%). In the past, ignoring heterogeneity in the CoxPH model led to the conclusion that deep learning methods are superior. We found that when using the CoxPH model with heterogeneity, its accuracy is comparable to that of DeepSurv and RSF. Furthermore, when unobserved heterogeneity is present, such as missing features in the training, all three models showed a similar drop in accuracy. This simulation study suggests that in some applications an algorithm with a smaller number of training weights is not disadvantaged in terms of accuracy. Since algorithms with fewer weights are inherently easier to explain, this study can help artificial intelligence research develop a principled approach to comparing statistical, machine learning, and deep learning algorithms for time-to-event predictions.

Item Type:	Article
Uncontrolled Keywords:	Humans; Algorithms; Artificial Intelligence; Machine Learning; Cognitive Dysfunction; Deep Learning; Humans; Artificial Intelligence; Deep Learning; Algorithms; Cognitive Dysfunction; Machine Learning; General Science & Technology
Subjects:	B Philosophy. Psychology. Religion > BF Psychology Q Science > QA Mathematics > QA75 Electronic computers. Computer science R Medicine > RC Internal medicine > RC0321 Neuroscience. Biological psychiatry. Neuropsychiatry
Divisions:	Computer Science and Mathematics
Publisher:	Public Library of Science
Date of acceptance:	1 January 2024
Date of first compliant Open Access:	8 March 2024
Date Deposited:	08 Mar 2024 11:00
Last Modified:	04 Jul 2025 13:45
DOI or ID number:	10.1371/journal.pone.0297190
Editors:	Chen, ACH
URI:	https://researchonline.ljmu.ac.uk/id/eprint/22752

View Item

CORE (COnnecting REpositories)