Explainable Machine Learning Model for Alzheimer Detection Using Genetic Data: A Genome-Wide Association Study Approach

Khater, T; Ansari, S; Saad Alatrany, A; Alaskar, H; Mahmoud, S; Turky, A; Tawfik, H; Almajali, E; Hussain, A

Explainable Machine Learning Model for Alzheimer Detection Using Genetic Data: A Genome-Wide Association Study Approach

Export Citation

Khater, T, Ansari, S, Saad Alatrany, A, Alaskar, H, Mahmoud, S, Turky, A, Tawfik, H, Almajali, E and Hussain, A ORCID: 0000-0001-8413-0045 (2024) Explainable Machine Learning Model for Alzheimer Detection Using Genetic Data: A Genome-Wide Association Study Approach. IEEE Access, 12. pp. 95091-95105.

[thumbnail of Explainable Machine Learning Model for Alzheimer Detection Using Genetic Data A Genome-Wide Association Study Approach.pdf]

Preview

Text
Explainable Machine Learning Model for Alzheimer Detection Using Genetic Data A Genome-Wide Association Study Approach.pdf - Published Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.
Download (1MB) | Preview

Publisher URL: http://doi.org/10.1109/access.2024.3410135

Abstract

Recent research has revealed that using machine learning systems for the analysis of genetic data could reliably detect Alzheimer's disease. The interpretability of these models, however, has been a challenge, as they frequently provided little insight into the features that contribute to their predictions. Explainable machine learning has been presented as a solution to this problem since it enables the identification of significant attributes and gives a clearer method of making predictions. In this study, Genome-Wide Association Studies were used to recognize genetic variants associated with Alzheimer's disease, utilizing the Alzheimer's Disease Neuroimaging Initiative dataset and quality control methods to ensure the validity and reliability of the findings. The results indicate strong connections between certain genetic variations and Alzheimer's disease, highlighting the potential of Genome-Wide Association Studies as a valuable tool for identifying and predicting this disease. After studying and analyzing the genetic data, machine learning algorithms are utilized to train a model to detect Alzheimer. The Support Vector Machine achieved 89% accuracy as the best-performing model. Explainable machine learning has the potential to increase the accuracy and interpretability of Alzheimer's disease detection models, giving significant insights for both academics and physicians. The explanation of the support vector machine model reveals that rs4821510 is the most important SNP in detecting AD. On top of that, the SHAP method shows that rs429358 is an indication for Alzheimer's disease and rs4821510 presents in the healthy ones. These findings suggest that explainable machine learning can play an important role in accurately detecting Alzheimer's disease and identifying critical genetic markers associated with the disease.

Item Type:	Article
Uncontrolled Keywords:	08 Information and Computing Sciences; 09 Engineering; 10 Technology
Subjects:	Q Science > QA Mathematics > QA75 Electronic computers. Computer science Q Science > QH Natural history > QH426 Genetics
Divisions:	Computer Science and Mathematics
Publisher:	Institute of Electrical and Electronics Engineers (IEEE)
Date of acceptance:	29 May 2024
Date of first compliant Open Access:	1 August 2024
Date Deposited:	01 Aug 2024 15:46
Last Modified:	04 Jul 2025 10:15
DOI or ID number:	10.1109/ACCESS.2024.3410135
URI:	https://researchonline.ljmu.ac.uk/id/eprint/23864

View Item

CORE (COnnecting REpositories)