Facial reconstruction

Search LJMU Research Online

Browse Repository | Browse E-Theses

Machine Learning Approaches for the Prediction of Obesity using Publicly Available Genetic Profiles

Curbelo Montañez, CA, Fergus, P, Hussain, A and Al-Jumeily, D (2017) Machine Learning Approaches for the Prediction of Obesity using Publicly Available Genetic Profiles. In: Neural Networks (IJCNN) . (2017 International Joint Conference on Neural Network, 14 May 2017 - 19 May 2017, Anchorage, Alaska, USA).

[img]
Preview
Text
Machine Learning Approaches for the Prediction of Obesity using Publicly available genetic profiles.pdf - Accepted Version

Download (503kB) | Preview

Abstract

This paper presents a novel approach based on the analysis of genetic variants from publicly available genetic profiles and the manually curated database, the National Human Genome Research Institute Catalog. Using data science techniques, genetic variants are identified in the collected participant profiles then indexed as risk variants in the National Human Genome Research Institute Catalog. Indexed genetic variants or Single Nucleotide Polymorphisms are used as inputs in various machine learning algorithms for the prediction of obesity. Body mass index status of participants is divided into two classes, Normal Class and Risk Class. Dimensionality reduction tasks are performed to generate a set of principal variables - 13 SNPs - for the application of various machine learning methods. The models are evaluated using receiver operator characteristic curves and the area under the curve. Machine learning techniques including gradient boosting, generalized linear model, classification and regression trees, K-nearest neighbours, support vector machines, random forest and multilayer neural network are comparatively assessed in terms of their ability to identify the most important factors among the initial 6622 variables describing genetic variants, age and gender, to classify a subject into one of the body mass index related classes defined in this study. Our simulation results indicated that support vector machine generated high accuracy value of 90.5%.

Item Type: Conference or Workshop Item (Paper)
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QH Natural history > QH426 Genetics
Divisions: Computer Science & Mathematics
Publisher: IEEE
Date Deposited: 07 Feb 2017 09:48
Last Modified: 13 Apr 2022 15:15
URI: https://researchonline.ljmu.ac.uk/id/eprint/5450
View Item View Item