Facial reconstruction

Search LJMU Research Online

Browse Repository | Browse E-Theses

Evaluation of Phenotype Classification Methods for Obesity using Direct to Consumer Genetic Data

Curbelo Montañez, CA, Fergus, P, Hussain, A, Al-Jumeily, D, Tevfik Dorak, M and Abdullah, R (2017) Evaluation of Phenotype Classification Methods for Obesity using Direct to Consumer Genetic Data. In: Lecture Notes in Computer Science . pp. 350-362. (2017 International Conference on Intelligent Computing, 07 August 2017 - 10 August 2017, Liverpool, UK).

Evaluation of Phenotype Classification Methods for Obesity using Direct to customer genetic data.pdf - Accepted Version

Download (613kB) | Preview


Today, Direct-to-Consumer genetic testing services are becoming more ubiquitous. Consumers of such services are sharing their genetic and clinical information with the research community to facilitate the extraction of knowledge about different conditions. In this paper, we build on these services to analyse the genetic data of people with different BMI levels to determine the immediate and long-term risk factors associated with obesity. Using web scraping techniques, a dataset containing publicly available information about 230 participants from the Personal Genome Project is created. Subsequent analysis of the dataset is conducted for the identification of genetic variants associated with high BMI levels via standard quality control and association analysis protocols for Genome Wide Association Analysis. Finally, we applied a combination of Recursive Feature Elimination feature selection and Support Vector Machine with Radial Basis Function Kernel learning method to the filtered dataset. Using a robust data science methodology our approach provides the identification of obesity related genetic variants, to be used as features when predicting individual obesity susceptibility. The results reveal that the subset of features obtained through Recursive Feature Elimination does not improve the performance of the classifier when compared to the totality of genetic variants identified in logistic regression.

Item Type: Conference or Workshop Item (Paper)
Additional Information: The final publication is available at Springer via https://doi.org/10.1007/978-3-319-63312-1_31
Uncontrolled Keywords: 08 Information And Computing Sciences
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
R Medicine > R Medicine (General)
Divisions: Computer Science & Mathematics
Publisher: Springer Verlag (Germany)
Date Deposited: 05 May 2017 10:43
Last Modified: 20 May 2024 13:09
DOI or ID number: 10.1007/978-3-319-63312-1_31
URI: https://researchonline.ljmu.ac.uk/id/eprint/6365
View Item View Item