Facial reconstruction

Search LJMU Research Online

Browse Repository | Browse E-Theses

Data quality in the human and environmental health sciences: Using statistical confidence scoring to improve QSAR/QSPR modeling

Steinmetz, FP, Madden, JC and Cronin, MTD (2015) Data quality in the human and environmental health sciences: Using statistical confidence scoring to improve QSAR/QSPR modeling. Journal of Chemical Information and Modeling, 55 (8). pp. 1739-1746. ISSN 1549-960X

[img]
Preview
Text
just accepted document for sympletic.pdf - Accepted Version

Download (1MB) | Preview

Abstract

A greater number of toxicity data are becoming publicly available allowing for in silico modeling. However, questions often arise as how to incorporate data quality and how to deal with contradicting data if more than a single datum point is available for the same compound. In this study, two well-known and studied QSAR/QSPR models for skin permeability and aquatic toxicology have been investigated in the context of statistical data quality. In particular, the potential benefits of the incorporation of the statistical Confidence Scoring (CS) approach within modelling and validation. As a result, robust QSAR/QSPR models for the skin permeability coefficient and the toxicity of nonpolar narcotics to Aliivibrio fischeri assay were created. CSweighted linear regression for training and CS-weighted root mean square error (RMSE) for validation were statistically superior compared to standard linear regression and standard RMSE. Strategies are proposed as to how to interpret data with high and low CS, as well as how to deal with large datasets containing multiple entries.

Item Type: Article
Subjects: R Medicine > RS Pharmacy and materia medica
Divisions: Pharmacy & Biomolecular Sciences
Publisher: American Chemical Society
Date Deposited: 29 Jul 2015 08:00
Last Modified: 09 Mar 2022 09:53
DOI or ID number: 10.1021/acs.jcim.5b00294
URI: https://researchonline.ljmu.ac.uk/id/eprint/1757
View Item View Item