Facial reconstruction

Search LJMU Research Online

Browse Repository | Browse E-Theses

Removing reference bias and improving indel calling in ancient DNA data analysis by mapping to a sequence variation graph

Martiniano, R, Erik, G, Eppie, J, Andrea, M and Richard, D (2020) Removing reference bias and improving indel calling in ancient DNA data analysis by mapping to a sequence variation graph. Genome Biology, 21 (250). pp. 1-18. ISSN 1474-7596

[img]
Preview
Text
Removing reference bias and improving indel calling in ancient DNA data analysis by mapping to a sequence variation graph.pdf - Published Version
Available under License Creative Commons Attribution.

Download (2MB) | Preview

Abstract

Background: During the last decade, the analysis of ancient DNA (aDNA) sequence has become a powerful tool for the study of past human populations. However, the degraded nature of DNA means that aDNA molecules are short and frequently mutated by post-mortem chemical modi cations. These features decrease read mapping accuracy and increase reference bias, in which reads containing non-reference alleles are less likely to be mapped than those containing reference alleles. Alternative approaches have been developed to replace the linear reference with a variation graph which includes known alternative variants at each genetic locus. Here, we evaluate the use of variation graph software vg to avoid reference bias for aDNA and compare with existing methods. Results: We use vg to align simulated and real aDNA samples to a variation graph containing 1000 Genome Project variants and compare with the same data aligned with bwa to the human linear reference genome. Using vg leads to a balanced allelic representation at polymorphic sites, e ectively removing reference bias, and more sensitive variant detection in comparison with bwa, especially for insertions and deletions (indels). Alternative approaches that use relaxed bwa parameter settings or lter bwa alignments can also reduce bias but can have lower sensitivity than vg, particularly for indels. Conclusions: Our findings demonstrate that aligning aDNA sequences to variation graphs e ectively mitigates the impact of reference bias when analysing aDNA, while retaining mapping sensitivity and allowing detection of variation, in particular indel variation, that was previously missed.

Item Type: Article
Uncontrolled Keywords: 05 Environmental Sciences, 06 Biological Sciences, 08 Information and Computing Sciences
Subjects: Q Science > QH Natural history > QH301 Biology
Q Science > QH Natural history > QH426 Genetics
Divisions: Biological & Environmental Sciences (from Sep 19)
Publisher: BMC
Date Deposited: 15 Jul 2020 14:34
Last Modified: 22 Aug 2022 09:30
DOI or ID number: 10.1186/s13059-020-02160-7
URI: https://researchonline.ljmu.ac.uk/id/eprint/13309
View Item View Item