Chromosome-scale genome assembly and annotation of two geographically distinct strains of malaria vector Anopheles albimanus

Derilus, D, Weedall, GD orcid iconORCID: 0000-0002-8927-1063, Vandewege, MW, Batra, D, Sheth, M, Rowe, LA, Escalante, AA, Lenhart, A and Impoinvil, LM (2025) Chromosome-scale genome assembly and annotation of two geographically distinct strains of malaria vector Anopheles albimanus. Scientific Reports, 15 (1). ISSN 2045-2322

[thumbnail of Chromosome-scale genome assembly and annotation of two geographically distinct strains of malaria vector Anopheles albimanus.pdf]
Preview
Text
Chromosome-scale genome assembly and annotation of two geographically distinct strains of malaria vector Anopheles albimanus.pdf - Published Version
Available under License Creative Commons Attribution.

Download (2MB) | Preview

Abstract

Anopheles albimanus is one of the principal malaria vectors in the Americas and exhibits phenotypic variation across its geographic distribution. High-quality reference genomes from geographically distant populations are essential to deepen our understanding of the biology, evolution, and genetic variation of this important malaria vector. In this study, we applied long-read PacBio and short-read Illumina sequencing technologies to assemble the complete genomes of two reference strains of An. albimanus, Stecla (originating from El Salvador), and Cartagena (originating from Colombia); and investigated the structural features of these genomes, including gene content, transposable elements (TEs), genetic variation, and structural rearrangements. Our hybrid assembly approach generated reference-quality genomes for each strain and recovered ~ 96% of the expected genome size. The genome assemblies of Stecla and Cartagena consisted of 109 and 149 scaffolds, with estimated genome sizes of 167.5 Mbp (N<inf>50</inf> = 88 Mbp) and 167.1 Mbp (N<inf>50</inf> = 87 Mbp), respectively. They exhibited a high level of completeness and contained a smaller number of gaps and ambiguous bases than either of the two previously published reference genomes for this species, suggesting a considerable improvement in the quality and completeness of the assemblies. A total of 12,082 and 12,120 protein-coding genes were predicted in Stecla and Cartagena, respectively. TE analyses indicated more repetitive content was captured in the long read assemblies. The assembled genomes shared 98.12% pairwise identity and synteny analyses suggested that gene position was conserved between both strains. These newly assembled genomes will serve as an important resource for future research in comparative genomics, proteomics, epigenetics, transcriptomics, and functional analysis of this important malaria vector.

Item Type: Article
Uncontrolled Keywords: Animals; Anopheles; Malaria; DNA Transposable Elements; Genome, Insect; Genetic Variation; Molecular Sequence Annotation; Chromosomes, Insect; Mosquito Vectors; 31 Biological Sciences; 3102 Bioinformatics and Computational Biology; 3105 Genetics; Vector-Borne Diseases; Human Genome; Malaria; Rare Diseases; Biotechnology; Genetics; Infectious Diseases; Infection; 3 Good Health and Well Being; Anopheles; Animals; Mosquito Vectors; Genome, Insect; Malaria; Molecular Sequence Annotation; DNA Transposable Elements; Genetic Variation; Chromosomes, Insect
Subjects: Q Science > QH Natural history > QH301 Biology
Divisions: Biological and Environmental Sciences (from Sep 19)
Publisher: Nature Research
Date of acceptance: 7 May 2025
Date of first compliant Open Access: 2 July 2025
Date Deposited: 02 Jul 2025 11:29
Last Modified: 03 Jul 2025 12:45
DOI or ID number: 10.1038/s41598-025-01713-9
URI: https://researchonline.ljmu.ac.uk/id/eprint/26693
View Item View Item