Engels, I, Burnett, A, Robert, P, Pironneau, C, Abrams, G, Bouwmeester, R, Van der Plaetsen, P, Di Modica, K, Otte, M, Straus, LG, Fischer, V, Bray, F, Mesuere, B, De Groote, I ORCID: 0000-0002-9860-0180, Deforce, D, Daled, S and Dhaenens, M
(2025)
Classification of Collagens via Peptide Ambiguation, in a Paleoproteomic LC-MS/MS-Based Taxonomic Pipeline.
Journal of proteome research, 24 (4).
pp. 1907-1925.
ISSN 1535-3893
Preview |
Text
engels-et-al-2025-classification-of-collagens-via-peptide-ambiguation-in-a-paleoproteomic-lc-ms-ms-based-taxonomic.pdf - Published Version Available under License Creative Commons Attribution Non-commercial No Derivatives. Download (15MB) | Preview |
Abstract
Liquid chromatography-mass spectrometry (LC-MS/MS) extends the matrix-assisted laser desorption ionization-time of flight (MALDI-TOF) Zooarcheology by Mass Spectrometry (ZooMS) "mass fingerprinting" approach to species identification by providing fragmentation spectra for each peptide. However, ancient bone samples generate sparse data containing only a few collagen proteins, rendering target-decoy strategies unusable and increasing uncertainty in peptide annotation. To ameliorate this issue, we present a ZooMS/MS data pipeline that builds on a manually curated Collagen database and comprises two novel algorithms: isoBLAST and ClassiCOL. isoBLAST first extends peptide ambiguity by generating all "potential peptide candidates" isobaric to the annotated precursor. The exhaustive set of candidates created is then used to retain or reject different potential paths at each taxonomic branching point from superkingdom to species, until the greatest possible specificity is reached. Uniquely, ClassiCOL allows for the identification of taxonomic mixtures, including contaminated samples, as well as suggesting taxonomies not represented in sequence databases, including extinct taxa. All considered ambiguity is then graphically represented with clear prioritization of the potential taxa in the sample. Using public as well as in-house data acquired on different instruments, we demonstrate the performance of this universal postprocessing and explore the identification of both genetic and sample mixtures. Diet reconstruction from 40,000-year-old cave hyena coprolites illustrates the exciting potential of this approach.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Algorithms; Amino Acid Sequence; Animals; Chromatography, Liquid; Collagen; Databases, Protein; Humans; Liquid Chromatography-Mass Spectrometry; Peptides; Proteomics; Tandem Mass Spectrometry; Animals; Humans; Collagen; Peptides; Chromatography, Liquid; Proteomics; Amino Acid Sequence; Algorithms; Databases, Protein; Tandem Mass Spectrometry; Liquid Chromatography-Mass Spectrometry; 3401 Analytical Chemistry; 31 Biological Sciences; 34 Chemical Sciences; Collagen; Tandem Mass Spectrometry; Animals; Proteomics; Chromatography, Liquid; Algorithms; Peptides; Databases, Protein; Amino Acid Sequence; Humans; Liquid Chromatography-Mass Spectrometry; 03 Chemical Sciences; 06 Biological Sciences; Biochemistry & Molecular Biology; 31 Biological sciences; 34 Chemical sciences |
Subjects: | Q Science > QH Natural history > QH301 Biology |
Divisions: | Biological and Environmental Sciences (from Sep 19) |
Publisher: | American Chemical Society (ACS) |
Date of acceptance: | 5 March 2025 |
Date of first compliant Open Access: | 2 July 2025 |
Date Deposited: | 02 Jul 2025 11:39 |
Last Modified: | 03 Jul 2025 12:45 |
DOI or ID number: | 10.1021/acs.jproteome.4c00962 |
URI: | https://researchonline.ljmu.ac.uk/id/eprint/26694 |
![]() |
View Item |