Title: An Algorithm for Random Match Probability Calculation from Peptide Sequences
Authors: August E. Woerner,1* F. Curtis Hewitt,2 Myles W. Gardner,2 Michael A. Freitas,3,4 Kathleen Q. Schulte,2 Danielle S. LeSassier,2 Maryam Baniasad,3 Andrew J. Reed,3 Megan E. Powals,2 Alan R. Smith,2 Nicolette C. Albright,2 Benjamin C. Ludolph,2 Liwen Zhang,3 Leah W. Allen,2 Katharina Weber,2 Bruce Budowle1
1. Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX
2. Signature Science, LLC, Austin, TX
3. The Ohio State University, Columbus, OH
4. The Ohio State University Wexner Medical Center, Columbus, OH
* Corresponding author
Abstract: For the past three decades, forensic genetic investigations have focused on elucidating DNA signatures. While DNA has a number of desirable properties (e.g., presence in most biological materials, an amenable chemistry for analysis and well-developed statistics), DNA also has limitations. DNA may be in low quantity in some tissues, such as hair, and in some tissues it may degrade more readily than its protein counterparts. Recent research efforts have shown the feasibility of performing protein-based human identification in cases in which recovery of DNA is challenged; however, the methods involved in assessing the rarity of a given protein profile have not been addressed adequately. In this paper an algorithm is proposed that describes the computation of a random match probability (RMP) resulting from a genetically variable peptide signature. The approach described herein explicitly models proteomic error and genetic linkage, makes no assumptions as to allelic drop-out, and maps the observed proteomic alleles to their expected protein products from DNA which, in turn, permits standard corrections for population structure and finite database sizes. To assess the feasibility of this approach, RMPs were estimated from peptide profiles of skin samples from 25 individuals of European ancestry. 126 common peptide alleles were used in this approach, yielding a mean RMP of approximately 10-2.