Minireview Published in ASM Infection and Immunity: Categorizing Sequences of Concern by Function To Better Assess Mechanisms of Microbial Pathogenesis


To identify sequences with a role in microbial pathogenesis, we assessed the adequacy of their annotation by existing controlled vocabularies and sequence databases. Our goal was to regularize descriptions of microbial pathogenesis for improved integration with bioinformatic applications. Here, we review the challenges of annotating sequences for pathogenic activity. We relate the categorization of more than 2,750 sequences of pathogenic microbes through a controlled vocabulary called Functions of Sequences of Concern (FunSoCs). These allow for an ease of description by both humans and machines. We provide a subset of 220 fully annotated sequences in the supplemental material as examples. The use of this compact (∼30 terms), controlled vocabulary has potential benefits for research in microbial genomics, public health, biosecurity, biosurveillance, and the characterization of new and emerging pathogens.

View full minireview here.


Gene D. Godbolda, Anthony D. Kappellb, Danielle S. Le.Sassierb Todd J. Treangenc, Krista L. Ternusb

a Signature Science, LLC, Charlottesville, VA, USA
Signature Science, LLC, Austin, TX, USA
Department of Computer Science, Rice University, Houston Texas, USA