Artificial intelligence (AI) models increasingly augment the day-to-day operations of individuals, enterprises, and government organizations across the world. AI can benefit scientific research, wherein models offer automation or semi-automation of practical chemical and biological tasks. In this context, responsible AI model developers must ensure the tool works as intended, and install safeguards to mitigate potential vulnerabilities, risks, and unintended behaviors. Signature Science, LLC subject matter experts support AI stakeholders by assessing and evaluating the efficacy of these safety measures and the accuracy with which the model handles expert-level scientific content.

Signature Science’s chemistry and biology researchers and CBRNE threat subject matter experts combine their hands-on, practical lab experience with capabilities in AI benchmarking, evaluations, and assessments. We can apply our current, practicing chemistry and biology expertise and operational laboratories to facilitate national security-related AI evaluations.
SIGNATURE SCIENCE OFFERS
Red Teaming
Signature Science collaborates with AI developers to rigorously test AI systems to identify and address vulnerabilities, risks, and/or unintended behaviors. Through redlining, responsible developers can mitigate identified risks and implement safeguards, test the limits of the model, and enhance the model’s ability to handle unexpected inputs.
Question and Answer Pair Development
Targeted question and answer pairs can evaluate the knowledge base of AI systems and train the tools to provide accurate responses to user queries, allowing the AI to readily find relevant information when asked a similar question, and prevent instances of dangerous misuse. Our chemistry and biology experts can develop challenging multiple-choice or open-ended questions across chemistry and biology sub-domains to evaluate model performance relative to human expert performance or other reference models.
Automated Grading Rubric Development
AI model assessments benefit from clear, objective and measurable criteria by which submissions can be evaluated consistently. Scientific-based question and answer pairs often have numerous feasible correct answers, which complicates the development of an automated grading rubric. Our experts work to refine the question and ensure that our answer rubric is specific and comprehensive while maintaining compatibility with automated grading systems.
Human Uplift Studies
We design approaches to empirically test the biological and chemical laboratory uplift capabilities of AI models and investigate potential areas of concern posed by their capabilities as it relates to biological and chemical threats. By employing human uplift studies that ask humans to perform practical chemical and biological research tasks in a laboratory, AI stakeholders can gain a better understanding of the real-world impact of AI assistance on carrying out complex laboratory protocols.
Laboratory Verification of AI BioScience
As your team works to automate biological processes by developing and employing custom agentic AI models, you need to understand whether your tool is performing the intended biodesign tasks accurately and safely. With AI EmpiriTestTM, we offer a suite of in-laboratory tests to assess your model’s ability to execute a range of molecular design tasks. In our corporate laboratories, Signature Science’s expert scientists will follow your tool’s protocol, rapidly sourcing the suggested reagents, custom DNA, CRISPR, proteins, etc., and executing the laboratory work dictated by the model to deliver timely real-world empirical data on your AI model’s performance. This benchmark assessment will simultaneously serve to screen your tool for biosafety risks and inform you of whether and to what degree your tool enables a novice biologist to design and perform synthetic biology at a high-risk level.

For more information about AI Safety Consultation Services:

Alan Smith, PMP
AI Safety Lead, Chemistry

Danielle LeSassier, PhD
AI Biosafety Lead