k2v: A Containerized Workflow for Creating VCF Files from Kintelligence Targeted Sequencing Data

Abstract

The ForenSeq Kintelligence kit developed by Verogen is a targeted Illumina sequencing assay that genotypes 10,230 single nucleotide polymorphisms designed for forensic genetic genealogy, forensic DNA phenotyping, and ancestry inference. We developed k2v, a containerized workflow for creating standard specification-compliant variant call format (VCF) files from the custom output data produced by the Kintelligence Universal Analysis Software. VCF files produced with k2v enable the use of many pre-existing, widely used, community-developed tools for manipulating and analyzing genetic data in the standard VCF format. Here we describe the k2v implementation, demonstrate its usage, and use the VCF produced by k2v to demonstrate downstream analyses that can easily be performed with pre-existing tools using VCF data as input: concordance analysis, ancestry inference, and relationship estimation. k2v is distributed as a Docker container available on Docker Hub. Documentation and source code for k2v is freely available under the GNU Public License (GPL-3.0) here.

Read full paper here.


Authors

Stephen D. Turnera, Michelle A. Pecka

a Signature Science, LLC, Charlottesville, VA