DOI: 10.26508/lsa.202302181 ISSN: 2575-1077

A comparative study of structural variant calling in WGS from Alzheimer’s disease families

John S Malamon, John J Farrell, Li Charlie Xia, Beth A Dombroski, Rueben G Das, Jessica Way, Amanda B Kuzma, Otto Valladares, Yuk Yee Leung, Allison J Scanlon, Irving Antonio Barrera Lopez, Jack Brehony, Kim C Worley, Nancy R Zhang, Li-San Wang, Lindsay A Farrer, Gerard D Schellenberg, Wan-Ping Lee, Badri N Vardarajan
  • Health, Toxicology and Mutagenesis
  • Plant Science
  • Biochemistry, Genetics and Molecular Biology (miscellaneous)
  • Ecology

Detecting structural variants (SVs) in whole-genome sequencing poses significant challenges. We present a protocol for variant calling, merging, genotyping, sensitivity analysis, and laboratory validation for generating a high-quality SV call set in whole-genome sequencing from the Alzheimer’s Disease Sequencing Project comprising 578 individuals from 111 families. Employing two complementary pipelines, Scalpel and Parliament, for SV/indel calling, we assessed sensitivity through sample replicates (N = 9) with in silico variant spike-ins. We developed a novel metric, D-score, to evaluate caller specificity for deletions. The accuracy of deletions was evaluated by Sanger sequencing. We generated a high-quality call set of 152,301 deletions of diverse sizes. Sanger sequencing validated 114 of 146 detected deletions (78.1%). Scalpel excelled in accuracy for deletions ≤100 bp, whereas Parliament was optimal for deletions >900 bp. Overall, 83.0% and 72.5% of calls by Scalpel and Parliament were validated, respectively, including all 11 deletions called by both Parliament and Scalpel between 101 and 900 bp. Our flexible protocol successfully generated a high-quality deletion call set and a truth set of Sanger sequencing–validated deletions with precise breakpoints spanning 1–17,000 bp.

More from our Archive