DOI: 10.1002/qub2.42 ISSN: 2095-4689

SimHOEPI: A resampling simulator for generating single nucleotide polymorphism data with a high‐order epistasis model

Yahan Li, Xinrui Cai, Junliang Shang, Yuanyuan Zhang, Jin‐Xing Liu
  • Applied Mathematics
  • Computer Science Applications
  • Biochemistry, Genetics and Molecular Biology (miscellaneous)
  • Modeling and Simulation

Abstract

Epistasis is a ubiquitous phenomenon in genetics, and is considered to be one of main factors in current efforts to unveil missing heritability of complex diseases. Simulation data is crucial for evaluating epistasis detection tools in genome‐wide association studies (GWAS). Existing simulators normally suffer from two limitations: absence of support for high‐order epistasis models containing multiple single nucleotide polymorphisms (SNPs), and inability to generate simulation SNP data independently. In this study, we proposed a simulator SimHOEPI, which is capable of calculating penetrance tables of high‐order epistasis models depending on either prevalence or heritability, and uses a resampling strategy to generate simulation data independently. Highlights of SimHOEPI are the preservation of realistic minor allele frequencies in sampling data, the accurate calculation and embedding of high‐order epistasis models, and acceptable simulation time. A series of experiments were carried out to verify these properties from different aspects. Experimental results show that SimHOEPI can generate simulation SNP data independently with high‐order epistasis models, implying that it might be an alternative simulator for GWAS.

More from our Archive