Journal of Proteomics & Bioinformatics

Journal of Proteomics & Bioinformatics
Open Access

ISSN: 0974-276X

Abstract

Conducting Genome-Wide Association Studies: Epistasis Scenarios

Philip Cooley, Nathan Gaddis, Ralph Folsom and Diane Wagener

This paper investigates epistatic scenarios in a genome-wide association studies (GWAS) context using a qualitative association model, to assess the statistical models that reliably predict associations between a qualitative phenotype (i.e., a disease diagnosis) and a pair of interacting genes. We employed the concept of relative risk, which is the ratio of the probability of a positive diagnosis given a mutated genotype divided by the probability with no risk present.

We used a Monte Carlo-based simulation approach, to generate synthetic data corresponding to a variety of possible epistatic models (EMs). Our method took into account the strength of association, disease prevalence in non-risk populations and most importantly, the inheritance patterns of the epistatic genes. We analyzed the simulated gene data, to assess how these individual factors influenced statistical power in the context of GWAS.

Using simulated data provides two distinct advantages. First, the association-affecting factors are isolated and can be linked to the affecting locus. Second, we can use any specific statistical method to perform the assessment. The simulated dataset provides a truth set, for assessing the effect of statistical method choice on association sensitivity, and highlights the role of errors in disease diagnosis and incorrect genotype assignments.

The results indicate that the most powerful statistical methods for predicting associations between phenotypes and genotypes, in epistatic scenarios are statistical models that simultaneously test for associations involving both interacting loci. This result is not surprising and has been reported by others. Two-gene models produce better predictions of association than single-gene models. The significance of this study is twofold: First, it incorporates recent new statistical methods as part of the comparison analysis and second, it documents the extent to which single-gene models fail to predict associations, involving interacting genes with phenotypes constructed to be associated with low risk.

Top