Abstract
Abstract The aim of the paper is to discuss the association between SNP genotype data and a disease. For genetic association studies, the statistical analyses with
multiple markers have been shown to be more powerful, efficient, and biologically
meaningful than single marker association tests. As the number of genetic markers
considered is typically large, here we cluster them and then study the association
between groups of markers and disease. We propose a two-step procedure: first a
Bayesian nonparametric cluster estimate under normalized generalized gamma process mixture models is introduced, so that we are able to incorporate the information
from a large-scale SNP data with a much smaller number of explanatory variables.
Then, thanks to the introduction of a genetic score, we study the association between
the relevant disease response and groups of markers using a logit model. Inference is
obtained via an MCMC truncation method recently introduced in the literature. We
also provide a review of the state of art of Bayesian nonparametric cluster models
and algorithms for the class of mixtures adopted here. Finally, the model is applied
to genome-wide association study of Crohn’s disease in a case-control setting.
Lingua originale | English |
---|---|
Titolo della pubblicazione ospite | Nonparametric Bayesian Methods in Biostatistics and Bioinformatics |
Pagine | 115-134 |
Numero di pagine | 20 |
DOI | |
Stato di pubblicazione | Pubblicato - 2015 |
Keywords
- NA