Bayesian nonparametric clustering and association studies for candidate SNP observations

C. Wang, F. Ruggeri, C.K. Hsiao, Raffaele Argiento*

*Autore corrispondente per questo lavoro

Risultato della ricerca: Contributo in rivistaArticolopeer review

1 Citazioni (Scopus)

Abstract

Clustering is often considered as the first step in the analysis when dealing with an\r\nenormous amount of Single Nucleotide Polymorphism (SNP) genotype data. The lack of\r\nbiological information could affect the outcome of such procedure. Even if a clustering\r\nprocedure has been selected and performed, the impact of its uncertainty on the\r\nsubsequent association analysis is rarely assessed. In this research we propose first a model\r\nto cluster SNPs data, then we assess the association between the cluster and a disease. In\r\nparticular, we adopt a Dirichlet process mixture model with the advantages, with respect\r\nto the usual clustering methods, that the number of clusters needs not to be known and\r\nfixed in advance and the variation in the assignment of SNPs to clusters can be accounted.\r\nIn addition, once a clustering of SNPs is obtained, we design an individualized genetic score\r\nquantifying the SNP composition in each cluster for every subject, so that we can set up\r\na generalized linear model for association analysis able to incorporate the information\r\nfrom a large-scale SNP dataset, and yet with a much smaller number of explanatory\r\nvariables. The inference on cluster allocation, the strength of association of each cluster\r\n(the collective effect on SNPs in the same cluster), and the susceptibility of each SNP\r\nare based on posterior samples from Markov chain Monte Carlo methods and the Binder\r\nloss information. We exemplify this Bayesian nonparametric strategy in a genome-wide\r\nassociation study of Crohn’s disease in a case-control setting.
Lingua originaleInglese
pagine (da-a)19-35
Numero di pagine17
RivistaInternational Journal of Approximate Reasoning
Volume80
Numero di pubblicazionena
DOI
Stato di pubblicazionePubblicato - 2017

All Science Journal Classification (ASJC) codes

  • Software
  • Informatica Teorica
  • Intelligenza Artificiale
  • Matematica Applicata

Keywords

  • Bayesian Clustering
  • Bayesian Nonparametric
  • Dirichlet process mixture model
  • GWAS
  • Logistic regression
  • Random partitions

Fingerprint

Entra nei temi di ricerca di 'Bayesian nonparametric clustering and association studies for candidate SNP observations'. Insieme formano una fingerprint unica.

Cita questo