Context-Specific Bayesian Clustering for Gene Expression Data
Y. Barash and N. Friedman
In Proc. Fifth Annual Inter. Conf.
on Computational Molecular Biology (RECOMB 2001).
Postscript version (1040K)
PDF version.
Abstract
The recent growth in genomic data and measurement of genome-wide
expression patterns allows to examine gene regulation by transcription
factors using computational tools. In this work, we present a class
of mathematical models that help in understanding the connections between
transcription factors and functional classes of genes based on genetic
and genomic data. These models represent the joint distribution of
transcription factor binding sites and of expression levels of a gene
in a single model. Learning a combined probability model of binding
sites and expression patterns enables us to improve the clustering of
the genes based on the discovery of putative binding sites and to
detect which binding sites and experiments best characterize a
cluster. To learn such models from data, we introduce a new search
method that rapidly learns a model according to a Bayesian score. We
evaluate our method on synthetic data as well as on real data and
analyze the biological insights it provides.
Back to Nir's publications page
nir@cs.huji.ac.il