Thursday, October 28, 2004

UCSB CS Colloquium

UCSB CS Colloquium: "Eric Xing
University of California Berkeley

Date: Monday April 12
Time: 3:00-4:00
Place: Engineering 1, 2114

I discuss two probabilistic modeling problems arising in metazoan genomic analysis: identifying motifs and cis-regulatory modules (CRMs) from transcriptional regulatory DNA sequences, and inferring haplotypes from genotypes of single nucleotide polymorphisms. Motif and CRM identification is important for understanding the gene regulatory network underlying metazoan development and functioning. I discuss a modular Bayesian model that captures rich structural characteristics of the transcriptional regulatory sequences and supports a variety of tasks such as learning motif representations, model-based motif and CRM prediction, and de novo motif detection. Haplotype inference is essential for the understanding of genetic variation within and among populations, with important applications to the genetic analysis of disease propensities and other complex traits. I discuss a Bayesian model based on a prior constructed from a Chinese restaurant process -- a nonparametric prior which provides control over the size of the unknown pool of population haplotypes, and on a likelihood that allows statistical errors in the haplotype/genotype relationship. Our models use the 'probabilistic graphical model' formalism, a formalism that exploits the conjoined talents of graph theory and probability theory to build complex models out of simpler pieces. I discuss the mathematical underpinnings for the models, how they formally incorporate biological prior knowledge about the data, and the related computational issues.
Eric Xing received his B.S. with honors in Physics and Biology from Tsinghua University, his Ph.D. in Molecular Biology and Biochemistry "

1 comment:

stars2man said...

BioMed Central | Full text | Computational detection of genomic cis-regulatory modules applied to body patterning in the early Drosophila embryo: "Predicting and understanding transcriptional regulation is a fundamental problem in biology. We have designed new algorithms for the detection of cis-regulatory modules in the genomes of higher eukaryotes which is a first step in unraveling transcriptional regulatory networks. We have demonstrated, in the case of body patterning in the Drosophila embryo, that our algorithms allow the genome-wide identification of regulatory modules when the motifs for the transcription factors are known (algorithm Ahab), or when only related modules are known (customized Gibbs sampler in conjunction with Ahab), or when only genomic sequence is analyzed with Argos. We believe that Ahab overcomes many problems of recent studies and we estimated the false positive rate to be about 50%. Argos is the first successful attempt to predict regulatory modules using only the genome without training data. All our results and module predictions across the Drosophila genome are available at The Ahab code is available upon request from the authors."