Uncovering gene regulatory network using sparse Bayesian factor model
Response of cells to changing endogenous or exogenous conditions is governed by intricate networks of gene regulations including those by, most notably, transcription factors (TFs) and microRNAs (miRNAs). Due to technical limitations, the protein level expressions of TFs are difficult to measure, making computational reconstruction of transcriptional network usually a difficult task.
In this dissertation, the author proposes a novel Bayesian factor model for uncovering transcriptional networks regulated by TFs from microarray data, where the TF activities are modeled as the unknown factors. To enable accurate estimation of the TF activities and model the sparsity of TF regulation, the prior knowledge of the TF regulation is integrated based on the Bayesian framework and modeled by a spike-slab prior. Three particular aspects of the TF regulation are investigated in this dissertation. First, the author investigates correlated TF activates, where the correlation of non-negative TF activities are modeled by the Dirichlet mixture of rectified Gaussian distribution (DMrG). Second, the author investigates the modeling of the clustering effect among biological samples, which are due to, for instance, samples of patients with the same cancer subtype. To this end, a DMrG prior is introduced to data samples. Third, the author investigates the modeling of cooperative transcription regulations between TF and miRNAs, where a hybrid Bayesian factor model is proposed. For all three investigations, the author is able to develop the respective Gibbs sampling solutions for model parameter inference. The validity and effectiveness of the proposed Gibbs sampling solutions are demonstrated through simulated systems. The developed models are applied to the cancer expression profiling data.
The novelty and significance of this work lies in that: Firstly, a modeling framework of a Bayesian sparse factor model is proposed to model TF mediated regulation with the absence of knowledge on TF activities. Secondly, based on the framework, three different aspects of TF regulations are investigated including the correlated behavior of transcription factors, the clustering effect among biological samples, and the cooperative transcriptional regulations by both transcription factors and miRNAs. These problems are timely and open questions in molecular biology. Third, the application of the proposed models in cancer profiling data is new and suggests an alternative method for personalized cancer prognosis and diagnosis.