Bayesian model selection in finite mixture regression
The finite mixture regression is a common method to account for heterogeneity in relationship between the response variable and the predictor variables. One of the goals of this study is variable selection within each component in the finite mixture regression. This has not been studied in the literature from Bayesian perspective. We propose an approach by embedding variable selection into the data augmentation method that iteratively updates estimation in two steps: estimate parameters for each component and determine the latent membership for each observation. Componentwise variable selection is realized by imposing special priors or procedures designed for parsimony in the first step. Due to separation of the two steps, our approach provides the freedom to choose from a wide variety of variable selection techniques. In particular, we illustrate how four popular variable selection techniques could be implemented in the proposed approach: Stochastic Search Variable Selection, g-prior, Reversible Jump Markov chain Monte Carlo, and Bayesian LASSO. A simulation study is conducted to assess performance of the proposed approach under a variety of scenarios through investigating accuracy of variable selection and clustering as well as several other diagnostics including MCMC convergence tests and posterior predictive model checking. Simulation studies show that the proposed approach successfully identifies important variables even in the noisy scenarios. When the approach is applied to real datasets, we find that selected variables are quite different between components, which provides additional insight to scientific understanding.
The other goal of this dissertation is to determine the number of components K when it is unknown a priori, which is another model selection issue in finite mixture models. We propose a new method that borrows the reductive procedure in Sahu and Chen (2003) and employs a new distance measure. This measure is based on posterior predictive replicates called "SWAP". This is different from commonly-used posterior predictive replicates in the sense that SWAP uses parameters from other components. At each MCMC iteration, the measure is utilized to judge whether there exist two components close enough to be collapsed. The posterior probability of collapsing will further indicate whether there is strong evidence in reducing K. The results from the simulation study show that the proposed method performs well and is more stable than AIC and BIC. When applied to real data, the method gives a reasonable estimate and is more in line with AIC than other criteria.