Clustering observations across partially exchangeable groups of data is a routine task in Bayesian nonparametrics. Previously proposed models allow for clustering across groups by sharing atoms in the group-specific mixing measures. However, exact atom sharing can be overly rigid when groups differ subtly, introducing a trade-off between clustering and density estimates and fragmenting across-group clusters, particularly at larger sample sizes. We introduce the hierarchical shot-noise Cox process (HSNCP) mixture model, where group-specific atoms concentrate around shared centers through a kernel. This enables accurate density estimation within groups and flexible borrowing across groups, overcoming the density-clustering trade-off of previous approaches. Our construction, built on the shot-noise Cox process, remains analytically tractable: we derive closed-form prior moments and an inter-group correlation, obtain the marginal law and predictive distribution for latent parameters, as well as the posterior of the mixing measures given the latent parameters. We develop an efficient conditional MCMC algorithm for posterior inference. We assess the performance of the HSNCP model through simulations and an application to a large galaxy dataset, demonstrating balanced across-group clusters and improved density estimates compared with the hierarchical Dirichlet process, including under model misspecification.
翻译:暂无翻译