# 3、Gibbs采样

Blei的LDA原始论文（Latent Dirichlet
Allocation）中给出隐变量的后验分布如下图，核心问题是给出一篇文档，如何推断隐变量：

$p\left( {\theta ,z|w,\alpha ,\beta } \right) = {{p\left( {\theta ,z,w|\alpha ,\beta } \right)} \over {p\left( {w|\alpha ,\beta } \right)}}$

$p\left( {\theta ,z,w|\alpha ,\beta } \right) = p\left( {\theta |\alpha } \right)\prod\limits_{n = 1}^N {p\left( {{z_n}|\theta } \right)p\left( {{w_n}|{z_n},\beta } \right)}$

$p\left( {w|\alpha ,\beta } \right) = {{\Gamma \left( {\sum\nolimits_i {{\alpha _i}} } \right)} \over {\prod\nolimits_i {\Gamma \left( {{\alpha _i}} \right)} }}\int {\left( {\prod\limits_{i = 1}^k {\theta _i^{{\alpha _i} - 1}} } \right)\left( {\prod\limits_{n = 1}^N {\sum\limits_{i = 1}^k {\prod\limits_{j = 1}^V {{{\left( {{\theta _i}{\beta _{ij}}} \right)}^{w_n^j}}} } } } \right)d\theta }$

OK，给定一个文档集合，w是可以观察到的已知变量，$\alpha$$\beta$是根据经验给定的先验参数，其他的变量z，θ和φ都是未知的隐含变量，需要根据观察到的变量来学习估计的。根据LDA的图模型，可以写出所有变量的联合分布：

$p\left( {{{\vec w}_m},{{\vec z}_m},{{\vec \theta }_m},\Phi |\vec \alpha ,\vec \beta } \right) = \prod\limits_{n = 1}^{{N_m}} {p\left( {{w_{m,n}}|{{\vec \varphi }_{{z_{m,n}}}}} \right)p\left( {{z_{m,n}}|{{\vec \theta }_m}} \right) \cdot p\left( {{{\vec \theta }_m}|\vec \alpha } \right) \cdot p\left( {\Phi |\vec \beta } \right)}$

：上述公式中及下文中，${z_{m,n}}$等价上文中定义的${z_{i,j}}$${w_{m,n}}$等价于上文中定义的${w_{i,j}}$${\vec \varphi _{{z_{m,n}}}}$等价于上文中定义的${\phi _{{z_{i,j}}}}$${\theta _m}$等价于上文中定义的${\theta _i}$

$p\left( {{{\vec w}_m},{{\vec z}_m},{{\vec \theta }_m},\Phi |\vec \alpha ,\vec \beta } \right) = \prod\limits_{n = 1}^{{N_m}} {p\left( {{w_{m,n}}|{{\vec \varphi }_{{z_{m,n}}}}} \right)p\left( {{z_{m,n}}|{{\vec \theta }_m}} \right) \cdot p\left( {{{\vec \theta }_m}|\vec \alpha } \right) \cdot p\left( {\Phi |\vec \beta } \right)}$

$p\left( {\vec w|\vec z,\Phi } \right) = \prod\limits_{i = 1}^W {p\left( {{w_i}|{z_i}} \right)} = \prod\limits_{i = 1}^W {{\varphi _{{z_i},{w_i}}}}$

$\Delta \left( {\vec \alpha } \right) = \int {\prod\limits_{k = 1}^V {p_k^{{\alpha _k} - 1}d\vec p} }$

$p\left( {\vec z|\Theta } \right) = \prod\limits_{i = 1}^W {p\left( {{z_i}|{d_i}} \right) = \prod\limits_{m = 1}^M {\prod\limits_{k = 1}^K {p\left( {{z_i} = k|{d_i} = m} \right)} } = \prod\limits_{m = 1}^M {\prod\limits_{k = 1}^K {\theta _{m,k}^{n_m^{\left( k \right)}}} } }$

$p\left( {\vec z,\vec w|\vec \alpha ,\vec \beta } \right) = \prod\limits_{z = 1}^K {{{\Delta \left( {{{\vec n}_z} + \vec \beta } \right)} \over {\Delta \left( {\vec \beta } \right)}}} \prod\limits_{m = 1}^M {{{\Delta \left( {{{\vec n}_m} + \vec \alpha } \right)} \over {\Delta \left( {\vec \alpha } \right)}}}$

$p\left( {{{\vec \varphi }_k}|{{\vec z}_m},\vec \alpha } \right) = {1 \over {{Z_{{\varphi _k}}}}}\prod\limits_{n = 1}^{{N_m}} {p\left( {{z_{m,n}}|{{\vec \theta }_m}} \right) \cdot p\left( {{{\vec \varphi }_k}|\vec \alpha } \right)} = Dir\left( {{{\vec \varphi }_k}|{{\vec n}_m} + \vec \alpha } \right)$

$p\left( {{{\vec \varphi }_k}|\vec z,\vec w,\vec \beta } \right) = {1 \over {{Z_{{\varphi _k}}}}}\prod\limits_{\left\{ {i:{z_i} - k} \right\}} {p\left( {{w_i}|{{\vec \varphi }_k}} \right) \cdot p\left( {{{\vec \varphi }_k}|\vec \beta } \right)} = Dir\left( {{{\vec \varphi }_k}|{{\vec n}_k} + \vec \beta } \right)$

$E\left( {{p_i}} \right) = \left( {{{{\alpha _1}} \over {\sum\nolimits_{i = 1}^K {{\alpha _i}} }},{{{\alpha _2}} \over {\sum\nolimits_{i = 1}^K {{\alpha _i}} }},...,{{{\alpha _K}} \over {\sum\nolimits_{i = 1}^K {{\alpha _i}} }}} \right)$

${\theta _{m,k}} = {{n_m^{\left( k \right)} + {\alpha _k}} \over {\sum\nolimits_{k = 1}^K {n_m^{\left( k \right)} + {\alpha _k}} }}$

${\varphi _{k,t}} = {{n_k^{\left( t \right)} + {\beta _t}} \over {\sum\nolimits_{t = 1}^V {n_k^{\left( t \right)} + {\beta _t}} }}$

$p\left( {{z_i} = k|{{\vec z}_{\neg i}},\vec w} \right) \propto {{n_{m,\neg i}^{\left( k \right)} + {\alpha _k}} \over {\sum\nolimits_{k = 1}^K {n_{m,\neg i}^{\left( k \right)} + {\alpha _k}} }} \cdot {{n_{k,\neg i}^{\left( t \right)} + {\beta _t}} \over {\sum\nolimits_{t = 1}^V {n_{k,\neg i}^{\left( t \right)} + {\beta _t}} }}$

# 参考文献

1. http://www.arbylon.net/publications/text-est.pdf —《Parameter estimation for text analysis》

2. rickjin的LDA数学八卦—http://vdisk.weibo.com/s/q0sGh/1360334108?utm_source=weibolife

3. 马晨的LDA算法漫游指南—https://yuedu.baidu.com/ebook/d0b441a8ccbff121dd36839a.html

Top