So, if I propose $theta^*$ from $text{Dirichlet}(theta_i times tau)$, where $tau$ is some scalar, does the Metropolis-Hastings atio need to account for that scalar i.e.
$$text{Dirichlet}(theta_i|theta^*timestau) big/ text{Dirichlet}(theta^*|theta_i timestau)$$
or should I drop the $tau$‘s from the Metropolis-Hastings ratio? I went through the proof for the MH algorithm and this is not immediately clear.
$endgroup$
1
For a function $f(cdot)$ proportional to the probability density of interest, and a proposal function $g ( cdot ∣ cdot )$ proportional to a conditional density, the Metropolis-Hastings algorithm operates with the Markov kernel $K( cdot ∣ cdot )$ defined as
$$K(theta’|theta)=alpha(theta’|theta)g(theta’|theta)+(1-baralpha(theta))delta_theta(theta’)$$
where $delta_theta$ denotes the Dirac mass in $theta$ and
$$alpha =dfrac{f(theta’)/f(theta)}{g(theta’|theta)/g(theta|theta’)}wedge 1$$
with
$$baralpha(theta) = int left{dfrac{f(theta’)/f(theta)}{g(theta’|theta)/g(theta|theta’)}wedge 1right} g(theta’|theta),text dtheta’$$
Therefore, when $g(theta’|theta)$ corresponds to the density of the Dirichlet $mathcal D_p(tautheta)$ distribution with $tau>0$, the ratio $g(theta’|theta)/g(theta|theta’)$ writes as
$$frac{Gamma(tausum_{i=1}^ptheta_i)}{Gamma(tausum_{i=1}^ptheta’_i)}prod_{i=1}^p frac{Gamma(tautheta’_i),(theta’)_i^{tautheta_i-1}}{Gamma(tautheta_i),theta_i^{tau(theta’)_i-1}}$$
and there is no immediate simplification.
$endgroup$
1