Markov-Chain Monte Carlo Sampling | Fernando Villanea

^{Fernando Villanea} _{Markov-Chain Monte Carlo Sampling}

Markov-Chain Monte Carlo (MCMC) is a sampling algorithm which has enabled the use of modern Bayesian demographic methods by allowing for an approximation of the likelihood equations, which are more practical to estimate computationally. The typical MCMC algorithm commences by sampling a genealogy, tests its fit to the data, and then proposes a new genealogy by making a random but small change in the genealogical tree topology. The chain accepts this new genealogy given a probability calculated from the ratio of Bayesian posteriors between the new genealogy P(x’) and old genealogy P(x). If the ratio is higher (if P(x’)/ P(x) ≥ 1), the “chain” adopts the new genealogy and abandons the previous one. Even if the new posterior probability is lower (if P(x’)/ P(x) ≤ 1), the chain may accept the new genealogy with probability proportional to the ratio (≤ 1). This is critical in ensuring that the posterior distribution is composed of a mix of diverse genealogies, proportional to their likelihood, in true Bayesian form.

MCMC samples step-by-step starting at a random genealogy, and generally moving into better fitting ones. This is far from a perfect process for two main reasons; firstly an individual chain may get “stuck” moving between closely related genealogies, each not considered truly independent for the purposes of parameter estimation. For this reason, MCMC analysis only records a fraction of the total visited trees, for example, only every 1000^th tree may contribute to the posterior distribution, assuming that the interim will permit the chain to explore unrelated (but not truly independent) genealogies. Secondly, each chain starts from a random point, which may not be a very likely tree. For this reason, MCMC analysis often ignores the first few thousand steps (called burn-in), only including later sampled genealogies as part of the distribution, assuming that the chain will move to a likelier group of trees after this interval. Finally, to alleviate these limitations, a typical analysis will run thousands of chains, each contributing to the posterior distribution of genealogies. The posterior distribution generated from MCMC sampling should have the same shape as the true posterior distribution (one composed of all possible genealogies) while being composed of a much smaller number of genealogies, making the calculations of likelihood computationally practical.