In Bayesian probability theory, a marginal likelihood function is a likelihood function integrated over some variables, typically model parameters. Integrated likelihood is a synonym for marginal likelihood. Evidence is also sometimes used as a synonym, but this usage is somewhat idiosyncratic. "Marginal likelihood" is the most commonly-used of these three terms.
For any likelihood function of two or more variables, marginal likelihoods with respect to any subset of the variables can be defined. Let a denote the subset of variables marginalized (i.e., integrated). Let b denote the other variables. Let x denote observed data. Given the likelihood function p(x|a, b), the marginal likelihood of b is
where p(a|b) is the distribution of a conditional on b. The marginal likelihood of a is computed in an analogous way, by exchanging the roles of a and b.
In a widely-used application, the marginalized variables are parameters for a particular type of model, and the remaining variable is the identity of the model itself. In this case, the marginalized likelihood is the probability of the data given the model type, not assuming any particular model parameters. Writing θ for the model parameters, the marginal likelihood for the model M is
This quantity is important because the posterior odds ratio for a model M1 against another model M2 involves a ratio of marginal likelihoods, the so-called Bayes factor:
which can be stated schematically as
The Bayes factor is an object of central importance in Bayesian model comparison.
Unfortunately, marginal likelihoods are generally difficult to compute. Exact solutions are known for a small class of distributions. In general, some kind of numerical integration method is needed, either a general method such as Gaussian integration or a Monte Carlo method, or a method specialized to statistical problems such as the Laplace approximation or Gibbs sampling.
- Charles S. Bos. "A comparison of marginal likelihood computation methods". In W. Härdle and B. Ronz, editors, COMPSTAT 2002: Proceedings in Computational Statistics, pp. 111-117. 2002. (Available as a preprint on the web: )
- The on-line textbook: Information Theory, Inference, and Learning Algorithms, by David J.C. MacKay.