Bayesian Estimation

Bayesian EstimationIntroduction to Bayesian EstimationBayes' TheoremChoice of a prior distributionPoint EstimationBayesian Credible Interval

Introduction to Bayesian Estimation

Bayes' Theorem

$A$ $B$ with positive probability,

P(A|B) = P(B|A)\frac{P(A)}{P(B)}

Example

$3/4$ . One of the coins is picked at random and flipped three times. It lands Heads all three times. Given this information, what is the probability that the coin which is picked is the fair one?

Suppose now we have many coins with different head probabilities. We have prior information that most of the coins are close to being unbiased. How can we incorporate such prior information in our inference for the parameter?

Idea: $\theta$ $\theta$ $\theta$ $x_1,\dots,x_n$ (a.k.a. posterior distribution).

$f(\theta)$ prior $\theta$
$f(x_1,\dots,x_n|\theta)$ $x$ $\theta$ $\theta$ .
$f(\theta|x_1,\dots,x_n)$ $\theta$ $x_1,\dots,x_n$ posterior $\theta$ given data.

By Bayes' theorem,

f(\theta|x_1,\dots,x_n) = f(\theta)\frac{f(x_1,\dots,x_n|\theta)}{f(x_1,\dots,x_n)}

Choice of a prior distribution

Based on prior information (e.g. previous studies)
Non-informative prior
Conjugate prior
- a prior distribution that leads to the same distribution for the posterior distribution

A useful trick for posterior distribution computations: $p(θ | x_1,\dots,x_n)$ $\theta$ $x_1,\dots,x_n$ ) such that the expression integrates to 1.

Example $X$ $\theta$ $\theta$ $X$ $\theta$ $N(\theta, \sigma^2)$ $\sigma^2$ $\sigma^2=5^2$ $\theta$ $N(100, \tau^2)$ $\tau^2$ $\tau^2=3^2$ $\theta$ .

Point Estimation

$\theta$ $\theta$ $\theta$ (point estimation), which value should we choose?

Three most common summary statistics for posterior distributions are

Posterior mean: expected value of the parameter estimate using the posterior distribution.
Posterior median: the value which divides the posterior distribution in half
Posterior mode: the peak of the posterior distribution

Remark: $T$ would clearly depend upon the penalties for various errors created by incorrect guesses.

$L(\theta,a)$ $a$ $\theta$ is the true value of the parameter.

Examples:

$L(θ, a) = (θ − a)^2$
$L(θ, a) = |θ − a|$

Definition (Bayes estimator) $L(\theta,a)$ is an estimator that minimizes the posterior expected value of the loss function.

$L(\theta,a) =(\theta-a)^2$ is the posterior mean.
$L(\theta,a) =(\theta-a)^2$ is the posterior median.

Example: $\theta$ in the previous example, when the loss function is the squared error loss.

Bayesian Credible Interval

A Bayesian counterpart to a confidence interval is a Bayesian credible interval.

$1-\alpha$ credible interval) $100(1 − α)$ $[L,U]$ $P(L\le\theta\le U | x_1,\dots,x_n) = 1-\alpha$ .

$L$ $U$ $L= (α/2)$ $U=(1 − α/2)$ th quantile of the posterior distribution.

$1-\alpha$ confidence interval)

$1-\alpha$ credible $100(1-\alpha)$ %
$100(1-\alpha)$ $1-\alpha$ confidence intervals are expected to contain the true parameter
- $1-\alpha$ confidence interval is either zero or one.

Example: $θ$ .