Conjugate relationship between likelihoods and priors are important in MCMC and offer a handy way to find values for the parameters of posterior distributions in closed form, on the back of a cocktail napkin. Here we practice that important bar skill. Go over your notes on conjugate priors before diving in to this problem set. For each of the problems below you should:
Example plotting code:
Suppose the prior distribution of \(\alpha\) is a beta(4,5) and the posterior distribution of \(\alpha\) given the data \(y\) is a beta(20,3). Below is code that plots both of these distributions and finds the posterior 95% credible interval for \(\alpha\).
prior <- function(x) dbeta(x, 4, 5)
post <- function(x) dbeta(x, 20, 3)
plot(prior, 0, 1, ylim=c(0, 7), main = "Beta prior (dash) and Beta posterior (red)", xlab = expression(alpha), ylab = "Density",
lty = 2, lwd = 2)
plot(post, 0, 1, add = TRUE, col = "red", lwd = 2)
# 95% CI from posterior
round(qbeta(c(.025,.975),20,3),2)
## [1] 0.71 0.97
Thus, after observing the data we have 95% confidence (or belief) that \(\alpha\) is between 0.71 and 0.97.
Recall that we can use the binomial - beta conjugate relationship to find the parameters of the beta posterior using
\[\phi\sim\textrm{beta}\big(\alpha_{prior}+y,\beta_{prior}+n-y\big)\]Parameter of interest: \(\phi\) = probability that a person will vote “yes” on upcoming referendum (or equivalently, the proportion of the population that intends to vote “yes”)
Prior distribution: \(\phi \sim\) beta\((1,1)\) (Note, this is equivalent to a uniform distribution on the interval (0,1).)
Data model: \(y \sim\) binomial\((10,\phi)\)
Posterior distribution: \((\phi | y = 3) \sim\) beta\((4,8)\)
Note: no calculations were required for determining the posterior distribution. We identified the beta-binomial prior/data model on the distribution worksheet and knew the posterior was a beta distribution. The parameters of the posterior were determined using the prior hyperparameter and the observed data.
Now, let’s plot the prior and posterior distributions.
aPrior <- 1
bPrior <- 1
y <- 3
n <- 10
(aPost <- aPrior + y)
## [1] 4
(bPost <- bPrior + n - y)
## [1] 8
prior <- function(x) dbeta(x, aPrior, bPrior)
post <- function(x) dbeta(x, aPost, bPost)
plot(prior, 0, 1, ylim=c(0, 3), main = "Beta prior (dash) and beta posterior (red)", xlab = expression(phi), ylab = "Density",
lty = 2, lwd = 2)
plot(post, 0, 1, add = TRUE, col = "red", lwd = 2)
Now let’s find the posterior 95% credible interval.
bounds95 <- qbeta(c(0.025,0.975), aPost, bPost)
bounds95
## [1] 0.1092634 0.6097426
Finally, using the posterior distribution, let’s calculate the probability the vote passes.
1 - pbeta(.5, aPost, bPost)
## [1] 0.1132812
For those that are interested, below is the derivation of the posterior distribution: \[\begin{eqnarray} \big[\phi \mid y, n\big] & \propto & \textrm{binomial}\big(y\mid\phi,n\big)\,\textrm{beta}\big(\phi\mid \alpha_{prior},\beta_{prior}\big)\\[1em] & \propto & \phi^y \big(1-\phi\big)^{n-y}\phi^{\alpha_{prior}-1}\big(1-\phi\big)^{\beta_{prior}-1}\\[1em] & \propto & \phi^{\alpha_{prior}+y-1}\big(1-\phi\big)^{\beta_{prior}+n-y-1}\\[1em] & = & \frac{\Gamma\big(\alpha_{prior}+y+\beta_{prior}+n-y\big)}{\Gamma\big(\alpha_{prior}+y\big)\Gamma\big(\beta_{prior}+n-y\big)} \phi^{\alpha_{prior}+y-1}\big(1-\phi\big)^{\beta_{prior}+n-y-1}\\[1em] & = & \textrm{beta}\big(\alpha_{prior}+y,\beta_{prior}+n-y\big)\\[1em] & = & \textrm{beta}\big(1+3,1+10-3\big)\\[1em] & = & \textrm{beta}\big(4,8\big) \end{eqnarray}\]
ClimateVote
data frame from the BayesNSF
package). Each voter is asked whether they will vote “yes” in an
upcoming referendum on funding for local climate adaptation. Find the
posterior probability that the referendum passes. The poll results are
stored in the data frame ClimateVote
. In this case, assume
we have already conducted the study from question 1 above. Thus, we seek
to use the posterior distribution from the previous scenario as the
prior for this new analysis, answer questions a-d. In addition:aPrior <- 4
bPrior <- 8
(y <- sum(ClimateVote$vote))
## [1] 38
(n <- dim(ClimateVote)[1])
## [1] 100
(aPost <- aPrior + y)
## [1] 42
(bPost <- bPrior + n - y)
## [1] 70
Prior distribution: \(\phi \sim\) beta\((4,8)\) (posterior from above)
Data model: \(y \sim\) binomial\((100,\phi)\)
Posterior distribution: \((\phi | y = 38) \sim\) beta\((42,70)\)
prior <- function(x) dbeta(x, aPrior, bPrior)
post <- function(x) dbeta(x, aPost, bPost)
postFlat <- function(x) dbeta(x, 1 + y, 1 + n - y)
plot(prior, 0, 1, ylim=c(0, 10), main = "Beta prior (dash) and beta posterior (red, blue)", xlab = expression(phi), ylab = "Density", lty = 2, lwd = 2)
plot(post, 0, 1, add = TRUE, col = "red", lwd = 2)
plot(postFlat, 0, 1, add = TRUE, col = "blue", lwd = 2)
Now let’s find the posterior 95% credible interval.
bounds95 <- qbeta(c(0.025,0.975), aPost, bPost)
bounds95
## [1] 0.2880426 0.4661870
Finally, using the posterior distribution, let’s calculate the probability the vote passes.
1 - pbeta(.5, aPost, bPost)
## [1] 0.003792406
For those that are interested, below is the derivation of the posterior distribution viewing the data as coming from Bernoulli distributions rather than a binomial. \[\begin{eqnarray} \big[\phi \mid \mathbf{z}\big] & \propto & \prod_{i=1}^{100}\textrm{Bernoulli}\big(z_{i}\mid\phi)\,\textrm{beta}\big(\phi\mid \alpha_{prior},\beta_{prior}\big)\\[1em] & \propto & \prod_{i=1}^{100}\phi^{z_{i}}\big(1-\phi\big)^{1-z_{i}}\phi^{\alpha_{prior}-1}\big(1-\phi\big)^{\beta_{prior}-1}\\[1em] & \propto & \phi^{\alpha_{prior}-1+\sum_{i=1}^{100}z_{i}}\big(1-\phi\big)^{\beta_{prior}-1+\sum_{i=1}^{100}(1 - z_{i})}\\[1em] & \propto & \phi^{\alpha_{prior}+y-1}\big(1-\phi\big)^{\beta_{prior}+100-y-1}\\[1em] & = & \frac{\Gamma\big(\alpha_{prior}+y+\beta_{prior}+100-y\big)}{\Gamma\big(\alpha_{prior}+y\big)\Gamma\big(\beta_{prior}+100-y\big)} \phi^{\alpha_{prior}+y-1}\big(1-\phi\big)^{\beta_{prior}+100-y-1}\\[1em] & = & \textrm{beta}\big(\alpha_{prior}+y,\beta_{prior}+100-y\big)\\[1em] & = & \textrm{beta}\big(4+38,8+100-38\big)\\[1em] & = & \textrm{beta}\big(42,70\big)\\[1em] && \textrm{where}\quad y = \sum_{i=1}^{100}z_{i} \end{eqnarray}\]
Let’s first identify the data model and appropriate conjugate prior.
Data model: \(y_i \sim\) Poisson(\(\lambda\)) for \(i=1,...,5\)
Conjugate prior: \(\lambda \sim\) gamma(\(a_{prior}\),\(b_{prior}\))
Now, let’s use moment-matching to convert the prior mean \(\mu=3\) and standard deviation \(\sigma = 1.5\) to \(a_{prior}\) and \(b_{prior}\).
mu <- 3
sigma <- 1.5
(aPrior <- mu^2/sigma^2)
## [1] 4
(bPrior <- mu/sigma^2)
## [1] 1.333333
Using the gamma-Poisson conjugate relationship, we know the posterior distribution is \(\lambda | y_1,...,y_5 \sim\) gamma(\(a_{prior} + \sum y_i, b_{prior} +n\)).
y <- c(15, 8, 6, 11, 4 )
(aPost <- aPrior + sum(y))
## [1] 48
(bPost <- bPrior + length(y))
## [1] 6.333333
prior <- function(x) dgamma(x, aPrior, bPrior)
post <- function(x) dgamma(x, aPost, bPost)
plot(prior, 0, 20, main = "Gamma prior (dash) and gamma posterior (red)", xlab = expression(lambda),
ylab = "Density", lty = 2, lwd = 2, ylim = c(0, 0.4))
plot(post, 0, 20, add = TRUE, col = "red", lwd = 2)
bounds95 <- qgamma(c(0.025,0.975), aPost, bPost)
bounds95
## [1] 5.588117 9.868427
For those interested, the derivation of the posterior distribution is below: \[\begin{eqnarray} \big[\lambda \mid y\big] & \propto & \prod_{i=1}^{n}\textrm{Poisson}\big(y_{i}\mid\lambda\big)\,\textrm{gamma}\big(\lambda\mid\alpha_{prior},\beta_{prior}\big)\\[1em] & = & \prod_{i=1}^{n}\lambda^{y_{i}}e^{-\lambda}\lambda^{\alpha_{prior}-1}e^{-\beta_{prior}\lambda}\\[1em] & = & \lambda^{\alpha_{prior}-1 + \sum_{i=1}^{n}y_{i}}e^{-\beta_{prior}\lambda-n\lambda}\\[1em] & = & \lambda^{\alpha_{prior}-1 + \sum_{i=1}^{n}y_{i}}e^{-(n + \beta_{prior})\lambda}\\[1em] & = & \textrm{gamma}\big(\alpha_{prior} + \sum_{i=1}^{n}y_{i}, n + \beta_{prior}\big)\\[1em] & = & \textrm{gamma}\big(4 + 44, 5 + 1.333\big)\\[1em] & = & \textrm{gamma}\big(48, 6.333\big) \end{eqnarray}\]
SolsticeTemp
data frame from the
BayesNSF
package). You seek to learn about mean temperature
\(\mu\) for the past 50 years, but
suppose that we know the standard deviation of the distribution
that gave rise to our observed measurements and it is \(\sigma=15\). In a separate study,
researchers found mean temperature to be 30 \((\mu_{prior}=30)\) with a standard
deviation of 12 \((\sigma_{prior}=12)\). Given this prior
information, answer questions a-d.Parameter of interest: \(\mu\) = true mean temperature
Data model: \(y_i \sim\) normal(\(\mu,15^2\)) for \(i=1,...,50\)
Prior: \(\mu \sim\) normal\((30,12^2)\)
(varKnown <- 15^2)
## [1] 225
(n <- dim(SolsticeTemp)[1])
## [1] 50
(varPrior <- 12^2)
## [1] 144
(muPrior <- 30)
## [1] 30
(y <- sum(SolsticeTemp$temp))
## [1] 1538.549
(muPost <-((muPrior/varPrior) + y/varKnown)/((1/varPrior) + (n/varKnown)))
## [1] 30.74762
(varPost <- 1/((1/varPrior) + (n/varKnown)))
## [1] 4.363636
Posterior: \(\mu | \boldsymbol{y} \sim\) normal(30.75,4.36)
prior <- function(x) dnorm(x, muPrior, varPrior^.5)
post <- function(x) dnorm(x, muPost, varPost^.5)
plot(prior, 0, 60, main = "Normal prior (dash) and normal posterior (red)", xlab = expression(mu),
ylab = "Density", lty = 2, ylim = c(0, 0.2), lwd = 2)
plot(post, 0, 60, add = TRUE, col = "red", lwd = 2)
bounds95 <- qnorm(c(0.025,0.975), muPost, varPost^.5)
bounds95
## [1] 26.65338 34.84185
For those interested, the derivation of the posterior distribution is below: \[\begin{eqnarray} \big[\mu \mid y\big] & \propto & \prod_{i=1}^{n}\textrm{normal}\big(y_{i}\mid\mu, \sigma^{2}\big)\,\textrm{normal}\big(\mu\mid\mu_{prior},\sigma_{prior}^{2}\big)\\[1em] & = & \textrm{normal}\Bigg(\cfrac{\bigg(\cfrac{\mu_{prior}}{\sigma_{prior}^{2}} +\cfrac{\sum_{i=1}^{n}y_{i}}{\sigma^{2}}\bigg)}{\bigg( \cfrac{1}{\sigma_{prior}^{2}} +\cfrac{n}{\sigma^{2}} \bigg)} ,\bigg(\cfrac{1}{\sigma_{prior}^{2}}+{\cfrac{n}{\sigma^{2}}}\bigg)^{-1}\Bigg)\\[1em] & = & \textrm{normal}\Bigg(\cfrac{\bigg(\cfrac{30}{144} +\cfrac{1538.549}{225}\bigg)}{\bigg( \cfrac{1}{144} +\cfrac{50}{225} \bigg)} ,\bigg(\cfrac{1}{144}+{\cfrac{50}{225}}\bigg)^{-1}\Bigg)\\[1em] & = & \textrm{normal}\big(30.75, 4.36 \big) \end{eqnarray}\]
Chrome
data frame from the
BayesNSF
package. You seek to learn about the variance
\(\sigma^2\) of the distribution that
gave rise to these data (about which you know very little before the
study), but you already know that in a typical stainless-steel welding
workplace, airborne concentrations have a known mean of \(\mu=225\) μg/m3. Given this prior
information, answer questions a-d. Hint: the R library
actuar
has the function dinvgamma()
for
calculating probability densities of the inverse gamma
distribution.Parameter of interest: \(\sigma^2\) = variance of airborne chromium concentrations
Data model: \(y_i \sim\) normal(225,\(\sigma^2\)) for \(i=1,...,100\)
Prior: \(\sigma^2 \sim\) inverse-gamma\((.001,.001)\) (vague prior for variances)
muKnown <- 225
n <- nrow(Chrome)
alphaPrior <- 0.001 #shape
betaPrior <- 0.001 #scale
(alphaPost <-alphaPrior + (n/2))
## [1] 50.001
(betaPost <- betaPrior + sum((Chrome$conc-muKnown)^2)/2)
## [1] 110410.8
Posterior: \(\sigma^2 | \boldsymbol{y} \sim\) inverse-gamma(alphaPost,betaPost)
library(actuar)
##
## Attaching package: 'actuar'
## The following objects are masked from 'package:stats':
##
## sd, var
## The following object is masked from 'package:grDevices':
##
## cm
prior <- function(x) dinvgamma(x, shape = alphaPrior, scale = betaPrior)
post <- function(x) dinvgamma(x, shape = alphaPost, scale = betaPost)
plot(prior, 0, 4000, main = "Inverse Gamma Prior (dash) and Inverse Gamma Posterior (red)", xlab = expression(sigma^2),
ylab = "Density", lty = 2, lwd = 2, ylim = c(0,.0013))
plot(post, 0, 4000, add = TRUE, col = "red", lwd = 2)
bounds95 <- qinvgamma(c(0.025,.975), alphaPost, scale = betaPost)
bounds95
## [1] 1704.351 2975.084
For those interested, the derivation of posterior distribution is given below: \[\begin{eqnarray} \big[\sigma^2 \mid y\big] & \propto & \prod_{i=1}^{n}\textrm{normal}\big(y_{i}\mid\mu, \sigma^{2}\big)\,\textrm{inverse gamma}\big(\sigma^2|\alpha_{prior},\beta_{prior}\big)\\[1em] & = & \textrm{inverse gamma}\Bigg(\bigg(\alpha_{prior} + \cfrac{n}{2}\bigg),\bigg(\beta_{prior} + \cfrac{\Sigma_{i=1}^n(y_i-\mu)^2}{2}\bigg)\Bigg)\\[1em] & = & \textrm{inverse gamma}\Bigg(\bigg(0.001 + \cfrac{n}{2}\bigg),\bigg(0.001 + \cfrac{\Sigma_{i=1}^n(y_i-225)^2}{2}\bigg)\Bigg) \end{eqnarray}\]