2014年11月4日星期二

GLM and Natural parameter family

GLM and Exponential Family

GLM and Exponential Family

GLM

Ordinary Linear Regression assumes that y changes a constant value when x changes a constance value. It fails in following situation 1. Response has certain range, say life or death [0,1]. 2. When we need to estimate the probability. 3. Data are not normal distributed, say, the new ebola victims each day in mali.

Generalized Linear model solved this problem by specify $g(u)=x\beta$, which means the function of mean varies constant ,rather than the mean itself with $x\beta$.

I hope you do not get bored, let's look at form of generalized linear model, which will help you understand logistics model, poisson model, probit model well. A glm is made up with the following:

1.$ \eta=x\beta$,which is called systematic component.

2.$g(\mu)=\theta$,which is called linked function

3.Last but not the least, distribution comes from natural exponential family.

Natural Parameter Famliy

Any distribution with a pdf can be written in the following way belong to natural parameter family.

$f(y_i,\theta,\phi)=exp(\frac{y_i\theta+b(\theta)}{a(\phi)}+c(y_i,\phi))$

$\theta$:function of $\mu$

$a(\phi)$:dispersion parameter

Normal distribution is one of this famliy: $\frac{1}{\sqrt{2\pi}}exp-\frac{(y_i-\mu)^2}{2\sigma^2} $=$exp(\frac{y_i\theta-\theta^2/2}{\sigma^2}+\frac{-y^2}{2\sigma^2}-log(2\pi))$

Here we can see:

$\theta:\mu$

$b(\theta):\theta^2/2$

$a(\phi):\sigma^2$

$c(y_i,\phi):\frac{-y^2}{2\sigma^2}-log(2\pi)$

Also binomial distribution is natural exponential family:

$\binom ny p^{y_i}(1-p)^{n-y_i}$=$exp[y_ilog(\frac{p}{1-p})+nlog(1-p)+log(\binom ny)]$

$\theta=log(\frac{p}{1-p})$

$\binom ny p^{y_i}(1-p)^{n-y_i}=exp[y_i\theta-nlog(1+e^\theta)+nlog(\binom ny)]$

Here:

$\theta:\log(\frac{\mu}{1-\mu})$

$b(\theta):nlog(1+e^\theta)$

$a(\phi):1$

Exponetial famliy

$\lambda exp(-\lambda y_i)=exp[-\lambda y_i +log(\lambda)]$

$u=\frac{1}{\lambda}$

$\lambda exp(-\lambda y_i)=exp[-\frac{1}{\mu}y_i -log(\mu)]$

$\theta=-\frac{1}{\mu}$

$\lambda exp(-\lambda y_i)=exp[\theta y_i +log(-\theta)]$

$\theta=-\frac{1}{\mu}$

$b(\theta)=-log(-\theta)$

canonical link

if link function \theta=g(\mu) in natrual exponential family equals \theta in the natural parameter function, this kind of link is called canonical link.

Distribution Canonical Link
Normal $\theta=\mu$
Poisson $\theta=log(\mu)$
Exponential $\theta=-\frac{1}{\mu}$
Bernoulli $log(\frac{\mu}{1-\mu})$

You can get the cannoical link of natrual exponential family by reorgnize the distribution in the way in the example

ps: You might have heard identity link,$u=\theta$, normal distribution's canonlical link is also its identity link.

why do we use cannoical link ?

How to get the parameter?

How to inference it?

What is the realtion between GLM and OLM?

See you next time !

Reference:

1.http://en.wikipedia.org/wiki/Exponential_distribution 2.http://en.wikipedia.org/wiki/Generalized_linear_model

3.Categorical Data Analysis by Alan Agresti (Wiley, 2013)

没有评论:

发表评论