Hacking GLMs to fit population growth models

r
Published

February 24, 2020

I’m currently teaching Ecological Statistics and Data, a class I inherited from Lee Brown and Elizabeth Crone. In a lecture on population dynamics, they do some really cool things with generalized linear model—things that I don’t think are standard practice and as far as I can tell from googling, aren’t well documented. And let me tell you, I did a lot of googling to make sure I understood this stuff before teaching it. So, I thought I’d put it up on the blog for others.

Data

I’ll be using data on the Northern Rocky Mountain grey wolf population. You can read more about the history of these wolves here.

library(tidyverse)
wolves <- read_csv("NRMwolves.csv") %>%
  mutate(year_post = year - 1982) 
head(wolves)
# A tibble: 6 × 8
   year num.wolves MT.wolves WY.wolves ID.wolves OR.wolves WA.wolves year_post
  <dbl>      <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>
1  1982          8         8        NA        NA        NA        NA         0
2  1983          6         6        NA        NA        NA        NA         1
3  1984          6         6        NA        NA        NA        NA         2
4  1985         13        13        NA        NA        NA        NA         3
5  1986         15        15        NA        NA        NA        NA         4
6  1987         10        10        NA        NA        NA        NA         5

Exponential growth

Exponential growth describes unregulated reproduction and is described by the equation:

\[ N_{T} = \lambda^TN_0 \]

where \(\lambda\) is the population growth rate, \(T\) is a number of time steps (e.g. years) and \(N_0\) is the population at some initial time.

Hacking an exponential growth rate GLM

We can take advantage of a log-link to linearize this equation:

\[ log(N_T) = log(N_0) + log(\lambda)\times T \]

Compare to a generic GLM equation with a log-link:

\[ log(y)= \beta_0 + \beta_1x_1 \]

Here’s the glm for an exponential growth model fit to the wolf data:

m_exp <- glm(num.wolves ~ year_post, 
             family = poisson(link = "log"), data = wolves)
exp(coef(m_exp))
(Intercept)   year_post 
  19.615307    1.176583 

The backtransformed intercept is the estimate for \(N_0\), the estimated number of wolves at year_post = 0

The backtransformed coefficient for year_post = \(\lambda\), the population growth rate

Process error

The exponential growth model fit above is an observation error model. It assumes variation from predicted values are due to inaccuracies in estimating the number of wolves.

A process error model estimates a population growth that depends on current population size. This can be modeled as a rate, \(N_{t+1}/N_{t}\).

Hacking a process error GLM

Again, we can use a log-link to linearize this:

\[ log(N_{t+1}/N_{t}) = \beta_0 \]

\[ log(N_{t+1}) +log(N_{t}) = \beta_0 \] \[ log(N_{t+1}) = \beta_0 + log(N_{t}) \]

The \(log(N_t)\) term, which has no coefficient associated with it, is an offset. We can hack a glm to fit this model like so:

wolves2 <- wolves %>%
  mutate(num.prev = lag(num.wolves)) %>% #create a column of lagged wolf numbers
  filter(!is.na(num.prev))

m_process <- glm(num.wolves ~ 1, offset = log(num.prev),
                 family = poisson(link = "log"), data = wolves2)

The backtransformed intercept is the yearly rate of increase (\(N_{t+1}/N_t\))

exp(coef(m_process))
(Intercept) 
   1.108238 

\[ N_{t+1} = e^{\beta_0} \times N_{t} \]

So, if there are 13 wolves in 1985, how many would it predict in 1986?

1.108238 * 13
[1] 14.40709

Ricker model

Finally, the most complicated, possibly mind blowing example of hacking a GLM. This one took me quite a while to wrap my head around.

A Ricker model takes carrying capacity into account and allows growth rate to change as the population increases. It approximates logistic growth.

\[ N_{t+1}=N_{t}e^{r\left(1-{\frac {N_{t}}{K}}\right)} \] where \(r = ln(\lambda)\) and \(K\) is the carrying capacity

Hacking a Ricker model GLM

Linearizing using a log-link (please tell me if I got the math wrong in the comments):

\[ log(N_{t+1})=log(N_{t}) + log\left( (e^r)^{\left(1-{\frac {N_{t}}{K}}\right)}\right) \]

\[ log(N_{t+1})=log(N_{t}) + log (e^r)\times{\left(1-{\frac {N_{t}}{K}}\right)} \]

\[ log(N_{t+1})= r-\frac{r}{K}N_{t} + \textrm{offset}[log(N_{t})] \]

We can model this with the following GLM:

m_rick <- glm(num.wolves ~ num.prev, offset = log(num.prev),
              family = poisson, data = wolves2)

\(r = \beta_0\)

coef(m_rick)[1]
(Intercept) 
  0.3119624 

\(\beta_1 = -r/K\)

coef(m_rick)[2]
     num.prev 
-0.0001683389 

Which means that \(K = -\beta_0/\beta_1\)

-coef(m_rick)[1]/coef(m_rick)[2]
(Intercept) 
    1853.18 

🤯