Skip to content

Poisson-distributed data with an offset

    How to analyse ratios? For example, how to analyse changes in population density according to a certain predictor? Population density equals number of individuals over a certain area, so it is tempting to treat densities as count data, therefore analysing them as:

    glm(density ~ predictor, family="poisson") 

    Where:

    density = abundance/land_area

    This would be incorrect for at least two reasons. Firstly, a Poisson distribution can only deal with integers. This can be addressed by using family="quasipoisson". A quasi-Poisson distribution can deal with non-integers, but it still doesn’t address another important aspect: a population density of 0.5 can be the result of one individual per 2 km2, or 100 individuals per 200 km2, etc. The ratio is the same, but the latter case should have more weight on the outcome of the analysis, because what we observe over 200 km2 is more representative of what we observe over 2 km2. In other words: if we observe something over a large area, that’s more likely to represent what happens across the entire landscape than if we observed it over a small area. Using an offset addresses both issues:

    glm(abundance ~ predictor, family="poisson", offset = log(land_area))

    The offset is basically the denominator by which we are dividing abundance, but it’s also the weight by which every observation is weighted. This way, R not only gets the densities right, but it gives more importance to densities measured over larger areas, thus increasing the power of the analysis.

    Note that the offsets must be provided on the scale of the link function, which is why I used log(land_area) rather than land_area in the code above.

    Some useful explanations and examples can be found here, here, here, and here (the latter explains how to predict data correctly in R from models that use an offset).

    A similar argument as the one I have just described applies to observations belonging to other distributions, such as the binomial.

    Did you enjoy this? Consider joining my on-line course “First steps in data analysis with R” and learn data analysis from zero to hero!