*I wrote the following notes as a memorandum for myself, but I am sharing them in the hope that they may be useful for others too. These notes are in no way meant to replace the study of specialised literature, a selection of which is listed at the end of this article.*

Response variables can show a non-linear correlation with one or more continuous explanatory variables. Generalised additive models (hereafter: GAM) provide a tool for modelling such correlations. Generalised Additive Models are â€śgeneralized linear models with a linear predictor involving a sum of smooth functions of covariatesâ€ť (S.N. Wood, 2017). In other words, they estimate functional relationships between predictors and the response variable by approximating continuous predictors using smooth curves, or â€śsmoothersâ€ť. Smoothers are functions which are, themselves, sums of functions. The functions that, summed together, produce a smoother are called its basis functions.

How can it be that a linear model can describe non-linear correlations? This calls for a refresher of what a linear model is. Linear models are models that describe a continuous response variable as a function of a linear combination of parameters and explanatory variables. A linear combination is â€śan expression constructed from a set of terms by multiplying each term by a constant and adding the resultsâ€ť (source). For example, * y = a + b*x* is a linear model. On the contrary,

**is not, because y is equalled to an expression that does not fit the definition of a linear combination. The polynomial model**

*y = a * exp(b*x)***describes a U-shaped correlation between y and x, but it is a linear model because the right side of the model is a linear combination of parameters and predictors. This shows that, given a response variable y and a predictor x, the correlation between y and x**

*y = a + b*x + c*x*^{2}_{ }can be non-linear in shape and still be described using a linear model. The term â€ślinearâ€ť referred to a model refers to how the model is â€śassembledâ€ť, not to the shape of the relationship between its variables.

The polynomial model discussed above is an Additive Model itself. Polynomials follow the general expression of an additive model: ** y = f_{1}(x) + f_{2}(x) + â€¦ f_{n}(x)**. In the case of a second-degree polynomial:

*y = a + b*x + c*x ^{2}*

* = f _{1}(x) + f_{2}(x)*

Where ** f_{1}(x) = a + bx** and

*f*_{2}(x) = c*x^{2}The problem with polynomial models is that they get more and more â€świgglyâ€ť as their degree increases, and this out-of-control â€świgglinessâ€ť often affects their ability to approximate data well. To overcome this issue, GAM are fitted using splines, namely piecewise polynomial models. To avoid overfitting, models with a large basis size (i.e. many basis functions) are assigned a smoothing penalty which prevents eccessive â€świgglinessâ€ť.

The penalisation operates by calculating a penalty matrix based on the integral of the squared derivative of each basis function.

Effective Degrees of Freedom (EDF), estimated as the maximum number of coefficients in the model minus the modelâ€™s constraints, are a proxy for the degree of non-linearity in the in stressor-response relationships: the larger the EDF, the wigglier the model is.

Package ** mgcv** is a powerful tool to fit GAMs in R. It can:

- handle both normally distributed and non-normally distributed data;
- fit complex models including both continuous and categorical data, interactive effects, and more;
- fit mixed-effect GAMs.

**Interaction between smooth terms** can be modelled in **mgcv** in two main ways:

- using two-dimensional thin plate regression smoothers, with the syntax
**s(x1,x2)**. This can work if the predictors have the same rate of change (e.g. some geographic coordinate systems). - using tensor products, which create a new set of basis functions that allows for each smoother to have their own penalty matrix. Tensor products can be implemented with
**te()**,**ti()**(for when main effects are also present),**t2()**. See: Pedersen et al. (2019), p. 7.

**Hierarchical Generalised Additive Models (HGAM)**

Hierarchical Generalised Additive Models (HGAM) allow to account for structure in the data and can follow one of the following five structures:

*1) Models with a single common smoother for all observations (G)*

```
gam(y ~ s(x)
+ s(fac, bs=â€śreâ€ť)) # â€śreâ€ť is short for â€śrandom effectâ€ť
```

*2) Models with a global smoother for all observations, plus group-level smoothers assumed to have all the same wiggliness (GS)*

```
gam(y ~ s(x, m=2)
+ s(x, fac, bs=â€śfsâ€ť, m=2))
# â€śfsâ€ť is short for â€śfactor-smoother [interaction]â€ť
gam(y ~ te(x1, x2, bs=â€śtpâ€ť, m=2)
+ t2(x1, x2, fac, bs=c(â€śtpâ€ť, â€śtpâ€ť, â€śreâ€ť),
m=2, full=T)
)
```

The argument m specifies the order of the derivatives on which the penalty matrix is calculated. I take Pedersen et al.â€™s word for the choice of the values of m.

*3) GI: similar to GS but allowing group-level smoothers to differ in wiggliness.*

```
gam(y ~ s(x, m=2)
+ s(x, by= fac, bs=â€śtpâ€ť, m=1)
# â€śtpâ€ť is short for â€śthin plate [regression spline]"
+ s(fac, bs=â€śreâ€ť))
```

*4) S: similar to GS but without a global smoother term.*

```
gam(y ~ s(x, fac, bs=â€śfsâ€ť))
gam(y ~ t2(x1, x2, fac, bs=c(â€śtpâ€ť, â€śtpâ€ť, â€śreâ€ť)))
```

*5) I: similar to GI but without a global smoother term.*

```
gam(y ~ fac
+ s(x, by=fac))
gam(y ~ fac
+ te(x1, x2, by=fac))
```

**Model comparison**

These models can be compared by their AIC (Pedersen et al. (2019), p. 20), but model selection should also keep the goals of the study into account. For example, models S and I cannot be used to predict the predictor-response relationship for groups that are not among those used for estimating the model parameters. Similarly, GS models can be used to simulate functional variation for unobserved group levels, while GI models cannot.

Cross-validation is recommended for assessing the predictive power of a model.

Function ** gam.check()** and

**allow to check if the family distribution chosen for the residuals is appropriate, and whether the model is overfitting the data.**

*k.check()*Other aspects to keep in mind during model selection are **bias-variance** trade-offs and the trade-off between **complexity and computational costs **[Pedersen et al. (2019): p. 26 ss., p. 31 ss.].

**Alternative formulations**

, for large datasets and large number of groups;*bam()*, which uses*gamm()*;*nlme()*, which uses package*gamm4()*.*lme4*

**Useful literature**

- Simon Woodâ€™s â€śGeneralized Additive Models: An Introduction with Râ€ť is an essential textbook for understanding the theory behind GAMs and how to implement them correctly in R;
- â€śHierarchical generalized additive models in ecology: an introduction with mgcvâ€ť by Pedersen, Miller, Simpson, and Ross (2019);
- An introduction to GAM by Jacolien van Rij (2015), which I find useful because it delves into the different ways of modelling interactions between categorical and continuous predictors in the context of GAM;
- Analysing time series with GAMs: an introductory article by Peter Laurinec with a focus on modelling time series.