Generalized Additive Models

GAMs/GAMMs handle nonlinear relationships and can include random effects (e.g., site or tree identity) to account for hierarchical structures and temporal or spatial dependencies, making them well-suited for modeling complex dendrochronological data. in growthTrendR package, a suite of GAM/GAMM models has been implemented to accommodate different types of datasets. Here, we use a single model, gamm_spatial, to demonstrate how to generate a fitting and diagnostic model report from raw data.

prepare data for model:



# loading processed ring measurement
dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR"))

# climate
dt.clim <- fread(system.file("extdata", "dt.clim.csv", package = "growthTrendR"))

# merge data
dt.samples_clim <- merge(dt.samples_trt$tr_all_wide[, c("uid_site", "site_id","latitude", "longitude",  "species", "uid_tree", "uid_radius")], dt.samples_trt$tr_all_long$tr_7_ring_widths, by = "uid_radius")

# # Calculate BAI
dt.samples_clim <- calc_bai(dt.samples_clim)

dt.samples_clim <- merge(dt.samples_clim, dt.clim, by = c("site_id", "year"))

fitting model

This example uses gamm_spatial; other functions with the same arguments (gamm_radius, gamm_site, bam_spatial, gam_mod) may be used depending on the data and analysis goals.

setorder(dt.samples_clim, uid_tree, year)

# Remove ageC == 1 prior to fitting log-scale models.
dt.samples_clim <- dt.samples_clim[ageC > 1]
m.sp <- gamm_spatial(data = dt.samples_clim, resp_scale = "resp_log",
                     m.candidates =c( "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC) + s(FFD)",
                                      "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC) + FFD")
)

arguments

resp_scale The function provides three options for specifying the response variable, and the user must choose the one that best suits their modelling purpose:

“resp_gaussian”: the response variable is used on its original scale and is modelled under a Gaussian distribution with an identity link (no transformation applied).

“resp_log”: the response variable is log-transformed prior to modelling. The transformed response is then assumed to follow a Gaussian distribution and is fitted using an identity link.

“resp_gamma”: the response variable is kept on its original scale, and the model is fitted under a Gamma distribution with a log link, appropriate for strictly positive and right-skewed data.

m.candidates

The list of all candidate equations. Note that the response variable is kept on its original scale in all cases, even when using the option “resp_log”.

generate report

generate_report(robj = m.sp)

Modeling and Diagnostics Report

This report presents the results of a generalized additive model (GAM) analysis.
The objective is to evaluate predictor contributions, describe the functional forms of relationships,
and assess the adequacy of the fitted model.
The following sections include a model summary, smooth term importance, partial effect visualizations,
and diagnostic checks to ensure the robustness of the analysis.

Model Summary

The following output provides the summary of the fitted GAM model.
It includes estimated coefficients, smooth terms, approximate significance of predictors,
and overall model fit statistics.

#> 
#> Family: gaussian 
#> Link function: identity 
#> 
#> Formula:
#> log(bai_cm2) ~ log(ba_cm2_t_1) + s(ageC) + FFD + s(uid_site.fac, 
#>     bs = "re")
#> 
#> Parametric coefficients:
#>                  Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)     -0.240826   0.343068  -0.702   0.4834    
#> log(ba_cm2_t_1)  0.540556   0.078397   6.895 5.04e-11 ***
#> FFD             -0.003385   0.001400  -2.418   0.0164 *  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Approximate significance of smooth terms:
#>                   edf Ref.df     F  p-value    
#> s(ageC)         5.885  5.885 4.194 0.000709 ***
#> s(uid_site.fac) 1.768  2.000 5.625 0.001394 ** 
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> R-sq.(adj) =  0.924   
#>   Scale est. = 0.069672  n = 243

Importance of Smooth Terms

The relative contribution of predictors is evaluated by calculating the importance percentage of each smooth term, based on ssq method. This indicates how much each variable contributes to explaining variation in the response.

Relative importance of smooth terms
Term	Score (%)
s(ageC)	59.7
s(uid_site.fac)	40.3

Partial Effects of Smooth Terms

Partial effect plots illustrate the shape of the relationship between the response and each predictor, while holding other predictors constant. These visualizations help identify nonlinear trends and assess whether effects are monotonic, threshold-like, or more complex.

Model Diagnostics

Diagnostic checks evaluate whether the fitted GAM meets assumptions of independence, normality, and sufficient smoothness. Residuals, k-index, and qq-plots provide evidence of model adequacy or potential overfitting.

#> 
#> 'gamm' based fit - care required with interpretation.
#> Checks based on working residuals may be misleading.
#> Basis dimension (k) checking results. Low p-value (k-index<1) may
#> indicate that k is too low, especially if edf is close to k'.
#> 
#>                   k'  edf k-index p-value
#> s(ageC)         9.00 5.88    0.97    0.28
#> s(uid_site.fac) 3.00 1.77      NA      NA