R: GAM convergence and performance issues

gam.convergence {mgcv}

R Documentation

GAM convergence and performance issues

Description

When fitting GAMs there is a tradeoff between speed of fitting and probability of fit convergence. The default fitting options specified by gam.method (as the default for argument method of gam), always opt for certainty of convergence over speed of fit. In the additive modelling contexts this means using fitting routine magic rather than the slightly faster routine mgcv. In the Generalized Additive Model case it means using `outer' iteration in preference to `performance iteration': see gam.outer for details.

It is possible for the default `outer' iteration to fail when finding intial smoothing parameters using a few steps of outer iteration (if you get a convergence failure message from magic when outer iterating, then this is what has happened): lower outerPIsteps in gam.control to fix this.

There are two things that you can do to speed up GAM fitting. (i) Change the method argument to gam so that `performance iteration' is used in place of the default outer iteration. See the perf.magic option under gam.method, for example. Usually performance iteration converges well and is quick. (ii) For large datasets it may be worth changing the smoothing basis to use bs="cr" (see s for details) for 1-d smooths, and to use te smooths in place of s smooths for smooths of more than one variable. This is because the default thin plate regression spline basis "tp" is costly to set up for large datasets (much over 1000 data, say). Alternatively see the last few examples for gam.

If the GAM estimation process fails to converge when using performance iteration, then switch to outer iteration via the method argument of gam (see gam.method). If it still fails, try increasing the number of IRLS iterations (see gam.control) or perhaps experiment with the convergence tolerance.

If you still have problems, it's worth noting that a GAM is just a (penalized) GLM and the IRLS scheme used to estimate GLMs is not guaranteed to converge. Hence non convergence of a GAM may relate to a lack of stability in the basic IRLS scheme. Therefore it is worth trying to establish whether the IRLS iterations are capable of converging. To do this fit the problematic GAM with all smooth terms specified with fx=TRUE so that the smoothing parameters are all fixed at zero. If this `largest' model can converge then, then the maintainer would quite like to know about your problem! If it doesn't converge, then its likely that your model is just too flexible for the IRLS process itself. Having tried increasing maxit in gam.control, there are several other possibilities for stabilizing the iteration. It is possible to try (i) setting lower bounds on the smoothing parameters using the min.sp argument of gam: this may or may not change the model being fitted; (ii) reducing the flexibility of the model by reducing the basis dimensions k in the specification of s and te model terms: this obviously changes the model being fitted somewhat; (iii) introduce a small regularization term into the fitting via the irls.reg argument of gam.control: this option obviously changes the nature of the fit somewhat, since parameter estimates are pulled towards zero by doing this.

Usually, a major contributer to fitting difficulties is that the model is a very poor description of the data.

Author(s)

Simon N. Wood simon.wood@r-project.org

[Package mgcv version 1.3-12 Index]