R: Constructor functions for smooth terms in a GAM

smooth.construct {mgcv}

R Documentation

Constructor functions for smooth terms in a GAM

Description

Smooth terms in a GAM formula are turned into smooth specification objects of class xx.smooth.spec during processing of the formula. Each of these objects is converted to a smooth object using an appropriate smooth.construct function. New smooth classes can be added by writing a new smooth.construct method function and a corresponding Predict.matrix method function (see example code below).

In practice, smooth.construct is usually called via the wrapper function smoothCon.

Usage

smooth.construct(object,data,knots)

Arguments

object

is a smooth specification object, generated by an s or te term in a GAM formula. Objects generated by s terms have class xx.smooth.spec where xx is given by the bs argument of s (this convention allows the user to add their own smoothers). If object is not class tensor.smooth.spec it will have the following elements:

term

The names of the covariates for this smooth, in an array.

bs.dim

Argument k of the s term generating the object. This is the dimension of the basis used to represent the term (or, arguably, 1 greater than the basis dimension for cc terms).

fixed

TRUE if the term is to be unpenalized, otherwise FALSE.

dim

the number covariates of which this smooth is a function.

p.order

the order of the smoothness penalty or 0 for autoselection of this. This is argument m of the s term that generated object.

the name of any by variable to multiply this term as supplied as an argument to s. "NA" if there is no such term.

full.call

The full version of the s term, with all defaults expanded explicitly.

label

A suitable label for use with this term.

null.space.dim

The dimension of the null space of the wiggliness penalty.

If object is of class tensor.smooth.spec then it was generated by a te term in the GAM formula, and specifies a smooth of several variables with a basis generated as a tensor product of lower dimensional bases. In this case the object will be different and will have the following elements:

margin

is a list of smooth specification objects of the type listed above, defining the bases which have their tensor product formed in order to construct this term.

term

is the array of names of the covariates that are arguments of the smooth.

is the name of any by variable, or "NA".

is an array, the elements of which indicate whether (TRUE) any of the margins in the tensor product should be unpenalized.

full.call

The full version of the s term, with all defaults expanded explicitly.

label

A suitable label for use with this term.

dim

is the number of covariates of which this smooth is a function.

TRUE if multiple penalties are to be used.

TRUE if 1-D marginal smooths are to be re-parameterized in terms of function values.

data a data frame in which the covariates and any by variable can be found.

knots an optional data frame specifying knot locations for each covariate. If it is null then the knot locations are generated automatically.

Details

The returned objects for the built in smooth classes have the following extra elements. cr.smooth objects (generated using bs="cr") have an additional array xp giving the knot locations used to generate the basis.

cyclic.smooth objects (generated using bs="cc") have an array xp of knot locations and a matrix BD used to define the basis (BD transforms function values at the knots to second derivatives at the knots).

tprs.smooth objects require several items to be stored in order to define the basis. These are:

shift

A record of the shift applied to each covariate in order to center it around zero and avoid any co-linearity problems that might otehrwise occur in the penalty null space basis of the term.

A matrix of the unique covariate combinations for this smooth (the basis is constructed by first stripping out duplicate locations).

The matrix mapping the t.p.r.s. parameters back to the parameters of a full thin plate spline.

null.space.dimension

The dimension of the space of functions that have zero wiggliness according to the wiggliness penalty for this term.

Again, these extra elements would be found in the elements of the margin list of tensor.smooth class object.

The classes cs.smooth and ts.smooth are modifications of the classes cr.smooth and tprs.smooth respsectively: they differ only in having an extra shrinkage component added to their penalties, so that automatic model selection can effectively remove such terms from a model altogether.

Value

The input argument object, assigned a new class to indicate what type of smooth it is and with at least the following items added:

`X`	The model matrix from this term.
`C`	The matrix defining any constraints on the term - usually a one row matrix giving the column sums of the model matrix, which defines the constraint that each term should sum to zero over the covariate values.
`S`	A list of positive semi-definite penalty matrices that apply to this term. The list will be empty if the term is to be left un-penalized.
`rank`	an array giving the ranks of the penalties.
`df`	the degrees of freedom associated with this term (at least when unpenalized).

Usually the returned object will also include extra information required to define the basis, and used by Predict.matrix methods to make predictions using the basis. See the Details section for the information included for the built in smooth classes.
tensor.smooth returned objects will additionally have each element of the margin list updated in the same way. tensor.smooths also have a list, XP, containing re-parameterization matrices for any 1-D marginal terms re-parameterized in terms of function values. This list will have NULL entries for marginal smooths that are not re-parameterized, and is only long enough to reach the last re-parameterized marginal in the list.

Author(s)

Simon N. Wood simon.wood@r-project.org

References

Wood, S.N. (2000) Modelling and Smoothing Parameter Estimation with Multiple Quadratic Penalties. J.R.Statist.Soc.B 62(2):413-428

Wood, S.N. (2003) Thin plate regression splines. J.R.Statist.Soc.B 65(1):95-114

Wood, S.N. (in press) Stable and efficient multiple smoothing parameter estimation for generalized additive models. J. Amer. Statist. Ass.

The p-spline code given in the example is based on:

Eilers, P.H.C. and B.D. Marx (1996) Flexible Smoothing with B-splines and Penalties. Statistical Science, 11(2):89-121

http://www.stats.gla.ac.uk/~simon/

Examples

# adding "p-spline" classes and methods

smooth.construct.ps.smooth.spec<-function(object,data,knots)
# a p-spline constructor method function
{ require(splines)
  if (length(object$p.order)==1) m<-rep(object$p.order,2) 
  else m<-object$p.order  # m[1] - basis order, m[2] - penalty order
  nk<-object$bs.dim-m[1]  # number of interior knots
  if (nk<=0) stop("basis dimension too small for b-spline order")
  x <- get.var(object$term,data)  # find the data
  xl<-min(x);xu<-max(x);xr<-xu-xl # data limits and range
  xl<-xl-xr*0.001;xu<-xu+xr*0.001;dx<-(xu-xl)/(nk-1) 
  if (!is.null(knots)) k <- get.var(object$term,knots) 
  else k<-NULL
  if (is.null(k)) 
  k<-seq(min(x)-dx*(m[1]+1),max(x)+dx*(m[1]+1),length=nk+2*m[1]+2)   
  if (length(k)!=nk+2*m[1]+2) 
  stop(paste("there should be ",nk+2*m[1]+2," supplied knots"))
  object$X<-spline.des(k,x,m[1]+2,x*0)$design # get model matrix
  if (!object$fixed)       
  { S<-diag(object$bs.dim);if (m[2]) for (i in 1:m[2]) S<-diff(S)
    object$S<-list(t(S)%*%S)  # get penalty
    object$S[[1]] <- (object$S[[1]]+t(object$S[[1]]))/2 # exact symmetry
  }
  object$rank<-object$bs.dim-m[2]  # penalty rank 
  object$null.space.dim <- m[2]  # dimension of unpenalized space  
  object$knots<-k;object$m<-m      # store p-spline specific info.
  object$C<-matrix(colSums(object$X),1,object$bs.dim) #constraint
  object$df<-ncol(object$X)-1      # maximum DoF
  if (object$by!="NA")  # deal with "by" variable 
  { by <- get.var(object$by,data) # find by variable  
    if (is.null(by)) stop("Can't find by variable")
    object$X<-as.numeric(by)*object$X # form diag(by)%*%X
  }
  class(object)<-"pspline.smooth"  # Give object a class
  object
}

Predict.matrix.pspline.smooth<-function(object,data)
# prediction method function for the p.spline smooth class
{ require(splines)
  x <- get.var(object$term,data)
  spline.des(object$knots,x,object$m[1]+2,x*0)$design
}

# an example, using the new class....
set.seed(0);n<-400;
x0 <- runif(n, 0, 1);x1 <- runif(n, 0, 1)
x2 <- runif(n, 0, 1);x3 <- runif(n, 0, 1)
f <- 2 * sin(pi * x0)
f <- f + exp(2 * x1) - 3.75887
f <- f+0.2*x2^11*(10*(1-x2))^6+10*(10*x2)^3*(1-x2)^10-1.396
e <- rnorm(n)*2
y <- f + e
b<-gam(y~s(x0,bs="ps",m=2)+s(x1,bs="ps",m=c(1,3))+
         s(x2,bs="ps",m=2)+s(x3,bs="ps",m=2))
plot(b,pages=1)
# another example using tensor products of the new class

test1<-function(x,z,sx=0.3,sz=0.4)
{ (pi**sx*sz)*(1.2*exp(-(x-0.2)^2/sx^2-(z-0.3)^2/sz^2)+
  0.8*exp(-(x-0.7)^2/sx^2-(z-0.8)^2/sz^2))
}
n<-400
x<-runif(n);z<-runif(n);
f <- test1(x,z)
y <- f + rnorm(n)*0.1
b <- gam(y~te(x,z,bs=c("ps","ps"),m=c(2,2)))
vis.gam(b)

[Package mgcv version 1.3-12 Index]