rcorr {Hmisc} | R Documentation |
rcorr
Computes a matrix of Pearson's r
or Spearman's
rho
rank correlation coefficients for all possible pairs of
columns of a matrix. Missing values are deleted in pairs rather than
deleting all rows of x
having any missing variables. Ranks are
computed using efficient algorithms (see reference 2), using midranks
for ties.
spearman2
computes the square of Spearman's rho rank correlation
and a generalization of it in which x
can relate
non-monotonically to y
. This is done by computing the Spearman
multiple rho-squared between (rank(x), rank(x)^2)
and y
.
When x
is categorical, a different kind of Spearman correlation
used in the Kruskal-Wallis test is computed (and spearman2
can do
the Kruskal-Wallis test). This is done by computing the ordinary
multiple R^2
between k-1
dummy variables and
rank(y)
, where x
has k
categories. x
can
also be a formula, in which case each predictor is correlated separately
with y
, using non-missing observations for that predictor.
print
and plot
methods allow one to easily print or plot
the results of spearman2(formula)
. The adjusted rho^2
is
also computed, using the same formula used for the ordinary adjusted
R^2
. The F
test uses the unadjusted R2. For plot
,
a dot chart is drawn which by default shows, in sorted order, the
adjusted rho^2
.
spearman
computes Spearman's rho on non-missing values of two
variables. spearman.test
is a simple version of spearman2.default
.
rcorr(x, y, type=c("pearson","spearman")) ## S3 method for class 'rcorr': print(x, ...) spearman2(x, ...) ## Default S3 method: spearman2(x, y, p=1, minlev=0, exclude.imputed=TRUE, ...) ## S3 method for class 'formula': spearman2(x, p=1, data, subset, na.action, minlev=0, exclude.imputed=TRUE, ...) ## S3 method for class 'spearman2.formula': print(x, ...) ## S3 method for class 'spearman2.formula': plot(x, what=c('Adjusted rho2','rho2','P'), sort.=TRUE, main, xlab, ...) spearman(x, y) spearman.test(x, y, p=1)
x |
a numeric matrix with at least 5 rows and at least 2 columns (if
y is absent). For spearman2 , the first argument may be a vector
of any type, including character or factor. The first argument may also be a
formula, in which case all predictors are correlated individually with
the response variable. x may be a formula for spearman2
in which case spearman2.formula is invoked. Each
predictor in the right hand side of the formula is separately correlated
with the response variable. For print , x is an object
produced by rcorr or spearman2 . For plot , x
is a result returned by spearman2 . For spearman and
spearman.test x is a numeric vector, as is y .
|
type |
specifies the type of correlations to compute. Spearman correlations are the Pearson linear correlations computed on the ranks of non-missing elements, using midranks for ties. |
y |
a numeric vector or matrix which will be concatenated to x . If
y is omitted for rcorr , x must be a matrix.
|
p |
for numeric variables, specifies the order of the Spearman rho^2 to
use. The default is p=1 to compute the ordinary rho^2 . Use p=2
to compute the quadratic rank generalization to allow
non-monotonicity. p is ignored for categorical predictors.
|
data, subset, na.action |
the usual options for models. Default for na.action is to retain
all values, NA or not, so that NAs can be deleted in only a pairwise
fashion.
|
minlev |
minimum relative frequency that a level of a categorical predictor
should have before it is pooled with other categories (see
combine.levels ) in spearman2 . The default, minlev=0 causes no pooling.
|
exclude.imputed |
set to FALSE to include imputed values (created by impute ) in the calculations.
|
what |
specifies which statistic to plot |
sort. |
set sort.=FALSE to suppress sorting variables by the statistic being plotted
|
main |
main title for plot. Default title shows the name of the response variable. |
xlab |
x-axis label. Default constructed from what .
|
... |
other arguments that are passed to dotchart2
|
Uses midranks in case of ties, as described by Hollander and Wolfe.
P-values are approximated by using the t
distribution.
rcorr
returns a list with elements r
, the
matrix of correlations, n
the
matrix of number of observations used in analyzing each pair of variables,
and P
, the asymptotic P-values.
Pairs with fewer than 2 non-missing values have the r values set to NA.
The diagonals of n
are the number of non-NAs for the single variable
corresponding to that row and column. spearman2.default
(the
function that is called for a single x
, i.e., when there is no
formula) returns a vector of statistics for the variable.
spearman2.formula
returns a matrix with rows corresponding to
predictors.
Frank Harrell
Department of Biostatistics
Vanderbilt University
f.harrell@vanderbilt.edu
Hollander M. and Wolfe D.A. (1973). Nonparametric Statistical Methods. New York: Wiley.
Press WH, Flannery BP, Teukolsky SA, Vetterling, WT (1988): Numerical Recipes in C. Cambridge: Cambridge University Press.
hoeffd
, cor
, combine.levels
, varclus
, dotchart2
, impute
x <- c(-2, -1, 0, 1, 2) y <- c(4, 1, 0, 1, 4) z <- c(1, 2, 3, 4, NA) v <- c(1, 2, 3, 4, 5) rcorr(cbind(x,y,z,v)) spearman2(x, y) plot(spearman2(z ~ x + y + v, p=2))