TwoSampleTests {fBasics} | R Documentation |
A collection and description of functions for
two sample statistical tests. The functions allow
to test for distributional equivalence, for difference
in location, variance and scale, and for correlations.
Distributional Equivalence:
ks2Test | Two sample Kolmogorov-Smirnov test. |
Difference in Locations:
tTest | The t test, |
kw2Test | the Kruskal–Wallis test. |
Difference in Variance:
varfTest | The variance F test, |
bartlett2Test | the Bartlett test, |
fligner2Test | the Fligner–Killeen test. |
Difference in Scale:
ansariTest | The Ansari–Bradley test, |
moodTest | the Mood test. |
Correlations:
pearsonTest | Pearson's coefficient, |
kendallTest | Kendall's rho, |
spearmanTest | Spearman's rho. |
Test Distributions:
[dpq]ansariw | Distribution of the Ansari W statistic. |
ks2Test(x, y, title = NULL, description = NULL) tTest(x, y, title = NULL, description = NULL) kw2Test(x, y, title = NULL, description = NULL) varfTest(x, y, title = NULL, description = NULL) bartlett2Test(x, y, title = NULL, description = NULL) fligner2Test(x, y, title = NULL, description = NULL) ansariTest(x, y, title = NULL, description = NULL) moodTest(x, y, title = NULL, description = NULL) pearsonTest(x, y, title = NULL, description = NULL) kendallTest(x, y, title = NULL, description = NULL) spearmanTest(x, y, title = NULL, description = NULL) dansariw(x = NULL, m, n = m) pansariw(q = NULL, m, n = m) qansariw(p, m, n = m)
description |
optional description string, or a vector of character strings. |
m, n |
[*ansariw] - |
p |
[qansariw] - a numeric vector of quantiles. |
q |
[pansariw] - a numeric vector of quantiles. |
title |
an optional title string, if not specified the inputs data name is deparsed. |
x, y |
a numeric vector of data values.
[bartlett2Test][fligner2Test][kw2Test] - here x is a list, where each element is either a vector
or an object of class timeSeries . y is only used
for the two–sample test situation, where x and y
are two vectors or objects of class timeSeries .
[dansariw] - a numeric vector of quantiles. |
The tests may be of interest for many financial
and economic applications, especially for the
comparison of two time series. The tests are grouped
according to their functionalities.
Distributional Equivalence:
The test ks2Test
performs a Kolmogorov–Smirnov two sample test
that the two data samples x
and y
come from the same
distribution, not necessarily a normal distribution. That means that
it is not specified what that common distribution is.
Differences in Location:
The function tTest
can be used to determine if the two sample
means are equal for unpaired data sets. Two variants are used,
assuming equal or unequal variances.
The function kw2Test
performs a Kruskal-Wallis rank sum
test of the null hypothesis that the central tendencies or medians of
two samples are the same. The alternative is that they differ.
Note, that it is not assumed that the two samples are drawn from the
same distribution. It is also worth to know that the test assumes
that the variables under consideration have underlying continuous
distributions.
Differences in Variances:
The function varfTest
can be used to compare variances of two
normal samples performing an F test. The null hypothesis is that
the ratio of the variances of the populations from which they were
drawn is equal to one.
The function bartlett2Test
performs the Bartlett's test of the
null hypothesis that the variances in each of the samples are the
same. This fact of equal variances across samples is also called
homogeneity of variances. Note, that Bartlett's test is
sensitive to departures from normality. That is, if the samples
come from non-normal distributions, then Bartlett's test may simply
be testing for non-normality. The Levene test (not yet implemented)
is an alternative to the Bartlett test that is less sensitive to
departures from normality.
The function fligner2Test
performs the Fligner-Killeen test of
the null that the variances in each of the two samples are the same.
Differences in Scale:
The function ansariTest
performs the Ansari–Bradley two–sample
test for a difference in scale parameters. Note, that we have completely
reimplemented this test based on the statistcs and p-values computed
from algorithm AS 93. The test returns for any sizes of the series
x
and y
the exact p value together with its asymptotic
limit. The test procedure is not limited to sizes shorter of length 50
as this is the case for the function ansari.Test
implemented in
R's stats
package. For the test statistics the following
functions are available: dansariw
, pansariw
, and
qansariw
.
The function code{moodTest}, is another test which performs a
two–sample test for a difference in scale parameters. The underlying
model is that the two samples are drawn from f(x-l) and
f((x-l)/s)/s, respectively, where l is a common
location parameter and s is a scale parameter. The null
hypothesis is s=1.
Correlations:
The function correlationTest
tests for association
between paired samples, using Pearson's product moment
correlation coefficient,
The function kendallTest
performs Kendall's tau test
The function spearmanTest
performs Spearman's rho test.
In contrast to R's output report from S3 objects of class "htest"
a different output report is produced. The classical tests presented
here return an S4 object of class "fHTEST"
. The object contains
the following slots:
@call |
the function call. |
@data |
the data as specified by the input argument(s). |
@test |
a list whose elements contail the results from the statistical test. The information provided is similar to a list object of class{"htest"}. |
@title |
a character string with the name of the test. This can be overwritten specifying a user defined input argument. |
@description |
a character string with an optional user defined description. By default just the current date when the test was applied will be returned. |
statistic |
the value(s) of the test statistic. |
p.value |
the p-value(s) of the test. |
parameters |
a numeric value or vector of parameters. |
estimate |
a numeric value or vector of sample estimates. |
conf.int |
a numeric two row vector or matrix of 95 |
method |
a character string indicating what type of test was performed. |
data.name |
a character string giving the name(s) of the data. |
Some of the test implementations are selected from R's ctest
package.
R-core team for the tests from R's ctest package,
Diethelm Wuertz for the Rmetrics R-port.
Conover, W. J. (1971); Practical nonparametric statistics, New York: John Wiley & Sons.
Durbin J. (1961); Some Methods of Constructing Exact Tests, Biometrika 48, 41–55.
Durbin,J. (1973); Distribution Theory Based on the Sample Distribution Function, SIAM, Philadelphia.
Lehmann E.L. (1986); Testing Statistical Hypotheses, John Wiley and Sons, New York.
Moore, D.S. (1986); Tests of the chi-squared type, In: D'Agostino, R.B. and Stephens, M.A., eds., Goodness-of-Fit Techniques, Marcel Dekker, New York.
## SOURCE("fBasics.15E-TwoSampleTests") ## x, y - xmpBasics("\nStart: Create two Samples > ") x = rnorm(50) y = rnorm(50) ## ks2Test - xmpBasics("\nNext: Distributional Tests > ") ks2Test(x, y) ## tTest | kw2Test - xmpBasics("\nNext: Location Tests > ") tTest(x, y) kw2Test(x, y) ## varfTest, bartlett2Test | fligner2Test - xmpBasics("\nNext: Variance Tests > ") varfTest(x, y) bartlett2Test(x, y) fligner2Test(x, y) ## ansariTest | moodTest - xmpBasics("\nNext: Scale Tests > ") ansariTest(x, y) moodTest(x, y) ## pearsonTest | kendallTest | spearmanTest - xmpBasics("\nNext: Correlation Tests > ") pearsonTest(x, y) kendallTest(x, y) spearmanTest(x, y)