R: One Sample Tests

OneSampleTests {fBasics}

R Documentation

One Sample Tests

Description

A collection and description of functions of one sample tests for testing normality of for detecting non-randomness in observations.

The functions for testing normality are:

`normalTest`	test suite for some normality tests,
`ksnormTest`	Kolmogorov-Smirnov normality test,
`shapiroTest`	Shapiro-Wilk's test for normality,
`jarqueberaTest`	Jarque–Bera test for normality,
`dagoTest`	D'Agostino normality test.

Functions for high precision Jarque Bera LM and ALM tests:

`jbTable`	Table of finite sample p values for the JB test,
`pjb`	Computes probabilities for the Jarque Bera Test,
`qjb`	Computes quantiles for the Jarque Bera Test,
`jbTest`	Performs finite sample adjusted JB LM and ALM test.

Additional functions for testing normality from the 'nortest' package:

`adTest`	Anderson–Darling normality test,
`cvmTest`	Cramer–von Mises normality test,
`lillieTest`	Lilliefors (Kolmogorov-Smirnov) normality test,
`pchiTest`	Pearson chi–square normality test,
`sfTest`	Shapiro–Francia normality test.

More tests ...

`runsTest`	Runs test for detecting non-randomness,
`gofnorm`	Prints a report on 13 different tests of normality.

Usage

normalTest(x, method = c("ks", "sw", "jb", "da"))
 
ksnormTest(x, title = NULL, description = NULL)
shapiroTest(x, title = NULL, description = NULL)
jarqueberaTest(x, title = NULL, description = NULL)
dagoTest(x, title = NULL, description = NULL)

jbTable(type = c("LM", "ALM"), size = c("all", "small"))
pjb(q, N = Inf, type = c("LM", "ALM")) 
qjb(p, N = Inf, type = c("LM", "ALM"))
jbTest(x, title = NULL, description = NULL)

adTest(x, title = NULL, description = NULL)            
cvmTest(x, title = NULL, description = NULL)      
lillieTest(x, title = NULL, description = NULL) 
pchiTest(x, title = NULL, description = NULL)    
sfTest(x, title = NULL, description = NULL)  

runsTest(x)
gofnorm(x, doprint = TRUE)

Arguments

`description`	optional description string, or a vector of character strings.
`doprint`	if TRUE, an exhaustive report is printed.
`method`	[normalTest] - indicates four different methods for the normality test, `"ks"` for the Kolmogorov-Smirnov one–sample test, `"sw"` for the Shapiro-Wilk test, `"jb"` for the Jarque-Bera Test, and `"da"` for the D'Agostino Test. The default value is `"ks"`.
`N`	an integer value specifying the sample size.
`p`	a numeric vector of probabilities. Missing values are not allowed.
`q`	vector of quantiles or test statistics. Missing values are not allowed.
`size`	[jbTable] - a character string denoting the size of the table. If set to `"all"` then all data are used from the table, if set to `"small"` then only a small part of the data will be returned.
`title`	an optional title string, if not specified the inputs data name is deparsed.
`type`	[jbTest][pjb][qjb] - the same for the Jarque Bera test statistic. `"LM"` denotes the Lagrange multiplier test, and `"ALM"` the adjusted Lagrange multiplier test.
`x`	a numeric vector of data values or a S4 object of class `timeSeries`.

Details

The hypothesis tests may be of interest for many financial and economic applications, especially for the investigation of univariate time series returns.

Normal Tests:

Several tests for testing if the records from a data set are normally distributed are available. The input to all these functions may be just a vector x or a univariate time series object x of class timeSeries.

First there exists a wrapper function which allows to call one from two normal tests either the Shapiro–Wilks test or the Jarque–Bera test. This wrapper was introduced for compatibility with S-Plus' FinMetrics package.

Also available are the Kolmogorov–Smirnov one sample test and the D'Agostino normality test.

The remaining five normal tests are the Anderson–Darling test, the Cramer–von Mises test, the Lilliefors (Kolmogorov–Smirnov) test, the Pearson chi–square test, and the Shapiro–Francia test. They are calling functions from R's contributed package nortest. The difference to the original test functions implemented in R and from contributed R packages is that the Rmetrics functions accept time series objects as input and give a more detailed output report.

The Anderson-Darling test is used to test if a sample of data came from a population with a specific distribution, here the normal distribution. The adTest goodness-of-fit test can be considered as a modification of the Kolmogorov–Smirnov test which gives more weight to the tails than does the ksnormTest.

Runs Test:

The runs test can be used to decide if a data set is from a random process. A run is defined as a series of increasing values or a series of decreasing values. The number of increasing, or decreasing, values is the length of the run. In a random data set, the probability that the (i+1)-th value is larger or smaller than the i-th value follows a binomial distribution, which forms the basis of the runs test.

Report from gofnorm Tests:

The function reports about the following goodness-of-fit tests for normality:

1	Omnibus Moments Test for Normality
2	Geary's Test of Normality
3	Studentized Range for Testing Normality
4	D'Agostino's D-Statistic Test of Normality
5	Kuiper V-Statistic Modified to Test Normality
6	Watson U-Squared-Statistic Modified to Test Normality
7	Durbin's Exact Test (Normal Distribution
8	Anderson-Darling Statistic Modified to Test Normality
9	Cramer-Von Mises W-Squared-Statistic to Test Normality
10	Kolmogorov-Smirnov D-Statistic to Test Normality
11	Kolmogorov-Smirnov D-Statistic (Lilliefors Critical Values)
12	Chi-Square Test of Normality (Equal Probability Classes)
13	Shapiro-Francia W-Test of Normality for Large Samples

The functions are implemented from the GRASS GIS software package an Open Source project avalaible under the GNU GPL license.

Value

In contrast to R's output report from S3 objects of class "htest" a different output report is produced. The tests here return an S4 object of class "fHTEST". The object contains the following slots:

`@call`	the function call.
`@data`	the data as specified by the input argument(s).
`@test`	a list whose elements contail the results from the statistical test. The information provided is similar to a list object of class{"htest"}.
`@title`	a character string with the name of the test. This can be overwritten specifying a user defined input argument.
`@description`	a character string with an optional user defined description. By default just the current date when the test was applied will be returned.
`statistic`	the value(s) of the test statistic.
`p.value`	the p-value(s) of the test.
`parameters`	a numeric value or vector of parameters.
`estimate`	a numeric value or vector of sample estimates.
`conf.int`	a numeric two row vector or matrix of 95
`method`	a character string indicating what type of test was performed.
`data.name`	a character string giving the name(s) of the data.

The meaning of the elements of the @test slot is the following:
ksnormTest
returns the values for the 'D' statistic and p-values for the three alternatives 'two-sided, 'less' and 'greater'.
shapiroTest
returns the values for the 'W' statistic and the p-value.
jarqueberaTest
jbTest
returns the values for the 'Chi-squared' statistic with 2 degrees of freedom, and the asymptotic p-value. jbTest is the finite sample version of the Jarque Bera Lagrange multiplier, LM, and adjusted Lagrange multiplier test, ALM.
dagoTest
returns the values for the 'Chi-squared', the 'Z3' (Skewness) and 'Z4' (Kurtosis) statistic together with the corresponding p values.
adTest
returns the value for the 'A' statistic and the p-value.
cvmTest
returns the value for the 'W' statistic and the p-value.
lillieTest
returns the value for the 'D' statistic and the p-value.
pchiTest
returns the value for the 'P' statistic and the p-values for the adjusted and not adjusted test cases. In addition the number of classes is printed, taking the default value due to Moore (1986) computed from the expression n.classes = ceiling(2 * (n^(2/5))), where n is the number of observations.
sfTest
returns the value for the 'W' statistic and the p-value.

Note

Some of the test implementations are selected from R's ctest and nortest packages.

Author(s)

R-core team for the tests from R's ctest package,
Adrian Trapletti for the runs test from R's tseries package,
Juergen Gross for the normal tests from R's nortest package,
James Filliben for the Fortran program producing the runs report,
Paul Johnson for the Fortran program producing the gofnorm report,
Diethelm Wuertz and Helmut Katzgraber for the finite sample JB tests,
Diethelm Wuertz for the Rmetrics R-port.

References

Anderson T.W., Darling D.A. (1954); A Test of Goodness of Fit, JASA 49:765–69.

Conover, W. J. (1971); Practical nonparametric statistics, New York: John Wiley & Sons.

D'Agostino R.B., Pearson E.S. (1973); Tests for Departure from Normality, Biometrika 60, 613–22.

D'Agostino R.B., Rosman B. (1974); The Power of Geary's Test of Normality, Biometrika 61, 181–84.

Durbin J. (1961); Some Methods of Constructing Exact Tests, Biometrika 48, 41–55.

Durbin,J. (1973); Distribution Theory Based on the Sample Distribution Function, SIAM, Philadelphia.

Geary R.C. (1947); Testing for Normality; Biometrika 36, 68–97.

Lehmann E.L. (1986); Testing Statistical Hypotheses, John Wiley and Sons, New York.

Linnet K. (1988); Testing Normality of Transformed Data, Applied Statistics 32, 180–186.

Moore, D.S. (1986); Tests of the chi-squared type, In: D'Agostino, R.B. and Stephens, M.A., eds., Goodness-of-Fit Techniques, Marcel Dekker, New York.

Shapiro S.S., Francia R.S. (1972); An Approximate Analysis of Variance Test for Normality, JASA 67, 215–216.

Shapiro S.S., Wilk M.B., Chen V. (1968); A Comparative Study of Various Tests for Normality, JASA 63, 1343–72.

Thode H.C. (2002); Testing for Normality, Marcel Dekker, New York.

Weiss M.S. (1978); Modification of the Kolmogorov-Smirnov Statistic for Use with Correlated Data, JASA 73, 872–75.

Wuertz D., Katzgraber H.G. (2005); Precise finite-sample quantiles of the Jarque-Bera adjusted Lagrange multiplier test, ETHZ Preprint.

Examples

## SOURCE("fBasics.15D-OneSampleTests")

## Series:
   xmpBasics("\nStart: Create Series > ")
   x = rnorm(100)
   
## ksnormTests - 
   xmpBasics("\nNext: Kolmogorov - Smirnov One-Sampel Test > ")
   ksnormTest(x)

## shapiroTest - 
   xmpBasics("\nNext: Shapiro - Wilk Test > ")
   shapiroTest(x)

## jarqueberaTest -
   xmpBasics("\nNext: Jarque - Bera Test > ")
   jarqueberaTest(x)
   jbTest(x)
   
## dagoTest -
   xmpBasics("\nNext: D'Agostino Test > ")
   dagoTest(x)

## adTest -
   xmpBasics("\nNext: Anderson - Darling Test > ")
   adTest(x)            

## cvmTest -
   xmpBasics("\nNext: Cramer - von Mises Test > ")
   cvmTest(x)      

## lillieTest -
   xmpBasics("\nNext: Lillifors (KS) Test > ")
   lillieTest(x) 

## pchiTest - 
   xmpBasics("\nNext: Pearson Chi-Squared Test > ")
   pchiTest(x)    

## sfTest - 
   xmpBasics("\nNext: Shapiro - Franca Test > ")
   sfTest(x)  

## gofnorm -
   xmpBasics("\nNext: Goodness-of-Fit Test for Normality > ")  
   gofnorm(x, doprint = TRUE)
   
## runsTest -
   xmpBasics("\nNext: Runs Test > ")
   runsTest(x)

[Package fBasics version 221.10065 Index]