Basic usage

Available test statistics

Currently, we have seven test statistics, implemented in R as functions with the following names:

I1, which is given by the formula: \[ \frac{1}{n {n\choose{2k}}} \sum\limits_{\mathcal{I}_{2k}} \sum\limits_{i_{2k+1}=1}^n I\{|X_{(k),X_{i_1},\ldots,X_{i_{2k}}}| < |X_{2k+1}|\}- I\{|X_{(k+1),X_{i_1},\ldots,X_{i_{2k}}}| < |X_{2k+1}|\}\]
K1, given by the formula: \[\sup\limits_{t>0}\left|\frac{1}{{n\choose{2k}}} \sum\limits_{\mathcal{I}_{2k}} I\{|X_{(k),X_{i_1},\ldots,X_{i_{2k}}}| < t\}- I\{|X_{(k+1),X_{i_1},\ldots,X_{i_{2k}}}| < t \}\right|\]
I2, given by the formula: \[\frac{1}{n^4} \sum\limits_{i,j,a,b=1}^n I\{|X_i - X_j| < X_a+X_b\}- I\{|X_i + X_j| < X_a+X_b\}\]
I2U, given by the formula: \[\frac{1}{n\choose4} \sum_{1\leq i<j<a<b\leq n} I\{|X_i - X_j| < X_a+X_b\}- I\{|X_i + X_j| < X_a+X_b\} \]
I2HU, given by the formula: \[\frac{1}{n^2{n\choose2}} \sum_{1\leq i < j \leq n}\sum_{a,b=1}^n I\{|X_i - X_j| < X_a+X_b\}- I\{|X_i + X_j| < X_a+X_b\}\]
K2, given by the formula: \[\sup\limits_{t>0}\frac{1}{n^2} \left| \sum\limits_{i,j=1}^n I\{|X_i - X_j| < t\}- I\{|X_i + X_j| < t\}\right|\]
K2U, given by the formula: \[\sup\limits_{t>0}\frac{1}{n\choose2} \left| \sum\limits_{1\leq i < j \leq n} I\{|X_i - X_j| < t\}- I\{|X_i + X_j| < t\}\right|\]

Along them, we have Kolmogorov-Smirnov, Sign and Wilcoxon test statistics as R functions with the following names:

KS, which is given by the formula: \[ \sup_t\left|F_n(t)-(1-F_n(-t))\right| \]
SGN, given by the formula: \[ \frac1n \sum_{i=1}^nI\{X_i > 0\} - \frac12 \]
WCX, given by the formula: \[ \frac{1}{{n\choose2}} \sum_{1\leq i<j\leq n} I\{X_i+X_j > 0\} - \frac12 \]

Let’s show the basic usage of these functions.

library(symmetry)
set.seed(1) # for reproducibility
X <- rnorm(50)

I1(X, k=1)

## [1] 0.07266939

I1(X, k=2)

## [1] 0.08693539

K1(X, k=1)

## [1] 0.1771429

K1(X, k=2)

## [1] 0.1928571

I2(X)

## [1] 0.01486112

I2U(X)

## [1] 0.005749023

I2HU(X)

## [1] 0.007902041

K2(X)

## [1] 0.044

K2U(X)

## [1] 0.03265306

KS(X)

## [1] 0.22

SGN(X)

## [1] 0.02

WCX(X)

## [1] 0.102449

These functions can be used with the function Tvalues to simulate the distribution under a specified null distribution. A simple example follows.

A simple use case

Let’s simulate the distribution of the test statistic I1 under the standard normal null distribution. We do that by calling the Tvalues function. Call ?Tvalues for more information about that function. This can sometimes take a couple of minutes, depending on the number of simulations and parameters used.

# we'll get a 10k element vector with the T values of samples of size 20 from
# a standard normal distribution. Note that I1 takes a parameter k.
t0 <- Tvalues(10000, 20, list(name='norm'), list(name='I1', k=2))

# let's do that again, but add it to a variable t1
t1 <- Tvalues(10000, 20, list(name='norm'), list(name='I1', k=2))

# let's calculate the power of the test using t0 vs t1.
test_power(t0, t1) # should be close to 0.05

## [1] 0.0476

We can, of course, add parameters for the distribution. Let’s call I2 this time, for demonstration purposes.

# we'll get a 10k element vector with the T values of samples of size 20 from
# logistic distribution with location parameter 0.
t0 <- Tvalues(10000, 20, list(name='logis', loc=0, sca=1), list(name='I2'))

# let's do that again, but add it to a variable t1
t1 <- Tvalues(10000, 20, list(name='logis', loc=0, sca=1), list(name='I2'))

# let's calculate the power of the test using t0 vs t1.
test_power(t0, t1) # should be close to 0.05

## [1] 0.0504

There is also a parallel version of the Tvalues function, called parTvalues which can improve the speed of execution in certain (but not all!) cases. Let’s use K2 statistic this time.

# we'll get a 10k element vector with the T values of samples of size 20 from
# logistic distribution with location parameter 0.5.
t0 <- parTvalues(10000, 20, list(name='logis', loc=0.5, sca=1), list(name='K2'))

# let's do that again, but add it to a variable t1
t1 <- parTvalues(10000, 20, list(name='logis', loc=0.5, sca=1), list(name='K2'))

# let's calculate the power of the test using t0 vs t1.
test_power(t0, t1) # should be close to 0.05

## [1] 0.0509

Basic usage

Blagoje Ivanovic

2017-05-01

Available test statistics

A simple use case