mtsknn.neq {MTSKNN}R Documentation

A robust multivariate two-sample test based on k-nearest neighbors against unbalanceness

Description

The function tests whether two samples share the same underlying distribution based on k-nearest-neighbors approach. This approach is robust in the unbalanced case.

Usage

mtsknn.neq(x,y,k,clevel)

Arguments

x A matrix or data frame.
y A matrix or data frame.
k A integer.
clevel The confidence level. Default value is 0.05.

Details

matrices or data frames x and y are the two samples to be tested. Each row consists of the coordinates of a data point. The integer k is the number of nearest neighbors to choose in the testing procedure.

Value

The test result for a given confidence level. Reject or accept the null hypothesis.

Note

This is appropriate for the unbalanced case where the two sample sizes are about the same level. Another robust test ismtsknn.neq.

Author(s)

Peng Dai and Wei Dou wei.dou@yale.edu

References

Schilling, M. F. (1986). Multivariate two-sample tests based on nearest neighbors. J. Amer. Statist. Assoc., 81 799-806.

Henze, N. (1988). A multivariate two-sample test based on the number of nearest neighbor type coincidences. Ann. Statist., 16 772-783.

Chen, L. and Dou W. (2009). Robust multivariate two-sample tests based on k nearest neighbors for unbalanced designs. manuscripts.

Examples


## Example of two samples from the same multivariate t distribution:

n <- 100

x <- matrix(rt(2*n, df=5),n,2)

y <- matrix(rt(2*15*n, df=5),(15*n),2)

mtsknn.neq(x,y,3)

## Example of two samples from different distributions:

n <- 100

x <- matrix(rt(2*n, df=10),n,2)

y <- matrix(rnorm(2*15*n),(15*n),2)

mtsknn.neq(x,y,3)


[Package MTSKNN version 0.0-1 Index]