Tests for Covariance Matrices in High Dimension with Less Sample Size

Transcription

CIRJE-F-933
Tests for Covariance Matrices in High Dimension
with Less Sample Size
Muni S. Srivastava
University of Toronto
Hirokazu Yanagihara
Hiroshima University
Tatsuya Kubokawa
The University of Tokyo
June 2014
CIRJE Discussion Papers can be downloaded without charge from:
http://www.cirje.e.u-tokyo.ac.jp/research/03research02dp.html
Discussion Papers are a series of manuscripts in their draft form.
circulation or distribution except as indicated by the author.
They are not intended for
For that reason Discussion Papers
may not be reproduced or distributed without the written consent of the author.
Tests for Covariance Matrices in High Dimension with Less Sample Size
Muni S. Srivastava
Department of Statistics, University of Toronto, 100 St George Street, Toronto, Ontario, CANADA M5S 3G3.
Hirokazu Yanagihara
Department of Mathematics, Hiroshima University, 1-3-1 Kagamiyama, Higashi-Hiroshima, Hiroshima 739-8626, JAPAN.
Tatsuya Kubokawa
Faculty of Economics, University of Tokyo, Hongo, Bunkyo-ku, Tokyo 113-0033, JAPAN.
Abstract
In this article, we propose tests for covariance matrices of high dimension with fewer observations than the
dimension for a general class of distributions with positive definite covariance matrices. In one-sample case,
tests are proposed for sphericity and for testing the hypothesis that the covariance matrix Σ is an identity matrix,
by providing an unbiased estimator of tr [Σ2 ] under the general model which requires no more computing time
than the one available in the literature for normal model. In the two-sample case, tests for the equality of two
covariance matrices are given. The asymptotic distributions of proposed tests in one-sample case are derived
under the assumption that the sample size N = O(pδ ), 1/2 < δ < 1, where p is the dimension of the random
vector, and O(pδ ) means that N/p goes to zero as N and p go to infinity. Similar assumptions are made in the
two-sample case.
AMS 2001 subject classifications: 62H15, Secondary 62F05.
Keywords: Asymptotic distributions, covariance matrix, high dimension, non-normal model, sample size
smaller than dimension, test statistics.
AMS 2000 subject classifications: Primary 62H15; secondary 62F05.
1. Introduction
In analyzing data, certain assumptions made implicitly or explicitly should be ascertained. For example in
comparing the performances of two groups based on observations from both groups, it is necessary to ascertain
if the two groups have the same variability. For example, if they have the same variability, we can use the
usual t-statistics to verify that both groups have the same average performance. And if the variability is not
the same, we are required to use Behrens-Fisher type of statistics. When observations are taken on several
characteristics of an individual, we write them as observation vectors. In this case, we are required to check
if the covariance matrices of the two groups are the same by using Wilks (1946) likelihood ratio test statistics
provided the number of characteristics, say, p is much smaller than the number of observations for each group,
say, N1 and N2 . In this article, we consider the case when p is larger than N1 and N2 .
The problems of large p and very small sample size are frequently encountered in statistical data analysis
these days. For example, recent advances in technology to obtain DNA microarrays have made it possible
to measure quantitatively the expression of thousands of genes. These observations are, however, correlated
to each other as the genes are from the same subject. Since the number of subjects available for taking the
observations are so few as compared to the gene expressions, multivariate theory for large p and small sample
size N needs to be applied in the analysis of such data. Alternatively, one may try to reduce the dimension by
Email addresses: srivasta@utstat.toronto.edu (Muni S. Srivastava), yanagi@math.sci.hiroshima-u.ac.jp (Hirokazu
Yanagihara), tatsuya@e.u-tokyo.ac.jp (Tatsuya Kubokawa)
Preprint submitted to Elsevier
May 15, 2014
using the false discovery rate (FDR) proposed by Benjamini and Hochberg (1995) provided the observations are
equally positively related as shown by Benjamini and Yekutieli (2001) or apply the average false discovery rate
(AFD) proposed by Srivastava (2010). The above AFD or FDR methods do not guarantee that the dimension p
can be reduced to a dimension which is substantially smaller than N.
The development of statistical theory for analyzing high-dimensional data has taken a jump start since the
publication of a two-sample test by Bai and Saranadasa (1996) which has also included the two-sample test
proposed by Dempster (1958, 1960). A substantial progress has been made in providing powerful tests in
testing that the mean vectors are equal in two or several samples, see Srivastava and Du (2008), Srivastava
(2009), Srivastava, Katayama and Kano (2013), Yamada and Srivastava (2012) and Srivastava and Kubokawa
(2013). In the context of inference on means of high-dimensional distributions, multiple tests have also been
used, see Fan, Hall and Yao (2007) and Kosorok and Ma (2007) among others. All the methods of inference
on means mentioned above require some verification of the structure of a covariance matrix in one-sample
case and the verification of the equality of two covariance matrices in the two-sample case. The objective of
the present article is to present some methods of verification of these hypotheses. Below, we describe these
problems in terms of hypotheses testing.
Consider the problem of testing the hypotheses regarding the covariance matrix Σ of a p-dimensional observation vector based on N independent and identically distributed (i.i.d) observation vectors x j , j = 1, . . . , N.
In particular, we consider the problem of testing the hypothesis that Σ = λI p , λ > 0, and unknown and, that
of testing that Σ = I p ; the first hypothesis is called sphericity hypothesis. We also consider the problem of
testing the equality of the covariance matrices Σ1 and Σ2 of the two groups when N1 i.i.d observation vectors
are obtained from the first group and N2 i.i.d observation vectors are obtained from the second group. It will
be assumed that N1 ≤ N2 , 0 < N1 /N2 ≤ 1 and Ni /p → 0 as (N1 , N2 , p) → ∞.
We begin with the description of the model for the one-sample case. Let x j , j = 1, . . . , N be i.i.d observation
vectors with mean vector µ, and covariance matrix Σ = FF, where F is the unique factorization of Σ, that is,
1/2 ′
′
F is a symmetric and positive definite matrix obtained as Σ = ΓDλ Γ′ = ΓD1/2
λ Γ ΓDλ Γ = FF, where
′
Dλ = diag(λ1 , . . . , λ p ) and ΓΓ = I p . We assume that the observation vectors x j are given by,
x j = µ + Fu j , j = 1, . . . , N,
(1.1)
E(u j ) = 0, Cov (u j ) = I p ,
(1.2)
with
and for integers, γ1 , . . . , γ p , 0 ≤
∑p
k=1
γk ≤ 8, j = 1, . . . , N,
 p

p
∏ γk  ∏

E(uγk ),
E 
u  =
jk
jk
k=1
(1.3)
k=1
where u jk is the kth component of the vector u j = (u j1 , .., u jk , .., u jp )′ . It may be noted that the condition (1.3)
implies the existence of the moments of u jk , k = 1, . . . , p, upto the order eight. For comparison with the normal
distribution, we shall write the fourth moment of u jk , namely, E[u4jk ] = K4 + 3. For normal distribution, K4 = 0.
In the general case, K4 ≥ −2. We may also note that instead of Σ = F2 , we may also consider as in Srivastava
(2009) Σ = CC′ , where C is a p × p non-singular matrix but it increases the algebraic manipulations with no
apparent gain in showing that the proposed tests can be used in non-normal situations.
We are interested in the following testing of hypothesis problems in one-sample case:
Problem (1) H1 : Σ = λI p , λ > 0 vs A1 : Σ , λI p .
Problem (2) H2 : Σ = I p vs A2 : Σ , I p .
These problems have been considered many times in the statistical literature. More recently, Onatski, Moreire
and Hallin (2013) and Cai and Ma (2013) have proposed tests for testing the above problems under the assumption that the observation vectors are normally distributed. It has, however, been shown by Srivastava, Kollo and
von Rosen (2011) that many of these tests are not robust against the departure from normality. The objective
of this paper is to propose tests for the above two problems under the assumptions (1.1)-(1.3) which includes
multivariate normal distributions as well as many others.
2
∑N
Onatski et al. (2013) test is based on the largest eigenvalue of the sample covariance matrix S = N −1 i=1
xi x′i ,
where xi are i.i.d. from N p (0, Σ) for testing Σ = σ2 I p , against the alternative that Σ = σ2 (I p + θvv′ ), v′ v = 1.
However, Berthet and Rigollet (2013) argue that the largest eigenvalue cannot discriminate between the null
hypotheses and the alternative hypotheses, since λmax (S) → ∞ as p/N → ∞, and hence its fluctuations are too
large and thus would require much larger θ to be able to discriminate between the two hypotheses; see also
Baik and Silverstein (2006).
Cai and Ma (2013) proposed a test based on U-statistics for testing the hypothesis that Σ = I p based on N
i.i.d. observations from N(0, Σ). For normal distribution, the assumption of the mean vector of the observations
to be 0 makes no difference in the use of the proposed U-statistics as the observation matrix can be transformed
by a known orthogonal matrix of Helmerts type to obtain n = N − 1 observable i.i.d. observation vectors
with mean 0 and the same covariance matrix Σ. But for non-normal distributions with mean vector (, 0), this
U-statistics cannot be used for testing the above hypothesis. Thus, the U-statistics used by Chen, Zhang and
Zhang (2010) is needed to test the above hypothesis which requires computing time of the order O(N 4 ). Our
proposed test requires computing time of the order O(N 2 ).
In the case of two sample case we have N1 and N2 independently distributed p-dimensional observation
vectors xi j , j = 1, . . . , Ni , i = 1, 2; Ni < p, N1 ≤ N2 , 0 < N1 /N2 ≤ 1, and Ni /p → 0 as (N1 , N2 , p) → ∞,
with mean vectors µi and covariance matrices Σi = F2i , i = 1, 2, each satisfying the conditions of the model
described in (1.1)-(1.3) with µi and Fi in place of µ and F and ui j = (ui j1 , .., ui jp )′ in place of u j . We consider
tests for testing the hypothesis H3 vs A3 described in the Problem 3 below:
Problem (3) H3 : Σ1 = Σ2 vs A3 : Σ1 , Σ2 .
Problem (3) has recently been considered by Cai, Liu and Xia (2013). Following Jiang (2004), they proposed a test against sparse alternative rather than the general alternative given above. In this article, we propose
a test on the lines of Schott (2007) using the estimator of the squared Frobenius norm of Σ1 − Σ2 , under the
assumptions given in (1.1)-(1.3) as has been done in Li and Chen (2012) using U-statistics. However, the
computing time for the Li and Chen statistics is of the order O(N 4 ) which for the proposed test, it is only
O(N 2 ).
For testing the hypotheses in Problems (1)-(2), in one-sample case, we make the following assumptions:
Assumption (A)
(i) N = O(pδ ), 1/2 < δ < 1.
(ii) 0 < a2 < ∞, a4 /p = o(1), where ai = tr [Σi ]/p, i = 1, 2, 3, 4.
∑
(iii) For Σ = (σi j ), p−2 i,p j σ4i j = o(1).
For testing the hypothesis given in Problem (3) for the two-sample case, the Assumption (A) applies to both
the covariance matrices Σ1 and Σ2 , and the sample sizes are comparable as stated below:
Assumption (B)
(i) Assumption (A) to both the covariance matrices Σ1 and Σ2 with ai j = tr [Σij ]/p, and Σ j = (σ jkℓ ),
i = 1, 2, 3, 4, j = 1, 2.
(ii) For N1 ≤ N2 , 0 < N1 /N2 ≤ 1.
The organization of the paper is as follows. In Section 2, we give notations and preliminaries for onesample testing problems. In section 3, we propose tests and give their asymptotic distributions based on the
asymptotic theory given in Section 6. The problem of testing the equality of two covariance matrices will be
considered in Section 4. Simulation results showing power and attained significance level, the so-called ASL
will be given in Section 5. Section 6 gives the general asymptotic theory under which the proposed statistics
are shown to be normal. In Section 7, we give results on moments of quadratic forms for a general class of
distributions. The paper concludes in Section 8.
3
2. Notations and Preliminaries in One-Sample Case
Let x1 , . . . , xN be independently and identically distributed p-dimension observation vectors with mean
vector µ and covariance matrix Σ = F2 satisfying the conditions of the model (1.1)-(1.3). Let
x=
N
N
∑
1 ∑
x j, V =
y j y′j , y j = x j − x,
N j=1
j=1
(2.1)
j = 1, . . . , N. It is well known that n−1 V, n = N − 1, is an unbiased estimator of the covariance matrix Σ for
any distribution. Since our focus in this paper is on testing the hypotheses on a covariance matrix or equality
of two covariance matrices, the matrix V plays an important role. In this paper, we consider tests based on the
estimator of the squared Frobenius norm, as a distance between the hypothesis H : Σ = I p against the alternative
A : Σ , I p , the squared Frobenius norm (divided by p) is given by p−1 tr [(Σ− I p )2 ] = p−1 tr [Σ2 ]−2p−1 tr [Σ]+1.
Thus, for notational convenience, we introduce the notation
ai =
1
tr [Σi ], i = 1, . . . , 8.
p
(2.2)
aˆ 1 =
1
tr [V], n = N − 1,
np
(2.3)
{
}
1
1
tr [V 2 ] − (tr [V])2 ,
(n − 1)(n + 2)p
n
(2.4)
We estimate a1 and a2 by
and
aˆ 2s =
respectively. Srivastava (2005) has shown that aˆ 1 and aˆ 2s are unbiased and consistent estimators of a1 and a2
under the assumption of normality and Assumption (A). That is,
4
+ o(n−2 ),
n2
1
2
Var(â1 ) =
.
a2
np
E(â2s ) = a2 ,
Var(â2s /a2 ) =
E(â1 ) = a1 ,
However, for the model (1.1)-(1.3) and Σ = (σi j ),
∑
n
K4
σ2ii + a2 ,
N(N + 1)p
i=1
p
E(â2s ) =
as shown in Section 2.1. Hence,
∑p 2
n
σii ),
E [â2s − a2 ] = O(p−1 i=1
2
which does not go to zero even when Σ = λI p . Thus, aˆ 2s cannot be asymptotically normally distributed.
Hence, we need to find an unbiased estimator of a2 for a general class of distributions given by (1.1)-(1.3),
or an estimator with bias of the order O(n−1−ε ), ε > 0. We propose an unbiased estimator aˆ 2 defined in (2.5).
Its unbiasedness will be shown in Section 2.1, and the variances of aˆ 1 , aˆ 2 , and Cov(â1 , aˆ 2 ) will be given in
subsequent sections.
We define an estimator of a2 given by,
1
f
1
=
f
aˆ 2 =
{
{
(N − 2)ntr [V 2 ] − Nntr [ D2 ] + (tr [V])2
}
}
(N − 2)ntr [M2 ] − Nntr [ D2 ] + (tr [ D])2 ,
where f = pN(N − 1)(N − 2)(N − 3), M = Y ′ Y, Y = (y1 , . . . , yN ) and
D = diag(y′1 y1 , ..., y′N yN ),
4
(2.5)
namely, D denotes an N × N diagonal matrix with diagonal elements given by y′1 y1 , . . . , y′N yN . It will be
shown in the following Section 2.1 that aˆ 2 is an unbiased estimator of a2 = tr [Σ2 ]/p from which an unbiased
estimator of tr [Σ2 ] is given by pâ2 . It may be noted that it takes no longer time to compute aˆ 2 given in (2.5)
than to compute aˆ 2s given in (2.4). It may also be noted that from computing viewpoint, the expression given
in the second line of (2.5) is better suited as all the matrices are N × N matrices, while the expression in the
first line is a mixture of N × N and p × p matrices.
2.1. Unbiasedness of the estimator aˆ 2
In this subsection, we show that the estimator aˆ 2 defined in (2.5) is an unbiased estimator. For this, we need
to compute the expected values of tr [V 2 ], tr [ D2 ], and (tr [V])2 . Note that,
xj − x = xj −
N
1 ∑
n
xk = (x j − x( j) )
N k=1
N
where
1
(N x − x j ), n = N − 1.
n
Also, note that x j − x does not depend on the mean vector µ given in the model (1.1), and thus we will assume
without any loss of generality that µ = 0. Then, E(tr [ D2 ]) is expressed as
x( j) =
N
∑
E(tr [D2 ]) =
]2
[
E (x j − x)′ (x j − x)
j=1
N
( n )4 ∑
=
N
j=1
N
( n )4 ∑
=
N
[
]2
E x′j x j − 2x′j x( j) + x′( j) x( j)
[
]2
E u′j Σu j − 2u′j Σu( j) + u′( j) Σu( j) ,
j=1
which is rewritten as
N
( n )4 ∑
N
+
+
]
[
E (u′j Σu j )2 + 4(u′j Σu( j) )2 + (u′( j) Σu( j) )2
j=1
N
( n )4 ∑
N
[
]
E −4(u′j Σu j )(u′j Σu j ) + 2(u′j Σu j )(u′( j) Σu( j) )
j=1
N
( n )4 ∑
N
[
]
E −4(u′j Σu( j) )(u′( j) Σu( j) ) .
j=1
Hence, using the results on the moments of quadratic form given in Section 7, we get
]
[
]
[
]}
n4 {[ ′
E (u j Σu j )2 + 4E (u′j Σu( j) )2 + E (u′( j) Σu( j) )2
3
N
]
n4 [
+ 2 3 E (u′j Σu j )(u′( j) Σu( j) )
N
p
n2
n(n3 + 1) ∑ 2 2n2
2
tr
[Σ
]
+
(tr [Σ])2 .
=
K
σ
+
4
ii
N
N
N3
i=1
E(tr [D2 ]) =
(2.6)
Following the above derivation, we obtain
n2 ∑ 2
E(tr [V]) = K4
σii + 2ntr [Σ2 ] + n2 (tr [Σ])2 ,
N
i=1
(2.7)
n2 ∑ 2
K4
σii + nNtr [Σ2 ] + n (tr [Σ])2 .
N
i=1
(2.8)
p
2
p
E(tr [V 2 ]) =
5
Collecting the terms according to the formula of aˆ 2 in (2.5), we find that the coefficients of K4
(tr [Σ])2 are zero. The coefficients of tr [Σ2 ] is N(N − 1)(N − 2)(N − 3)/ f . Hence,
∑p
i=1
σ2ii and of
E(â2 ) = a2 ,
proving that aˆ 2 is an unbiased estimator of a2 .
Next, we show that aˆ 2s defined in (2.4) is a biased estimator of a2 . From (2.7) and (2.8), we get
{
}
1
1
E tr [V 2 ] − (tr [V])2
E(â2s ) =
(n − 1)(n + 2)p
n
}
{ 2
1
n −n
K4 a20 + (nN − 2)tr [Σ2 ]
=
(n − 1)(n + 2)p
N
2
n
n +n−2
=
K4 a20 + 2
a2
(n + 2)N
n +n−2
p
n
1∑ 2
=
K4 a20 + a2 , a20 =
σ .
N(N + 1)
p i=1 ii
(2.9)
Thus, unless K4 = 0, aˆ 2s is not an unbiased estimator of a2 , as has been shown in Srivastava, Kollo, and von
Rosen (2011) for Σ = λI p .
2.2. Variance of aˆ 1
In this section, we derive the variance of aˆ 1 . The matrix of N independent observation vectors x1 , . . . , xN
is given by X = (x1 , . . . , xN ) with E[X] = (µ, . . . , µ) = µ1′ where 1 is an N-vector of ones, 1 = (1, . . . , 1)′ , and
Cov (xi ) = Σ, i = 1, . . . , N. Let A = I N − N −1 11′ , where I N is the N × N identity matrix. Then, with n = N − 1,
we define S as n−1 V, which can be written as
S=
1
1
(X AX′ ) = (X − µ1′ ) A(X − µ1′ )′ .
n
n
Thus S does not depend on the mean vector µ. In the following calculations, we shall assume that µ = 0. Thus,


N
N
N
N
 1 ∑
1  n ∑
1 ∑
1 ∑
′
′
x j xk  =
x j x′k ,
S = 
xi xi −
xi x′i −
n N i=1
N j,k
N i=1
nN j,k
and
1 ∑ ′
1
1 ∑ ′
tr [S] =
xi xi −
x xk
p
N p i=1
N pn j,k j
N
aˆ 1 =
1 ∑ ′
2 ∑ ′
x xk ,
xi xi −
N p i=1
N pn j<k j
N
=
N
N
where xi ’s are i.i.d. with mean vector µ = 0 and covariance matrix Σ = FF. We note that Cov(x′i xi , x′j xk ) = 0
∑
for j , k, and Var( Nj<k x′j xk ) = (Nn/2)tr [Σ2 ]. Hence, from Lemma 7.1 in Section 7,
Var(
∑N
′
′
′
i=1 xi xi ) = NVar(xi xi ) = NVar(ui Σui ) = K4
p
∑
σ2ii + 2tr [Σ2 ].
i=1
Thus, with
1∑ 2
σ ,
p i=1 ii
p
a20 =
and
6
a2 =
1
tr [Σ2 ],
p
(2.10)
we get
1
2
2
K4 a20 +
a2 +
a2
Np
Np
Nnp
)
2a2
a20
1 (
n
=
+ K4
=
2a2 + K4 a20 .
np
N p np
N
Var(â1 ) =
Thus,
where
aˆ ∗1
(2.11)
aˆ 1 = aˆ ∗1 + O p (N −1 p−1/2 ),
(2.12)
1 ∑
(xi − µ)′ (xi − µ).
=
N p i=1
(2.13)
N
We state these results in the following theorem.
Theorem 2.1 For the general model given in (1.1)-(1.3) and under the assumption (A), the means of aˆ 1 as well
as of aˆ ∗1 is a1 , and aˆ ∗1 − aˆ 1 = O p (N −1 p−1/2 ). The variance of aˆ 1 is given by
Var(â1 ) = (2a2 + nK4 a20 /N)/(np) ≡ C11 /(np).
2.3. Variance of aˆ 2
In this section, we derive the variance of aˆ 2 , the estimator of a2 given in (2.5). Since


 N
2 
N
N
∑

∑
  ∑


′
′
′
2
′
2
(yi yi ) + tr  (yi yi )(y j y j )
yi yi   =
tr [V ] = tr 
=
N
∑
i=1
i, j
(y′i yi )2 +
i, j
i=1
i=1
N
∑
(y′i y j )2 ,
2
 N
N
N
∑
∑
∑ ′ 
2

(y′i yi )2 +
(y′i yi )(y′j y j ),
yi yi  =
(tr [V]) = 
i=1
i, j
i=1
we can rewrite aˆ 2 with f = pN(N − 1)(N − 2)(N − 3), as


N
N
∑
∑

1 
′
2
′
2
aˆ 2 = (N − 2)n (yi y j ) + (N − 2)n (yi yi ) 
f
i, j
i=1


N
N
N
∑
∑
∑

1 
′
′
′
2
′
2
+ −Nn (yi yi ) +
(yi yi ) +
(yi yi )(y j y j )
f
i, j
i=1
i=1


N
N
∑
∑
∑


1 
′
2
′
2
′
′
= (N − 2)n (yi y j ) − (2n − 1) (yi yi ) +
(yi yi )(y j y j )
f
i, j
i, j
i=1
=
N
N
N
∑
1
(2n − 1) ∑ ′ 2 1 ∑ ′
(y′i y j )2 −
(yi yi ) +
(yi yi )(y′j y j ).
pN(N − 3) i, j
f
f
i, j
i=1
From the Markov inequality we find from (2.6) that for every ε > 0,
{
}
∑N ′ 2
2n
E(tr [ D2 ])
P (2n − 1) f −1 i=1
(yi yi ) > ε ≤
fε
(
)
2
≤
NK4 a20 + 2Na2 + pNa21
N(N − 2)(N − 3)ε
= O(N −2 ) + O(pN −2 ) = O(p1−2δ ) = o(1),
7
since N = O(pδ ) for δ > 1/2. Similarly,
{ ∑
}
1
N
′
′
N(N − 1)(tr [Σ])2
P f −1 i,
j (yi yi )(y j y j ) > ϵ ≤
fϵ
p
=
a2 = o(1).
(N − 2)(N − 3)ε 1
Hence,
∑
1
(y′ y )2 + o p (1)
pN(N − 3) i, j i j
N
aˆ 2 =
=
∑{
}2
1
(xi − x)′ (x j − x) + o p (1)
pN(N − 3) i, j
=
∑{
}2
1
(xi − µ)′ (x j − µ) + o p (1)
pN(N − 1) i, j
N
N
= a∗2 + o p (1),
where
aˆ ∗2 =
Thus,
N
}2
1 ∑{
(xi − µ)′ (x j − µ) .
pnN i, j
Var(â2 ) = Var(â∗2 )(1 + o(1)).
To find the variance of aˆ ∗2 , we define
1
1
zi j = √ (xi − µ)′ (x j − µ) = √ u′i Σu j
p
p
from the model (1.1). Thus, aˆ ∗2 =
Section 7, we get
Var(â∗2 )
1
Nn
∑N
2
i, j zi j
and E(z2i j ) = a2 . Hence, from the results on moments given in

2
N
 1 ∑

2

= E 
(zi j − a2 )
nN i, j
 N

N
∑

4 ∑ 2
2
2
2
2

= 2 2 E  (zi j − a2 ) + 2 2
(zi j − a2 )(zik − a2 )
N n
N n i, j,k
i, j
2
4(N − 2)
Var(z2i j ) +
Cov(z2i j , z2ik ) (i , j , k)
Nn
Nn
}
]
2
4(N − 2) { [ ′
=
Var[(u′i Σu j )2 ] +
E (ui Σu j )2 (u′i Σuk )2 − a22 . (i , j , k)
2
2
Nnp
Nnp
=
For any i , j , k, the second term of the above expression is


p

} 4(N − 2) 
]


4(N − 2) { [ ′
 ∑ 2 2
4 
2 ′
2
2
{(Σ
)
}
+
2tr
[Σ
]
,
K
E
(u
Σu
)
(u
Σu
)
−
a
=


ℓℓ
4
j
k
i
i
2


2
2


Nnp
Nnp
ℓ=1
]
[
]
[
]
[
]
[
since E (u′i Σu j )2 (u′i Σuk )2 − a22 = E u′i Σu j u′j Σui u′i Σuk u′k Σui − a22 = E u′i Σ2 ui u′i Σ2 ui − a22 = E (u′i Σ2 ui )2 −
∑p
a22 = Var[(u′i Σ2 ui )2 ] = K4 ℓ=1
{(Σ2 )ℓℓ }2 + 2tr [Σ4 ]. For any i , j, from Lemmas 7.1 and 7.2, we get
Var[(u′i Σu j )2 ]
= K4
p
∑
k,ℓ
σ4kℓ
p
∑
{(Σ2 )kk }2 + 2(tr [Σ2 ])2 + 6tr [Σ4 ].
+ 6K4
k=1
8
Hence
Var(â∗2 )
2
=
Nnp2


p
p {
∑
 ∑

}2
2
4
2
2
2
4
K
σi j + 6K4
(Σ )ii + 2(tr [Σ ]) + 6tr [Σ ]
 4
i, j
i=1


p

4(N − 2)  ∑ { 2 }2
4 
(Σ )ii + 2tr [Σ ] ,
+
K4
2
Nnp
i=1
which is rewritten as
Var(â∗2 ) =
p
4a22 4(N + 1) ∑
{ 2 }2 4(2N + 1)
+
K
(Σ )ii +
tr [Σ4 ]
4
2
2
Nn
Nnp
Nnp
i=1
4a22
8
4
K4 b4 +
a4 + o(n−2 )
+
np
np
n2
4a2
4
= 22 +
(K4 b4 + 2a4 ) + o(n−2 )
np
n
1
≡ 2 C22 + o(n−2 ),
n
∑ p { 2 }2
[
]
(Σ )ii and a4 = p−1 tr [Σ4 ]. The above result is stated
where C22 = 4 a22 + (n/p)(K4 b4 + 2a4 ) , b4 = p−1 i=1
in the following theorem.
=
Theorem 2.2 Under the general model given in (1.1)-(1.3) and under the assumption (A), the variance of aˆ ∗2
is approximated as
}
4{
n
C22
Var(â∗2 ) = 2 a22 + (K4 b4 + 2a4 ) + o(n−2 ) = 2 + o(n−2 ),
p
n
n
where C22 , b4 and a4 have been defined above.
2.4. Covariance between aˆ 1 and aˆ 2
In this section, we derive an expression for the covariance between aˆ 1 and aˆ 2 which is needed to obtain
the joint distribution of aˆ 1 and aˆ 2 . Since aˆ 1 and aˆ 2 are asymptotically equivalent to aˆ ∗1 and aˆ ∗2 respectively, we
obtain the Cov(â∗1 , aˆ ∗2 ). In terms of model (1.1),
aˆ ∗1 =
N
N
1 ∑ ′
1 ∑ ′
uℓ Σuℓ and aˆ ∗2 =
(u Σuk )2
N p ℓ=1
pnN j,k j
Thus,
Cov(â∗1 , aˆ ∗2 )

2 
 N
N
∑
 
∑
1

′
′
= 2 2 E  uℓ Σuℓ  u j Σuk   − a1 a2
N np
j,k
ℓ=1

 N
N
∑

 ∑
1
′
′
2
′
′
2
= 2 2 E 
(uℓ Σuℓ )(u j Σuk ) + 2 (u j Σu j )(u j Σuk )  − a1 a2
N np
j,k
ℓ, j,k


p



N−2
2 
 ∑
2
3
2 
=
a1 a2 +
K
σ
(Σ
)
+
2tr
[Σ
]
+
tr
[Σ]tr
[Σ
]
− a1 a2


4
rr
rr


2


N
Np
r=1
2
(K4 b3 + 2a3 )
np
1
≡
C12 .
np
∑p
where C12 = 2(K4 b3 + 2a3 ), b3 = p−1 r=1
σrr (Σ2 )rr and a3 = p−1 tr [Σ3 ] The above result is stated in the
following theorem.
9
=
Theorem 2.3 Under the general model given in (1.1)-(1.3) and under the assumption (A), the covariance
between aˆ ∗1 and aˆ ∗2 is given by
2
C12
Cov(â∗1 , aˆ ∗2 ) =
(K4 b3 + 2a3 ) =
,
np
np
where b3 and a3 are defined above.
Corollary 2.1 From the results of Theorems 2.1-2.3, the covariance matrix of (â∗1 , aˆ ∗2 )′ is approximated as
)
[( ∗ )] (
C11 /(np) C12 /(np)
aˆ 1
=
+ o(n−2 ) = C + o(n−2 ).
(2.14)
Cov
aˆ ∗2
C12 /(np) C22 /(n2 )
It will be shown in Section 6 that as (N, p) → ∞, (â∗1 , aˆ ∗2 )′ is asymptotically distributed as bivariate normal
with mean vector (a1 , a2 )′ and covariance matrix C as given above in (2.14).
3. Tests for Testing that Σ = λI p and Σ = I p
In this section, we consider the model described in (1.1)-(1.3), and propose tests for the two hypotheses,
namely for testing sphericity and for testing the hypothesis that Σ = I p .
The sphericity hypothesis will be considered first in Section 3.1, and the hypothesis that Σ = λI p will be
considered in Section 3.2.
3.1. Testing sphericity
For finite p, and N → ∞, John (1972) proposed a locally best invariant test based on the statistic
U=
tr [S2 ]/p
1
− 1, S = V
n
(tr [S]/p)2
(3.1)
d
and showed that for finite p, as n → ∞, N pU/2 → χ2d under the hypothesis that Σ = λI p , where d =
d
p(p+1)/2−1, and → denotes a convergence in distribution. Ledoit and Wolf (2002) showed that for (n, p) → ∞
such that p/N → c, the following modified statistic,
d
T LW = (NU − p) → N(1, 4),
(3.2)
under the hypothesis that Σ = λI p ; the distribution of this statistic when Σ , λI p was not given, but later given
in Srivastava (2005).
Srivastava (2005) showed that from Cauchy-Schwartz inequality
(
∑p 2)
p−1 i=1
λi
a2
=(
2
∑ p 2 )2 ≥ 1, and = 1 iff λi = λ,
a1
p−1 i=1
λi
(3.3)
where λi are the eigenvalues of Σ. Thus a measure of sphericity is given by
M1 =
a2
− 1,
a21
(3.4)
which is equal to zero if and only if λ1 = · · · = λ p = λ. Thus, Srivastava (2005) proposed a test based on
unbiased and consistent estimators of a1 and a2 , namely aˆ 1 and aˆ 2s defined in (2.3) and (2.4) respectively under
the assumption that the observations are i.i.d N p (µ, Σ). This test statistic is given by



n  aˆ 2s
(3.5)
T 1s =  2 − 1
2 aˆ 1
d
Srivastava (2005) showed that as (n, p) → ∞ T 1s → N(0, 1) under the hypothesis that Σ = λI p . The asymptotic
distribution of T 1s when Σ , λI p is also given. It has been shown in (2.9) that the estimator aˆ 2s is not unbiased
for the general model given in (1.1)-(1.3).
10
An unbiased estimator of a2 or tr [Σ2 ] can be obtained by using Hoeffding’s (1948) U-statistics, for details,
see Fraser (1957, chapter 4), Serfling (1980) and Lee (1990). For example, if the mean vector µ is zero, tr [Σ2 ]
can be estimated by
N
∑
1
(x′ x j )2 ,
tr [Σ2 ] =
N(N − 1) i, j i
as has been done by Ahmad, Werner and Brunner (2008) in connection with testing mean vectors in high
dimensional data. The computation of this estimator takes time of order O(N 2 ), same as in calculating aˆ 2s . But
when µ is not known, an unbiased estimator of tr [Σ2 ], using Hoeffding’s U-statistic is given by
\
tr
[Σ2 ] =
=
N
∑
1
[(xi − x j )′ (xk − xℓ )]2
4N(N − 1)(N − 2)(N − 3) i, j,k,ℓ
N
N
∑
∑
1
2
(x′i x j )2 −
x′ x j x′j xk
N(N − 1) i, j
N(N − 1)(N − 2) i, j,k i
+
N
∑
1
x′ x j x′k xℓ ,
N(N − 1)(N − 2)(N − 3) i, j,k,ℓ i
as used by Chen, Zhang and Zhong (2010) in replacing aˆ 2s in Srivastava’s statistic T 1s by using the above
estimator divided by p. The above estimator of tr [Σ2 ], however, has summation over four indices, and thus
requires computing time of O(N 4 ), which is not easy to compute.
Thus, in this paper, we use the estimator aˆ 2 in place of aˆ 2s , and propose the statistic T 1 given by



n  aˆ 2
T 1 =  2 − 1 ,
2 aˆ 1
(3.6)
which is a one-sided test since we are testing the hypothesis H1 : M = 0 against the alternative A1 : M1 > 0.
In Section 6, we show that as (N, p) → ∞, (â1 , aˆ 2 ) has a bivariate normal distribution with covariance matrix
C given in (2.14). Thus, following the delta method for obtaining the asymptotic distribution as in Srivastava
(2005) or Srivastava and Khatri (1979, page 59, Theorem 2.10.2), we can see that the variance of (â2 /â21 ) is
approximated as Var(â2 /â21 ) = τ2 (n, p) + o(n−2 ), where

(
)(
)
 −2a2 1  C11 /(np) C12 /(np) −2a2 /a31
2


τ (n, p) =  3 , 2 
2
1/a21
a1 a1 C12 /(np) C22 /(n )

(
)
 −2a2C11
C12 −2a2C12
C22  −2a2 /a31

= 
+
,
+
 1/a2
npa2
n2 a2
npa3
npa3
1
1
1
1
1
4a22C11
2a2C12 2a2C12
C22
=
−
−
+ 2 4
6
5
5
n a1
npa1
npa1
npa1
 2

2
4a2

1  4a2 (K4 a20 + 2a2 ) 8a2
4
= 2 4+
− 5 (K4 b3 + 2a3 ) + 4 (K4 b4 + 2a4 )

6
n a1 np
a1
a1
a1
 2
 3


2
4a2
 4a2 a20 8a2 b3
1
b4  1  8a2 16a2 a3 8a4 



K4  6 −
= 2 4+
+ 4 4 +
−
+ 4  .

np a61
n a1 np
a1
a1
a1
a51
a51
(3.7)
Thus, we get the following result, stated as Theorem 3.1.
Theorem 3.1 As (n, p) → ∞, the asymptotic distribution of the statistics aˆ 2 /â21 under the assumption (A) and
for the distribution given in (1.1)-(1.3) is given by
(
)
d
aˆ 2 /â21 − a2 /a21 /τ(n, p) → N(0, 1),
where τ2 (n, p) is defined in (3.7).
11
Under the hypothesis that Σ = λI p , a2 /a21 = 1, a20 /a21 = 1, b3 /a31 = 1, b4 /a41 = 1 and a4 /a41 = 1. Hence,
under the hypothesis, the variance of (â2 /â21 ) denoted by Var0 is given by,
Var0 (â2 /â21 ) =
4
.
n2
(3.8)
Hence, we get the following result, stated as Corollary 3.1.
Corollary 3.1 The asymptotic distribution of the test statistics T 1 when Σ = λI p , λ > 0, as (n, p) → ∞, is
given by
d
T 1 → N(0, 1)
3.2. Testing Σ = I p
In this section, we consider the problem of testing the hypothesis that Σ = I p . Nagao (1973) proposed the
locally most powerful test given by,
1
T˜ 1 = tr [(S − I p )2 ],
(3.9)
p
and showed that as N goes to infinity while p remains fixed, the limiting null distribution of
Np ˜ d
T 1 → Y p(p+1)/2
2
(3.10)
where Yd denotes a χ2 with d degrees of freedom. Ledoit and Wolf (2002) modified this statistic and proposed
the statistic
1
p
p
W = tr [(S − I p )2 ] − aˆ 21 +
(3.11)
p
N
N
and showed that as (N, p) → ∞ in a manner that p/N → c,
d
nW − p → N(1, 4)
under the hypothesis that Σ = I p .
The test proposed by Nagao was based on an estimate of the distance
1
tr [(Σ − I p )2 ]
p
1
= (tr [Σ2 ] − 2tr [Σ] + p)
p
= a2 − 2a1 + 1.
M2 =
(3.12)
Srivastava (2005) proposed a test based on unbiased and consistent estimators aˆ 1 and aˆ 2s under normality. This
test is given by,
n
T 2s = (â2s − 2â1 + 1).
(3.13)
2
As (N, p) → ∞, it has been shown to be normally distributed under the assumption that the observations are
normally distributed.
We propose the statistic
n
(â2 − 2â1 + 1)
(3.14)
2
and show that asymptotically as (N, p) → ∞, T 2 is normally distributed. It may be noted that T 2 is also a
one-sided test.
T2 =
From the covariance matrix of (â1 , aˆ 2 ) given in (2.14), the variance of (â2 − 2â1 ) can be approximated as
Var(â2 − 2â1 ) = η2 (n, p) + o(n−2 ), where
η2 (n, p) = Var(â2 ) + 4Var(â1 ) − 4Cov(â1 , aˆ 2 )
=
4a22 8a4 8a2 16a3 4K4
+
−
+
(b4 + a20 − 2b3 ).
+
np
np
np
np
n2
We state these results in Theorem 3.2.
12
(3.15)
Theorem 3.2 Under the assumption (A) and for the distribution given in (1.1)-(1.3), the asymptotic distribution of the statistic aˆ 2 − 2â1 as (n, p) → ∞ is given by,
d
{(â2 − 2â1 ) − (a2 − 2a2 )} /η(n, p) → N(0, 1),
(3.16)
where η2 (n, p) = Var(â2 − 2â1 ) given in (3.15).
Since when the hypothesis that Σ = I p , a2 − 2a1 = −1, and η2 (n, p) = 4/n2 , we get the following Corollary.
Corollary 3.2 The asymptotic distribution of the best statistic T 2 when Σ = I p , is given by
d
T 2 → N(0, 1),
as (n, p) → ∞.
4. Tests for the Equality of Two Covariance Matrices
In this section, we consider the problem of testing the hypothesis of the equality of two covariance matrices
Σ1 and Σ2 when N1 i.i.d p-dimensional observation vectors xi j , j = 1, . . . , N1 are obtained from the first group
following the model (1.1) - (1.3) with F replaced by F1 , µ by µ1 and u j by u1 j where Σ1 = F21 . And, similarly
N2 i.i.d p-dimensional vectors x2 j , j = 1, . . . , N2 are obtained from the second group following the model
(1.1)-(1.3) with F replaced by F2 , µ by µ2 and u j by u2 j , where Σ2 = F22 . The sample mean vectors are now
given by
N1
N2
1 ∑
1 ∑
x1 =
x1 j , and x2 =
x2 j .
N1 j=1
N2 j=1
Similarly, we define sample covariance matrices S1 and S2 through V 1 and V 2 given by,
N1
∑
V1 =
y1 j y′1 j , V 1 =
j=1
N2
∑
y2 j y′2 j
j=1
1
1
S1 = V 1 , and S2 = V 2 , ni = Ni − 1, i = 1, 2,
n1
n2
where
y1 j = x1 j − x1 , j = 1, . . . , N1 , y2 j = x2 j − x2 , j = 1, . . . , N2 .
Under normality assumption, the unbiased and consistent estimators of a1i = tr [Σi ]/p and a2i = tr [Σ2i ]/p will
be denoted by aˆ 1i and aˆ 2is respectively by using V i in place of V 1 and Ni or ni = Ni − 1 in place of N, i = 1, 2.
The unbiased estimator of a2i under the general model will be denoted by aˆ 2i . Thus,
}
{
1
1
1
2
2
,
tr
[V
],
a
ˆ
=
tr
[V
]
−
(tr
[V
])
aˆ 1i =
i
2is
i
i
ni p
(ni − 1)(ni + 2)p
ni
and
aˆ 2i =
}
1{
(Ni − 2)ni tr [V 2i ] − Nntr [ D2i ] + tr [V 2i ] ,
fi
where for i = 1,2,
fi = pNi (Ni − 1)(Ni − 2)(Ni − 3)
Di = diag(y′i1 yi1 , . . . , y′iN yiNi ) : Ni × Ni .
To test the hypothesis stated in Problem (3), namely testing the hypothesis Σ1 = Σ2 = Σ, say, against the
alternative Σ1 , Σ2 , Schott (2007) proposed the statistic
TS c =
aˆ 21s + aˆ 22s − 2tr [V 1 V 2 ]/(pn1 n2 )
,
2â2s (1/n1 + 1/n2 )
13
(4.1)
where
1
(n1 aˆ 21s + n2 aˆ 22s ),
n1 + n2
aˆ 2s =
(4.2)
is the estimator of a2 = p−1 tr [Σ2 ] under the hypothesis that Σ1 = Σ2 = Σ. It may be noted that the square of
the expression in the denominator of T S c is an estimate of the variance of the statistic in the numerator. Using
the new unbiased estimator of a2i , we obtain the statistic
T3 =
aˆ 21 + aˆ 22 − 2tr [V 1 V 2 ]/(pn1 n2 )
,
√
\
Var0 (qˆ 3 )
(4.3)
\
where Var
ˆ 3 ) denotes the estimated variance of the numerator of (4.3), namely, the estimated variance of
0 (q
qˆ 3 = aˆ 21 + aˆ 22 −
2
tr [V 1 V 2 ],
pn1 n2
under the hypothesis that Σ1 = Σ2 = Σ. The variance of qˆ 3 is as shown in Sections 4.1 and 4.2, is given by
(
)2
2
Var0 (qˆ 3 ) = Var(â21 ) + Var(â22 ) +
Var(tr [V 1 V 2 ])
pn1 n2
2
4 ∑
Cov(â2i , tr [V 1 V 2 ]).
−
pn1 n2 i=1
=
4a22
n21
+
4a22
n22
+
8a22
+ o(N1−1 ), N1 ≤ N2 .
n1 n2
Thus, assuming that Ni /p → 0, i = 1, 2, as (N1 , N2 , p) → ∞, the test statistic T 3 defined in (4.3) is given by
T3 =
aˆ 21 + aˆ 22 − 2tr [V 1 V 2 ]/(pn1 n2 )
,
2â2 (1/n1 + 1/n2 )
(4.4)
where
1
(n1 aˆ 21 + n2 aˆ 22 ).
(4.5)
n1 + n2
It may be noted that T 3 is a one-sided test. That is, the hypothesis is rejected if T 3 > z1−α , where z1−α is the
upper 100(1 − α)% point of the standard normal distribution.
aˆ 2 =
4.1. Evaluation of Variance of tr [V 1 V 2 ]
To evaluate the variance of tr [V 1 V 2 ],we note that F1 = F2 = F, under the hypothesis that Σ1 = Σ2 = Σ =
F2 , and asymptotically

N

N
2
1
∑

∑

V 1 = F  u1i u′1i  F, V 2 = F  u2 j u′2 j  F.
j=1
i=1
Thus
  N
 N
 N
1
2
1
  ∑
 ∑
∑
′
′




u′1i Cu1i ,
tr [V 1 V 2 ] = tr  u1i u1i  Σ  u2 j u2 j  Σ =
where C = (ci j ) = ΣBΣ and B =
j=1
i=1
j=1
i=1
∑N2
u2 j u′2 j . Hence,
)
(∑ N
1
u′1i Cu1i
Var0 (tr [V 1 V 2 ]) = Var i=1
( [∑
[
(∑
])
)]
N1 ′
N1 ′
u1i Cu1i C .
u1i Cu1i C + Var E i=1
= E Var i=1
Note that
E
[∑
N1
i=1
N2
]
∑
u′2 j Σ2 u2 j .
u′1i Cu1i C = N1 tr [C] = N1
j=1
14


p
( [∑
])
 ∑

N1 ′
2
2
2
4 

Var E i=1 u1i Cu1i C = N1 N2 K4
{(Σ )ii } + 2tr [Σ ] .
Hence
i=1
Thus, under the assumption (A),
( [∑
])
1
4
N1 ′
Var
E
u
Cu
(K4 + 2)tr [Σ4 ] = o((np)−1 ),
1i C ≤
i=1 1i
2 2
2
N2 p2
p n1 n2
and hence
[
(∑
)]
1
1
N1 ′
C + o((np)−1 ).
Var(tr
[V
V
])
=
E
Var
u
Cu
1
2
1i
i=1 1i
p2 n21 n22
p2 n21 n22
∑ 2
For a given C = ΣBΣ, where B = Nj=1
u2 j u′2 j ,
Var
(∑N
1
′
i=1 u1i u1i
)


p
 ∑

2 
2

= N1 K4
cii + 2tr [C ]
i=1


p

 ∑
2
2
2
cii + 2tr [Σ BΣ B] .
= N1 K4
i=1
Let Σ = (σ1 , . . . , σ p ). Then,
p
∑
c2ii
i=1
 
 
2
p
p 
N2


∑
∑
 



 ′ ∑
′
′
2
σ  u2 j u2 j  σi 
=
(σi Bσi ) =




 i 

j=1
i=1
i=1 
2



p 
p 


N2
N2
N2




∑
∑
∑




∑ ′

∑ ′
4
2
′
′
=
(σ
u
)
+
=
(σ
u
)
σ
u
u
σ


2j
2j 
2 j 2k i 
i
i
i












i=1
j=1
i=1
j=1
j,k


p ∑
N2
∑
∑

 N2 ′

=
σi u2 j u′2k σi +
σ′i u2 j u′2k σi  .
i=1
j=1k
j,k
Hence,
 p

p
p
∑
∑
∑ 2 
′
′
2


E  cii  = N2
E[(σi u2 j u2 j σi ) ] = N2
E[(u′2 j σi σ′i u2 j )2 ]
i=1
i=1
i=1


p ∑
p
p
∑
 ∑

′
2
′
2

= N2 K4
{(σi σi ) j j } + 3 (σi σi ) 
i=1 j=1
i=1
p
∑
≤ N2 (K4 + 3) (σ′i σi )2 ≤ N2 (K4 + 3) tr [Σ4 ],
i=1
since Σ4 =
∑p
i=1
σi σ′i σi σ′i +
∑p
i, j
σi σ′i σ j σ′j . Similarly,


E(tr [C ]) = E(tr [Σ BΣ B]) = E tr
2
2
2
 
 N
 N
2
2
 
 ∑
 ∑
Σ2  u2 j u′  Σ2  u2 j u′ 
2 j  
2 j

 
j=1

N
N2 (
2 (
∑
)2 
)2 ∑
′
2
2
′

u2 j Σ u2k 
u2 j Σ u2 j +
= E 
j=1
j,k
j=1


p

 ∑
2
2
4
2 2

{(Σ )rr } + tr [Σ ] + (tr [Σ ])  + N2 n2 tr [Σ4 ].
= N2 K4
r=1
15
Thus,
1
1
Var0 (tr [V 1 V 2 ]) = 2
p n1 n2
p2 n21 n22
+


p ∑
p
p
∑
 ∑

′
′
2
K 2
(σi σi ) j j + 3K4
(σi σi ) 
 4
2
p2 n1 n2
i=1 j=1
i=1


p
 ∑

2
2
4
2 2
4 
K4
{(Σ )rr } + 2tr [Σ ] + (tr [Σ ]) + n2 tr [Σ ]
r=1
2a22
+ o(n−2 ).
=
n1 n2
4.2. Evaluation of covariance between aˆ 2i and tr [V 1 V 2 ]
The covariance between aˆ 21 and tr [V 1 V 2 ] under the hypothesis is given by
(1)
C12
≡ Cov0 [â21 , tr [V 1 V 2 ]/(N1 N2 p)]



N1



∑


1


′
2
 (u1 j Σu1k ) 
= 2
E
tr





N1 N2 n1 p2 

j,k



N1



∑


N2

 (u′1 j Σu1k )2 
tr
E
= 2



2



N1 N2 n1 p

 j,k
 N
 N
 
1
2
 ∑
 ∑

′
′
Σ  u1i u  Σ  u2l u  − a2
1i  
2
2l 
 

i=1
ℓ=1

 N
1

 ∑
Σ  u1i u′  − a2 ,
2
1i

 
i=1
since u1 j and u2ℓ are independently distributed and aˆ 21 is independently distributed of u2ℓ . Hence,

 N
N1
1
∑

 ∑
1
(1)
′
′
2
′
2
′
C12 = 2
(u1 j Σu1k )(u1i Σ u1i ) + 2 (u1 j Σu1k ) (u1 j Σu1 j ) − a22
E 
2
N1 n1 p
i, j,k
j,k
 ∑p

2

N1 n1 (n1 − 1) 2 2N1 n1  K4 i=1 {(Σ )ii }2 2tr [Σ4 ]
2
=
a2 + 2 
+
+ a2  − a22
2
2
2
p
p
N1 n1
N n1
(
) 1
n1 − 1
2
2
=
+
− 1 a22 +
× o(1) = o(N1−1 ).
N1
N1
N1
5. Attained Significance Level and Power
In this section we compare the performance of the proposed test statistics with the tests given under the normality assumption. The attained significance level to the nominal value α = 0.05 and the power are investigated
in finite samples by simulation.
The attained significance level (ASL) is defined by αˆ T = #(T H > zα )/r1 for Problems (1) and (2), and
by αˆ T = #(T H2 > χ21,α )/r1 for Problem (3), where T H are values of the test statistic T computed from data
simulated under the null hypothesis H, r1 is the number of replications, zα is the 100(1 − α)% quantile of the
standard normal distribution and χ21,α is the 100(1 − α)% quantile of the chi-square distribution. The ASL
assesses how close the null distribution of T is to its limiting null distribution. From the same simulation, we
2
also obtain zbα for Problems (1) and (2) and χd
1,α for Problem (3) as the 100(1 − α)% sample quantile of the
empirical null distribution, and define the attained power by βbT = #(T A > zbα )/r2 for Problems (1) and(2),
2
βbT = #(T A2 > χd
1,α )/r2 for Problem (3), where T A are values of T computed from data simulated under the
alternative hypothesis A. In our simulation, we set r1 = 10, 000 and r2 = 5, 000.
It may be noted that irrespective of the ASL of any statistic, the power has been computed when all the
statistics in the comparison have the same specified significance level as the cut off points have been obtained
by simulation. The ASL gives an idea as to how close it is to the specified significance level. If it is not close, the
only choice left is to obtain it from simulation, not from the asymptotic distribution. It is common in practice,
although not recommended, to depend on the asymptotic distribution, rather than relying on simulations to
determine the ASL.
Through the simulation, let µ = 0 without loss of generality. For j = 1, . . . , N, u j = (ui j ) given in the model
(1.1) is generated with the four cases: one is the normal case and the others are the non-normal cases.
16
(Case 1)
(Case 2)
(Case 3)
(Case 4)
ui j
ui j
ui j
ui j
∼ N(0, 1),
= (νi j − 32)/8 for νi j ∼ χ232 ,
= (νi j − 8)/4 for νi j ∼ χ28 ,
= (νi j − 2)/2 for νi j ∼ χ22 ,
where χ2m denotes the chi-square distribution with m degrees of freedom, and ui j are standardized. Since the
skewness and kurtosis (K4 + 3) of χ2m is, respectively, (8/m)1/2 and 3 + 12/m, it is noted that χ22 has higher
skewness and kurtosis than χ28 and χ232 . Following (1.1), x j is generated by x j = Fu j for Σ = F2 .
[1] Testing problems (1) and (2). For these testing problems, the null and alternative hypotheses we treat
are H : Σ = I p and A : Σ = diag(d1 , . . . , d p ), di = 1 + (−1)i+1 (p − i)/(2p). We compare the two tests T 1s and T 1 ,
given in (3.5) and (3.6) for Problem (1), and the tests T 2s and T 2 , given in (3.13) and (3.15) for Problem (2). It
is noted that the 95% point of the standard normal distribution is 1.64485. The simulation results are reported
in Tables 1-4 for Problems (1) and (2), respectively.
From the tables, it is observed that the attained significance level (ASL) of the proposed tests T 1 and T 2 are
close to the specified level while the ASL values of the tests T 1s and T 2s proposed under normal distribution
are much inflated in Cases 3 and 4. Concerning the powers, both tests have similar performances although the
proposed tests are slightly more powerful in Case 4.
[2] Testing problem (3). For this testing problem, the covariance matrix Σ we treat here is of the form
Σ(ρ)

 σ1


= 


σ2
..
.
  |1−1| 101
  ρ
 
1
  ρ|2−1| 10
 
 
1
σ p  ρ|p−1| 10
1
ρ|1−2| 10
1
|2−2| 10
ρ
...
1
|p−2| 10
ρ
···
···
···

1
ρ|1−p| 10   σ1
 
1
ρ|2−p| 10  
 
 
1
ρ|p−p| 10
σ2
..
.
σp




 ,


where σi = 1 + (−1)i+1 Ui /2 for a random variable Ui having uniform distribution U(0, 1). Then we consider
the null and alternative hypotheses given by H : Σ1 = Σ2 = Σ(0.1) for ρ = 0.1 and A : Σ1 = Σ(0.1) for ρ = 0.1,
Σ2 = Σ(0.3) for ρ = 0.3. We compare the two tests T S c and T 3 , given in (4.1) and (4.4). The simulation results
are reported in Table 5.
For the table, it is revealed that the ASL of the proposed test T 3 are closer to the nominal level than T S c
in Cases 3 and 4. Concerning the powers, both tests have similar performances although the proposed test has
slightly more powerful in Case 4.
6. Asymptotic Distributions
In this section, we show that all the test statistics proposed in Sections 3 and 4 are asymptotically normally
distributed as (N1 , N2 , p) go to infinity. The test statistic T i depends on aˆ 2i and aˆ 1i , i = 1, 2 or simply on (â2 , aˆ 1 )
in one-sample case. We shall consider (â21 , aˆ 11 ) or equivalently (â∗21 , aˆ ∗11 ) in probability. To obtain asymptotic
normality, we consider a linear combination l1 aˆ ∗21 + l2 aˆ ∗11 of aˆ ∗21 and aˆ ∗11 , where we assume without any loss of
generality that l12 + l22 = 1. We shall know that for all l1 and l2 , l1 aˆ 21 + l2 aˆ 11 is normally distributed from which
the joint normality of aˆ 21 and aˆ 11 follow. Note that
1
1 ∑
(u′ Σu1 j )2 − a2
N1 n1 p i, j 1i
N
aˆ ∗21 − a2 =
=
1 {
}
1 ∑
(u′1i Σu1 j )2 − u′1 j Σ2 u1 j + u′1 j Σ2 u1 j − a2
N1 n1 p i, j
=
1 {
}
1 ∑
(u′1i Σu1 j )2 − u′1 j Σ2 u1 j + A,
N1 n1 p i, j
N
N
17
where A = (N1 n1 p)−1
∑N1 (
i, j
)
)
∑N1 ( ′ 2
u′1 j Σ2 u1 j − tr [Σ2 ] = (N1 p)−1 i=1
u1i Σ u1i − tr [Σ2 ] and
Var(A) =
≤
N1
1 ∑
Var(u′1i Σ2 u1i )
(N1 p)2 i=1
1
4N1 (K4 + 2)tr [Σ4 ] = o(N1−1 ),
(N1 p)2
since tr [Σ4 ]/p2 = o(1) from Assumption A. Thus, A → 0 in probability. Hence,
p
n1 (â∗21 − a2 ) =
1 ∑{
1
} ∑
2 ∑
(u′1i Σu1 j )2 − u′1 j Σ2 u1 j =
ξi ,
N1 p i=2 j=1
i=2
N
N
i−1
}
∑ { ′
(b)
2
2
′
where ξi ≡ 2(N1 p)−1 i−1
be a σ-algebra generated by random vectors
j=1 (u1i Σu1 j ) − u1 j Σ u1 j . Let ℑi
u11 , . . . , u1i , i = 1, ....N1 , and let (Ω, ℑ, P) be the probability space, where Ω is the whole space and P is the prob(p)
(p)
(p)
ability measure. Let ∅ be the null set. Then, with ℑ(p)
0 = (∅, Ω) = ℑ−1 , we find that ℑ0 ⊂ ℑ1 ⊂ · · · ⊂ ℑN1 ⊂ ℑ,
′
and E(ξi |ℑi−1 ) = 0. Let Bℓ = Σu1ℓ u1ℓ Σ for ℓ = j, j , k. Then
i−1
i−1
( N p )2
∑
∑
1
E(ξi2 |ℑi−1 ) =
Var(u′1i B j u1i ) + 2
Cov(u′1i B j u1i , u′1i Bk u1i )
2
j=1
j<k




p
p
i−1 
i−1

∑
∑
∑  ∑






2
2
K
(B
)
(B
)
+
2tr
[B
B
]
,
{(B j )ℓℓ } + 2tr [B j ] + 2
=
K4


4
j
ℓℓ
k
ℓℓ
j
k




j=1
ℓ=1
j<k
ℓ=1
where (B j )ℓℓ is the (ℓ, ℓ)tth diagonal element of the matrix B j = ((B j )ℓr ). Thus,
{
}
(K4 + 2)
(i − 1)(i − 2)
E(ξi2 ) ≤ 4 2 2
(i − 1)E(u′j Σ2 u j )2 +
tr [Σ]4
2
N1 p
}
4(K4 + 2) {
≤
(K4 + 2)tr [Σ]4 + (tr [Σ2 ])2 + tr [Σ4 ] = O((p−1 tr [Σ2 ])2 ).
2
p
{
}
Hence, the sequence ξi , ℑi is a sequence of square integrable martingale difference, see Shiryaev (1984) or
Hall and Heyde (1980). Similarly, it can be shown that
N1
∑
p
E(ξi2 |ℑi−1 ) → σ20
i=0
for some finite constant σ20 . Thus, it remains to show that the Lindberg’s condition, namely
N1
∑
p
E[ξi2 I(|ξi | > ϵ|ℑi−1 )] → 0.
i=0
is satisfied. It is known, see, e.g, Srivastava (2009), that this condition will be satisfied if we show that
N1
∑
E(ξi4 ) → 0 as N1 → ∞.
i=0
Next, we evaluate E(ξi4 ). Note that
(
ξi2
where
2
=
N1 p

)2 ∑
i−1
∑

 i−1 2
 ci j + 2
ci j cik  ,
j=1
j<k
ci j = u′1i B j u1i − tr [B j ], B j = Σu1 j u′1 j Σ.
18
Hence, from an inequality in Rao (2002),
2
 i−1
2 
)4 ∑
∑
 
 i−1 2 
2



 ci j  + 4  ci j cik  
≤2


N1 p  j=1 
j<k

)4 ∑
(
i−1
i−1
∑
∑

 i−1 4
2
2
2
ci j cik + 4
ci j cik cil cir  ,
=2
 ci j + 6
N1 p
j=1
j<k
j<k,l<r
(
ξi4
and

)4 ∑
i−1
∑

 i−1 4
2
2
2
≤2
E  ci j + 6
ci j cik 
N1 p
j=1
j<k

(
)4 ∑
i−1
i−1
∑


1
4
2
2

E(ci j ) + 6
E(ci j cik )
≤ 32
N1 p
j=1
j<k
 i−1 2
1 ∑ 2 
 c  .
≤ 96
(N1 p)4  j=1 i j 
(
E(ξi4 )
Hence, using again an inequality in Rao (2002), we get
N1
∑
E(ξi4 ) ≤ 192
i=1
E(c4i j )
N13 p4
(i , j),
which is of order O(N1−1 ) or converges to zero as (N1 , p) → ∞ under the Assumptions (A1)-(A3) by using the
results on the moments given in Section 7. For example, for some constant γ, we get from Corollary 7.2, and
the fact that N = O(pδ ), δ > 1/2,
1
E[(u′1 j Σ2 u1 j )4 ]
p4
1
≤ 4 γ[(tr [Σ2 ])4 + (tr [Σ2 ])2 tr [Σ4 ] + (tr [Σ4 ])2 + tr [Σ2 ]tr [Σ6 ]) + tr [Σ8 ]],
p
which is of orer O(1) under the Assumptions (A1)-(A4), because
1
1
a4
tr [Σ6 ] ≤ 3 tr [Σ2 ]tr [Σ4 ] = a2 → 0,
3
p
p
p
( )2
1
1
a4
8
4 2
tr
[Σ
]
≤
(tr
[Σ
])
=
→ 0.
4
4
p
p
p
Similarly, for some constant γ1 ,
1
1
E{E[(u′1i B j u1i )4 |B j ]} ≤ 4 γ1 E[2(tr [B j ])4 + (tr [B j ])2 (tr [ B2j ]) + (tr [B2j ])2 + tr [B4j ]]
p4
p
5
= 4 γ1 E[(u′1 j Σ2 u1 j )4 ].
p
Hence, E[(u′1i B j u1i )4 ]/p4 = O(1). Similarly,
√
N1 p(â∗11
 N
N
1
1



1 
 ∑
∑ ′
=
ξ2i ,
(u
Σu
−
tr
[Σ])
− a1 ) = √


1i
1i



N1 p 
i=1
i=1
where ξ2i = (N1 p)−1/2 [u′1i Σu1i − tr [Σ]]. It can be checked that E(ξ2i |ℑi−1 ) = E(ξ2i ) = 0 and
1
2
E(ξ2i
|ℑi−1 ) = Var(ξ2i ) =
Var(u′1i Σu1i )
N1 p


p

1  ∑ 2
2
K4
σ j j + 2tr [Σ ] .
=
N1 p
j=1
19
∑
N1
N1
E(ξi2 |ℑi−1 ) = p−1 [K4 pj=1 σ2j j + 2tr [Σ2 ]] < ∞. Similarly it can be shown that Σi=1
E(ξi4 ) → 0 as
Hence, Σi=1
(N1 , p) → ∞. Thus, asymptotically, as (N1 , N2 , p) → ∞,
(
)
ˆ 11 − a1 d
−1/2 a
Ω1
→ N2 (0, I2 ),
aˆ 21 − a2
1/2
where Ω = Ω1/2
1 Ω1 given by
Ω=
( (1)
)
(1)
C11 /(N1 p) C12
/(N1 p)
(1)
(1)
C12
/(N1 p) C22
/(N12 )
Thus, the asymptotic distribution of (â21 /â211 ) is normal with mean a21 /a211 and variance ν12 given by
ν12 = (−2a2 /a31 , 1/a21 )Ω(−2a2 /a31 , 1/a21 )′
=
(1)
4a22C11
−
(1)
4a2C12
+
(1)
C22
a41 N12
a61 N1 p
a51 N1 p


1  (1) 4N1 a2 (1) 4N1 a22 (1) 

= 2 4 C22 −
C +
C  .
p a1 12
p a21 11
N1 a1
7. Moments of Quadratic forms
In this section we give the moments of quadratic forms, which are useful for evaluating the varaiances of
aˆ 1 and aˆ 2 . For proofs, see Srivastava (2009) and Srivastava and Kubokawa (2013).
Lemma 7.1 Let u = (u1 , . . . , u p )′ be a p-dimensional random vector such that E(u) = 0, Cov (u) = I p ,
E(u4i ) = K4 + 3, i = 1, . . . , p, and
E(uai ubj uck udℓ ) = E(uai )E(ubj )E(uck )E(udℓ ),
0 ≤ a + b + c + d ≤ 4 for all i, j, k, ℓ. Then for any p × p symmetric matrices A = (ai j ) and B = (bi j ) of constants,
we have
E(u′ Au)2 = K4
p
∑
a2ii + 2tr [ A2 ] + (tr [A])2 ,
i=1
Var(u′ Au) = K4
p
∑
a2ii + 2tr [ A2 ],
i=1
E[u′ Auu′ Bu] = K4
p
∑
aii bii + 2tr [ AB] + tr [ A]tr [B],
i=1
for any symmetric matrix B = (bi j ) of constants.
Corollary 7.1 Let u = N −1
∑N
i=1
ui , where u1 , . . . , uN are independently and identically distributed. Then
¯ =
Var(u¯ ′ Au)
p
K4 ∑ 2
2
aii + 2 tr [ A2 ].
3
N i=1
N
Lemma 7.2 Let u and v be independently and identically distributed random vectors with zeroes mean vector
and covariance matrix I p . Then for any p × p symmetric matrix A = (ai j ),
Var[(u′ Av)2 ] = K42
p
∑
i, j
a4i j + 6K4
p
∑
{(A2 )ii }2 + 6tr [ A4 ] + 2(tr [ A2 ])2 .
i=1
Note that for any symmetric matrix A = (a1 , . . . , a p ) = A′ , (A2 )ii = a′i ai =
∑p ∑p 2 2 ∑p 2 2
i=1 ( j=1 ai j ) =
i, j,k ai j aik .
20
∑p
j=1
a2i j , and
∑p
2
2
i=1 {( A )ii }
=
8. Concluding Remarks
In this paper, we have proposed a new estimator of p−1 tr [Σ2 ] which is unbiased and consistent for a general
class of distributions which includes normal distribution. The computing time for this estimator is the same
as the one used in the literature under normality assumption. Using this new estimator we modified the tests
proposed by Srivastava (2005) for testing the sphericity of the covariance matrix Σ, and for testing Σ = I p . The
performance of these two modified tests are compared by simulation. It is shown that the attained significance
level (ASL) of the proposed tests are close to the specified level while the tests proposed under normal distribution, the ASL is 83.61% for chi-square with 2 degrees of freedom. Thus, the modified proposed test is robust
against departure from normality without losing power.
Acknowledgments.
We thanks the associate editor and the two reviewers for their kind suggestions and for pointing out a
misprint and providing some recent references. The research of M.S. Srivastava was supported in part by
NSERC of Canada, and that of T. Kubokawa by Grant-in-Aid for Scientific Research (21540114 and 23243039)
from Japan Society for the Promotion of Science.
References
Ahmad, M., Werner, C., and Brunner, E. (2008). Analysis of high dimensional repeated measures designs: the one sample case. Comput.
Statist. Data Anal., 53, 416–427.
Baik, J., and Silverstein, J. W. (2006). Eigenvalues of large sample covariance matrices of spiked population models. J. Multivariate
Analysis, 97, 1382–1408.
Benjamini, Y., and Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J.
Royal Statist. Soc., B57, 289–300.
Benjamini, Y., and Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency. Ann. Statist., 29,
1165–1188.
Berthet, Q., and Rigollet, P. (2013). Optimal detection of sparse principal components in high dimension. Ann. Statist., 41, 1780–1815.
Cai, T.T., and Ma, Z. (2013). Optimal hypotheses testing for high dimensional covariance matrices. Bernoulli, 19, 2359–2388.
Cai, T., Liu, W., and Xia, J. (2013). Two-sample covariance matrix testing and support recovery in high-dimensional and sparse settings.
J. Amer. Statist. Assoc., 108, 265–277.
Chen, S.S., Zhang, L., and Zhong, P. (2010). Tests for high-dimensional covariance matrices. J. Amer. Statist. Assoc., 105, 810–819.
Dempster, A.P. (1958). A high dimensional two-sample significance test. Ann. Math. Statist., 29, 995–1010.
Dempster, A.P. (1960). A significance test for the separation of two highly multivariate small samples. Biometrics, 16, 41–50.
Fan, J., Hall, P., and Yao, Q. (2007). How many simultaneous hypothesis tests can normal, Student’s t or bootstrap calibration be applied.
J. Amer. Statist. Assoc., 102, 1282–1288.
Fraser, D.A.S. (1957). Nonparametric Methods in Statistics. Wiley, New York.
Hall, P., and Heyde, C.C. (1980). Martingale Limit Theory and Its Application. Academic Press, New York.
Hoeffding, H. (1948). A class of statistics with asymptotically normal distributions. Ann. Math. Statist., 19, 293–325.
Jiang, T. (2004). The asymptotic distributions of the largest entries of sample correlation matrices. Ann. Applied Probab., 14, 865–880.
John, S. (1971). Some optimal multivariate tests. Biometrika, 58, 123–127.
Kosorok, M., and Ma, S. (2007). Marginal asymptotics for the large p, small n paradigm: With applications to microarray data. Ann.
Statist., 35, 1456–1486.
Ledoit, O., and Wolf, M. (2002). Some hypothesis tests for the covariance matrix when the dimension is large compared to the sample size.
Ann. Statist., 30, 1081–1102.
Lee, A.J. (1990). U-Statistics: Theory and Practice. Marcel Dekker.
Li, J., and Chen, S.X. (2012). Two sample tests for high-dimensional covariance matrices. Ann. Statist., 40, 908–940.
Nagao, H. (1973). On some test criteria for covariance matrix. Ann. Statist., 1, 700–709.
Onatski, A., Moreira, M.J., and Hallin, M. (2013). Asymptotic power of sphericity tests for high-dimensional data. Ann. Statist., 41,
1204–1231.
Rao, C.R. (2002). Linear Statistical Inference and Its applications (Paperback ed.). John Wiley & Sons.
Schott, J.R. (2007). A test for the equality of covariance matrices when the dimension is large relative to the sample size. Comput. Statsit.
Data Anal., 51, 6535–6542.
Serfling, R.J. (1980). Approximation Theorems of Mathematical Statistics. Wiley, New York.
Shiryaev, A.N. (1984). Probability (2nd. ed.). Springer-Verlag.
Srivastava, M.S. (2005). Some tests concerning the covariance matrix in high-dimensional data. J. Japan Statist. Soc., 35, 251–272.
Srivastava, M.S. (2009). A test of the mean vector with fewer observations than the dimension under non-normality. J. Multivariate Anal.,
100, 518–532.
Srivastava, M.S. (2010). Controlling the average false discovery in large-scale multiple testing. J. Statist. Res., 44, 85–102.
Srivastava, M.S., and Du, M. (2008). A test for the mean vector with fewer observations than the dimension. J. Multivariate Anal., 99,
386–402.
Srivastava, M.S., Katayama, S., and Kano, Y. (2013). A two sample test in high dimensional data. J. Multivariate Anal., 114, 249–358.
21
Srivastava, M.S., and Khatri, C.G. (1979). An Introduction to Multivariate Statistics, North-Holland, New York.
Srivastava, M.S., Kollo, T., and von Rosen, D. (2011). Some tests for the covariance matrix with fewer observations than the dimension
under non-normality. J. Multivariate Anal., 102, 1090–1103.
Srivastava, M.S., and Kubokawa, T. (2013). Tests for multivariate analysis of variance in high dimension under non-normality. J. Multivariate Anal., 115, 204–216.
Yamada, T., and Srivastava, M.S. (2012). A test for the multivariate analysis of variance in high-dimension. Comm. Statist. Theory Methods,
41, 2602–2612.
Wilks, S.S. (1946). Sample criteria for testing equality of means, equality of variances, and equality of covariances in a normal multivariate
distribution. Ann. Math. Statist., 17, 257–281.
Table 1: ASL and powers given in percentage (%) of the tests T 1s and T 1 for Problem (1), as well as the tests T 2s and T 2 for Problem (2),
under Case 1, N(0, 1)
p
20
40
60
100
200
N
10
20
40
60
10
20
40
60
10
20
40
60
10
20
40
60
10
20
40
60
T 1s
5.10
5.18
5.36
5.37
5.32
5.51
4.85
5.27
5.05
5.12
4.85
5.05
5.13
5.08
5.33
4.70
5.38
5.08
4.66
5.00
ASL in H
T 1 T 2s
6.71 5.50
6.38 6.02
5.78 5.68
5.79 5.97
7.14 6.02
6.01 5.88
5.23 5.00
5.33 5.69
7.05 5.36
5.97 5.52
5.30 5.09
5.20 5.23
7.19 5.28
6.08 5.29
5.81 5.33
5.15 4.70
7.09 5.55
6.17 5.27
5.21 4.76
5.39 5.01
T2
7.39
7.04
6.23
6.35
7.68
6.43
5.69
5.74
7.51
6.36
5.54
5.44
7.36
6.26
5.92
5.25
7.20
6.24
5.32
5.46
22
T 1s
11.02
20.98
48.92
76.25
10.86
19.59
49.18
76.00
11.45
20.73
50.13
76.77
10.40
19.99
47.83
77.43
10.91
21.07
50.66
77.39
Power in A
T1
T 2s
10.53 10.86
18.69 19.76
47.22 47.47
74.91 74.06
10.34 10.15
18.88 19.28
48.02 48.80
75.00 74.29
9.94 11.17
20.23 19.83
48.27 49.51
75.65 76.12
9.96 10.39
18.60 19.88
46.49 47.32
75.93 77.01
10.74 10.72
18.86 20.52
48.19 49.96
75.61 77.18
T2
10.43
17.89
45.11
72.94
9.65
18.29
46.81
74.24
10.01
19.39
48.13
75.27
9.63
18.29
46.15
75.48
10.62
18.69
47.86
75.75
under Case 2, χ232
p
20
40
60
100
200
N
10
20
40
60
10
20
40
60
10
20
40
60
10
20
40
60
10
20
40
60
T 1s
6.50
7.43
7.50
7.60
7.19
7.81
7.48
7.53
6.85
7.54
7.72
7.38
6.82
6.93
7.49
7.17
6.89
7.47
7.46
7.21
ASL in H
T 1 T 2s
6.41 7.37
6.72 8.33
6.10 8.37
5.74 8.49
7.19 7.94
6.60 8.40
5.85 8.16
5.73 7.96
7.18 7.27
6.12 7.90
5.80 7.91
5.42 7.65
6.63 6.96
5.88 7.07
5.88 7.68
5.39 7.35
7.07 7.15
6.30 7.39
5.88 7.50
5.32 7.31
T2
7.19
7.40
6.90
6.45
7.56
6.95
6.39
6.00
7.37
6.42
5.84
5.71
6.98
6.11
6.05
5.56
7.23
6.33
5.94
5.30
T 1s
10.82
20.51
47.54
74.22
11.17
19.20
47.25
74.05
11.40
19.89
47.34
75.40
12.21
21.77
46.88
75.89
11.48
19.44
46.13
77.05
Power in A
T1
T 2s
10.89 10.50
18.56 18.63
45.89 44.61
74.09 71.90
10.43 10.50
18.09 18.24
45.83 45.22
73.03 72.40
10.38 11.40
18.79 19.16
46.42 46.02
75.11 74.46
11.33 11.95
20.36 21.71
45.52 46.03
75.74 75.26
10.64 11.45
18.93 19.61
45.32 45.95
76.27 76.89
T2
9.85
17.75
44.10
72.41
9.78
18.04
43.98
72.07
10.24
18.69
45.57
74.22
10.86
20.33
44.86
75.03
10.70
18.76
45.44
76.19
under Case 3, χ28
p
20
40
60
100
200
N
10
20
40
60
10
20
40
60
10
20
40
60
10
20
40
60
10
20
40
60
T 1s
12.82
15.03
17.87
17.86
14.38
17.37
18.07
18.55
14.63
16.71
18.04
18.50
15.27
16.69
17.55
18.25
15.57
16.68
17.23
18.09
ASL in H
T1
T 2s
6.89 14.84
6.72 17.22
6.98 19.62
6.68 19.98
7.14 15.74
6.98 18.28
6.57 19.48
6.36 19.61
7.55 15.76
6.33 17.20
6.21 18.86
5.89 19.20
7.17 15.59
5.73 17.08
5.89 18.17
5.86 18.68
7.10 15.76
6.01 16.93
5.85 17.61
5.22 18.22
T2
7.91
7.75
8.02
7.72
7.46
7.30
7.22
6.83
7.74
6.86
6.56
6.09
7.53
5.89
6.02
6.09
7.36
6.07
5.89
5.48
23
T 1s
10.78
19.25
42.33
66.10
10.41
18.25
43.15
69.05
9.76
18.12
44.34
72.60
10.65
19.77
45.93
74.29
9.90
20.91
46.73
75.52
Power in A
T1
T 2s
11.26 9.79
18.58 16.47
42.21 36.66
68.09 61.49
10.50 10.27
18.22 17.43
42.81 40.59
69.60 66.57
9.74 9.60
18.80 17.39
44.09 42.93
73.01 71.32
10.28 10.11
19.72 18.85
45.71 45.68
73.42 73.24
10.04 9.85
19.72 20.44
45.57 46.02
75.41 74.91
T2
10.67
17.15
40.35
65.93
10.08
17.77
41.32
68.01
9.74
18.59
43.56
72.41
10.19
19.86
44.94
73.27
10.02
19.79
45.92
75.04
under Case 4, χ22
p
20
40
60
100
200
N
10
20
40
60
10
20
40
60
10
20
40
60
10
20
40
60
10
20
40
60
T 1s
42.16
55.24
63.40
66.69
50.09
61.01
71.25
73.99
51.85
64.54
72.86
76.84
53.80
67.71
75.83
79.22
56.32
69.45
78.90
81.15
ASL in H
T1
T 2s
10.24 42.93
10.59 56.78
11.00 65.79
11.01 69.74
9.19 49.20
9.37 61.32
9.22 72.11
9.02 75.36
9.30 51.20
8.67 64.22
8.98 73.41
8.35 77.50
8.44 53.48
7.99 67.35
7.76 75.94
7.42 79.63
7.82 55.93
7.12 69.67
6.50 78.91
6.17 81.18
T2
11.31
11.96
12.94
13.27
9.60
10.48
10.36
10.44
9.34
9.09
9.72
9.27
8.64
8.63
8.37
7.81
8.11
7.24
6.67
6.32
T 1s
8.19
11.84
22.42
37.48
8.76
13.03
25.41
44.74
8.57
12.84
28.19
49.91
9.26
13.54
31.83
54.03
8.82
14.26
33.76
61.95
Power in A
T1
T 2s
8.51 7.41
13.91 9.98
28.02 16.68
45.73 27.85
10.30 8.33
15.36 11.25
32.21 20.78
54.76 36.57
9.43 8.51
16.41 11.39
34.88 23.34
59.88 42.20
10.05 9.18
16.74 12.54
39.01 27.61
63.45 49.08
9.86 8.84
17.59 13.51
41.56 31.50
70.88 59.13
T2
8.53
12.89
26.08
43.16
10.32
14.56
31.30
52.50
8.81
15.99
33.96
58.27
10.26
16.46
37.99
62.65
9.74
17.93
41.35
70.42
Table 5: Critical values, ASL and powers given in percentage (%) of the tests T S c and T 3 for Problem (3)
N
p
20
40
60
80
40
80
120
200
20
40
60
80
40
80
120
200
20
40
60
80
40
80
120
200
20
40
60
80
40
80
120
200
Critical Value
ASL in H
TS c
T3
TS c
T3
Case 1 : N(0, 1)
1.4359 1.4740
2.64 2.99
1.5489 1.5919
3.92 4.29
1.6208 1.6501
4.70 5.07
1.5983 1.6186
4.60 4.74
Case 2 : χ232
1.5380 1.5127
3.46 3.49
1.6945 1.6178
5.65 4.75
1.7807 1.6894
6.54 5.48
1.7424 1.6257
6.07 4.80
Case 3 : χ28
1.8263 1.5557
7.98 3.83
2.0472 1.6527 11.48 5.08
2.1271 1.6485 13.13 5.07
2.1575 1.6757 12.86 5.32
Case 4 : χ22
2.6124 1.7723 31.78 6.46
3.4193 1.8795 51.70 7.21
3.6721 1.9193 59.15 7.95
3.6834 1.7761 62.21 6.48
24
Power in A
TS c
T3
25.96
69.48
92.46
99.30
25.32
68.50
92.02
99.24
24.98
68.74
91.92
99.04
24.54
68.58
91.98
99.04
20.04
64.74
89.26
98.48
23.10
67.54
91.08
98.78
11.64
43.84
78.16
96.42
19.58
60.76
88.58
98.72

Tests for Covariance Matrices in High Dimension with Less Sample Size

Transcription

Similar documents

Sample Size Determination for Analysis of Covariance

Cactaceae (Cactus Family)

Slide show

Powerful. Intuitive. Flexible. Eddy Covariance Flux Processing Software EddyPro Software - Page 1

For PDF

Two Sample Tests for High Dimensional Covariance Matrices Munich Personal RePEc Archive

Power Spectrum Super

ASL Open Video Chat

Model Selection for Small Sample Regression

Document 6482607

Lecture 10 - Jussi Klemelä