Read 3022.pdf text version

Eigenvalues and Condition Numbers of Complex Random Matrices

T. Ratnarajah R. Vaillancourt M. Alvo

CRM-3022 April 2004

This work was partially supported by the Natural Sciences and Engineering Council of Canada and the Centre de recherches math´e matiques of the Universit´ de Montr´al. e e Department of Mathematics and Statistics, University of Ottawa, 585 King Edward Ave., Ottawa ON K1N 6N5 Canada T. Ratnarajah is now with ECIT, Queen's University of Belfast, Belfast BT7 1NN, Northern Ireland, UK, ([email protected])

Abstract In this paper, the distributions of the largest and smallest eigenvalues of complex Wishart matrices and the condition number of complex Gaussian random matrices are derived. These distributions are represented by complex hypergeometric functions of matrix arguments, which can be expressed in terms of complex zonal polynomials. Several results are derived on complex hypergeometric functions and complex zonal polynomials and are used to evaluate these distributions. Finally, applications of these distributions in numerical analysis and statistical hypothesis testing are mentioned. Keywords. complex random matrix, complex Wishart matrix, complex zonal polynomials, eigenvalue distribution, condition number distribution AMS Mathematics Subject Classification. 15A52, 60E05, 62H10, 65F15 To appear in SIAM J. Matrix Anal. Applic. R´sum´ e e On d´rive la distribution de la plus grande et de la plus petite valeur propre de matrices de Wishart e complexes et l'on obtient le conditionnement de matrices gaussiennes al´atoires. On repr´sente ces distrie e butions au moyen de fonctions hyperg´om´triques complexes et aussi par des polyn^mes zonaux complexes e e o et l'on d´rive plusieurs r´sultats pour ces functions et ces polyn^mes avec lesquels on ´value les dites dise e o e tributions. On mentionne des applications de ces distributions en analyse num´rique et pour les tests e d'hypoth`ses statistiques. e

1

Introduction

In this work, we investigate the distributions of the eigenvalues and condition number of complex random matrices and their applications to numerical analysis. In contrast to the literature in [3], we consider that the elements of random matrices are complex Gaussian distributed with zero mean and arbitrary covariance matrices. This will enable us to consider the beautiful but difficult theory of complex zonal polynomials (also called Schur polynomials [10]), which are symmetric polynomials in the eigenvalues of a complex matrix [12]. Complex zonal polynomials enable us to represent the distributions of the eigenvalues of these complex random matrices as infinite series. In statistics, the random eigenvalues are used in hypothesis testing, principal component analysis, canonical correlation analysis, multiple discriminant analysis, etc. (see [12]). In nuclear physics, random eigenvalues are used to model nuclear energy levels and level spacing [11]. Moreover, the zeros of the Riemann zeta function are modeled using random eigenvalues [11]. Let an n × m complex Gaussian random matrix A be distributed as A CN (0, In ) with mean E{A} = 0 and covariance cov{A} = In . Then the matrix W = AH A is called the complex central Wishart matrix and its distribution is denoted by CW m (n, ). The condition number, cond(A), of a matrix A is defined as the positive square root of the ratio of the largest to the smallest eigenvalues of the positive definite Hermitian matrix W = AH A. Thus cond(A) = where the

2 -norms

max /min = A

2

A-1 2 ,

cond(W ) = cond(A)2 ,

(1)

of the matrix A and the vector x are A

2

= sup Ax 2 / x

x=0

2

and

x

2

= x2 + x2 + · · · + x2 1 2 n

1/2

,

respectively. We assume that the eigenvalues of W are ordered in strictly decreasing order, max = 1 > · · · > m = min > 0, since the probability that any eigenvalues of A are equal is zero. The condition number of a random matrix gives valuable information on the convergence rate of iterative methods in optimization algorithms and on the reliability of the solutions of linear systems of equations. The distributions of max and min and the condition number density of random matrices are studied in [3] (and references given therein) for = I. The singular value distribution of Gaussian random matrices is given in [13] for = I. Note that the singular values of a complex Gaussian random matrix A are equal to the square root of the eigenvalues of the complex Wishart matrix W = AH A. The asymptotic distribution of the largest eigenvalue of a complex Wishart matrix is given in [6] if m and n are large and = I. In [7] the largest and smallest eigenvalue distributions of a complex Wishart matrix are studied for = 2 I. Here, we derive the distributions of the largest and smallest eigenvalues of complex Wishart matrices and the condition number density of complex random matrices for arbitrary . Applications of these distributions are also given. This paper is organized as follows. Section 2 provides the necessary tools for deriving the eigenvalue and condition number distributions of complex central Wishart matrices. Complex central Wishart matrices are studied in Section 3 and their largest and smallest eigenvalue distributions are derived. The condition number density is derived in Section 4 and a numerical example is given.

2

Preliminaries

In this section, we derive several results on complex hypergeometric functions and complex zonal polynomials that will be used to evaluate the subsequent distributions. First, we define the multivariate hypergeometric coefficients () [a] which frequently occur in integrals involving zonal polynomials. Let = (k1 , . . . , km ) be a partition of the integer k with k1 · · · km 0 and k = k1 + · · · + km . Then [1]

m

[a]() =

i=1

a-

1 (i - 1)

ki

where (a)k = a(a + 1) · · · (a + k - 1) and = 1 for complex and = 2 for real multivariate hypergeometric coefficients, respectively. In this paper we only consider the complex case; therefore, for notational simplicity we drop the superscript [8], i.e., m Cm (a, ) [a] := [a](1) = (a - i + 1)ki = , Cm (a) i=1 1

where Cm (a, ) = m(m-1)/2

m

(a + ki - i + 1),

i=1

(a) > (m - 1),

and Cm (a) denotes the complex multivariate gamma function

m

Cm (a) = m(m-1)/2

k=1

(a - k + 1)

(a) > (m - 1) + k1 .

Moreover, Cm (a, -) = m(m-1)/2

m

(a - m - ki + i).

i=1

The complex zonal polynomials (also called Schur polynomials [10]) of a complex matrix X are defined in [5] by C (X) = [] (1)[] (X), where [] (1) is the dimension of the representation [] of the symmetric group given by [] (1) = k!

m i<j (ki - kj - i + j) , m i=1 (ki + m - i)!

(2)

(3)

and [] (X) is the character of the representation [] of the linear group given as a symmetric function of the eigenvalues, 1 , . . . , m , of X by k +m-j det i j [] (X) = . (4) det m-j i Note that both the real and complex zonal polynomials are particular cases of Jack polynomials C (X) for general . See [1] and [15] for details. Again = 1 for complex and = 2 for real zonal polynomials, respectively. For the same (1) reason as before, we shall drop the superscript of Jack polynomials, as we did in equation (2), i.e., C (X) := C (X). The following basic properties are given in [5]: (tr X)k =

()

C (X)

and C (AXBX H )(dX) =

U (m)

C (A)C (B) , C (Im )

(5)

where (dX) is the invariant measure on the unitary group U (m) normalized to make the total measure unity and C (Im ) = 22k k! where 1 m 2 =

i=1

1 m 2

r

r i<j (2ki - 2kj - i + r i=1 (2ki + r - i)!

j)

,

1 (m - i + 1) 2

.

ki

Note that the partition of k has r nonzero parts. The probability distributions of random matrices are often derived in terms of hypergeometric functions of matrix arguments. The following definitions of hypergeometric functions with a single and double matrix argument are due to Constantine [2] and Baker [1]. Definition 1. The hypergeometric function of one complex matrix is defined as

() p Fq (a1 , . . . , ap ; b1 , . . . , bq ; X) = k=0

[a1 ] · · · [ap ] C (X) , () () k! [b1 ] · · · [bq ]

()

()

()

(6)

where X Cm×m and {ai }p and {bi }q are arbitrary complex numbers. Note that denotes summation over i=1 i=1 all partitions of k and = 1 and 2 for complex and real hypergeometric functions, respectively. 2

In this paper we consider only the complex case, and hence, we shall drop the superscript, i.e., p Fq := p Fq . Note that none of the parameters bi is allowed to be zero or an integer or half-integer m - 1. Otherwise some of the terms in the denominator will be zero [12]. Remark 1. The convergence of (6) is as follows [12]: (i) If p q, then the series converges for all X. (ii) If p = q + 1, then the series converges for (X) < 1, where the spectral radius (X) of X is the maximum of the absolute values of the eigenvalues of X. (iii) If p > q + 1, then the series diverges for all X = 0, unless it terminates. Note that the series terminates when some of the numerators [aj ] in the series vanish. Special cases are

0 F0 (X)

(1)

= etr(X),

H

1 F0 (a; X)

= det(I - X)-a ,

and

0 F1 (n; ZZ

)=

U (n)

etr(ZE + ZE)(dE),

where Z is an m × n complex matrix with m n, etr denotes the exponential of the trace, etr(·) = exp(tr(·)) and ZE denotes the complex conjugate of ZE. Definition 2. The complex hypergeometric function of two complex matrices is defined by

p Fq (a1 , . . . , ap ; b1 , . . . , bq ; X, Y ) = k=0

[a1 ] · · · [ap ] C (X)C (Y ) , [b1 ] · · · [bq ] k!C (Im )

(7)

where X, Y Cm×m . The splitting formula is

p Fq (AEBE U (m) H

)(dE) = p Fq (A, B).

The following propositions and corollaries are required in the sequel. Proposition 1. If Y and Z are m × m Hermitian matrices with etr(-XZ)(det X)a-m C (XY )(dX) = Cm (a, )(det Z)-a C (Y Z -1 ), (8) where (a) > (m - 1) and etr(-XZ)(det X)a-m C (X -1 Y )(dX) = Cm (a, -)(det Z)-a C (Y Z), (9) where (a) > (m - 1) + k1 . Proof. Let Z = I and f (Y ) denote the left side of (8). Then f (EY E H ) =

X H =X>0

(Z) > 0, then

X H =X>0

X H =X>0

etr(-X)(det X)a-m C (XEY E H )(dX), E U (m).

If X = EW E H , then (dX) = (dW ) and f (EY E H ) = f (Y ). This implies that f is a symmetric function of Y . Moreover, (dE) is the normalized invariant measure on the unitary group U (m). Therefore we have f (Y ) =

U (m)

f (Y )(dE) etr(-X)(det X)a-m

X H =X>0 U (m)

(10) C (XEY E H )(dE)(dX)

=

C (X)C (Y ) = etr(-X)(det X)a-m (dX) C (Im ) X H =X>0 f (Im )C (Y ) = . C (Im ) 3

On the one hand, from Definition 7.2.1 in [12] we have f (Y ) = f (Im ) k km d y1 1 · · · ym + terms of lower weight. C (Im ) (11)

On the other hand, using Lemma 7.2.6 in [12] we have f (Y ) =

X H =X>0

etr(-X)(det X)a-m C (XY )(dX) etr(-X)(det X)a-m xk1 -k2 11 · · · det X km (dX) + terms of lower weight.

k km = d y1 1 · · · ym

× det

x11 x21

X H =X>0 k2 -k3 x12

x22

Substituting X = T H T and evaluating this integral we obtain

k km f (Y ) = d y1 1 · · · ym Cm (a, ) + terms of lower weight. k km Equating the coefficients of y1 1 · · · ym in (11) and (12) and using (10), we obtain

(12)

f (Y ) = Cm (a, )C (Y ). The rest of the proof for general Z can be obtained by substituting X = Z -1/2 V Z -1/2 . Similarly, we can prove the second part. The following corollary follows from the second part of Proposition 1 by letting Y = I. Corollary 1. Let Z be an m × m Hermitian matrix with (Z) > 0. Then (-1)k Cm (a) (det Z)-a C (Z) [-a + m] (13)

X H =X>0

etr(-XZ)(det X)a-m C (X -1 )(dX) =

for

(a) > k1 + (m - 1), where = (k1 , . . . , km ).

Proof. The result follows by noting that Cm (a, -) = (-1)k Cm (a) . [-a + m]

Proposition 2. Let Y be an m × m symmetric matrix. Then the following are true: (det X)a-m det(Im - X)b-m C (XY )(dX) =

0<X<Im

Cm (a, )Cm (b) C (Y ) Cm (a + b, )

(14)

for

(a) > (m - 1) and

(b) > (m - 1). Moreover, (det X)a-m det(Im - X)b-m C (X -1 Y )(dX) = Cm (a, -)Cm (b) C (Y ) Cm (a + b, -) (15)

0<X<Im

for

(a) > (m - 1) + k1 and

(b) > (m - 1).

Proof. As in the proof of Proposition 1, if f (Y ) denotes the left side of (14) then we have f (Y ) = f (EY E H ) E U (m) and f (Y )C (Im ) = f (Im )C (Y ). Letting Z = I and Y = I in (8) and then multiplying with f (Im ) we obtain the following Cm (a + b, )f (Im ) = =

W H =W >0

etr(-W )(det W )a+b-m f (W )(dW )

W H =W >0

etr(-W )(det W )a+b-m

0<X<Im

(det X)a-m

× det(Im - X)b-m C (W X)(dX)(dW ). 4

Let X = W -1/2 U W -1/2 . Then (dX) = (det W )-m (dU ) and Cm (a + b, )f (Im ) = etr(-W )

W H =W >0 0<U <W

(det U )a-m

× det(W - U )b-m C (U )(dU )(dW ) =

U H =U >0

etr(-U )(det U )a-m C (U )(dU ) etr(-V )(det V )b-m (dV ) (letting V = W - U )

×

V H =V >0

= Cm (a, )C (Im )Cm (b). This complete the proof, i.e., f (Im ) = Similarly, we can prove the second part. If b = m then we have the following corollary. Corollary 2. If Y is an m × m Hermitian matrix, then (det X)a-m C (XY )(dX) =

0<X<Im

Cm (a, )Cm (b) C (Im ). Cm (a + b, )

Cm (a)Cm (m) [a] C (Y ) Cm (a + m) [a + m]

(16)

for

(a) > (m - 1). Cm (a, ) = [a] Cm (a).

Proof. The result follows by noting that

3

The complex central Wishart matrix

In this section, we describe the complex central Wishart distribution and give the joint eigenvalue density of the complex central Wishart matrix. The largest and smallest eigenvalue distributions are derived in Subsections 3.1 and 3.2, respectively. These distributions are used in the hypothesis testing of the structure of the covariance matrix . The definition of the complex central Wishart distribution is given by Definition 3. Let W = AH A, where the n × m matrix A is distributed as A CN (0, In ). Then W is said to have the complex central Wishart distribution W CW m (n, ) with n degrees of freedom and covariance matrix . Let W CW m (n, ) with n m. Then the density of W is given by [5] f (W ) = 1 etr --1 W (det W )n-m . Cm (n)(det )n (17)

Moreover, W is an m×m positive definite Hermitian matrix with real eigenvalues. The joint density of the eigenvalues, 1 > · · · > m > 0, of W is f () = m(m-1) (det )-n Cm (m)Cm (n)

m m

n-m k

k=1 k<l

(k - l )2 0 F0 -, -1 ,

(18)

where = diag(1 , . . . , m ). If W CW m (n, 2 Im ) with n m, then the joint density of its eigenvalues is f () = m(m-1) ( 2 )-nm Cm (m)Cm (n)

m m

n-m k

k=1 k<l

(k - l )2 exp -

1 2

m

k

k=1

.

(19)

5

3.1

Distribution of max

In this subsection, we derive the distribution of the largest eigenvalue, max , of a central Wishart matrix and apply it to hypothesis testing. The following theorem is needed. Theorem 1. Let W CW m (n, ) (n m) and be an m × m positive definite matrix. Then the probability P (W < ) is given by Cm (m) (det )n -1 P (W < ) = ), (20) 1 F1 (n; n + m; - Cm (n + m) (det )n where

1 F1 (a; b; X)

=

k=0

[a] C (X) . [b] k!

Proof. Using the Wishart distribution (17) we can write P (W < ) as P (W < ) = 1 Cm (n)(det )n etr(--1 W )(det W )n-m (dW ).

0<W <

The change of variable W = 1/2 X1/2 leads to the differential form (dW ) = (det )m (dX). Hence, P (W < ) = = (det )n Cm (n)(det )n (det )n Cm (n)(det )n etr(-1/2 -1 1/2 X)(det X)n-m (dX)

0<X<I k=0

1 k!

I 0

(det X)n-m C (-1/2 -1 1/2 X) (dX)

Cm (m) (det )n = Cm (n + m) (det )n

k=0

[n] C (--1 ) [n + m] k!

Cm (m) (det )n -1 = ). 1 F1 (n; n + m; - Cm (n + m) (det )n Note that Corollary 2 is used in this proof. The following corollary follows from Theorem 1. Corollary 3. Let W CW m (n, ) (n m). If max is the largest eigenvalue of W , then its distribution is given by P (max < x) = Cm (m) xmn -1 ). 1 F1 (n; n + m; -x Cm (n + m) (det )n (21)

The density of max is obtain by differentiating (21) with respect to x. Proof. The inequality max < x is equivalent to W < xI. Therefore, the result follows by letting = xI in Theorem 1. The distributional result in Corollary 3 can be used to test hypotheses about using statistics which are functions of max . For example, consider the null hypothesis H0 : = Im . A test on the size of based on the largest eigenvalue max is to reject H0 if max > (, m, n), where (, m, n) is the upper 100 % point of the distribution of max when = Im , i.e., PIm (max > (, m, n)) = . The power function of this test is given by () = P (max > (, m, n)), which depends on only through its eigenvalues. The percentage points and power can be computed using the distribution function given in Corollary 3.

3.2

Distribution of min

In this subsection, we derive the distribution of the smallest eigenvalue, min , of a central Wishart matrix and used it to test the structure of the covariance matrix , as explained in the previous subsection. In addition, the distribution of min is useful in principal component analysis. Here it would be of interest to find out the number of eigenvalues which are significant in . The following theorem is used to derive the distribution of min . 6

Theorem 2. Let W CW m (n, ) (n m) and be an m × m positive definite matrix. Then the probability P (W > ) can be written as a finite series, i.e.,

m(n-m)

P (W > ) = etr --1

k=0

Ck -1 , k!

(22)

where

denotes summation over the partitions = (k1 , . . . , km ) of k with k1 n - m.

Proof. Using the Wishart distribution (17) we can write the probability P (W > ) as P (W > ) = 1 Cm (n)(det )n etr --1 W (det W )n-m (dW ).

W >

(23)

The change of variable W = 1/2 (I + X)1/2 leads to the differential form (dW ) = (det )m (dX). Hence, P (W > ) = etr(--1 )(det )n Cm (n)(det )n ×

X>0

etr -1/2 -1 1/2 X (det X)n-m (det(I + X -1 ))n-m (dX)

m(n-m)

=

etr(--1 )(det )n Cm (n)(det )n ×

X>0 m(n-m)

k=0

[-(n - m)] (-1)k k!

etr -1/2 -1 1/2 X (det X)n-m C (X -1 )(dX) C -1 . k!

= etr(--1 )

k=0

In this proof we have used the following det I + X -1

n-m

= 1 F0 (-(n - m); -X -1 )

m(n-m)

=

k=0

[-(n - m)] C (X -1 )(-1)k k!

and Corollary 1. Note that if any part of is greater than (n - m) then [-(n - m)] = 0. Therefore, the series for 1 F0 reduces to a finite series. The distribution of the smallest eigenvalue is given in the following corollary. Corollary 4. Let W CW m (n, ). If min is the smallest eigenvalue of W, then

m(n-m)

P (min > x) = etr -x-1

k=0

C x-1 , k!

(24)

where denotes summation over the partitions = (k1 , . . . , km ) of k with k1 n - m. The density of min is obtain by differentiating (24) with respect to x and then changing the sign. Proof. The inequality min > x is equivalent to W > xI. Therefore, the result follows by letting = xI in Theorem 2. As a numerical example, we compute the smallest eigenvalue distribution of the complex central Wishart matrix for m = 2, n = 10 and 1 0.25 + 0.25i = . 0.25 - 0.25i 1 The distribution is defined by P (min < x) = 1 - P (min > x), where P (min > x) is given in (24). Let F = P (min < x). Figure 1 shows this distribution of min . 7

1.0 0.9 0.8 0.7 0.6 F 0.5 0.4 0.3 0.2 0.1 0.0

1

2

3

4

5

x

6

7

8

9

10

Figure 1: The smallest eigenvalue distribution of the complex central Wishart matrix.

4

Distribution of cond(A)

Many scientific problems lead to solving a random system of linear equations. The condition number distribution of this random matrix indicates how many digits of numerical precision are lost due to ill conditioning. In addition, if a random system is solved by an iterative technique then the condition number distribution describes the speed of convergence of this iterative method (e.g., conjugate gradient method). The condition number, cond(A), can also be defined (see [14] and [3]) as the smallest number x b cond(A) x b for all x and x such that Ax = b and A(x + x) = b + b. By taking the logarithm on both sides, we have (log x - log x ) - (log b - log b ) log cond(A). This shows that the number of correct digits in x can differ from the number of correct digits in b by at most log cond(A). In [14], the loss of precision is denoted by log cond(A). Problems where cond(A) is large are referred to as ill conditioned and such problems are characterized by very elongated elliptical level sets. Iterative methods converge slowly for these problems. These facts illustrate the importance of the condition number distribution for solving random systems. If A CN (0, I ) or A CN (0, I 2 I), then the condition number distributions of A and W = AH A are not available in the literature. We derive these distributions in the sequel. First, we derive the joint density of the extreme eigenvalues of the complex central Wishart matrix W = AH A, i.e., f (max , min ). This will enable us to compute the distribution of the condition number of the random matrix A. The following two lemmas are required in the sequel. Lemma 1. Let = diag(1 , . . . m ) and D1 = {1 > 1 > · · · > m > 0}. Then

m m

(det )a-m det(I - )b-m

D1 k<l

(k - l )2 C ()

k=1

dk = Cm (m) Cm (a, )Cm (b) C (I). (25) m(m-1) Cm (a + b, )

Proof. The result follows by letting Y = I and X = EE H in (14) and using the differential form

m

(dX) =

k<l

(k - l ) (d)(E dE) with

2

H

2m m (E dE) = . Cm (m) U (m)

H

2

We must then divide the left side of (14) by (2)m . 8

Lemma 2. Let Z = diag(2 , . . . , m ), Z1 = diag(1, 2 , . . . , m ) and D2 = {1 > 2 > · · · > m > 0}. Then

m m m

(det Z)a-m

D2 k=2

(1 - k )2

k<l

(k - l )2 C (Z1 )

k=2

dk = (ma + k) Cm (m) Cm (a, )Cm (m) C (I). (26) m(m-1) Cm (a + m, )

m

Proof. Let b = m and k = k /1 , k = 2, . . . , m. Then the left side of (25) becomes

1 0 m m

ma+k-1 d1 1

1 0

(det Z)a-m

D2 k=2

(1 - k )2

k<l

(k - l )2 C (Z1 )

k=2

dk .

(27)

The result follows by noting that

ma+k-1 d1 = 1/(ma + k). 1

The following theorem describes the joint density of the extreme eigenvalues of the central complex Wishart matrix. Theorem 3. Let W CW m (n, ). The joint distribution of 1 (= max ) and m (= min ) of W is given by f (1 , m ) = m(m-1) (det )-n exp(-m1 ) Cm (m)Cm (n)

×

t=0 ,

(m - n) g, (1 -

k=0 m /1 )(m-1)(m+1)+t+k-1

mn+k-1 C (-1 ) 1 k! C (I)

(28)

t!

× [(m - 1)(m + 1) + k + t]

Cm-1 (m - 1) (m-1)(m-2) Cm-1 (m + 1, )Cm-1 (m - 1) × C (I), Cm-1 (2m, )

where g, is the coefficient of C (defined in the proof ).

Proof. Consider equation (18). By making the transformations 1 = 1 , k = 1 - k /1 , k = 2, . . . , m, we obtain the joint density of 1 , 2 , . . . , m as m(m-1) (det )-n exp(-m1 ) Cm (m)Cm (n)

mn+k-1 (det H)2 det(I - H)n-m 1

k=0

× where H = diag(2 , . . . , m ). We have [9]

C (H)C (-1 ) (i - j )2 , 0 < 1 < , 0 < 2 < · · · < m < 1, k! C (I) i>j=2

m

det(I - H)n-m C (H) =

t=0

(-(n - m)) C (H)C (H) t!

(-(n - m)) g, C (H) , t!

=

t=0

where g, is the coefficient of C (H) in the product C (H)C (H), = (1 , . . . , m ), 1 · · · m 0 and m i = k + t. Again, by making the transformations 1 = 1 , k = k /m , k = 2, . . . , m - 1, and m = m , we i=1 obtain the joint density of 1 , 2 , . . . , m-1 , and m as

m(m-1) (det )-n exp(-m1 ) Cm (m)Cm (n)

k=0

mn+k-1 C (-1 ) 1 k! C (I)

(m - n) g, m t! (m-1)(m+1)+t+k-1

×

t=0 ,

m-1

m-1

× (det Z) C (Z1 )

i=2

2

(1 - i )

2 i>j=2

(i - j )2 ,

9

where Z = diag(2 , . . . , m-1 ) and Z1 = diag(1, 2 , . . . , m-1 ). Integrating with respect to 2 , . . . , m-1 and using Lemma 2, we obtain the joint density of 1 and m as g(1 , m ) = m(m-1) (det )-n exp(-m1 ) Cm (m)Cm (n)

×

t=0 ,

(m -

k=0 (m-1)(m+1)+t+k-1 n) g, m

mn+k-1 C (-1 ) 1 k! C (I) [(m - 1)(m + 1) + k + t]

(29)

t!

×

Cm-1 (m - 1) Cm-1 (m + 1, )Cm-1 (m - 1) C (I). Cm-1 (2m, ) (m-1)(m-2)

Finally, the result follows by substituting m = 1 - m /1 . Theorem 4. Let W = AH A CW m (n, ). Since cond(A)2 = 1 /m , then the density of y = 1 - 1/ cond(A)2 is given by f (y) = m(m-1) (det )-n Cm (m)Cm (n)

×

t=0 ,

(m -

k=0 n) g, y (m-1)(m+1)+t+k-1

(mn + k)C (-1 ) mmn+k k! C (I) [(m - 1)(m + 1) + k + t]

(30)

t!

×

Cm-1 (m - 1) Cm-1 (m + 1, )Cm-1 (m - 1) C (I). Cm-1 (2m, ) (m-1)(m-2)

0

Proof. The result follows by integrating (29) with respect to 1 and substituting y = m . Note that we have e-m1 mn+k-1 d1 = 1 (mn + k) . mmn+k

If = 2 I, then the corresponding results for Theorems 3 and 4 can be derived using a similar method. However, we provide an alternative approach as follows. Theorem 5. Let = 2 I. The joint density of 1 (= max ) and m (= min ) of a central Wishart matrix is given by f (1 , m ) = m(m-1) ( 2 )-nm (m-1)(n-m-1)+m 1 1 exp - 2 [(m - 1)1 - m ] n-m m Cm (m)Cm (n) × (1 - m )m where (, m, r, L, U ) =

D3 k=1 m m

2

-2

(; m - 2, 2, 0, 1), 0 < m < 1 < , (31)

m

(xr (xk )) k

(xk - xl )2

k>l=1 k=1

dxk ,

(32)

and (x) = (1 - x)2 (1 - x - (m /1 )x)n-m exp

1 2 (1

- m )x with D3 = {L x1 · · · xq U }.

Proof. Consider equation (19). By making the transformations 1 = 1 , k = 1 - k /1 , k = 2, . . . , m, we obtain the joint density of 1 , 2 , . . . , m as m(m-1) ( 2 )-nm mn-1 1 exp - 2 m1 Cm (m)Cm (n) 1

m

×

k=2

2 k (1 - k )n-m exp

1 1 k 2

m

(k - l )2 ,

k>l=2

where 0 < 1 < and 0 < 2 < · · · < m < 1. Again, by making the transformations 1 = 1 , k = k /m , k = 2, . . . , m - 1, and m = m , we obtain the joint density of 1 , 2 , . . . , m-1 , and m as 1 m(m-1) ( 2 )-nm mn-1 m2 1 exp - 2 1 (m - m ) m -2 (1 - m )n-m Cm (m)Cm (n)

m-1

×

k=2

2 k (1 - k )2 (1 - m k )n-m exp

1 1 m k 2

m-1

(k - l )2 ,

k>l=2

10

where 0 < 1 < , 0 < 2 < · · · < m-1 < 1, and 0 < m < 1. Upon integration with respect to 2 , . . . , m-1 , the joint density, g(1 , m ), of 1 and m is given by m(m-1) ( 2 )-nm mn-1 1 m2 1 exp - 2 1 (m - m ) m -2 (1 - m )n-m (; m - 2, 2, 0, 1) Cm (m)Cm (n) where (x) = (1 - x)2 (1 - m x)n-m exp substituting m = 1 - m /1 .

1 2 1 m x

(33)

, 0 < 1 < , and 0 < m < 1. Now, the result follows by

Theorem 6. Let W = AH A CW m (n, 2 I). Since cond(A)2 = 1 /m , then the density of y = 1 - 1/ cond(A)2 is given by

f (y) =

0

g(1 , m ) d1 ,

0 < y < 1.

(34)

Proof. The proof is obvious from (33). It should be noted that the joint density of the extreme eigenvalues of a real central Wishart matrix is studied in [17], [20] and [21]. The density given in Theorem 6 may be used to test the sphericity hypothesis H0 : = 2 I against the alternative H1 : = 2 I, see [17]. It may also be used to test the sphericity hypothesis against the alternative that any two eigenvalues of are unequal. Consider a following numerical example for testing the sphericity hypothesis. A sequence of 23 complex signals is received at the output of the communication system. Let the number of output of the system be 3 (m = 3). The sample covariance matrix is given by 150.77 78.15 + 15.12i 71.05 S = 78.15 - 15.12i 35.32 - 10.15i 23.65 - 10.12i 35.32 + 10.15i 23.65 + 10.12i . 12.26

Assume the signal is the complex multivariate normal with population covariance matrix . Then W = 22S is complex central Wishart distribution with 22 degrees of freedom, W CW 3 (22, ). We wish to test the sphericity hypothesis H0 : = 2 I, 2 unknown against H1 : any two eigenvalues of are unequal at the significance level = 0.05. Let 1 > 2 > 3 be the eigenvalues of W . Then the critical region is given by 1 = cond(W ) c, 3 where the constant c is chosen to make the significance level equal to 0.05. This critical region can be written equivalently as y d, where y = 1 - 3 /1 , the density f (y) is given in (34) and d is a constant chosen to make the significance level equal to 0.05. Thus d is chosen such that

1

PH0 (y d) =

f (y) dy = 0.05.

d

A numerical evaluation of this probability shows that d = 0.7 with m = 3, n = 22 and 2 = 1. For the measured data we have y = 1 - 3 /1 = 1 - 2/209.88 = 0.9905 which is highly significant at the 5% level and so we reject the sphericity hypothesis. If W CW 3 (22, I3 ), it also follows from this calculation that P (cond(W ) > 1/(1 - d) = 0.05.

5

Conclusion

In this paper, the distributions of the largest and smallest eigenvalues of a complex Wishart matrix were derived for an aribitary covariance marix and the joint distributions of the extreme eigenvalues were also derived. Using these distributions we derived the condition number distributions of complex random matrices. These distributions play an important role in numerical analysis and statistical hypothesis testing. 11

References

[1] T. Baker and P. Forrester, The Calogero-Sutherland model and generalized classical polynomials, Commun. Math. Phys., 188 (1997), pp. 175­216. [2] A. G. Constantine, Some noncentral distribution problems in multivariate analysis, Ann. Math. Statist., 34 (1963), pp. 1270­1285. [3] A. Edelman, Eigenvalues and condition numbers of random matrices, SIAM J. Matrix Anal. Appl., 9 (1998), pp. 543­560. [4] H. H. Goldstine and J. von Neumann, Numerical inverting of matrices of high order II, Amer. Math. Soc. Proc., 2 (1951), pp. 188­202. [5] A. T. James, Distributions of matrix variate and latent roots derived from normal samples, Ann. Math. Statist., 35 (1964), pp. 475­501. [6] I. M. Johnstone, On the distribution of the largest eigenvalue in principal components analysis, Ann. Statist., 29 (2001), pp. 295­327. [7] C. G. Khatri, Distribution of the largest or the smallest characteristic root under null hypothesis concerning complex multivariate normal populations, Ann. Math. Statist., 35 (1964), pp. 1807­1810. [8] C. G. Khatri, On certain distribution problems based on positive definite quadratic functions in normal vectors, Ann. Math. Statist., 37 (1966) pp. 468­479. [9] C. G. Khatri and K. C. S. Pillai, On the non-central distributions of two test criteria in multivariate analysis of variance, Ann. Math. Statist., 39 (1968), pp. 215­226. [10] I. G. Macdonald, Symmetric Functions and Hall Polynomials, Oxford University Press Inc., New York, 1995. [11] M. L Mehta, Random Matrices, 2nd ed., Academic Press, New York, 1991. [12] R. J. Muirhead, Aspects of Multivariate Statistical Theory, Wiley, New York, 1982. [13] J. Shen, On the singular values of Gaussian random matrices, Linear Algebra Appl., 326 (2001), pp. 1­14. [14] S. Smale, On the efficiency of algorithms of analysis, Bull. Amer. Math. Soc., 13 (1985), pp. 87­121. [15] R. P. Stanley, Some combinatorial properties of Jack Symmetric Functions, Adv. Math. 77, (1989), pp. 76­115. [16] T. Sugiyama, On the distribution of the largest latent root of the covariance matrix, Ann. Math. Statist., 38 (1967), pp. 1148­1151. [17] T. Sugiyama, Joint distribution of the extreme roots of a covariance matrix, Ann. Math. Statist., 41 (1970), pp. 655­657. [18] T. Sugiyama, Distributions of the largest latent root of the multivariate complex Gaussian distribution, Ann. Inst. Statist. Math., 24 (1972), pp. 87­94. [19] J. von Neumann and H. H. Goldstine, Numerical inverting of matrices of high order, Bull. Amer. Math. Soc., 53 (1947), pp. 1021­1099. [20] V. B. Waikar and F. J. Schuurmann, Exact joint density of the largest and smallest roots of the Wishart and MANOVA matrices, Util. Math., 4 (1973), pp. 253­260. [21] V. B. Waikar, On the joint distributions of the largest and smallest latent roots of two random matrices (noncentral case), South African Stat. J., 7 (1973), pp. 103­108. [22] J. H. Wilkinson, Error analysis revisited, IMA Bulletin, 22 (1986), pp. 192­200.

12

Information

16 pages

Report File (DMCA)

Our content is added by our users. We aim to remove reported files within 1 working day. Please use this link to notify us:

Report this file as copyright or inappropriate

1220712