Performance Bounds in Parameter Estimation with Application to
Transcription
Performance Bounds in Parameter Estimation with Application to
Performance Bounds in Parameter Estimation with Application to Bearing Estimation A dissertation submitted in partial fulllment of the requirements for the degree of Doctor of Philosophy at George Mason University by Kristine LaCroix Bell B.S. Electrical Engineering, Rice University, 1985 M.S. Electrical Engineering, George Mason University, 1990 Dissertation Director: Prof. Yariv Ephraim Electrical and Computer Engineering Spring Semester 1995 George Mason University Fairfax, Virginia ii Acknowledgments I would like to express my gratitude to Professor Harry L. Van Trees for giving me the opportunity to pursue this work in the C3I Center at George Mason University, and for his generous support. I have benetted greatly from his direction and encouragement, and from his extensive knowledge of detection, estimation, and modulation theory, as well as array processing. I am deeply indebted to Professor Yariv Ephraim, under whose direct supervision this work was performed. Working with Professor Ephraim has been a very protable and enjoyable experience, and his willingness to share his knowledge and insight has been invaluable to me. I am also grateful to Dr. Yossef Steinberg, whose close collaboration I enjoyed during the year he visited here. Professors Edward Wegman and Ariela Sofer deserve thanks for serving on my committee, and for reviewing this thesis. Finally, I thank my husband Jamie, and our daughters, Julie and Lisa, for all their love, encouragement, and patience. This work was supported by Rome Laboratories contracts F30602-92-C-0053 and F30602-94-C-0051, Defense Information Systems Agency contract DCA-89-0001, the Virginia Center For Innovative Technology grant TDC-89-003, the School of Information Technology and Engineering at George Mason University, and two Armed Forces Communications and Electronics Association (AFCEA) Fellowships. iii Table of Contents Acknowledgments List of Figures Abstract ii iv v 1 Introduction 1 2 Bayesian Bounds 6 2.1 Ziv-Zakai Bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 Covariance Inequality Bounds . . . . . . . . . . . . . . . . . . . . . . 9 2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3 Extended Ziv-Zakai Bound 3.1 Scalar Parameter with Arbitrary Prior . 3.2 Equally Likely Hypothesis Bound . . . . 3.3 Single Test Point Bound . . . . . . . . . 3.4 Bound for a Function of a Parameter . . 3.5 M -Hypothesis Bound . . . . . . . . . . . 3.6 Arbitrary Distortion Measures . . . . . . 3.7 Vector Parameters with Arbitrary Prior . 3.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 15 19 21 23 26 29 30 36 4 Relationship of Weiss-Weinstein Bound to Extended Ziv-Zakai Bound 38 5 Probability of Error Bounds 42 6 Examples 47 6.1 Estimation of a Gaussian Parameter in Gaussian Noise . . . . . . . . 47 6.2 Bearing Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 6.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 7 Concluding Remarks References A Proof of Vector M -Hypothesis Bound 78 81 89 iv List of Figures 1.1 2.1 3.1 3.2 3.3 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10 6.11 6.12 6.13 6.14 A.1 MSE Behavior in nonlinear estimation problems . . . . . . . . . . . Valley lling function. . . . . . . . . . . . . . . . . . . . . . . . . . Choosing g() for function of a parameter. . . . . . . . . . . . . . . Scalar parameter M -ary detection problem. . . . . . . . . . . . . . Vector parameter binary detection problem. . . . . . . . . . . . . . Geometry of the single source bearing estimation problem using a planar array. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Uniform linear array. . . . . . . . . . . . . . . . . . . . . . . . . . . EZZB evaluated with Pierce, SGB, and Bhattacharyya lower bounds, and Pierce and Cherno upper bounds for 8-element linear array and uniform distribution. . . . . . . . . . . . . . . . . . . . . Comparison of normalized bounds for 8-element linear array and uniform distribution. . . . . . . . . . . . . . . . . . . . . . . . . . . Comparison of normalized bounds for 8-element linear array and cosine squared distribution. . . . . . . . . . . . . . . . . . . . . . . Comparison of normalized bounds for bearing estimation with 8-element linear array and cosine distribution. . . . . . . . . . . . . Square array. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Beampattern of 16-element square array. . . . . . . . . . . . . . . . The function f () for 16-element square array for SNR=-14 dB. . . Impact of maximization and valley-lling for 16-element square array for SNR=-14 dB. . . . . . . . . . . . . . . . . . . . . . . . . . Comparison of normalized vector bounds for 16-element square array and uniform distribution. . . . . . . . . . . . . . . . . . . . . Beampattern of 16-element circular array. . . . . . . . . . . . . . . The function f () for 16-element circular array for SNR=-14 dB. . . Comparison of normalized vector bounds for 16-element circular array and uniform distribution. . . . . . . . . . . . . . . . . . . . . Vector parameter M -ary detection problem. . . . . . . . . . . . . . . . . . . 2 8 25 26 32 . 52 . 56 . 58 . 60 . 62 . . . . 65 67 68 69 . 71 . 72 . 74 . 75 . 76 . 91 Abstract PERFORMANCE BOUNDS IN PARAMETER ESTIMATION WITH APPLICATION TO BEARING ESTIMATION Kristine LaCroix Bell, Ph.D. George Mason University, 1995 Dissertation Director: Prof. Yariv Ephraim Bayesian lower bounds on the minimum mean square error (MSE) in estimating a set of parameters from noisy observations are studied. These include the Ziv-Zakai, Weiss-Weinstein, and Bayesian Cramer-Rao bounds. The focus of this dissertation is on the theory and application of the Ziv-Zakai bound. This bound relates the MSE in the estimation problem to the probability of error in a binary hypothesis testing problem, and was originally derived for problems involving a single uniformly distributed parameter. In this dissertation, the Ziv-Zakai bound is generalized to vectors of parameters with arbitrary prior distributions. In addition, several extensions of the bound and some computationally useful forms are derived. The extensions include a bound for estimating a function of a random parameter, a tighter bound in terms of the probability of error in a multiple hypothesis testing problem, and bounds for distortion measures other than MSE. A relationship between the extended Ziv-Zakai bound and the Weiss-Weinstein bound is also presented. The bounds developed here, as well as the Weiss-Weinstein and Bayesian Cramer-Rao bounds, are applied to a series of bearing estimation problems, in which vi the parameters of interest are the directions-of-arrival of signals received by an array of sensors. These are highly nonlinear problems for which evaluation of the exact performance is intractable. For this application, the extended Ziv-Zakai bound is shown to be tighter than the other bounds in the threshold and asymptotic regions. Chapter 1 Introduction Lower bounds on the minimum mean square error (MSE) in estimating a set of parameters from noisy observations are of considerable interest in many elds. Such bounds provide the unbeatable performance of any estimator in terms of the MSE. Furthermore, good lower bounds are often used in investigating fundamental limits of the parameter estimation problem at hand. Since evaluation of the exact minimum MSE is often dicult or even impossible, good computable bounds are sought. The focus of this dissertation is on the theory and application of such bounds. Parameter estimation problems arise in many elds such as signal processing, communications, statistics, system identication, control, and economics. An important example is the estimation of the bearings of point sources in array processing [1], which has applications in radar, sonar, seismic analysis, radio telemetry, tomography, and anti-jam communications [2, 3]. Other examples include the related problems of time delay estimation [4, 5], and frequency oset or Doppler shift estimation [4] used in the above applications as well as in communication systems [6], and estimation of the frequency and amplitude of a sinusoidal signal [7]. These are highly nonlinear problems for which evaluation of the exact performance is intractable. In non-linear estimation problems, several distinct regions of operation can be observed. Typical performance is shown in Figure 1.1. 1 2 MSE maximum MSE ambiguity asymptotic region region threshold SNR Figure 1.1 MSE Behavior in nonlinear estimation problems In the small error or asymptotic region, which is characterized by high signal-tonoise-ratio (SNR) and/or long observation time, estimation errors are small. In the ambiguity region, in which SNR and/or observation time is moderate, large errors occur. The transition between the two regions can be abrupt and the location of the transition is called the threshold. When SNR and/or observation time are very small, the observations provide very little information and the MSE is close to that obtained from the prior knowledge about the problem. We are interested in bounds which closely characterize performance in both the asymptotic and ambiguity regions, and accurately predict the location of the threshold. The most commonly used bounds are the Cramer-Rao [8]-[13], Barankin [14], Ziv-Zakai [15]-[17], and Weiss-Weinstein [18]-[22] bounds. The Cramer-Rao and Barankin bounds are local bounds which treat the parameter as an unknown de- 3 terministic quantity, and provide bounds on MSE for each possible value of the parameter. They are members of the family of local \covariance inequality" bounds [23, 24], which includes the Bhattacharyya [25], Hammersley-Chapman-Robbins [26, 27], Fraser-Guttman [28], Kiefer [29], and Abel [30] bounds. The CramerRao bound is generally the easiest to compute but is known to be useful only in the asymptotic region of high SNR and/or long observation time. The Barankin bound has been used for threshold and ambiguity region analysis but is harder to implement as it requires maximization over a number of free variables. Local bounds suer from the drawback that they are applicable to a restricted class of estimators, usually the class of unbiased estimators. Restrictions must be imposed in order to avoid the trivial bound of zero obtained when the estimator is chosen constant equal to a selected parameter value. Such restrictions, however, limit the applicability of the bound since biased estimators are often unavoidable. For example, unbiased estimators do not exist in the commonly encountered situation of a parameter whose support is nite [15]. Another drawback of local bounds is that they cannot incorporate prior information about the parameter, such as its support. Thus the lower bound may exceed the maximum MSE possible in a given problem. The Ziv-Zakai bound (ZZB) and Weiss-Weinstein bound (WWB) are Bayesian bounds which assume that the parameter is a random variable with known prior distribution. They provide bounds on the global MSE averaged over the prior distribution. There are no restrictions on the class of estimators to which they are applicable, but they can be strongly inuenced by the parameter values which produce the largest errors. The ZZB was originally derived in [15], and improved by Chazan, Zakai, and Ziv [16] and Bellini and Tartara [17]. It relates the MSE in the estimation problem to the 4 probability of error in a binary hypothesis testing problem. The WWB is a member of a family of Bayesian bounds derived from a \covariance inequality" principle. This family includes the Bayesian Cramer-Rao bound [31], Bayesian Bhattacharyya bound [31], Bobrovsky-Zakai bound [32] (when interpreted as a bound for parameter estimation), and the family of bounds of Bobrovsky, Mayer-Wolf, and Zakai [33]. The WWB is applicable to vectors of parameters with arbitrary prior distributions, while the ZZB was derived only for a single uniformly distributed random variable. The bounds have been applied to a variety of problems in [4], [15]-[18],[34][43], where they have proven to be some of the tightest available bounds for all regions of operation. The WWB and ZZB are derived using dierent techniques and no underlying theoretical relationship between the two bounds has been developed, therefore comparisons between the two bounds have been made only through computational examples. The WWB tends to be tighter in the very low SNR region, while the ZZB tends to be tighter in the asymptotic region and provides a better prediction of the threshold location [18, 43]. In this dissertation we focus on Bayesian bounds. The major contribution is an extension of the ZZB to vectors of parameters with arbitrary prior distributions. Such extension has long been of interest especially in the array processing area [18, 44], where the problem is inherently multidimensional and not all priors may be assumed uniform. The theory of the extended Ziv-Zakai bound (EZZB) is investigated, and several computationally useful forms of the bound are derived, as well as further extensions including a bound for estimating a function of a random variable, a tighter bound in terms of the probability of error in an M -ary hypothesis testing problem for M 2, and bounds on distortion measures other than MSE. The relationship between the Weiss-Weinstein family of bounds and the extended 5 Ziv-Zakai bound is also explored, and a new bound in the Weiss-Weinstein family is proposed which can also be derived in the Ziv-Zakai formulation. The bounds developed here are applied to a series of bearing estimation problems. Lower bounds on MSE for bearing estimation have attracted much attention in recent years (see e.g. [36]-[56]), and the emphasis has been mainly on the CramerRao and Barankin bounds. In the bearing estimation problem, the parameter space is limited to a nite interval and no estimators can be constructed which are unbiased over the entire interval, therefore these bounds may not adequately characterize performance of real estimators. The ZZB has not been widely used due to its limitation to a single uniformly distributed parameter. In the examples, we compute the EZZB for arbitrarily distributed vector parameters and compare it to the WWB and Bayesian Cramer-Rao bound (BCRB). The EZZB is shown to be the tightest bound in the threshold and asymptotic regions. The dissertation is organized as follows. In Chapter 2, the parameter estimation problem is formulated and existing Bayesian bounds are summarized. In Chapter 3, extension of the Ziv-Zakai bound to arbitrarily distributed vector parameters is derived. Properties of the bound are discussed and the extensions mentioned earlier are developed. In Chapter 4, the relationship of the EZZB to the WWB is explored. In Chapter 5, probability of error expressions and bounds which are needed for evaluation of the extended Ziv-Zakai bounds, are presented. In Chapter 6, the bounds are applied to several bearing estimation problems and compared with the WWB and BCRB. Concluding remarks and topics for further research are given in Chapter 7. Chapter 2 Bayesian Bounds Consider estimation of a K -dimensional vector random parameter based upon the noisy observation vector x. Let p() denote the prior probability density function (pdf) of and p(xj) the conditional pdf of x given . For any estimator, ^ (x), the estimation error is = ^ (x) , , and the error correlation matrix is dened as n o R = E T : (2.1) We are interested in lower bounding aT Ra, for any K -dimensional vector a. Of special interest is the case when a is the unit vector with a one in the ith position. This choice yields a bound on the MSE of the ith component of . The minimum MSE estimator is the conditional mean estimator [31, p. 75]: Z ^ (x) = E fjxg = p( jx)d (2.2) and its MSE is the greatest lower bound. The minimum MSE can be very dicult or even impossible to evaluate in many situations of interest, therefore good computable lower bounds are sought. Bayesian bounds fall into two families: the Ziv-Zakai family, which relate the MSE to the probability of error in a binary hypothesis testing problem, and the \covariance inequality" or Weiss-Weinstein family, which are derived using the Schwarz inequality. 6 7 2.1 Ziv-Zakai Bounds This family includes the original bound developed by Ziv and Zakai [15], and improvements by Seidman [4], Chazan, Zakai, and Ziv [16], Bellini and Tartara [17], and Weinstein [57]. Variations on these bounds may also be found in [58] and [59]. They all relate the MSE in the estimation problem to the probability of error in a binary hypothesis testing problem and were derived for the special case when is a scalar parameter uniformly distributed on [T0; T1]. The Bellini-Tartara bound is the tightest of these bounds. It is based upon the relation [60, p. 24]: Z1 (2.3) 2 = 2 Pr jj 2 d 0 and Kotelnikov's inequality [6]: (Z T1, ) 2 Pr jj 2 T , T V Pmin(; + )d ; T0 1 0 (2.4) where Pmin(; + ) is the minimum probability of error in the binary hypothesis testing problem H0 : ; Pr(H0 ) = 21 ; x p(xj) H1 : + ; Pr(H1 ) = 12 ; x p(xj + ); (2.5) and V fg is the \valley-lling" function illustrated in Figure 2.1. For any function f (), V ff ()g is a non-increasing function of obtained by lling in any valleys in f (), i.e. for every , V ff ()g = max f ( + ): 0 Combining Eqs. (2.3) and (2.4), the Bellini-Tartara bound on MSE is: (Z T1, ) Z T1,T0 2 0 T1 , T0 V T0 Pmin(; + )d d: (2.6) (2.7) If valley-lling is omitted, the weaker bound of Chazan-Zakai-Ziv [16] is obtained. 8 ν {f(h)} f(h) h Figure 2.1 Valley lling function. Weinstein [57] derived an approximation to (2.7) in which the integral over is replaced by a discrete sum: r 2 , 2 Z T1 ,i X i i,1 2 fmax Pmin(; + i)d (2.8) r 2( T , T i gi=1 1 0 ) T0 i=1 where 0 = 0 < 1 < < r T1 , T0. Although (2.8) is weaker than (2.7) for any choice of the test points figri=1, it can be tighter than the Chazan-Zakai-Ziv bound, and is easier to analyze. When a single test point is used (r = 1), the bound becomes 2 Z T1, P (; + )d; 2 max (2.9) min 2(T , T ) T 1 0 0 which is a factor of two better than the original Ziv-Zakai bound [15]. These bounds have been shown to be useful bounds for all regions of operation [4],[15]-[17], [34]-[38],[43], but their applicability is limited to problems involving a single, uniformly distributed parameter, and to problems in which the probability of error is known or can be tightly lower bounded. 9 2.2 Covariance Inequality Bounds Bounds in this family include the MSE of the conditional mean estimator (the minimum MSE), and the Bayesian Cramer-Rao [31], Bayesian Bhattacharyya [31], Bobrovsky-Zakai [32], Weiss-Weinstein [18]-[22], and Bobrovsky-Mayer-Wolf-Zakai [33] bounds. Using the Schwarz inequality, Weiss and Weinstein [18]-[22] showed that for any function (x; ) such that E f (x; )j xg = 0, the global MSE for a single, arbitrarily distributed random variable can be lower bounded by: 2 f (x; )g E 2 E f 2(x; )g : (2.10) The function (x; ) denes the bound. The minimum MSE is obtained from: (x; ) = E fjxg , : The Bayesian Cramer-Rao Bound is obtained by selecting p(x; ) ; (x; ) = @ ln @ which yields the bound [31, p. 72]: (2.11) (2.12) 2 J ,81 9 ( 2 ) < @ ln p(x; ) !2= @ ln p ( x ; ) J = E: : (2.13) ; = ,E @ @2 In order to evaluate this bound, the joint distribution must be twice dierentiable with respect to the parameter [31, 18]. This requirement can be quite restrictive as the required derivatives will not exist when the parameter space is a nite interval and the prior density is not suciently smooth at the endpoints, such as when the parameter is uniformly distributed over a nite interval. Bobrovsky, Mayer-Wolf, and Zakai generalized the BCRB using a weighting function q(x; ): ; )q(x; )] : (x; ) = q(x; ) @ ln [p(x@ (2.14) 10 Appropriate choice of q(x; ) allows for derivation of useful bounds even when the derivatives of the density function do not exist. The Bobrovsky-Zakai bound uses the nite dierence in place of the derivative: , p(x; ) : (x; ) = p(x1; ) p(x; + ) (2.15) It must be optimized over and converges to the BCRB as ! 0. This bound does not require the existence of derivatives of the joint pdf (except in the limit when it converges to the BCRB), but it does require that whenever p(x; ) = 0 then p(x; + ) = 0. This is also quite restrictive and is not satised, for example, when the prior distribution of the parameter is dened over a nite interval. The bound (2.10) can be generalized to include vector functions (x; ) with the property E f (x; )j xg = 0, as follows: 2 uT V,1u (2.16) ui = E f i(x; )g (2.17) Vij = E f i(x; ) j (x; )g: (2.18) where The Bayesian Bhattacharyya bound of order r is obtained by choosing: i(x; ) = @ i ln p(x; ) ; @i i = 1; : : : ; r (2.19) The bound becomes tighter as r increases and reduces to the BCRB when m = 1. It requires the existence of the higher order derivatives of the joint pdf [31, 18]. Weiss and Weinstein proposed the following r-dimensional (x; ): s 1,s i (x; ) = L i (x; +i; ) , L i (x; , i; ); 0 < si < 1; i = 1; : : : ; r (2.20) 11 where L(x; 1; 2) is the (joint) likelihood ratio: L(x; 1; 2) = pp((xx;; 1)) : (2.21) 2 The BCRB and Bobrovsky-Zakai bounds are special cases of this bound. The BCRB is obtained for r = 1 and any s, in the limit when ! 0, and the Bobrovsky-Zakai bound is obtained when r = 1 and s = 1. The WWB does not require the restrictive assumptions of the previous bounds, except in the limit when they converge, but must be optimized over the free variables fsigri=1 and figri=1. The variables figri=1 are usually called \test points". Increasing the number of test points always improves the bound. When a single test point is used (r = 1), the WWB has the form: 2e2(s;) 2 max s; e(2s;) + e(2,2s;,) , 2e(s;2) 0<s<1 (2.22) where (s; ) = ln E fLs(x; + ; )g Z Z s 1 , s s 1 , s = ln p( + ) p() p(xj + ) p(xj) dx d Z = ln p( + )sp()1,s e(s;+;) d: (2.23) The term (s; + ; ) is the semi-invariant moment generating function which is used in bounding the probability of error in binary hypothesis testing problems [31, 67]. It is important to note that in order to avoid singularities, the integration with respect to is performed over the region = f : p() > 0g. Although valid for any s 2 (0; 1), the WWB is generally computed using s = 12 , for which the single test point bound simplies to: 2e2( 21 ;) : 2 max 2 1 , e( 21 ;2) (2.24) 12 These scalar parameter bounds can all be generalized for vector parameters. The vector parameter bounds have the form: aT Ra aT UT V,1 Ua (2.25) where Uij = E fi j (x; )g (2.26) Vij = E f i(x; ) j (x; )g: (2.27) The multiple parameter BCRB is obtained by selecting @ ln p(x; ) i (x; ) = @i which yields [31]: (2.28) aT Ra aT(J,1a ) ( 2 ) @ ln p ( x ; ) @ ln p ( x ; ) @ ln p(x; ) = ,E : (2.29) Jij = E @ @ @ @ i j i j Again, in order to evaluate this bound, the joint distribution must be twice dierentiable with respect to the parameter [31, 18]. The vector WWB is obtained from the multiple parameter version of (2.20). In this general form, the WWB requires maximization over a large number of free variables. A practical form of the bound with si = 21 ; i = 1; : : : ; r has been used in several applications and is given by [22],[42]: aT Ra aT WT Q,1 Wa where WT = Qij = h 1 i r ( 1 ;i;j ) ( 1 ;i;,j ) 2 e 2 ,e 2 ( 21 ; i ;0) ( 21 ; j ;0) e e (2.30) (2.31) (2.32) 13 and 9 8v = <u u p ( x ; + ) p ( x ; + ) (2.33) ( 21 ; i; j ) = ln E :t p(x; ) i p(x; ) j ; Z q Z q = ln p( + i)p( + j ) p(xj + i)p(xj + j )dx d: Again, integration with respect to is over the region = f : p() > 0g. In implementing the bound, we must choose the test points fi gri=1. For a non-singular bound, there must be at least K linearly independent test points. The WWB has been shown to be a useful bound for all regions of operation [18],[39]-[43]. The only major diculty in implementing the bound is in choosing the test points and in inverting the matrix Q. 2.3 Summary The WWB and ZZB are derived using dierent techniques and no underlying theoretical relationship between the two bounds has been developed. They have both been used for threshold analysis and are useful over a wide range of operating conditions. Comparisons between the two bounds have been carried out through computational examples, in which the WWB tends to be tighter in the very low SNR region, while the ZZB tends to be tighter in the asymptotic region and provides a better prediction of the threshold location [18, 43]. The WWB can be applied to arbitrarily distributed vector parameters, but the ZZB was derived only for scalar, uniformly distributed parameters. In this dissertation, the ZZB is extended to arbitrarily distributed vector parameters. Properties of the bound are presented as well as some further variations and extensions. The relationship between the Weiss-Weinstein family of bounds and the extended Ziv-Zakai bound is also explored, and a new bound in the Weiss-Weinstein 14 family is proposed which can also be derived in the Ziv-Zakai formulation. In the examples, we compute the EZZB for arbitrarily distributed vector parameters and compare it to the WWB and BCRB. The EZZB is shown to be the tightest bound in the threshold and asymptotic regions. Chapter 3 Extended Ziv-Zakai Bound In this chapter, the Bellini-Tartara form of the Ziv-Zakai bound is extended for arbitrarily distributed vector parameters. We begin by generalizing the bound for a scalar parameter with arbitrary prior distribution. The derivation uses the elements of the original proofs in [15]-[17], but is presented in a more straightforward manner. This formulation provides the framework for the derivation of several variations and extensions to the general scalar bound. These include some bounds which are weaker but easier to evaluate, a bound on the MSE in estimating a function of a random variable, a tighter bound in terms of the probability of error in an M -ary detection problem, bounds for distortion functions other than MSE, and extension of all the above mentioned bounds to arbitrarily distributed vector parameters. 3.1 Scalar Parameter with Arbitrary Prior Theorem 3.1 The MSE in estimating the scalar random variable with prior pdf p() is lower bounded by: Z 1 Z 1 2 0 2 V ,1 (p(') + p(' + )) Pmin('; ' + )d' d (3.1) where V fg denotes the valley-lling function and Pmin('; ' + ) is the minimum probability of error in the binary detection problem: H0 : = '; Pr(H0) = p(') +p(p'(') + ) ; x p(xj = ') H1 : = ' + ; Pr(H1) = 1 , Pr(H0 ); x p(xj = ' + ): 15 (3.2) 16 Proof. We start from the relation [60, p. 24]: Z1 (3.3) 2 = 2 Pr jj 2 d: 0 Since both and Pr jj 2 are non-negative, a lower bound on 2 can be obtained from a lower bound on Pr jj 2 . Pr jj 2 = Pr > 2 + Pr , 2 (3.4) Z1 = p('0) Pr > 2 = '0 d'0 ,1 Z1 + p('1) Pr , 2 = '1 d'1: (3.5) ,1 Let '0 = ' and '1 = ' + . Expanding = ^(x) , gives: Z1 ^ p(') Pr (x) > ' + 2 = ' Pr jj 2 = ,1 ^ (3.6) +p(' + ) Pr (x) ' + 2 = ' + d' Z1 = (p(') + p(' + )) ,1 " p(') ^ p(') + p(' + ) Pr (x) > ' + 2 = ' # p ( ' + ) ^ + p(') + p(' + ) Pr (x) ' + 2 = ' + d': (3.7) Consider the detection problem dened in (3.2) and the suboptimal decision rule in which the parameter is rst estimated and a decision is made in favor of its \nearest-neighbor": Decide H0 : = ' if ^(x) ' + 2 (3.8) Decide H1 : = ' + if ^(x) > ' + 2 : The term in square brackets in (3.7) is the probability of error for the suboptimal decision scheme. If the suboptimal error probability is lower bounded by the minimum probability of error for the detection problem, we have Z1 Pr jj 2 (p(') + p(' + )) Pmin('; ' + )d': (3.9) ,1 17 Now since Pr jj 2 is a non-increasing function of , it can be more tightly bounded by applying the valley-lling function to the right hand side of (3.9). This produces a bound that is also non-increasing in : Z 1 (p(') + p(' + )) Pmin('; ' + )d' : (3.10) Pr jj 2 V ,1 Substituting (3.10) into (3.3) gives the desired bound on MSE. 2 Remarks. 1) For generality, the regions of integration over ' and have not been explicitly dened. However, since Pmin('; ' + ) is zero when one of the hypotheses has zero probability, integration with respect to ' is restricted to the region in which both p(') and p(' + ) are non-zero, and the upper limit for integration with respect to is the length of the interval over which p(') is non-zero. 2) When is uniformly distributed on [T0; T1], the bound (3.10) on Pr jj 2 reduces to Kotelnikov's inequality (2.4), and the MSE bound is equal to the BelliniTartara bound (2.7). If valley-lling is omitted, the weaker bound of Chazan-ZakaiZiv [16] is obtained. Note that when the a priori pdf is uniform, the hypotheses in the detection problem (3.2) are equally likely. 3) A bound similar to (3.1) for non-uniform discrete parameters can be found in [58], but which is worse by a factor of two. 4) The bound (3.1) coincides with the minimum MSE, achieved by conditional mean estimator, ^(x) = E fjxg mjx , when the conditional density p(jx) is symmetric and unimodal. Under these conditions, we have: Z 1 Z 1 V ( p ( ' ) + p ( ' + )) P ( '; ' + ) d' d min ,1 0 2 Z 1 Z 1 = V Ex ,1 min(p('jx); p(' + jx))d' d (3.11) 0 2 #) ( " Z1 Z mjx+ 2 Z1 = p(' + jx)d' + p('jx))d' d (3.12) 2 V Ex 0 ,1 mjx + 2 18 = = = = = = #) Z1 Z 1 ( "Z mjx, 2 V Ex ,1 p('jx)d' + m + p('jx)d' d 0 2 jx 2 Z1 x d V E x Pr j , mjx j 2 Z01 2 x d E x Pr j , mjx j 2 0 2 Z 1 Ex Pr j m jx , j x d 2 2 io n 0h 2 Ex E (mjx , ) x : n o E (mjx , )2 : (3.13) (3.14) (3.15) (3.16) (3.17) (3.18) In going from (3.11) to (3.12), we have used the fact that since p(jx) is symmetric about mjx and unimodal, p('jx) p(' +jx) when ' mjx + 2 and p(' +jx) < p('jx) when ' > mjx + 2 . There is equality in (3.15) because the term inside the brackets is non-increasing in and valley-lling is trivial. 5) If the a priori pdf is symmetric and unimodal, the bound (3.1) is equal to the prior variance in the region of very low SNR and/or observation time. In this region the observations are essentially useless, therefore p('); p(' + )) Pmin('; ' + ) = min( p(') + p(' + ) and the bound has the form: Z 1 Z 1 V min( p ( ' ) ; p ( ' + )) d' d: 0 2 ,1 (3.19) (3.20) If m and 2 denote the prior mean and variance of , and p() is symmetric about its mean and unimodal, then by the same arguments as in (3.11)-(3.18), (3.20) is equal to Z1 d = E n( , m )2o = 2: Pr j , m j (3.21) 2 0 2 19 3.2 Equally Likely Hypothesis Bound In the bound of Theorem 3.1, the dicult problem of computing the MSE has been transformed into the less dicult problem of computing and integrating the minimum probability of error. However, the bound is useful only when Pmin('; '+) can either be calculated or tightly lower bounded. In many problems, this is easier if the detection problem involves equally likely hypotheses. The bound (3.1) can be modied so that the detection problem has this property as follows: Theorem 3.2 2 Z 1 Z 1 el 0 2 V ,1 2 min (p('); p(' + )) Pmin('; ' + )d' d: (3.22) el ('; ' + ) is the minimum where V fg denotes the valley-lling function and Pmin probability of error in the binary detection problem: H0 : = '; Pr(H0 ) = 12 ; x p(xj = ') H1 : = ' + ; Pr(H1 ) = 12 ; x p(xj = ' + ); (3.23) The bound (3.22) for equally likely hypotheses equals the general bound of Theorem 3.1 when the prior pdf is uniform, otherwise it is weaker. Proof. We follow the proof of Theorem 3.1 through (3.6), which can be lower bounded by: 1 Z1 Pr jj 2 2 min(p('); p(' + )) 2 Pr ^ > ' + ,1 1 ^ + 2 Pr ' + 2 = ' + d': = ' 2 (3.24) The term in square brackets is the probability of error in the suboptimal nearestneighbor decision scheme when the two hypotheses are equally likely. It can be lower el ('; ' + ), from which (3.22) is bounded by the minimum probability of error, Pmin immediate. 20 By inspection, we see that the bound (3.22) and the bound of Theorem 3.1 coincide when the prior pdf is uniform. To show that the bound (3.22) is weaker than the general bound in all other cases, we show that el ('; ' +): (3.25) (p(') + p(' + )) Pmin('; ' +) 2 min (p('); p(' + )) Pmin Rewriting the left hand side, (p(') + p(' + )) Pmin('; ' + ) Z = min (p(')p(xj'); p(' + )p(xj' + )) dx: x (3.26) Now for any positive numbers a, b, c, and d, min(ab; cd) min(a; c) min(b; d): (3.27) Therefore (p(') + p(' + )) Pmin('; ' + ) 1 Z 1 2 min (p('); p(' + )) min 2 p(xj'); 2 p(xj' + ) dx (3.28) x el ('; ' + ):2 = 2 min (p('); p(' + )) Pmin (3.29) Even though the equally likely hypothesis bound in Theorem 3.2 is weaker than the general bound in Theorem 3.1, it can be quite valuable in a practical sense. In the examples, we consider a problem in which there is no closed form expression el ('; ' + ) and lower bounds on the probability of for either Pmin('; ' + ) or Pmin el ('; ' + ) is available, but error must be used. In this case a tight bound for Pmin no comparable expression for Pmin('; ' + ) is known. Probability of error bounds are discussed in more detail in Chapter 5. 21 3.3 Single Test Point Bound When a suitable expression or lower bound for the probability of error is available, the remaining computational diculty in implementing the bounds in Theorems 3.1 and 3.2 is in the integration over . Weinstein [57] proposed a weaker form of the Bellini-Tartara bound (2.8) in which the integral is replaced by a sum of terms evaluated at a set of arbitrarily chosen \test points". This approach can also be applied here. At one extreme, when a large number of test points are used with valley-lling, the sum becomes a method for numerical integration. At the other extreme, the single test point bound (2.9) is easy to evaluate but may not provide a close approximation to the integral. A stronger single test point bound is provided in the following theorem. Theorem 3.3 2 2Z1 (p(') + p(' + )) Pmin('; ' + )d' max 2 ,1 (3.30) where Pmin('; ' + ) is the minimum probability of error in the binary detection problem: H0 : = '; Pr(H0) = p(') +p(p'(') + ) ; x p(xj = ') H1 : = ' + ; Pr(H1) = 1 , Pr(H0); x p(xj = ' + ): (3.31) Proof. We start in the same manner as Theorem 3.1, 2 Z1 = 0 2 Pr jj 2 d Z 1 Z 1 = p('0) Pr > 2 = '0 d'0 0 2 ,1 Z1 + p('1) Pr , 2 = '1 d'1 d ,1 (3.32) (3.33) 22 In Theorem 3.1, we let '0 = ' and '1 = ' +. Here we let '0 = ' and '1 = ' + h, where h is a constant independent of . Expanding = ^(x) , gives: Z 1 Z 1 ^(x) > ' + = ' 2 = p ( ' ) Pr 2 0 2 ,1 ^ +p(' + h) Pr (x) ' + h , 2 = ' + h d' d (3.34) and by change of variables, Z 1 Z 1 2 = 2 p(') Pr ^(x) > ' + = ' 0 ,1 o +p(' + h) Pr ^(x) ' + h , = ' + h d' d: (3.35) The probability terms can only be compared to those associated with a decision rule in a binary detection problem if the estimate is evaluated against the same threshold in both terms. This can be accomplished if is restricted to the interval [0; h], and (3.35) is lower bounded as follows: Zh Z1 2 2 p(') Pr ^(x) > ' + = ' d'd 0 Z h ,1 Z1 +2 0 ,1 p(' + h) Pr ^(x) ' + h , = ' + h d'd (3.36) Zh Z1 = 2 p(') Pr ^(x) > ' + = ' d'd 0 Z h ,1 Z 1 +2 (h , ) p(' + h) Pr ^(x) ' + = ' + h d'd(3.37) ,1 Z h0 Z1 2 0 min(; h , ) ,1 (p(') + p(' + h)) " ^ p(') Pr ( x ) > ' + = ' p(') + p(' + h) ^ # p ( ' + h ) + p(') + p(' + h) Pr (x) ' + = ' + h d'd: (3.38) Now the term in brackets can be interpreted as the probability of error in the suboptimal decision rule in which the threshold varies with but the hypotheses remain xed: Decide H0 : = ' if ^(x) ' + (3.39) Decide H : = ' + h if ^(x) > ' + : 1 23 This can be lower bounded by the minimum probability of error for the detection problem, which is independent of : Zh Z1 2 2 min(; h , )d (p(') + p(' + h)) Pmin('; ' + h)d' (3.40) 0 ,1 Z 2 1 (p(') + p(' + h)) Pmin('; ' + h)d': (3.41) = h2 ,1 The bound is valid for any h and (3.30) is obtained by maximizing over h. 2 Remarks. 1) A weaker single test point bound in terms of equally likely hypotheses can be derived as in Theorem 3.2. 2) When is uniformly distributed on [T0; T1], the bound (3.30) becomes Z T1, 2 el ('; ' + )d'; 2 max P (3.42) min (T1 , T0) T0 which is a factor of two better than Weinstein's single test point bound (2.9) and a factor of four better than the original Ziv-Zakai bound [15]. 3) It will be shown in Chapter 4 that this bound can also be derived as a member of the Weiss-Weinstein family of bounds, thus it provides a link between the extended Ziv-Zakai bounds and the WWB. 3.4 Bound for a Function of a Parameter Consider estimation of a function of the parameter f (). For any estimate f^(x), the following theorem gives a bound on the mean square estimation error. Theorem 3.4 2 ^ E f (x) , f () Z 1 Z 1 0 2 V ,1 (p(') + p(' + g())) Pmin('; ' + g())d' d (3.43) 24 where V fg denotes the valley-lling function, Pmin('; ' + g ()) is the minimum probability of error in the binary detection problem: H0 : = '; Pr(H0) = p(') + pp(('')+ g()) ; x p(xj = ') H1 : = ' + g(); Pr(H1) = 1 , Pr(H0); x p(xj = ' + g()); (3.44) and g() satises f (' + g()) f (') + (3.45) for every ' and . Proof. Proceeding as in Theorem 3.1, 2 Z 1 ^ ^ E f (x) , f () = 2 Pr jf (x) , f ()j 2 d 0 (3.46) and Z1 ^ ^ Pr jf (x) , f ()j f ( x ) > f ( ' ) + = p ( ' ) Pr = ' 0 0 0 d'0 2 2 ,1 Z1 ^ + p('1) Pr f (x) f ('1) , 2 = '1 d'1: (3.47) ,1 Letting '0 = ' and '1 = ' + g(), where g() is some function of , Z1 ^ Pr jf (x) , f ()j = (p(') + p(' + g())) 2 ,1 " p(') ^ (3.48) p(') + p(' + g()) Pr f (x) > f (') + 2 = ' # p ( ' + g ()) + p(') + p(' + g()) Pr f^(x) f (' + g()) , 2 = ' + g() d': The term in square brackets in can be interpreted as the probability of error for a suboptimal decision scheme in the binary detection problem dened in (3.44) if the estimate f^(x) is compared to a common threshold, i.e. if f (' + g()) , 2 = f (') + 2 : (3.49) 25 f(w) f(W)+W f(W+g(W)) Figure 3.1 Choosing g() for function of a parameter. Furthermore, if f (' + g()) , 2 f (') + 2 ; (3.50) then the threshold on H1 is to the right of the threshold on H0, and the decision regions overlap. In this case, the probabilities in (3.48) can be lower bounded by shifting either or both thresholds so that they coincide. Therefore if a g() can be found which satises (3.50) for all ' and , then the term in brackets in (3.48) can be lower bounded by the minimum probability of error for the detection problem. This yields Z1 ^ Pr jf (x) , f ()j 2 (p(') + p(' + g())) Pmin('; ' + g())d'(3.51) ,1 from which (3.43) follows immediately. 2 Remarks. 1) A typical f (') is shown in Figure 3.1. When the curve is shifted up units, g() is the amount the curve must be shifted horizontally to the left so that it remains above the vertically shifted curve. If f (') is monotonically increasing in ', g() is positive, and if f (') is monotonically decreasing in ', g() is negative. 26 2) When f () = k, g() = k and the bound (3.43) reduces to a scaled version of the bound in Theorem 3.1. 3) Bounds in terms of equally likely hypotheses and single test point bounds for a function of a parameter can be derived in a similar manner to the bounds in Theorems 3.2 and 3.3. 3.5 M -Hypothesis Bound The bound of Theorem 3.1 can be further generalized and improved by relating the MSE to the probability of error in an M -ary detection problem as follows. Theorem 3.5 For any M 2, Z1 (3.52) 0( 2 ! ) Z 1 MX,1 1 (M ) p(' + n) Pmin ('; ' + ; : : :; ' + (M , 1)) d' d V M , 1 ,1 2 n=0 (M ) ('; ' + ; : : :; ' + (M , 1)) is where V fg is the valley-lling function and Pmin the minimum probability of error in the hypothesis testing problem with the M hypotheses Hi , i = 0; : : :; M , 1: Hi : = ' + i; Pr(Hi ) = PMp,(1'p(+'i+)n) ; x p(xj = ' + i); n=0 (3.53) which is illustrated in Figure 3.2. This bound coincides with the bound of Theorem 3.1 when M = 2 and is tighter when M > 2. H0 H1 H2 ' '+ ' + 2 HM ,1 ' + (M , 1) Figure 3.2 Scalar parameter M -ary detection problem. ^(x) 27 Proof. We start from (3.3) as in Theorem 3.1: Z1 2 (3.54) = 0 2 Pr jj 2 d: Focusing on Pr jj 2 , we can write it as the sum of M , 1 identical terms: MX ,1 Pr > + Pr , (3.55) Pr jj 2 = M 1, 1 2 2 i=1 MX ,1 Z 1 = M 1, 1 p ( ' ) Pr > = ' i,1 i,1 d'i,1 2 i=1 ,1 Z1 (3.56) + p('i) Pr , 2 = 'i d'i ,1 MX ,1 Z 1 1 ^ = M , 1 i=1 ,1 p('i,1 ) Pr (x) > 'i,1 + 2 = 'i,1 d'i,1 Z1 ^ + p ('i) Pr (x) 'i , 2 = 'i d'i : (3.57) ,1 Now let '0 = ' and 'i = ' + i for i = 1; : : :; M , 1. Taking the summation inside the integral gives: Pr jj 2 = 1 Z 1 MX,1 p(' + (i , 1)) Pr ^(x) > ' + i , 1 = ' + (i , 1) M , 1 ,1 i=1 2 + p(' + i) Pr ^(x) ' + i , 12 = ' + i d': (3.58) Multiplying and dividing by PMn=0,1 p(' + n) and combining terms, we get: ! Z 1 MX,1 1 Pr jj 2 = M , 1 p(' + n) ,1 n=0 2 ^ 4 PM ,1 p(') Pr (x) > ' + = ' (3.59) 2 n=0 p(' + n) 1 MX ,2 p ( ' + i ) ^ Pr (x) ' + i , = ' + i + PM ,1 2 i=1 n=0 p(' + n) 1 ^ + Pr (x) > ' + i + 2 = ' + i 3 p ( ' + ( M , 1)) 1 Pr ^(x) ' + M , = ' + (M , 1) 5 d': + PM ,1 2 n=0 p(' + n) 28 We can interpret the term in square brackets as the probability of error in a suboptimal \nearest-neighbor" decision rule for the detection problem dened in (3.53): Decide H0 if ^(x) ' + 2 ; Decide Hi;i=1;:::;M ,2 if ' + i , 12 < ^(x) ' + i + 12 ; (3.60) 1 Decide HM ,1 if ^(x) > ' + M , 2 : This is illustrated in Figure 3.2. Lower bounding the suboptimal error probability by the minimum probability of error yields: Pr jj 2 ! 1 Z 1 MX,1 p(' + n) P (M ) ('; ' + ; : : :; ' + (M , 1)) d': (3.61) min M ,1 ,1 n=0 Applying valley-lling, and substituting the result into (3.54) gives the bound (3.52). The proof that the bound (3.52) is tighter than the binary bound of Theorem 3.1 is given in Appendix A. 2 Remarks. 1) As before, the limits on the regions of integration have been left open, however, each integral is only evaluated over the region in which the probability of error (M ) ('; ' + ; : : :; ' + (M , 1)) is non-zero. Note that in regions in which one Pmin or more, say L, of the prior densities is equal to zero, the M -ary detection problem reduces to the corresponding (M , L)-ary detection problem. 2) A bound in terms of equally likely hypotheses can be derived similarly to the bound in Theorem 3.2, and single test point bounds can be obtained as in Theorem 3.3. An M -hypothesis bound for a function of a random variable may also be derived. It requires nding M , 1 functions fgi()gMi=1 which satisfy f (' + gi()) f (' + gi,1()) + for every ' and , with g0() = 0. (3.62) 29 3) Multiple hypothesis bounds were also derived in [15] and [59] for scalar, uniformly distributed parameters, and in [58] for non-uniform discrete parameters. In those bounds, the number of hypotheses was determined by the size of the parameter space and varied with . In the bound (3.52), M is xed and may be chosen arbitrarily. Furthermore, when M = 2, the bound of Theorem 3.1 is obtained. The other multiple hypothesis bounds do not reduce to the binary hypothesis bound. 4) In generalizing to M hypotheses, the complexity of the bound increases considerably since expressions or bounds on error probability for the M -ary problem are harder to nd. Thus, the bound may be mainly of theoretical value. 3.6 Arbitrary Distortion Measures Estimation performance can be assessed in terms of the expected value of distortion measures other than squared error. Some examples include the absolute error, D() = jj, higher moments of the error, D() = jjr; r 1, and the uniform distortion measure which assigns a constant cost for all values of error with magnitude larger than some value, i.e., ( 2 D() = k0 jjjj > (3.63) : 2 It was noted in [16] that the Chazan-Zakai-Ziv bound could be generalized for these distortion measures. In fact, the bounds can be extended to a larger class of distortion measures, those which are symmetric and non-decreasing, and composed of piecewise continuously dierentiable segments. First consider the uniform distortion measure. Its expected value is E fD(jj)g = k Pr jj 2 ; (3.64) which can be lower bounded by bounding Pr jj 2 as was done for MSE. 30 Next consider any symmetric, non-decreasing, dierentiable distortion measure D() with D(0) = 0 and derivative D_ (). We can write the average distortion as Z1 E fD()g = E fD(jj)g = D()d Pr (jj ) (3.65) 0 Integrating by parts, 1 Z 1 E fD()g = D() Pr(jj ) , 0 D_ ()(1 , Pr (jj ))d 0 Z1 _ = D() Pr (jj ) d: 0 (3.66) Substituting = 2 yields Z 1 1 _ (3.67) E fD()g = 2 D 2 Pr jj 2 d: 0 Since D() is non-decreasing, D_ 2 is non-negative, and we can lower bound (3.67) by bounding Pr jj 2 . Note that when D() = 2; D_ () = 2, and (3.67) reduces to (3.3) as expected. Since any symmetric, non-decreasing distortion measure composed of piecewise continuously dierentiable segments can be written as the sum of uniform distortion measures and symmetric, non-decreasing, dierentiable distortion measures, we can lower bound the average distortion by lower bounding Pr jj 2 , and all the results of the previous ve sections can be applied. 3.7 Vector Parameters with Arbitrary Prior We now present extension to vector random parameters with arbitrary prior distributions. The parameter of interest is a K -dimensional vector random variable with prior pdf p(). For any estimator ^ (x), the estimation error is = ^ (x) , , and R = E fT g is the error correlation matrix. We are interested in lower bounding aT Ra, for any K -dimensional vector a. The derivation of the vector bound is 31 based on derivation of the scalar bound of Theorem 3.1 and was jointly developed with Steinberg and Ephraim [61, 62]. Theorem 3.6 For any K -dimensional vector a, aT Ra ) Z Z1 ( V max (p(') + p(' + )) Pmin('; ' + )d' d; (3.68) 0 2 : aT = where V fg is the valley-lling function and Pmin('; ' + ) is the minimum probability of error in the binary detection problem: ) H0 : = '; Pr(H0) = p(') +p(p'(' + ) ; x p(xj = ') H1 : = ' + ; Pr(H1) = 1 , Pr(H0); x p(xj = ' + ): (3.69) Proof. Replacing jj with jaT j in (3.3) gives: n T 2o Z 1 T T (3.70) a Ra = E ja j = 0 2 Pr ja j 2 d: Focusing on Pr jaT j 2 , we can write: Pr jaT j = Pr aT > + Pr aT , (3.71) 2 2 2 Z T = p('0) Pr a > 2 = '0 d'0 Z T + p('1) Pr a , 2 = '1 d'1 (3.72) Z = p('0) Pr aT ^ (x) > aT '0 + 2 = '0 d'0 Z T T ^ + p('1) Pr a (x) a '1 , 2 = '1 d'1: (3.73) Let '0 = ' and '1 = ' + . Multiplying and dividing by p(') + p(' + ) gives Z Pr jaT j 2 = (p(') + p(' + )) " p(') T T ^ p(') + p(' + ) Pr a > a ' + 2 = ' # p ( ' + ) T T T + p(') + p(' + ) Pr a ^ a ' + a , 2 = ' + d': (3.74) 32 CC 1 = ' + CC H1 C H0 CC C C C C CC ^ C : CC a CC C C CC aT = aT ' + 0 = ' CC C aT = aT ' + 2 t t t Figure 3.3 Vector parameter binary detection problem. Now consider the detection problem dened in (3.69). If is chosen so that aT = ; (3.75) then the term in square brackets in (3.74) represents the probability of error in the suboptimal decision rule: Decide H0 : = ' if aT ^ (x) > aT ' + 2 (3.76) Decide H1 : = ' + if aT ^ (x) aT ' + 2 : The detection problem and suboptimal decision rule are illustrated in Figure 3.3. The decision regions are separated by the hyperplane aT = aT ' + 2 ; (3.77) which passes through the midpoint of the line connecting ' and ' + and is perpendicular to the a-axis. A decision is made in favor of the hypothesis which is on the same side of the separating hyperplane (3.77) as the estimate, ^ (x). 33 Replacing the suboptimal probability of error by the minimum probability of error gives: Z T Pr ja j 2 (p(') + p(' + )) Pmin('; ' + )d': (3.78) This is valid for any satisfying (3.75), and the tightest bound is obtained by maximizing over within this constraint. Applying valley-lling, we get Pr jaT j 2 ( ) Z V max (p(') + p(' + )) Pmin('; ' + )d' : (3.79) : aT = Substituting (3.79) into (3.70) gives the desired bound. 2 Remarks. 1) To obtain the tightest bound, we must maximize over the vector , subject to the constraint aT = . The vector is not uniquely determined by the constraint (3.75), and the position of the second hypothesis may lie anywhere in the hyperplane dened by: aT = aT ' + : (3.80) This is indicated by the dashed line in Figure 3.3. In order to satisfy the constraint, must be composed of a xed component along the a-axis, kak2 a, and an arbitrary component orthogonal to a. Thus has the form = kak2 a + b; (3.81) where b is an arbitrary vector orthogonal to a, i.e., aT b = 0; and we have K , 1 degrees of freedom in choosing via the vector b. (3.82) 34 In the maximization, we want to choose so that the two hypotheses are as indistinguishable as possible by the optimum detector, and therefore produce the largest probability of error. Choosing b = 0 results in the hypotheses being separated by the smallest Euclidean distance. This is often a good choice, but hypotheses separated by the smallest Euclidean distance do not necessarily have the largest probability of error, and maximizing over can improve the bound. 2) Multiple parameter extensions of the scalar bounds in Theorems 3.1,3.3, and 3.5 may be obtained in a straightforward manner. The vector generalization of the bound in Theorem 3.4 for a function of a parameter becomes quite complicated and was not pursued. Bounds for a wide class of distortion measures may also be derived. The vector extensions take the following forms: i. Equally likely hypotheses aT Ra (3.83) ) Z Z1 ( el ('; ' + )d' d: V max 2 min (p('); p(' + )) Pmin T 0 2 :a = ii. Single Test Point T k2 Z k a (p(') + p(' + )) Pmin('; ' + )d' a max 2 aT R (3.84) In this bound the maximization over subject to the constraint aT = , followed by the maximization over , have been combined. iii. M Hypotheses 8 9 > > > Z1 > < = T a Ra 0 2 V > max f (1; : : :; M ,1)> d > > :a1;T::i: =; iM,1 : ; (3.85) 35 where f (1; : : :; M ,1) = (3.86) ! Z MX ,1 1 (M ) p ( ' + n ) Pmin ('; ' + 1 ; : : :; ' + M ,1 ) d' M , 1 n=0 and 0 = 0. The optimized M -hypothesis bound is tighter than the optimized binary bound for M > 2. The derivation is given in Appendix A. 3) If a bound on the MSE of the ith parameter is desired, then we could also use the scalar bound (3.1) of Theorem 3.1 with the marginal pdfs p(i) and p(xji) obtained by averaging out the unwanted parameters. Alternatively, we could condition the scalar bound on the remaining K , 1 parameters, and take the expected value with respect to those parameters. If we use the vector bound with b = 0, the resulting bound is equivalent to the second alternative when valley-lling is omitted. Assume that the component of interest is 1. If we choose a = [1 0 : : : 0]T and = [ 0 : : : 0]T , then aT Ra = 21 Z1 Z 0 2 (p('1; '2; : : :; 'K ) + p('1 + ; '2; : : : ; 'K )) Pmin('1; '1 + j'2; : : :; 'K )d'd (3.87) Z1 Z = 0 2 p('2; : : :; 'K ) (p('1j'2; : : : ; 'K ) + p('1 + j'2; : : :; 'K )) Pmin('1; '1 + j'2; : : :; 'K )d'd (3.88) Z 1 Z Z = p('2; : : :; 'K ) 0 2 (p('1j'2; : : : ; 'K ) + p('1 + j'2; : : :; 'K )) Pmin('1; '1 + j'2; : : :; 'K )d'1d d'2 d'K (3.89) which is the expected value of the conditional scalar bound. Note that with any other choice of b, the vector bound does not reduce to the scalar bound. 36 4) Other multiple parameter bounds such as the Weiss-Weinstein, Cramer-Rao, and Barankin bounds can be expressed in terms of a matrix, B, which is a lower bound on the correlation matrix R in the sense that for an arbitrary vector a, aT Ra aT Ba. The extended Ziv-Zakai bound does not usually have this form, but conveys the same information. In both cases, bounds on the MSE of the individual parameters and any linear combination of the parameters can be obtained. The matrix form has the advantage that the bound matrix only needs to be calculated once, while the EZZB must be recomputed for each choice of a. But if only a small number of parameters or combinations of parameters is of interest, the advantage of the matrix form is not signicant. Furthermore, while the o-diagonal elements of the correlation matrix may contain important information about the interdependence of the estimation errors, the same is not true of the o-diagonal entries in the bound matrix. They do not provide bounds on cross-correlations, nor does a zero vs. non-zero entry in the bound imply the same property in the correlation matrix. 3.8 Summary In this chapter, the Ziv-Zakai bound was extended to arbitrarily distributed vector parameters. The extended bound relates the MSE to the probability of error in a detection problem in which a decision is made between two values of the parameter. The prior probabilities of the hypotheses are proportional to the prior pdf evaluated at the two parameter values. The derivation of the bound uses the elements of the original proofs in [15]-[17], but is organized dierently. In the original bounds, the derivations begin with the detection problem and evolve into the MSE, while the derivations presented here begin with an expression for the MSE, from which the detection problem is immediately recognized. This formulation allowed for the 37 derivation of additional bounds such as a weaker bound in terms of equally likely hypotheses and a bound which does not require integration over . Further generalizations included a bound for functions of a parameter, a tighter bound in terms of an M -ary detection problem, and bounds for a large class of distortion measures. Chapter 4 Relationship of Weiss-Weinstein Bound to Extended Ziv-Zakai Bound With the extensions presented in Chapter 3, both the EZZB and WWB are applicable to arbitrarily distributed vector parameters, and they are some of the tightest available bounds for analysis of MSE performance for all regions of operation. The WWB and EZZB are derived using dierent techniques and no underlying theoretical relationship between the two bounds has been developed, therefore comparisons between the two bounds have been made only through computational examples. The WWB tends to be tighter in the very low SNR region, while the EZZB tends to be tighter in the asymptotic region and provides a better prediction of the threshold location. Finding a relationship between the bounds at a theoretical level may explain these tendencies and may lead to improved bounds. In this chapter, a connection between the two bounds is presented. A new bound in the Weiss-Weinstein family is derived which is equivalent to the single test point extended Ziv-Zakai bound in Theorem 3.3. Recall that in the Weiss-Weinstein family of bounds, for any function (x; ) such that E f (x; )j xg = 0, the global MSE for a single, arbitrarily distributed random variable can be lower bounded by: 2 2 EE ff 2((xx;;))gg : (4.1) Consider the function ! ! p ( x ; , p ( x ; + (x; ) = min 1; p(x; ) , min 1; p(x; ) : 38 (4.2) 39 It is easily veried that the condition E f (x; )j xg = 0 is satised. ! Z1 p ( x ; + ) E f (x; )j xg = ,1 p(jx) min 1; p(x; ) d ! Z1 p ( x ; , ) , ,1 p(jx) min 1; p(x; ) d ! Z1 p ( + j x ) = p(jx) min 1; p(jx) d ,1 ! Z1 p ( , j x ) , ,1 p(jx) min 1; p(jx) d Z1 = min (p(jx); p( + jx)) d ,1 Z1 , ,1 min(p(jx); p( , jx)) d: Letting = , in the second integral, Z1 min (p(jx); p( + jx)) d E f (x; )j xg = ,1 Z1 , min (p( + jx); p(jx)) d = 0: ,1 Evaluating the numerator of the bound, ! Z Z1 p ( x ; + ) E f (x; )g = x ,1 p(x; ) min 1; p(x; ) ddx ! Z Z1 p ( x ; , ) , x ,1 p(x; ) min 1; p(x; ) ddx Z Z1 = min(p(x; ); p(x; + )) ddx x ,1 Z Z1 , min(p(x; ); p(x; , )) ddx x ,1 (4.3) (4.4) (4.5) (4.6) (4.7) (4.8) Letting = , in the second integral, Z Z1 E f (x; )g = min(p(x; ); p(x; + )) ddx x ,1 Z Z1 , x ,1 ( + ) min (p(x; + ); p(x; )) ddx (4.9) Z Z1 = , min (p(x; + ); p(x; )) ddx (4.10) x ,1 Z1 = , (p() + p( + )) Pmin(; + )d: (4.11) ,1 40 To evaluate the denominator, we have to compute ! ! ( Z Z1 p ( x ; , ) p ( x ; + ) 2 2 2 E f (x; )g = p(x; ) min 1; p(x; ) + min 1; p(x; ) x ,1 ! !) p ( x ; + ) p ( x ; , ) ,2 min 1; p(x; ) min 1; p(x; ) ddx: (4.12) In general, this is a dicult expression to evaluate. However, if an upper bound on the expression can be obtained, the inequality in (4.1) will be maintained. Note that p(x;,) the terms min 1; p(px(;x+) and min 1; p(x;) have values in the interval (0; 1]. ;) For any two numbers a and b, 0 a; b 1, a2 + b2 , 2ab a + b therefore Ef ! p ( x ; + ) x; )g x ,1 p(x; ) min 1; p(x; ) !) p ( x ; , ) + min 1; p(x; ) ddx Z Z1 = 2 min (p(x; ); p(x; + )) ddx Zx1,1 = 2 ,1 (p() + p( + )) Pmin(; + )d: 2( Z Z1 (4.13) ( (4.14) (4.15) (4.16) Substituting (4.11) and (4.16) into (4.1) and optimizing the free parameter yields the bound 2 Z 1 (p() + p( + )) P (; + )d 2 max (4.17) min 2 ,1 which is the same as the single test point EZZB derived in Theorem 3.3. This bound makes a connection between the extended Ziv-Zakai and WeissWeinstein families of bounds. In this form, the bound is weaker than the general EZZB, but can be tighter than the WWB (see Example 4 in Chapter 6). Theoretically, this bound can be extended for multiple test points and to vectors of 41 parameters, however, upper bounding the expression in the denominator makes this dicult. Further investigation of this bound may lead to a better understanding of the EZZB and WWB, and to improved bounds. Chapter 5 Probability of Error Bounds A critical factor in implementing the extended Ziv-Zakai bounds derived in Chapter 3 is in evaluating the probability of error in either a binary or M -ary detection problem. The bounds are useful only if the probability of error is known or can be tightly lower bounded. Probability of error expressions have been derived for many problems, as well as numerous approximations and bounds which vary in complexity and tightness (see e.g. [31], [63]-[72]). In the important class of problems in which the observations are Gaussian, H0 : Pr(H0) = q; p(xjH0) N (m0; K0) H1 : Pr(H1) = 1 , q; p(xjH1) N (m1; K1); (5.1) the probability of error has been well studied [31, 67]. When the covariance matrices are equal, K0 = K1 = K, the probability of error is given by [31, p. 37]: ! ! d d (5.2) Pmin = q d + 2 + (1 , q) , d + 2 where is the threshold in the optimum (log) likelihood ratio test = ln 1 ,q q ; (5.3) d is the normalized distance between the means on the two densities q d = (m1 , m0)T K,1(m0 , m1); (5.4) and Z 1 1 t2 (z) = z p e, 2 dt: 2 42 (5.5) 43 When the hypotheses are equally likely, q = 1 , q = 21 and = 0, and the probability of error has the simple form: ! d el Pmin = 2 : (5.6) When the covariance matrices are unequal, evaluation of the probability of error becomes intractable in all but a few special cases [31, 67], and we must turn to approximations and bounds. Some important quantities which appear frequently in both bounds and approximations are the semi-invariant moment generating function (s), and its rst two derivatives with respect to s, _ (s) and (s). The function (s) is dened as s ln p(xjH1 ) (s) ln E e p(xjH0 ) Z = ln p(xjH0)1,s p(xjH1)sdx: (5.7) When s = 21 , e( 21 ) is equal to the Bhattacharyya distance [63]. For the general Gaussian problem (s) is given by [67]: (s) = , s(1 2, s) (m1 , m0)T [sK0 + (1 , s)K1],1 (m1 , m0) (5.8) + 2s ln jK0j + 1 ,2 s ln jK1j , 12 ln jsK0 + (1 , s)K1j: When the covariance matrices are equal, K0 = K1 = K, (5.8) simplies to: (s) = , s(1 2, s) (m1 , m0)T K,1(m1 , m0): (5.9) When the mean vectors are equal, (5.8) becomes: (s) = 2s ln jK0j + 1 ,2 s ln jK1j , 21 ln jsK0 + (1 , s)K1j: (5.10) A simple upper bound on the minimum probability of error in terms of (s) is the Cherno bound [31, p. 123]: Pmin q ef(s ),s_ (s)g (5.11) 44 where s is the value of s for which _ (s) is equal to the threshold : _ (s) = = ln 1 ,q q : (5.12) It is well known that the exponent in the Cherno bound is asymptotically optimal [73, p. 313]. A simple lower bound is the Bhattacharyya lower bound [63]: Pmin q(1 , q)e2( 21 ) (5.13) which gets its name from the Bhattacharyya distance. This bound does not have the correct exponent and can be weak asymptotically. Shannon, Gallager, and Berlekamp [64] derived the following bounds on the individual error probabilities PF = Pr(errorjH0) and PM = Pr(errorjH1): n p o (s ),s _ (s ),ks (s ) Q0 o (5.14) PF en PM e where p (s )+(1,s )_ (s ),k(1,s ) (s ) Q1 (5.15) 1 Q0 + Q1 1 , k2 (5.16) and k may be chosen arbitrarily to optimize the bound. Shannon et. al. used p k = 2. From these inequalities, we can derive a lower bound on Pmin as follows Pmin = qPF + (1 , q)PM n o n o) (5.17) ( p p qef(s ),s_ (s)g Q0e ,ks (s) + Q1e ,k(1,s ) (s) (5.18) n p o (s ),s _ (s ),k max(s ;1,s ) (s ) qen (5.19) o (Q0 + Q1) p qe (s ),s_ (s),k max(s;1,s) (s) 1 , k12 : (5.20) This bound is similar to the Cherno upper bound and has the asymptotically optimal exponent. 45 The Cherno upper bound, the Bhattacharyya lower bound, and the ShannonGallager-Berlekamp (SGB) lower bounds are applicable to a wide class of binary detection problems. Evaluation of the bounds involves computing (s), _ (s), and (s), and nding the value s such that _ (s) = . In the extended Ziv-Zakai bounds derived in Chapter 3, the threshold varies with both ' and , and solving for s can be a computational burden. It is much easier to use the equally likely hypothesis bounds, where = 0 for every ' and and we can solve for s just once. In the remainder of this section, we will focus on bounds for equally likely hypotheses. In many cases, the optimal value of s when the hypotheses are equally likely is s = 21 , for which the bounds become: el 1 e( 21 ) (5.21) Pmin 2 el 1 e2( 21 ) Pmin (5.22) 4 p el 1 e( 21 ) e, k2 ( 21 ) 1 , 1 : (5.23) Pmin 2 k2 A good approximation to the probability of error was derived by Van Trees and Collins [31, p. 125],[66],[67, p. 40]. When the hypotheses are equally likely and s = 12 , their expression has the form: q el ef( 21 )+ 18 ( 12 )g 1 ( 1 ) : (5.24) Pmin 2 2 In the Gaussian detection problem (5.1) in which the covariance matrices are equal K0 = K1 = K, ( 21 ) = , 18 ( 12 ) = ,(m1 , m0)T K,1(m1 , m0); (5.25) and (5.24) is equal to the exact expression (5.6) for the probability of error. When the covariance matrices are not equal, (5.24) is known only to be a good approximation in general. However, Pierce [65] and Weiss and Weinstein [38] derived 46 the expression (5.24) as a lower bound to the probability of error for some specic problems in which the hypotheses were equally likely and the mean vectors were equal. A key element in both derivations was their expression for (s), which had the form (s) = ,c ln (1 + s(1 , s) ) : (5.26) An important example in which this form is obtained is the single source bearing estimation problem considered in the examples in Chapter 6. Pierce also derived the following upper and lower bounds for this problem: ( 12 ) ( 21 ) 1 )+ 1 ( 1 )g 1 q 1 e e ( f 2 8 2 r e 2 ( 2 ) Pmin r 2 1 + 8 12 2 1 + 18 21 (5.27) These bounds have not been shown to hold in general because (5.8) does not always reduce to (5.26), and extension of the bounds in (5.27) to a wider class of problems is still an open problem. Both the upper and lower bounds have the asymptotically optimal exponent, and are quite tight, as the upper and lower bounds dier at most p by a factor of . Asymptotically, we can expect the Pierce lower bound to be q tighter than the SGB bound. The \constant" term in the Pierce bound has ( 12 ) q in the denominator while the \constant" term in the SGB bound has ( 12 ) in the exponent. In evaluating the extended Ziv-Zakai bounds, we need bounds for the probability of error which are tight not only asymptotically, but for a wide range of operating conditions. The Pierce bound is applicable to the bearing estimation problems considered in the examples in Chapter 6. It will be demonstrated in the examples that the choice of probability of error bound has a signicant impact on the nal MSE bound, and that using the equally likely hypothesis bound with the Pierce bound on the probability of error yields the tightest bound on MSE. Chapter 6 Examples In this chapter, we demonstrate the application and properties of the extended ZivZakai bounds derived in Chapter 3 with some examples, and compare our results with those obtained using the BCRB and the WWB. We begin with some simple linear estimation problems involving a Gaussian parameter in Gaussian noise, for which the optimal estimators and their performance are known. In these examples, the EZZB is easy to calculate and is shown to be tight, i.e. it is equal to the performance of the optimal estimator. We then consider a series of bearing estimation problems, in which the parameter of interest is the direction-of-arrival of a planewave signal observed by an array of sensors. These are highly nonlinear problems for which evaluation of the exact performance is intractable. The EZZB is straightforward to evaluate and is shown to be tighter than the WWB and BCRB in the threshold and asymptotic regions. 6.1 Estimation of a Gaussian Parameter in Gaussian Noise Example 1. Let xi = + ni; i = 1; : : : ; N (6.1) where the ni are independent Gaussian random variables, N (0; n2 ), and is Gaussian N (0; 2), independent of the ni. The a posteriori pdf p(jx) is Gaussian, N (mjx; p2), 47 48 where [31, p. 59]: N ! 2 X 1 N (6.2) mjx = N2 + 2 N xi n i=1 2 2 p2 = N2 +n 2 : (6.3) n The minimum MSE estimator of is mjx and the minimum MSE is p2. In this example, p(jx) is symmetric and unimodal, therefore the scalar bound (3.1) of Theorem 3.1 should equal p2. To evaluate the bound, we use (5.2) for Pmin(; +) with (6.4) q = p() +pp(() + ) = p(p(+)) (6.5) p N: (6.6) n Straightforward evaluation of (3.1) with this expression yields ! Z1 2 2 d = p2: (6.7) 0 p Since p() is symmetric and unimodal, the bound converges to the prior variance 2 2 as N ! 0 and as n2 ! 0, as expected. In this problem, we can also evaluate the M -ary bound (3.52) of Theorem 3.5. Since the M -ary bound is always as least as good as the binary bound, it must equal p2. For any M 2, Pmin(; + ; : : :; + (M , 1)) has the form: Pmin(; + ; : : :; + (M , 1)) = PM ,1 1 n =0 p( + n) ! ! MX ,1 ln ln i d i+1 d p( + (i , 1)) d + 2 + p( + i) , d + 2 (6.8) i=1 where 1)) : i = p(p+((+i ,i) (6.9) d = 49 By inspection, we see that the key inequality (A.20) from Lemma 1, Appendix A, which guarantees that the M -ary bound is at least as tight as the binary bound, is an equality for this example. Therefore, when (6.8) is substituted into (3.52), the expression is equal to the binary bound, p2. In comparison, if we evaluate the WWB with one test point and s = 21 given in (2.24), we obtain: , 422 2 e p !: 2 max (6.10) , 222 2 1,e p The bound is maximized for = 0 and is equal to the minimum MSE p2. The single test point WWB is equal to the BCRB when ! 0 [18], therefore the BCRB is also equal to p2. This can be veried by direct evaluation of (2.13). Example 2. Suppose now that the average absolute error in Example 1 is lower bounded. The optimal estimator is the median of p(jx) [31, p. 57] which is equal to mjx, and s E fj , mjx jg = 2 p: (6.11) The EZZB for this problem is Z 1 1 Z 1 (p() + p( + )) Pmin(; + )d d: E fjjg 2 V 0 ,1 (6.12) Using the same expression for Pmin(; + ) as in Example 1 in (6.12) yields s Z1 ! 2 2 d = 2 p: (6.13) 0 p In fact, since p(jx) is symmetric and unimodal, the EZZB will be tight for any distortion measure which is symmetric, non-decreasing, composed of piecewise continuously dierentiable segments, and satises lim D()p(jx) = 0: !1 (6.14) 50 Under these conditions, the optimal estimator is mjx [31, p. 61], and equality in the bound can be shown similarly to (3.11)-(3.18). The WWB and BCRB are not applicable to this problem. Example 3. Now consider estimation of the vector parameter from the observation vector x: x i = + ni ; i = 1; : : : ; N (6.15) where the ni are independent Gaussian random vectors, N (0; Kn), and is a Gaussian random vector, N (0; K ), independent of the ni. The a posteriori pdf p(jx) is multivariate Gaussian, N (mjx; Kp), where N ! X 1 , 1 (6.16) mjx = N K (N K + Kn ) N xi Kp = K (N K + Kn ),1K n: i=1 (6.17) The minimum MSE estimator is mjx and the minimum MSE correlation matrix is Kp. The probability of error Pmin(; + ) is again given by (5.2) with q = p() +p(p() + ) ) = p(p(+ q T ) d = N K,n 1: Substituting this expression in the vector bound (3.68) yields 8 0 q T ,1 19 Z1 < = @ Kp A; d: aT Ra 0 V : max 2 : aT = (6.18) (6.19) (6.20) (6.21) The function () is a decreasing function of its argument, therefore we want to minimize T K,p 1 subject to the constraint aT = . The optimal is pa = aTK Ka p (6.22) 51 and (6.21) becomes 0 1 aT Ra 0 @ q T A d = aT Kpa; (6.23) 2 a Kpa which is the desired bound. In this example, the extended Ziv-Zakai bound is a matrix bound which is valid for any a. Evaluation of the multiple parameter BCRB (2.29) also yields the bound aT Kpa. Evaluating the the vector WWB (2.30)-(2.32) with multiple test points and then optimizing is a tedious procedure. However, the optimum WWB is guaranteed to be at least as tight as the BCRB [18], therefore we can conclude that the WWB is also equal to aT Kpa. Z1 6.2 Bearing Estimation An important class of parameter estimation problems is signal parameter estimation, where the vector of unknown parameters is embedded in a signal waveform s(t; ). The noisy measurements are a sample function of a vector random process (6.24) x(t) = s(t; ) + n(t) , T2 t T2 where n(t) is the noise waveform. In the simple (i. e. single sensor) time delay estimation problem, the signal waveform is known except for its delay, and the measurements have the form x(t) = s(t , ) + n(t) , T2 t T2 (6.25) Where = is the parameter of interest. In the array processing problem, measurements are taken at several spatially separated sensors. We consider the problem in which a planewave signal impinges on a planar array of M sensors as illustrated in Figure 6.1. The ith sensor is located at 52 Figure 6.1 Geometry of the single source bearing estimation problem using a planar array. " # d x;i di = dy;i (6.26) in the x , y plane. The bearing has two components which can be expressed in either angular or Cartesian (wavenumber) coordinates: " # 2 arcsin qu2 + u2 3 =4 x uy y 5 (6.27) = arctan , ux " # " # u cos sin x u = uy = , sin sin : (6.28) The observed waveform at ith sensor consists of a delayed version of the source signal and additive noise: xi(t) = s(t , i) + ni (t) , T2 t T2 (6.29) where T i = u cdi ; (6.30) 53 and c is the speed of propagation. In this problem, the spatial characteristics of the array allow for the estimation of the bearing under a variety of signal models. Estimation of the source bearing is a fundamental problem in radar, sonar, mobile communications, medical imaging, anti-jam communications, and seismic analysis. Many estimation schemes have been proposed and computation of bounds on achievable performance has attracted much attention (see e.g. [36]-[56]), but the ZZB has not been widely used due to its limitation to a single uniformly distributed parameter. In the following examples, the extended Ziv-Zakai bounds derived in Chapter 3 are used in a variety of bearing estimation problems. We assume that the source and noise waveforms are sample functions of independent, zero-mean, Gaussian random processes with known spectra. The source is narrowband with spectrum ( j! , !0j W2 (6.31) P (!) = 0P otherwise where !W0 1, and the noise is white with power spectral density N20 . We are interested in estimation of u. Under these assumptions, the observations are Gaussian with zero mean and covariance K(!) = P (!)E(!; u)E(!; u)y + N20 I (6.32) where 2 ,j !c uT d1 3 66 e .. (6.33) E(!; u) = 4 . 775 : ! T e,j c u dM The detection problem is a Gaussian problem with equal mean vectors and covariance matrices given by: H0 : = u; K0(!) = P (!)E(!; u)E(!; u)y + N20 I H1 : = u + ; K1(!) = P (!)E(!; u + )E(!; u + )y + N20 I: (6.34) 54 el (u; u + ) can be written in closed In this problem, neither Pmin(u; u + ) nor Pmin form, and one of the bounds from Chapter 5 must be used. These bounds all require the function (s; u; u + ). For reasonably large time-bandwidth product, WT 2 1, (s; u; u + ) has the following form [67, p. 67]: Z1 (s; u; u + ) = T s ln jK0(!)j + (1 , s) ln jK1(!)j 0 , ln jsK0(!) + (1 , s)K1(!)j d! (6.35) 2 : The determinants in (6.35) are straightforward to evaluate (see e.g. [36]-[40],[42]), 8 M > < N20 1 + 2MP j ! , !0j W2 N 0 jK0(!)j = jK1(!)j = >: N0 M (6.36) otherwise 2 and jsK0(!8) + (1 , s)K1(!)j 2MP 2 N0 M > 2 MP 2 < 1 + N0 + s(1 , s) N0 (1 , j()j ) = > N2 M : 20 where () M1 E(!0; u + )yE(!0; u): j! , !0j W2 otherwise Substitution of (6.36) and (6.37) into (6.35) yields Z !0+ W2 (s; ) = ,T ! , W ln [1 + s(1 , s) ()] d! 2 0 2 = , WT 2 ln [1 + s(1 , s) ()] (6.37) (6.38) (6.39) (6.40) where () 1 , j()j2 2MP 2 N0 : 1 + 2NMP0 (6.41) (6.42) 55 Note that (s; ), and hence the probability of error bounds, are only a function of and not a function of u. This simplies the computation of both the EZZB and WWB. The equally likely hypothesis bound of Theorem 3.2 is the most straightforward to implement, and the Pierce bound as well as the SGB and Bhattacharyya bounds can be applied. Solving _ (s) = 0 gives s = 21 and 1 WT 1 ( 2 ; ) = , 2 ln 1 + 4 () (6.43) _ ( 21 ; ) = 0 (6.44) 2 () : (6.45) ( 12 ; ) = WT 2 1 + 14 () Denoting the probability of error bound by Pbel, the nal EZZB has the form ( ) Z2 T el a Ra 0 V max Pb ()A() d: (6.46) : aT = where Z A() = min(p(u); p(u + ))du: (6.47) Evaluation of Pbel () depends on the geometry of the array and A() depends on the a priori distribution of u. For comparison, the BCRB for this problem is: ! !0 2 8 X Z @2 M WT Jij = , @u @u ln p(u)du + 2 c M di;mdj;m : (6.48) U i j m=1 In order to evaluate the BCRB, we must be able to dierentiate the prior distribution twice with respect to the parameters. The WWB is given by (2.30)-(2.32), where ( 21 ; i; j ) has the form: ( 12 ; i; j ) = ln C (i; j ) + ( 12 ; i , j ) Z q p(u + i)p(u + j )du; C (i; j ) = U and the region of integration is U = fu : p(u) > 0g. (6.49) (6.50) 56 Figure 6.2 Uniform linear array. Example 4. Consider estimation of u = sin() using a linear array of M sensors uniformly spaced at 20 on the y-axis as shown in Figure 6.2. We assume u has a h p pi uniform prior distribution on , 23 ; 23 : p p(u) = p1 ; juj 23 : 3 (6.51) In this problem the unknown parameter is a scalar with a uniform prior distribution, therefore the EZZB reduces to the Bellini-Tartara bound. The function A() is given by ! A() = 1 , p ; (6.52) 3 and the bound (6.46) becomes: ( !) Z p3 el 2 V Pb () 1 , p d: (6.53) 0 3 This expression must be evaluated numerically. 57 The bound (6.53) was evaluated using the Pierce (5.27), Bhattacharyya (5.22), and SGB (5.23) lower bounds, as well as the Pierce (5.27) and Cherno (5.21) upper bounds. The results, normalized with respect to the a priori variance, are plotted for an 8-element array and WT 2 = 100 in Figure 6.3. First note that the curves computed using upper bounds on the probability of error do not produce lower bounds on MSE. They are plotted to demonstrate the sensitivity of the EZZB to the probability of error expression. The curves computed using the Pierce upper and lower bounds are less than 1 dB apart, therefore we can conclude that using the Pierce lower bound in place of the actual probability of error does not signicantly impact the EZZB. However, the same is not true when the SGB and Bhattacharyya bounds are used. The SGB and Bhattacharyya bounds produce reasonable MSE bounds which predict a threshold in performance, but they are quite a bit weaker than the Pierce MSE bound. The Bhattacharyya curve is between 2-8 dB weaker than the Pierce curve and the SGB curve is between 1-10 dB weaker than the Pierce curve. The EZZB appears to be quite sensitive to the accuracy in the probability of error expression and requires a bound on the probability of error which is tight over a wide range of operating conditions. In the remaining examples only the Pierce lower bound will be used. Next, the EZZB (6.53) is compared to other bounds. The single test point (STP) EZZB of Theorem 3.3 has the form: ! 2 el 2 max Pb () 1 , p : (6.54) 3 For the WWB, (6.50) reduces to: ! max( j j ; j , j ) i j; j j i j p C (i; j ) = 1 , : (6.55) 3 Evaluation of the WWB involves choosing the test points, and computing and inverting the matrix Q. In the WWB, adding test points always improves the bound, 58 0 −5 Normalized MSE (dB) −10 −15 −20 −25 −30 Chernoff U.B. Pierce U.B. & L.B. −35 Bhattacharyya L.B. SGB L.B. −40 −45 −30 −25 −20 −15 −10 −5 SNR (dB) Figure 6.3 EZZB evaluated with Pierce, SGB, and Bhattacharyya lower bounds, and Pierce and Cherno upper bounds for 8-element linear array and uniform distribution. 59 and we have found that if a dense set of test points is chosen, optimization of their locations is not necessary. There is no expression for the BCRB because the uniform prior distribution is not twice dierentiable. The EZZB, WWB, and STP EZZB are shown in Figure 6.4 for WT 2 = 100. The bounds are normalized with respect to the a priori variance, and the WWB is p computed with 14 test points distributed over [0; 23 ]. At very low SNR, the EZZB and WWB converge to the prior variance, but the STP EZZB converges to a value about 0.5 dB lower. All three bounds indicate a threshold in performance, and the WWB and EZZB converge in the asymptotic region. The STP EZZB is 2 dB lower than the EZZB and WWB asymptotically. The WWB is tighter than both EZZBs for low values of SNR, and the regular EZZB is tighter than the WWB and STP EZZB in the threshold region. The STP EZZB is between 0.5 and 4 dB weaker than the regular EZZB, but is tighter than the WWB in the threshold region. The STP EZZB is the easiest to implement because it does not require numerical integration or matrix inversion. Example 5. In this example, bounds for a non-uniformly distributed scalar parameter are computed. Assume that u has a cosine squared distribution: 2 p(u) = cos 2 u ; juj 1: (6.56) For this problem ) ! sin( ; (6.57) A() = 1 , 2 , 2 and the bound (6.46) becomes: ( Z2 ) !) sin( el 2 0 V Pb () 1 , 2 , 2 d: (6.58) This expression must be evaluated numerically. The STP bound is given by: ) ! sin( 2 el : (6.59) 2 max Pb () 1 , 2 , 2 60 0 −5 Normalized MSE (dB) −10 −15 −20 −25 EZZB −30 EZZB−STP WWB −35 −40 −45 −30 −25 −20 −15 −10 SNR (dB) Figure 6.4 Comparison of normalized bounds for 8-element linear array and uniform distribution. −5 61 The BCRB for this problem exists and has the form: " 2(M 2 , 1) !#,1 WT 2 2 + 2 6 where was dened in (6.42). It can be computed easily. For the WWB, (6.50) reduces to: ! max( j , j i; j ) C (i; j ) = 1 , cos 2 2 i j + 21 sin 2 ji , j j + 21 sin 2 (i + j ) : (6.60) (6.61) Again, evaluation of the WWB involves choosing the test points, and computing and inverting the matrix Q. The four bounds are shown for an 8-element array and WT 2 = 100 in Figure 6.5. The bounds are normalized with respect to the a priori variance, and the WWB is computed with 14 test points distributed over [0; 1]. At very low SNR, the EZZB and WWB converge to the prior variance, while the STP EZZB and BCRB do not attain this value. In the asymptotic region, the EZZB, WWB, and BCRB converge, but the STP EZZB is about 2 dB weaker. The EZZB, WWB, and STP EZZB indicate a threshold in performance, with the WWB being tightest for low values of SNR, and the EZZB being tightest in the threshold region. In this example the STP EZZB is weaker than both the EZZB and WWB for all values of SNR. The computational savings in implementing the STP EZZB are not signicant enough to outweigh its weaker performance, and it will not be computed in the remaining examples. Example 6. In this example, estimation of the bearing angle rather than the wavenumber is considered to demonstrate the application of the EZZB for an arbitrarily distributed scalar parameter and the EZZB for a function of a parameter. Consider the same problem as in Example 4, with the wavenumber uniformly dis- 62 0 −5 Normalized MSE (dB) −10 −15 −20 −25 EZZB EZZB−STP −30 WWB BCRB −35 −40 −45 −30 −25 −20 −15 −10 SNR (dB) Figure 6.5 Comparison of normalized bounds for 8-element linear array and cosine squared distribution. −5 63 h p pi tributed on , 23 ; 23 . The bearing angle is related to the wavenumber u by = arcsin(u) (6.62) h i therefore has a cosine distribution on , 3 ; 3 : p() = p1 cos(); jj 3 : (6.63) 3 To implement the EZZB, the equally likely hypothesis bound of Theorem 3.2 is evaluated with the Pierce bound on the probability of error. The Pierce bound depends on the dierence in wavenumber, which is a function of both the dierence in the corresponding bearings, and the bearings themselves. The bound does not have the same form as in wavenumber estimation (6.46), but is given by: (Z , Z 23 V 3 p1 min(cos(); cos( + )) 2 0 ,3 3 o Pbel (sin( + ) , sin())d d: (Z , Z 23 3 2 2 p cos + 2 = V 0 0 3 el Pb sin + 2 , sin , 2 d d: (6.64) Evaluation of this bound requires numerical integration of a double integral. Considering to be a function of u, we can bound the MSE in estimating using the bound for a function of a parameter in Theorem 3.4. To evaluate (3.43), we must nd a function g() such that arcsin(u + g()) arcsin(u) + (6.65) h p pi h i for all u 2 , 23 ; 23 and all 2 0; 23 . Using straightforward manipulations, (6.65) is equivalent to p g() 2 sin 2 ,u sin 2 + 1 , u2 cos 2 (6.66) 64 The term in brackets is never larger than 1, therefore g() = 2 sin 2 satises (6.65) for all u and , and the bound is given by: 19 8 0 Z 23 < = 2 sin V :Pbel 2 sin 2 @1 , p 2 A; d: 2 (6.67) 0 3 This expression must also be evaluated numerically, but there is only one integral. The simplication results from the probability of error being a function only of the dierence in wavenumber, which allows for direct evaluation of the integral over u. The WWB for bearing angle estimation is most easily computed using the WWB for a function of a parameter [18]. It has the form of (2.30): 2 WT Q,1W with i 1 r 2 e( 21 ;i;j ) , e( 21 ;i;,j ) = ( 1 ;i ;0) e 2 q(i)e( 21 ;j ;0)q(j ) WT = Qij where h (6.68) (6.69) (6.70) v ! ! ! u u 2 4 2 t q() = 1 , p arcsin 1 , p , 2 + p 1 , p (6.71) 3 3 3 3 and ( 21 ; i; j ) is given by: ! max( j j ; j , j ) i j; j j i j 1 p ( 2 ; i; j ) = ln 1 , + ( 12 ; i , j ): (6.72) 3 Again, evaluation of the WWB involves choosing the test points, and computing and inverting the matrix Q. The three bounds are shown for an 8-element array and WT 2 = 100 in Figure 6.6. The bounds are normalized with respect to the a priori variance, and the WWB is p computed with 14 test points distributed over [0; 23 ]. At very low SNR, all three 65 0 −5 Normalized MSE (dB) −10 −15 −20 −25 EZZB EZZB−FCN −30 WWB −35 −40 −30 −25 −20 −15 −10 −5 SNR (dB) Figure 6.6 Comparison of normalized bounds for bearing estimation with 8-element linear array and cosine distribution. 66 bounds converge to the prior variance, with the WWB being tighter in the low SNR region, and EZZB being tighter in the threshold region. The EZZB for a function of a parameter is nearly identical to the regular EZZB for low and moderate SNR and begins to diverge in the threshold region. In the asymptotic region, the EZZB and WWB converge, but the EZZB for a function of a parameter is about 2 dB weaker. The regular EZZB is considerably more dicult to compute than either the WWB or EZZB for a function of a parameter, due to the numerical integration of a double integral. The EZZB for a function of a parameter is simpler and yields nearly the same bound except in the asymptotic region where it is a few dB weaker than the other bounds. Example 7. In the next two examples, the bounds for vector parameters are evaluated. Estimation of the two-dimensional wavenumber using planar arrays is considered. It is assumed that the wavenumber is uniformly distributed on the unit disc: (1 q2 2 ux + uy 1 (6.73) p(u) = 0 otherwise. Under this assumption, A() is ,1 times the area of intersection of two unit circles centered at the origin and : ! !! k k k k A() = 2 arccos 2 , sin 2 arccos 2 : (6.74) In the WWB, C (i; j ) is ,1 the area of intersection of three unit circles centered at the origin, i , and j . The formula is cumbersome and is omitted for brevity. The uniform prior is not twice dierentiable and there is no expression for the BCRB. In this example, we consider a square planar array of M = 16 elements with sensors evenly spaced 20 apart on a side, as shown in Figure 6.7. The beampattern of the array is plotted in Figure 6.8. For this array, there are signicant sidelobes along the diagonal axes. The points along these axes are points of ambiguity for the 67 y t t d = 20 , , , t t t t t t t t x t t t t t t Figure 6.7 Square array. estimation problem and the detection problem, and estimation and detection errors will tend to occur more often in these directions. The square array is symmetric in the x and y directions, therefore the MSE in estimating ux and uy will be the same. To evaluate the MSE in estimating ux, we choose " # a = 10 : (6.75) Evaluation of the EZZB requires maximization with respect to of the function f () = Pbel ()A() (6.76) which is plotted in Figure 6.9. This function is the product of the bound on probability of error for two hypotheses separated by , and the function A(), which is decreasing in kk. Because of the geometry of the array, f ( ) is large not only for small values of , but also for the ambiguity points which lie on the axes 2 = 1 and 2 = ,1. The maximization of f () with respect to is performed for each value 68 1 0.8 0.6 0.4 0.2 0 2 1 0 −1 −2 −2 −1.5 −1 −0.5 0 0.5 1 Figure 6.8 Beampattern of 16-element square array. 1.5 2 69 0 -20 -40 -60 2 1 2 0 1.5 1 -1 0.5 -2 0 Figure 6.9 The function f () for 16-element square array for SNR=-14 dB. 70 of , subject to the constraint aT = . In this problem, the constraint means that must have the form " # = (6.77) b() ; where b() can be chosen arbitrarily. When b = 0, f () is evaluated along the line 2 = 0, but the bound can be improved by choosing b() so that lies on one of the diagonal axes. In Figure 6.10, the results of maximizing f () over , valley-lling f () with b() = 0, and valley-lling the maximized function are plotted. The maximized and valley-lled function is signicantly larger than the other functions, and when it is integrated, a tighter MSE bound which captures the eects of the ambiguities is produced. The EZZB and WWB, normalized with respect to the prior variance of ux, are plotted in Figure 6.11. Several versions of the EZZB are plotted to illustrate the impact of valley-lling and maximizing over . All of the bounds approach the prior variance for small SNR, and the WWB is tighter than the EZZB in this region. In the threshold and high SNR regions, the use of valley-lling alone and maximization over without valley-lling produces bounds which are slightly better than the WWB. Combining maximization with valley-lling yields a bound which is signicantly better than the other bounds in this region. In summary, the geometry of the square array gives rise to larger estimation errors for some directions of arrival. The EZZB reects the presence of these large errors for high SNR, while the WWB does not. 71 0 max b,VF -10 max b b=0,VF -20 b=0 -30 -40 -50 -60 -70 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 Figure 6.10 Impact of maximization and valley-lling for 16-element square array for SNR=-14 dB. 2 72 0 −5 Normalized MSE (dB) −10 −15 −20 EZZB−max b, VF −25 EZZB−max b EZZB−b=0,VF −30 WWB −35 −40 −30 −25 −20 −15 −10 SNR (dB) Figure 6.11 Comparison of normalized vector bounds for 16-element square array and uniform distribution. −5 73 Example 8. Next consider the same problem as in Example 7 with a circular planar array of M = 16 elements with sensors evenly spaced 20 apart. The beampattern of this array is plotted in Figure 6.12. For this array, the beampattern is nearly the same along all axes. Since the circular array is symmetric in the x and y directions, the MSE in estimating ux and uy will be the same, and we will again evaluate the MSE in estimating ux. The EZZB and WWB have the same forms as in Example 7, and only dier in the ( 21 ; i ; j ) terms, which depend on the geometry of the array. The function f ( ) = Pbel ()A(), plotted in Figure 6.13, is smoother for the circular array and there are no signicant ambiguities as with the square array. The normalized EZZB and WWB are plotted vs. SNR for the circular array and WT = 100 in Figure 6.14. Once again, both the EZZB and WWB approach the 2 prior variance for small SNR, with the WWB being tighter than the EZZB in this region. The EZZB is tighter in the threshold region, and both bounds converge for high SNR. For this array, valley-lling and maximizing over do not improve the bound signicantly. The square and circular arrays have nearly the same performance in the low SNR and threshold regions, but the circular array performs better than the square array for large SNR. The square array does not possess as much symmetry as the circular array, resulting in larger estimation errors for some directions of arrival. 74 1 0.8 0.6 0.4 0.2 0 2 1 0 −1 −2 −2 −1.5 −1 −0.5 0 0.5 1 Figure 6.12 Beampattern of 16-element circular array. 1.5 2 75 0 -20 -40 -60 2 1 2 0 1.5 1 -1 0.5 -2 0 Figure 6.13 The function f () for 16-element circular array for SNR=-14 dB. 76 0 −5 Normalized MSE (dB) −10 −15 −20 EZZB−max b, VF −25 EZZB−max b EZZB−b=0,VF −30 WWB −35 −40 −30 −25 −20 −15 −10 SNR (dB) Figure 6.14 Comparison of normalized vector bounds for 16-element circular array and uniform distribution. −5 77 6.3 Summary The examples in this section demonstrated the application of the extended Ziv-Zakai bounds derived in Chapter 3. In the linear Gaussian parameter in Gaussian noise problems, the EZZB, WWB, and BCRB were equal to the minimum MSE, and the EZZB was also tight for other distortion measures. In the nonlinear bearing estimation problems, the equally likely hypothesis bound was used with the Pierce lower bound on the probability of error. In all of the examples, the EZZB was tighter than the WWB in the threshold and asymptotic regions. When other probability of error bounds were substituted in the EZZB, signicantly weaker bounds resulted, indicating that the EZZB is quite sensitive to the accuracy of the probability of error expression. The single test point EZZB, which was shown in Chapter 4 to also be a member of the Weiss-Weinstein family of bounds, was computed in two examples. It was weaker than the regular EZZB, but could be tighter than the WWB. Although the computational savings in implementing the STP EZZB are not signicant enough to outweigh its weaker performance, this bound provides a theoretical link between the WWB and the EZZB. The EZZB for a function of a parameter was computed for estimation of bearing angle rather than wavenumber. It was signicantly easier to implement that the regular EZZB, with a small loss in performance in the asymptotic region. Finally, the vector EZZB was computed for estimation of two dimensional wavenumber using square and circular arrays. As in the other examples, it was tighter in the threshold and asymptotic regions than the WWB. Chapter 7 Concluding Remarks The Ziv-Zakai lower bound on the MSE in estimating a random parameter from noisy observations has been extended to vectors of parameters with arbitrary prior distributions. As in the original bound, the extended bound relates the MSE to the probability of error in a binary detection problem. In the new bound, the hypotheses in the detection problem are not required to to be equally likely, and are related to the prior distribution of the parameter. The derivation of the bound was made possible by developing a more concise proof of the original bound. The new derivation allowed for the development of additional bounds such as a weaker bound in terms of equally likely hypotheses and a single test point bound which does not require integration. Further generalizations of the bound included a bound for functions of a parameter, a tighter bound in terms of an M -ary detection problem, and bounds for a large class of distortion measures. A new bound in the Weiss-Weinstein family was presented which is equivalent to the single test point extended Ziv-Zakai bound. This bound makes a theoretical connection between the extended Ziv-Zakai and Weiss-Weinstein families of bounds. Although weaker than the EZZB, it can be tighter than the WWB, and further investigation of this bound may lead to a better understanding of the EZZB and WWB, and to improved bounds. The new Ziv-Zakai bounds, as well as the Weiss-Weinstein and Bayesian CramerRao bounds, were applied to a series of bearing estimation problems, in which the 78 79 parameters of interest are the directions-of-arrival of signals received by an array of sensors. These are highly nonlinear problems for which evaluation of the exact performance is intractable. The EZZB was straightforward to evaluate and was shown to be tighter than the WWB and BCRB in the threshold and asymptotic regions in all of the examples. The EZZB was also analytically evaluated for some simple linear estimation problems involving a Gaussian parameter in Gaussian noise, for which the optimal estimators and their performance are known. In these examples, the EZZB was shown to be tight, i.e. it is equal to the performance of the optimal estimator. There are several topics related to the EZZB which require further research. In this dissertation, the EZZB was proven to equal the minimum MSE for all regions of operation when the posterior density is symmetric and unimodal, and to equal the prior variance for low SNR and/or observation times when the prior pdf is symmetric and unimodal. We were not able to determine conditions under which the bound is tight for large SNR and/or observation times, since normally this would be done by using asymptotic properties of the probability of error. Here, however, the hypotheses are not xed and we need a uniformly good asymptotic bound on the probability of error. In previous computational comparisons between the WWB and the ZZB, as well as in the examples considered here, the WWB tends to be tighter in the very low SNR region, while the EZZB tends to be tighter in the asymptotic region and provides a better prediction of the threshold location. In this dissertation a theoretical relationship between the EZZB and WWB was developed. Further exploration of this relationship may explain these tendencies and lead to improved bounds. The extended Ziv-Zakai bounds derived in this dissertation are useful when the probability of error in the detection problem is known or can be tightly lower 80 bounded for a wide range of operating conditions. The Pierce lower bound used in the bearing estimation examples is a good approximation to the probability of error in many problems [31, 67], but is known to be a lower bound only in a few special cases [65]-[67],[38]. Generalization of the Pierce bound to a wider class of detection problems, or development of other tight and easily computable lower bounds on the probability of error are open problems. Finally, an aspect of the bounds not treated here is the formulation of the extended Ziv-Zakai bounds within the generalized rate distortion theory, as was done by Zakai and Ziv [58]. This theory provides a general framework for generating lower bounds by specifying a convex function and a probability measure in the Data Processing Theorem [74]. This approach may be explored for improving the EZZB, for nding a relationship between the EZZB and WWB, and also for generating tight lower bounds on the probability of error. References 81 82 References [1] S. Haykin, ed., Array Signal Processing, Englewood Clis, NJ: Prentice-Hall, Inc., 1985. [2] B. Friedlander and B. Porat, \Performance Analysis of a Null-Steering Algorithm Based on Direction-of-Arrival Estimation", IEEE Trans. ASSP, vol. ASSP-37, pp. 461-466, April 1989. [3] K. L. Bell, J. Capetanakis, and J. Bugler, \Adaptive Nulling for Multiple Desired Signals Based on Signal Waveform Estimation", in Proceedings of IEEE Military Comm. Conf., (San Diego, CA), 1992. [4] L. P. Seidman, \Performance Limitations and Error Calculations for Parameter Estimation", Proceedings of the IEEE, vol. 58, pp. 644-652, May 1970. [5] Special Issue on Time Delay Estimation, IEEE Trans. ASSP, vol. ASSP-29, June 1981. [6] V. A. Kotel'nikov, The Theory of Optimum Noise Immunity. New York, NY: McGraw-Hill, 1959. [7] P. Stoica, \List of References on Spectral Line Analysis", Signal Processing, vol. 31, pp. 329-340, April 1993. [8] R. A. Fisher, \On the Mathematical Foundations of Theoretical Statistics", Phil. Trans. Royal Soc., vol. 222, p. 309, 1922. [9] D. Dugue, \Application des Proprietes de la Limite au Sens du Calcul des Probabilities a L'etude des Diverses Questions D'estimation", Ecol. Poly., vol. 3, pp. 305-372, 1937. [10] M. Frechet, \Sur l'extension de Certaines Evaluations Statistiques au cas de Petit Enchantillons", Rev. Inst. Int. Statist., vol. 11, pp. 182-205, 1943. [11] G. Darmois, \Sur les Lois Limites de la Dispersion de Certaines Estimations", Rev. Inst. Int. Statist., vol. 13, pp. 9-15, 1945. [12] C. R. Rao, \Information and Accuracy Attainable in the Estimation of Statistical Parameters", Bull. Calcutta Math. Soc., vol. 37, pp. 81-91, 1945. [13] H. Cramer, Mathematical Methods of Statistics. Princeton, NJ: Princeton University Press, 1946. 83 [14] E. W. Barankin, \Locally Best Unbiased Estimates", Ann. Math. Stat., vol. 20, pp. 477-501, 1949. [15] J. Ziv and M. Zakai, \Some Lower Bounds on Signal Parameter Estimation", IEEE Trans. Information Theory, vol. IT-15, pp. 386-391, May 1969. [16] D. Chazan, M. Zakai, and J. Ziv, \Improved Lower Bounds on Signal Parameter Estimation", IEEE Trans. Information Theory, vol. IT-21, pp. 90-93, January 1975. [17] S. Bellini and G. Tartara, \Bounds on Error in Signal Parameter Estimation", IEEE Trans. Comm., vol. 22, pp. 340-342, March 1974. [18] A. J. Weiss, Fundamental Bounds in Parameter Estimation. PhD thesis, TelAviv University, Tel-Aviv, Israel, 1985. [19] A. J. Weiss and E. Weinstein, \A Lower Bound on the Mean-Square Error in Random Parameter Estimation", IEEE Trans. Information Theory, vol. IT-31, pp. 680-682, September 1985. [20] E. Weinstein and A. J. Weiss, \Lower Bounds on the Mean Square Estimation Error", Proceedings of the IEEE, vol. 73, pp. 1433-1434, September 1985. [21] A. J. Weiss and E. Weinstein, \Lower Bounds in Parameter Estimation - Summary of Results", in Proceedings of IEEE ICASSP, Tokyo, Japan, 1986. [22] E. Weinstein and A. J. Weiss, \A General Class of Lower Bounds in Parameter Estimation", IEEE Trans. Information Theory, vol. 34, pp. 338-342, March 1988. [23] E. L. Lehmann, Theory of Point Estimation. New York, NY: John Wiley & Sons, 1983. [24] I. A. Ibragimov and R. Z. Hasminskii, Statistical Estimation - Asymptotic Theory., New York, NY: Springer-Verlag, 1981. [25] A. Bhattacharyya, \On Some Analogues of the Amount of Information and their Use in Statistical Estimation", Sankhya Indian J. of Stat., vol. 8, pp. 1-14, 201-218, 315-328, 1946. [26] J. M. Hammersley, \On Estimating Restricted Parameters", J. Royal Stat. Soc. (B), vol. 12, pp. 192-240, 1950. [27] D. G. Chapman and H. Robbins, \Minimum Variance Estimation without Regularity Assumptions", Ann. Math. Stat., vol. 22, pp. 581-586, 1951. [28] D. A. S. Fraser and I. Guttman, \Bhattacharyya Bounds without Regularity Assumptions", Ann. Math. Stat., vol. 23, pp. 629-632, 1952. 84 [29] J. Kiefer, \On Minimum Variance Estimators", Ann. Math. Stat., vol. 23, pp. 627-629, 1952. [30] J. S. Abel, \A Bound on Mean-Square-Estimate Error", IEEE Trans. Information Theory, vol. IT-39, pp. 1675-1680, September 1993. [31] H. L. Van Trees, Detection, Estimation, and Modulation Theory, Part I. New York, NY: John Wiley & Sons, 1968. [32] B. Z. Bobrovsky and M. Zakai, \A Lower Bound on the Estimation Error for Certain Diusion Processes", IEEE Trans. Information Theory, vol. IT-22, pp. 45-52, January 1976. [33] B. Z. Bobrovsky, E. Mayer-Wolf, and M. Zakai, \Some Classes of Global Cramer-Rao Bounds", Ann. Statistics, vol. 15, pp. 1421-1438, 1987. [34] P. N. Misra and H. W. Sorenson, \Parameter Estimation in Poisson Processes", IEEE Trans. Information Theory, vol. IT-21, pp. 87-90, January 1975. [35] S. C. White and N. C. Beaulieu, \On the Application of the Cramer-Rao and Detection Theory Bounds to Mean Square Error of Symbol Timing Recovery", IEEE Trans. Comm., vol. 40, pp. 1635-1643, October 1992. [36] A. J. Weiss and E. Weinstein, \Composite Bound on the Attainable MeanSquare Error in Passive Time Delay Estimation from Ambiguity Prone Signals", IEEE Trans. Information Theory, vol. IT-28, pp. 977-979, November 1982. [37] A. J. Weiss and E. Weinstein, \Fundamental Limitations in Passive Time Delay Estimation - Part I: Narrow-Band Systems", IEEE Trans. ASSP, vol. ASSP-31, pp. 472-486, April 1983. [38] E. Weinstein and A. J. Weiss, \Fundamental Limitations in Passive Time Delay Estimation - Part II: Wide-Band Systems", IEEE Trans. ASSP, vol. ASSP-32, pp. 1064-1078, October 1984. [39] A. J. Weiss, \Composite Bound on Arrival Time Estimation Errors", IEEE Trans. Aero. Elec. Syst., vol. AES-22, pp. 751-756, November 1986. [40] A. J. Weiss, \Bounds on Time-Delay Estimation for Monochromatic Signals", IEEE Trans. Aero. Elec. Syst., vol. AES-23, pp. 798-808, November 1987. [41] T. J. Nohara and S. Haykin, \Application of the Weiss-Weinstein Bound to a Two-Dimensional Antenna Array", IEEE Trans. ASSP, vol. ASSP-36, pp. 1533-1534, September 1988. [42] D. F. DeLong, \Use of the Weiss-Weinstein Bound to Compare the DirectionFinding Performance of Sparse Arrays", MIT Lincoln Laboratory, Lexington, MA, Tech. Rep. 982, August 1993. 85 [43] K. L. Bell, Y. Ephraim, and H. L. Van Trees, \Comparison of the ChazanZakai-Ziv, Weiss-Weinstein, and Cramer-Rao Bounds for Bearing Estimation", in Proceedings of Conf. on Info. Sciences and Syst., (Baltimore, MD), 1993. [44] A. Zeira and P. M. Schultheiss, \Realizable Lower Bounds for Time Delay Estimation", IEEE Trans. Sig. Proc., vol. SP-41, pp. 3102-3113, November 1993. [45] A. B. Baggeroer, \Barankin Bounds on the Variance of Estimates of the Parameters of a Gaussian Random Process", MIT Res. Lab. Electron., Quart. Prog. Rep. 92, January 1969. [46] V. H. MacDonald and P. M. Schultheiss, \Optimum Passive Bearing Estimation in a Spatially Incoherent Environment", J. Acoust. Soc. Amer., vol. 46, pp. 3743, July 1969. [47] W. J. Bangs and P. M. Schultheiss, \Space-Time Processing for Optimal Parameter Estimation", Proc. Signal Processing NATO Adv. Study Inst., New York, NY: Academic Press, 1973. [48] S. K. Chow and P. M. Schultheiss, \Delay Estimation Using Narrow-Band Processes", IEEE Trans. ASSP, vol. ASSP-29, pp. 478-484, June 1981. [49] P. Stoica and A. Nehorai, \MUSIC, Maximum Likelihood and the Cramer-Rao Bound", IEEE Trans. ASSP, vol. ASSP-37, pp. 720-741, May 1989. [50] H. Clergeot, S. Tressens, and A. Ouamri, \Performance of High Resolution Frequencies Estimation Methods Compared to the Cramer-Rao Bounds", IEEE Trans. ASSP, vol. ASSP-37, pp. 1703-1720, November 1989. [51] P. Stoica and A. Nehorai, \Performance Study of Conditional and Unconditional Direction-of-Arrival Estimation", IEEE Trans. ASSP, vol. ASSP-38, pp. 1783-1795, October 1990. [52] P. Stoica and A. Nehorai, \MUSIC, Maximum Likelihood and the Cramer-Rao Bound: Further Results and Comparisons", IEEE Trans. ASSP, vol. ASSP-38, pp. 2140-2150, December 1990. [53] B. Ottersten, M. Viberg, and T. Kailath, \Analysis of Subspace Fitting and ML Techniques for Parameter Estimation from Sensor Array Data", IEEE Trans. Sig. Proc., vol. SP-40, pp. 590-599, March 1992. [54] H. Messer, \Source Localization Performance and the Array Beampattern", Signal Processing, vol. 28, pp. 163-181, August 1992. [55] A. Weiss and B. Friedlander, \On the Cramer-Rao Bound for Direction Finding of Correlated Signals", IEEE Trans. Sig. Proc., vol. SP-41, pp. 495-499, January 1993. 86 [56] A. Zeira and P. M. Schultheiss, \Realizable Lower Bounds for Time Delay Estimation: Part II: Threshold Phenomena", IEEE Trans. Sig. Proc., vol. 42, pp. 1001-1007, May 1994. [57] E. Weinstein, \Relations Between Belini-Tartara, Chazan-Zakai-Ziv, and WaxZiv Lower Bounds", IEEE Trans. Information Theory, vol. IT-34, pp. 342-343, March 1988. [58] M. Zakai and J. Ziv, \A Generalization of the Rate-Distortion Theory and Application", in Information Theory New Trends and Open Problems, edited by G. Longo, Springer-Verlag, 1975, pp. 87-123. [59] L. D. Brown and R. C. Liu, \Bounds on the Bayes and Minimax Risk for Signal Parameter Estimation", IEEE Trans. Information Theory, vol. IT-39, pp. 1386-1394, July 1993. [60] E. Cinlar, Introduction to Stochastic Processes. Englewood Clis, NJ: PrenticeHall, Inc., 1975. [61] K. L. Bell, Y. Ephraim, Y. Steinberg, and H. L. Van Trees, \Improved BelliniTartara Lower Bound for Parameter Estimation", in Proceedings of Intl. Symp. on Info. Theory, (Trondheim, Norway), June 1994. [62] K. L. Bell, Y. Steinberg, Y. Ephraim, and H. L. Van Trees, \Improved ZivZakai Lower Bound for Vector Parameter Estimation", in Proceedings of Info. Theory /Stat. Workshop, (Alexandria, Virginia), October 1994. [63] T. Kailath, \The Divergence and Bhattacharyya Distance Measures in Signal Selection", IEEE Trans. Comm. Tech., vol. COM-15, pp. 52-60, February 1967. [64] C. E. Shannon, R. G. Gallager, and E. R. Berlekamp, \Lower Bounds to Error Probability for Coding on Discrete Memoryless Channels. I.", Information and Control, vol. 10, pp. 65-103, February 1967. [65] J. N. Pierce, \Approximate Error Probabilities for Optimal Diversity Combining", IEEE Trans. Comm. Syst., vol. CS-11, pp. 352-354, September 1963. [66] L. D. Collins, Asymptotic Approximation to the Error Probability for Detecting Gaussian Signals. Sc.D thesis, Mass. Institute of Technology, Cambridge, MA, June 1968. [67] H. L. Van Trees, Detection, Estimation, and Modulation Theory, Part III. New York, NY: John Wiley & Sons, 1971. [68] M. Ben-Bassat and J. Raviv, \Renyi's Entropy and the Probability of Error", IEEE Trans. Information Theory, vol. IT-24, pp. 324-331, May 1978. [69] D. E. Boekee and J. C. A. van der Lubbe, \Some Aspects of Error Bounds in Feature Selection", Pattern Recognition, vol. 11, pp. 353-360, 1979. 87 [70] D. E. Boekee and J. C. Ruitenbeek, \A Class of Lower Bounds on the Bayesian Probability of Error", Information Sciences, vol. 25, pp. 21-35, 1981. [71] M. Basseville, \Distance Measures for Signal Processing and Pattern Recognition", Signal Processing, vol. 18, pp. 349-369, December 1989. [72] M. Feder and N. Merhav, \Relations Between Entropy and Error Probability", IEEE Trans. Information Theory, vol. IT-40, pp. 259-266, January 1994. [73] T. M. Cover and J. A. Thomas, Elements of Information Theory. New York, NY: John Wiley & Sons, 1991. [74] J. Ziv and M. Zakai, \On Functionals Satisfying a Data-Processing Theorem", IEEE Trans. Information Theory, vol. IT-19, pp. 275-283, May 1973. Appendices 88 Appendix A Proof of Vector M -Hypothesis Bound The following is a proof of the vector M -hypothesis bound given in (3.85). It includes a proof that the vector M -hypothesis bound is tighter than the vector binary hypothesis bound, which reduces easily to the scalar case and completes the proof of Theorem 3.5. Proof. We start from (3.70): Z1 (A.1) aT Ra = 0 2 Pr jaT j 2 d: Focusing on Pr jaT j 2 , we can write it as the sum of M , 1 identical terms: MX ,1 Pr aT > 2 + Pr aT , 2 (A.2) Pr jaT j 2 = M 1, 1 i=1 MX ,1 Z = 1 p ('i,1) Pr aT > = 'i,1 d'i,1 M , 1 i=1 2 Z T + p('i ) Pr a , 2 = 'i d'i (A.3) MX ,1 Z 1 T T ^ = M ,1 p('i,1) Pr a (x) > a 'i,1 + 2 = 'i,1 d'i,1 i=1 Z T T ^ + p('i ) Pr a (x) a 'i , 2 = 'i d'i : (A.4) Now let '0 = ' and 'i = ' + i for i = 1; : : : ; M , 1. Dening 0 0 and taking the summation inside the integral gives: Pr jaT j 2 = 1 Z MX,1 p(' + ) Pr aT ^ (x) > aT ' + aT + = ' + i,1 i,1 i,1 M ,1 2 i=1 89 90 + p(' + i) Pr i , = ' + i d': (A.5) 2 Multiplying and dividing by PMn=0,1 p(' + n) and combining terms, we get: ! Z MX,1 1 T p(' + n ) (A.6) Pr ja j 2 = M , 1 n =0 2 T T ^ 4 PM ,1p(') Pr a (x) > a ' + = ' 2 n=0 p(' + n ) MX ,2 p ( ' + i) T T T ^ + PM ,1 p(' + ) Pr a (x) a ' + a i , 2 = ' + i i=1 n n=0 T T T ^ + Pr a (x) > a ' + a i + 2 = ' + i 3 p ( ' + M ,1 ) Pr aT ^ (x) aT ' + aT M ,1 , = ' + M ,1 5 d': + PM ,1 2 n=0 p(' + n ) We can interpret the term in square brackets as the probability of error in a suboptimal decision rule for the detection problem with M hypotheses Hi; i = 1; : : : ; M , 1: aT ^ (x) aT ' + aT Hi : = ' + i; Pr(Hi) = PMp,(1'p(+'+i) ) ; x p(xj = ' + i) n n=0 if 0 = 0 and the i are chosen so that aT i = i ; Thus i has the form i i = 1; : : : ; M , 1: = i kak2 a + bi; (A.7) (A.8) (A.9) where bi is an arbitrary vector orthogonal to a, i.e., aT bi = 0: (A.10) On each of the hypotheses Hi; i = 1; : : : ; M , 1, the parameter lies on one of the parallel hyperplanes perpendicular to the a-axis: aT = aT ' + i (A.11) 91 C CC H CC H3 C 2 CC C C C C C H1 C 2 C C CC CC CC CC CC 1 H0 C C C C C CC CC ^ C CCC: CC C CTaT = aT ' + 52 a C C C T CC CC aT = aCTa' +=3a ' + 2 CC 2 0 = ' C CC aT = aT ' + C aT = aT ' + 2 q q q t t Figure A.1 Vector parameter M -ary detection problem. This is illustrated in Figure A.1. The suboptimal decision rule makes a decision by comparing the estimate of the parameter to the M , 1 separating hyperplanes located midway between the hyperplanes on which the hypotheses lie, and a decision is made is favor of the appropriate hypothesis. Formally, Decide H0 if aT ^ (x) aT ' + 2 ; Decide Hi;i=1;:::;M ,2 if aT ' + i , 12 < aT ^ (x) aT ' + i + 12 ; (A.12) Decide HM ,1 if aT ^ (x) > aT ' + (M , 21 ): Lower bounding the suboptimal error probability by the minimum probability of error yields: ! Z MX,1 1 (M ) ('; ' + ; : : :; ' + T Pr ja j 2 M , 1 p(' + n) Pmin 1 M ,1 ) d': n=0 (A.13) Maximizing over 1; : : :; M ,1, applying valley-lling, and substituting the result into (A.1) gives the bound (3.85). 92 To show that the M -hypothesis bound is tighter than the binary hypothesis bound, let B (M )(1; : : :; M ,1) denote the bound in (A.13), i.e., ! Z MX,1 1 (M ) ('; ' + ; : : :; ' + ( M ) B (1; : : :; M ,1) M , 1 p('n) Pmin 1 M ,1 ) d': n=0 (A.14) If we let B (2)(0) denote the optimum binary bound, B (2)(0) = max B (2)() T :a = and let B (M )(1; : : : ; M ,1) denote the optimum M -ary bound, B (M )(1; : : :; M ,1) = (A.15) max B (M )(1; : : : ; M ,1); 1 ; : : :; M ,1 : aT i = i (A.16) then we need to show that B (M )(1; : : :; M ,1) B (2)(0): (A.17) We rst need the following Lemma. (M ) ' ; : : :; ' Lemma 1. Let Pmin 0 M ,1 denote the minimum probability of error in the M -ary detection problem: Hi : = 'i; Pr(Hi ) = qi; x p(xj = 'i); i = 0; : : : ; M , 1: (A.18) (2) and Pmin 'i ; 'i+1 denote the minimum probability of error in the binary detection problem: H0 : = 'i; Pr(H0) = qi +qiqi+1 ; x p(xj = 'i) (A.19) qi+1 ; x p(xj = ' ): H1 : = 'i+1 Pr(H1) = qi + i+1 qi+1 Then (M ) '0 ; : : : ; 'M ,1 Pmin MX ,2 i=0 (2) (qi + qi+1)Pmin 'i ; 'i+1 : (A.20) 93 Proof. (M ) Pmin '0 ; : : :; 'M ,1 Z = 1 , max [qip(xj'i)] dx i = Ex 1 , max ( p ( 'i jx)) 8 0i 19 > > = < BMX,1 C B@ p('njx)CA = Ex >min > ; : i n=0 (A.21) (A.22) (A.23) n6=i Since for any M positive numbers ai; i = 0; : : :; M , 1, MX ,1 MX ,2 min a min(ai; ai+1) n i =0 n6=i n i=0 (A.24) we have (MX,2 ) ( M ) Pmin '0; : : : ; 'M ,1 Ex min p('ijx); p('i+1jx) (A.25) i=0 ! ! Z MX ,2 q q i i+1 = (qi + qi+1) 1 , max qi + qi+1 p(xj'i); qi + qi+1 p(xj'i+1) dx (A.26) i=0 MX ,2 (2) = (qi + qi+1)Pmin 'i ; 'i+1 ; (A.27) i=0 and we have proven (A.20). We will now show that B (M )(1; : : :; M ,1) B (2)(0). Starting from B (2)(0), Z (2) B (2)(0) = (p(') + p(' + 0) Pmin ('; ' + 0) d': (A.28) Expanding into M , 1 identical terms, MX ,2 Z (2) ('; ' + 0 ) d': (A.29) (p(') + p(' + 0) Pmin B (2)(0) = M 1, 1 i=0 Now make the substitution ' = ' + i0, and take the summation inside the integral, B (2)(0) (A.30) Z M , 2 X (2) (' + i 0 ; ' + (i + 1) 0) d': (p(' + i0) + p(' + (i + 1) 0) Pmin = M 1, 1 i=0 94 From Lemma 1, B (2)(0) ! Z MX,1 1 (M ) M ,1 p(' + n0) Pmin ('; ' + 0; : : :; ' + (M , 1)0) d' (A.31) n=0 0 0 ( M ) = B ( ; 2 ; : : : ; (M , 1)0) (A.32) B (M )(1; : : :; M ,1): This completes the proof. (A.33) Vita Kristine LaCroix Bell was born on August 16, 1963 in Northampton, Massachusetts, and is an American citizen. She received a Bachelor of Science degree in Electrical Engineering from Rice University in 1985, and a Master of Science degree in Electrical Engineering from George Mason University in 1990. From 1985 to 1990, Ms. Bell was employed M/A-COM Linkabit and Science Applications International Corp. where she was involved in the development and analysis of military satellite communications systems. Since 1990, she has been a Research Instructor in the Center of Excellence in C3I at George Mason University. 95