MATH 382 Chebyshev`s Inequality
Transcription
MATH 382 Chebyshev`s Inequality
Dr. Neal, WKU MATH 382 Chebyshev’s Inequality 2 Let X be an arbitrary random variable with mean µ and variance σ . What is the probability that X is within t of its average µ ? If we knew the exact distribution and pdf of X , then we could compute this probability: P( X − µ ≤ t) = P( µ − t ≤ X ≤ µ + t ) . But there is another way to find a lower bound for this probability. For instance, we may obtain an expression like P( X − µ ≤ 2) ≥ 0.60 . That is, there is at least a 60% chance for an obtained measurement of this X to be within 2 of its mean. Theorem (Chebyshev’s Inequality). Let X be a random variable with mean µ and 2 variance σ . For all t > 0 σ2 P( X − µ > t) ≤ 2 t t 2 Proof. Consider Y = 0 and σ2 P( X − µ ≤ t) ≥ 1 − 2 . t if X − µ > t ≤ X − µ 2 . Then otherwise t 2 × P( X − µ > t) = E[Y ] ≤ E[ X − µ 2 ] = Var( X ) = σ 2 ; thus, P( X − µ > t) ≤ σ 2 / t 2 . Therefore, − P( X − µ > t) ≥ − σ 2 / t 2 which gives σ2 P( X − µ ≤ t) = 1 − P( X − µ > t ) ≥ 1 − 2 . t Chebyshev’s Inequality is meaningless when t ≤ σ . For instance, when t = σ it is simply saying P( X − µ > t) ≤ 1 and P( X − µ ≤ t) ≥ 0 , which are already obvious. So we must use t > σ to apply the inequalities. We illustrate next with some standard distributions. Example. (a) Let X ~ Poi(9) . Give a lower bound for P( X − µ ≤ 5) . (b) Let X ~ N(100, 15) . Give a lower bound for P( X − µ ≤ 20) . Solution. (a) For X ~ Poi(9) , µ = 9 = σ 2 ; so σ = 3 . Then σ2 9 P(4 ≤ X ≤ 14) = P( X − 9 ≤ 5) = P( X − µ ≤ 5) ≥ 1 − 2 = 1 − = 0.64 . 25 5 Note: Using the pdf of X ~ Poi(9) we obtain P(4 ≤ X ≤ 14) ≈ 0. 9373 . Dr. Neal, WKU (b) For X ~ N(100, 15) , we have 152 P(80 ≤ X ≤ 120) = P( X − 100 ≤ 20) = P( X − µ ≤ 20) ≥ 1 − 2 = 0.4375 20 Note: Using a calculator, we obtain P(80 ≤ X ≤ 120) ≈ 0.817577. From these examples, we see that the lower bound provided by Chebyshev’s Inequality is not very accurate. However, the inequality is very useful when applied to the sample mean x from a large random sample. 2 Recall that if X is an arbitrary measurement with mean µ and variance σ , and x is the sample mean from random samples of size n , then µ x = µ and σ 2x = σ2 . n Applying Chebyshev’s Inequality, we obtain a lower bound for the probability that x is within t of µ : σ2 σ2 P( x − µ ≤ t) = P( x − µ x ≤ t ) ≥ 1 − 2x = 1 − 2 t nt Suppose X is an arbitrary measurement with unknown mean and variance but with known range such that c ≤ X ≤ d . Then σ ≤ (d − c ) / 2 and σ 2 ≤ (d − c)2 / 4 . Thus, P( x − µ ≤ t) ≥ 1 − (d − c)2 4n t 2 A special case of x is a sample proportion p of a proportion p for which µ p = p and σ 2p = p(1 − p) 0.25 . ≤ n n We then have P( p − p ≤ t) ≥ 1 − p(1 − p) 0. 25 ≥1− 2 nt n t2 Dr. Neal, WKU Example. Let X ~ N(100, 15) . Let x be the sample mean from random samples of size 400. Give a lower bound for P( x − µ ≤ 2) . Solution. For random samples of size 400, we have P(98 ≤ x ≤ 102) = P( x − 100 ≤ 2) = P( x − µ ≤ 2) ≥ 1 − 152 = 0.859375 400 × 22 Thus, for samples of size 400, there is a relatively high chance that x will be within 2 of the average µ = 100 . Example. Let X be an arbitrary measurement with unknown distribution but with known range such that 10 ≤ X ≤ 30 . For random samples of size 1000, give a lower bound for P( x − µ ≤ 1) . Solution. Here µ and σ are unknown, but we do know that σ ≤ 30 − 10 = 10 so that 2 σ2 100 σ ≤ 100 . Then P( x − µ ≤ 1) ≥ 1 − = 0.90 . So there is at least a 2 ≥1− 1000 × 1 1000 × 12 90% chance that a sample mean x will be within 1 of the unknown mean µ . 2 Example. Let p be an unknown proportion that we are estimating with sample proportions p from computer simulations with samples of size 4000. Give a lower bound for P( p − p ≤ 0. 02) . Solution. For the proportion p and trials of size 4000, we have P( p − p ≤ 0. 02) ≥ 1 − 0.25 0.25 = 0.84375 . 2 =1− nt 4000 × 0. 022 Law of Large Numbers (a.k.a. Law of Averages) Let x be the sample mean from random samples of size n for a measurement with mean µ , and let p be the sample proportion for a proportion p . As the sample size n increases, the probability that x is within t of µ increases to 1, and the probability that p is within t of p increases to 1. So for very large n and small t , we can say that virtually all x are good approximations of µ and virtually all p are good approximations of p . Dr. Neal, WKU Exercises 1. Let X ~ exp(20) . (a) Use Chebyshev’s Inequality to give a lower bound for P( X − µ ≤ 25) . (b) Use the cdf of X to give a precise value for P( X − µ ≤ 25) . 2. Let X be a measurement with range 2 ≤ X ≤ 10 . For random samples of size 400, give a lower bound for P( x − µ ≤ 0. 5) . 3. With samples of size 1200, let p be the sample proportion for an unknown proportion p . Give a lower bound for P( p − p ≤ 0. 03) .