Chapter 7: Confidence Interval and Sample Size Learning Objectives
Transcription
Chapter 7: Confidence Interval and Sample Size Learning Objectives
Chapter 7: Confidence Interval and Sample Size Learning Objectives Upon successful completion of Chapter 7, you will be able to: • • Find the confidence interval for the mean, proportion, and variance. Determine the minimum sample size when determining a confidence interval for the mean and for a proportion. • Level of confidence, maximum error of Estimate (E) and the sample size are inter-related. I. Inference Includes: 1. Estimation of a population parameter (μ, ρ, or ) using data from a sample. 2. Hypothesis Testing or using sample data to test a conjecture about the population mean (μ), population proportion (ρ), or population standard deviation ( ). II. Two Kinds of Estimate for Parameters 1. A point estimate of the population parameter is the sample statistic, i.e., the point estimate for the population mean μ is the sample mean of , the point estimate for the population proportion is the sample proportion, and the point estimate for the population standard deviation is the sample standard deviation s. 2. An interval estimate of a parameter is a range of values determined from the point estimate. Dr. Janet Winter, jmw11@psu.edu Stat 200 Page 1 III. Confidence Interval Estimates for Population Parameters The confidence level is the probability that intervals determined by these methods will contain the parameter. A confidence interval is the range of values determined from a sample statistic and the specified confidence level. The common confidence intervals use 90%, 95%, or 99% confidence levels. IV.Confidence Interval Estimates for the Population Mean μ A. When to use the Normal Distribution (z) and when to use the t Distribution for Confidence Interval Estimates of the Population Mean Start Yes Yes Is the population normally distributed? Yes z Use the normal distribution No Is σ known? Yes No Is n > 30? Is the population normally distributed? No Yes Use nonparametric or bootstrapping methods. t Use the t distribution No Is n > 30? No Use nonparametric or bootstrapping methods. “Elementary Statistics: Using the Graphing Calculator for the TI-83/84”, Triola, Mario F. Dr. Janet Winter, jmw11@psu.edu Stat 200 Page 2 B. Rounding Rules for all Confidence Intervals Estimates of the Mean I. When using actual data: a) find the mean and standard deviation to 2 extra places than the data. b) round the answer to one more decimal place than the original data. Note: This is very important! Answers not rounded correctly are marked wrong on Mathzone. II. When using a mean and standard deviation, work with one more decimal place than the data and round to the same number of decimal places given for the mean. C. Meaning of ALL Confidence Interval Estimates Be sure to reread P 353 (6th edition) or P 361 (7th edition) in the textbook to better understand the meaning of the confidence interval. For example: a 90% confidence interval estimate for the population mean is interpreted as 90% of the confidence interval estimates formed with this process include the value of the population mean. D. z Interval Estimates for the population Mean I. Requirements a) the population standard deviation ( ) is given b) the sample size n 30; c) But, if the sample size n < 30, the variable must be selected from a normal distribution II. Confidence Coefficient Dr. Janet Winter, jmw11@psu.edu Stat 200 Page 3 a) Meaning of the Confidence Coefficient z is called the confidence coefficient, i.e., the number of multiples of the standard error for an interval estimate with a level of confidence. Complete the rest of the table using the confidence level (1-∝). The first 2 have been completed for you (answers at the end). . .90 .95 .10 .05 .05 .025 b) Method to find the Confidence Coefficient: Find the z value with area to its left, i.e., 1. Locate inside the Normal Probability Table (Table E) 2. Starting at , move your hand to the left along the row until you reach the Z column. This is the integer and tenths digits. Go back to , next move your hand to the top of its column. This is the hundreds digits. 3. Add the integer and tenths digits to the hundredths digits to find the value for z. 4. Affix a sign in front of the number. Dr. Janet Winter, jmw11@psu.edu Stat 200 Page 4 Using the method described, complete the table below. The first 2 have been completed for you (answer at the end). Confidence Level 1−α α .90 .95 .99 .80 .98 .96 .93 .10 .05 α/2 (1 − α ) + .95 .975 α 2 .95 .975 Confidence Coefficient 𝑧(𝑎/2) 1.645 1.96 III. Development of the Confidence Interval Formula 𝜎 𝜎 𝑥̅ − 𝑧 < 𝜇 < 𝑥̅ + 𝑧 √𝑛 √𝑛 Whenever the population standard deviation 𝜎 is known and either the population is normally distributed or n ≥ 30, the Central Limit Theorem guarantees the sample mean is normally distributed or: 𝑥̅ − 𝜇 <𝑧 𝜎𝑥̅ −𝑧 < − 𝑧 ∙ 𝜎𝑥̅ < 𝑥̅ − 𝜇 < 𝑧 ∙ 𝜎 � −𝑥̅ − 𝑧 ∙ 𝜎𝑥̅ < −𝜇 < −𝑥̅ + 𝑧 ∙ 𝜎𝑥̅ (−𝑥̅ − 𝑧 ∙ 𝜎𝑥̅ ) < (−𝜇) < (−𝑥̅ + 𝑧𝜎𝑥̅ ) −(−𝑥̅ − 𝑧 ∙ 𝜎𝑥̅ ) > −(−𝜇) > −(−𝑥̅ + 𝑧𝜎𝑥̅ ) 𝑥̅ + 𝑧 ∙ 𝜎𝑥̅ > 𝜇 > 𝑥̅ − 𝑧𝜎𝑥̅ 𝑥̅ − 𝑧 ∙ 𝜎𝑥 < 𝜇 < 𝑥̅ + 𝑧𝜎𝑥̅ 𝑥̅ − 𝑧 𝜎 √𝑛 < 𝜇 < 𝑥̅ + 𝑧 𝜎 √𝑛 Note: If the population standard deviation is not known or stated, use x� − t s √n < 𝜇 < x� + t s √n Dr. Janet Winter, jmw11@psu.edu (see section E page 9). Stat 200 Page 5 IV. Review of Concepts and Maximum Error of Estimate is the point estimate and the center of the confidence interval z is the confidence coefficient, the number of multiples of the standard error needed to construct an interval estimate of the correct width to have a level of confidence 1− α is called the maximum error of estimate. V. Example: 35 fifth-graders have a mean reading score of 82. The standard deviation of the population is 15. a) Find the 95% confidence interval estimate for the mean reading scores of all fifthgraders. Since we know the population standard deviation and n≥30, use . Use Table E backwards with the area to the left of z equal to .025. The value of z or the confidence coefficient is z = 1.9 + .06 = 1.96. This means approximately 95% of the sample means will fall within 1.96 standard errors of the population mean. Use z = 1.96 in the formula. 4.97, rounded to 5, is the maximum error of estimate. Be sure to list it for full credit in your answers. Dr. Janet Winter, jmw11@psu.edu Stat 200 Page 6 b) Find the 99% confidence interval estimate of the mean reading scores of all fifthgraders. Since approximately 99% of the sample means will fall within 2.58 standard errors of the population mean, use z = 2.58 𝑋� = 82.1, 𝑛 = 35, 𝜎 = 15 𝑋� − 𝑧 𝜎 √𝑛 < 𝜇 < 𝑋� + 𝑧 82.1 − 2.58 15 √35 𝜎 √𝑛 < 𝜇 < 82.1 + 2.58 82.1 − 6.54 < 𝜇 < 82.1 + 6.54 15 √35 82.1 ± 6.5 75.6 < 𝜇 < 88.5 6.54, rounded to 6.5, is the maximum error of estimate. Be sure to list it in the next to last step. c) Is the 95% confidence interval or the 99% confidence interval larger? Explain why. 95% confidence level: 77 < μ < 87 99% confidence level: 75 < μ < 89 The 99% confidence level is larger because it has a larger z value. Question 1 A study of 40 English composition professors showed that they spent, on average, 12.6 minutes correcting a student’s term paper. Find the 90% confidence interval of the mean time for all composition papers when 𝜎 = 2.5 minutes. n = 40 𝑋�= 12.6 Since the population standard deviation is given and n = 40 is greater than 30, use the formula: 𝝈 𝝈 �−𝒛 �+𝒛 <𝜇<𝑿 𝑿 √𝒏 √𝒏 If a professor stated that he spent, on average, 30 minutes correcting a term paper, what would be your reaction? Dr. Janet Winter, jmw11@psu.edu Stat 200 Page 7 VI. Maximum Error of Estimate for Confidence Interval Estimates of μ a) Definition The maximum error or estimate is always the largest difference between the point estimate of a parameter and the actual value of the parameter. The maximum error of estimate is ½ the width of the confidence interval. b) Maximum Error of Estimate for Confidence Interval Estimates of μ It is the term 𝐸 = 𝑧 𝜎 √𝑛 VII. Find the Sample Size Using E and the Confidence Level a) Concept: E is like tolerance or allowable error where: 𝜎 𝐸=𝑧 √𝑛 √𝑛𝐸 = 𝑧𝜎 √𝑛 = 𝑧𝜎 𝐸 𝑧𝜎 2 � 𝐸 𝑛=� b) Formula for the Minimum Sample Size for an Interval estimate of the population mean 𝒛𝝈 𝟐 𝒏=� � 𝑬 where E is the maximum error of estimate. If the answer is not a whole number, round up to the next larger whole number to find the sample size, n. If the population standard deviation is not available use the sample standard deviation. c) Example: An insurance company is trying to estimate the average number of sick days that fulltime food service workers use per year. A pilot study found the standard deviation to be 2.5 days. How large a sample must be selected if the company wants to be 95% confident of getting an interval that contains the true mean with a maximum error of 1 day? s= 2.5 confidence level = 95% maximum error = 1 day 𝒛𝝈 𝟐 𝟏. 𝟗𝟔 ∙ 𝟐. 𝟓 𝟐 𝒏=� � =� � = 𝟒. 𝟗𝟐 𝐨𝐫 𝑬 𝟏 𝒏 = 𝟐𝟒. 𝟎𝟏 𝒏 = 𝟐𝟓 𝒘𝒐𝒓𝒌𝒆𝒓𝒔 Dr. Janet Winter, jmw11@psu.edu Stat 200 Page 8 Question 2 Find the sample size necessary to estimate a population mean to within 0.5 with 95% confidence if the standard deviation is 6.2 𝒏=� 𝒛∙𝝈 𝟐 � 𝑬 Note: When solving for sample size n, always round up to the next larger integer. E. t Confidence Interval Estimates for the Population Mean I. Requirements a) σ is unknown b) n ≥ 30 c) But, if n < 30, the variable is normally distributed. II. Characteristics of the t Distribution Similarities with the normal distribution: a) Bell shaped. b) Symmetrical about the mean. c) The mean, median, and mode are equal to 0 at the center of the distribution. d) The curve never touches the x axis. Differences from the standard normal: a) The variance is greater than 1. b) The t distribution is actually a family of curves based on the degrees of freedom, which is related to sample size. c) As the sample size increases, the t distribution approaches the standard normal distribution. Dr. Janet Winter, jmw11@psu.edu Stat 200 Page 9 Read textbook page 362 (6th Edition) or page 370 (7th Edition) for the comparison between Normal and t distributions. (Triola & Triola, 2006) III. Tabled Values for the t Table F: a) Location • 6th Edition: Table F – located on the inside cover of the text on the opposite side from Table E (standard normal). • 7th Edition: Table F – located on the last page of the textbook or the pull-out card. b) Method to find the confidence coefficient for t • Use the column for the appropriate confidence level • Use the row for the appropriate degrees of freedom. • The intersection of the appropriate column and appropriate row is the confidence coefficient. Note: If the degrees of freedom needed are not listed in the table, always round down to the nearest table value. For example, if we need degrees of freedom 44, use df=40 since 44 is not listed in the table. IV. Degrees of Freedom for Estimates of the Population Mean Degrees of freedom are the number of values that are free to vary after a sample statistic has been computed. For the confidence interval for the mean the degrees of freedom are: sample size minus 1 Dr. Janet Winter, jmw11@psu.edu Stat 200 or d.f. = n – 1 Page 10 V. Example: 28 employees of XYZ Company travel an average (mean) of 14.3 miles to work. The standard deviation of their travel time was 2 miles. Find the 95% confidence interval of the true mean or population mean. Since the population standard deviation is not given, use the formula: 𝑋−𝑡 𝑠 √𝑛 <𝜇 <𝑋+𝑡 14.3 − 2.052 2 √28 𝑠 𝑛 = 28 √𝑛 < 𝜇 < 14.3 + 2.052 14.3 − 0.776 < 𝜇 < 14.3 + 0.776 2 𝑋 = 14.3 𝑠=2 𝑑𝑓: 27 √28 14.3 ± .8 13.5 < 𝜇 < 15.1 VI. Example: The average yearly income for 28 engineering graduates in 2008 is $56,718. The standard deviation was $650. 1. Find the 95% confidence interval estimate for the population mean. Since the population standard deviation is not given, use the formula: 𝑋−𝑡 𝑠 √𝑛 <𝜇 <𝑋+𝑡 $𝟓𝟔, 𝟕𝟏𝟖 − 𝟐. 𝟎𝟓𝟐 𝑠 √𝑛 𝟔𝟓𝟎 √𝟐𝟖 𝑛 = 28 𝑋 = 56718 < 𝜇 < $56,718 + 2.052 𝟓𝟔𝟕𝟏𝟖 − 𝟐𝟓𝟐 < 𝜇 < 56718 + 252 𝟔𝟓𝟎 𝑠 = 650 𝑑𝑓: 27 √𝟐𝟖 $𝟓𝟔, 𝟕𝟏𝟖 < 𝜇 < $56,970 Note: Now that you are familiar with this problem, it is simpler to record the steps: 𝟓𝟔𝟕𝟏𝟖 ± 𝟐. 𝟎𝟓𝟐 𝟓𝟔𝟕𝟏𝟖 ± 𝟐𝟓𝟐. 𝟏 𝟔𝟓𝟎 √𝟐𝟖 56718 ± 252 (rounded to the same number of places as the mean) 2. 𝟓𝟔𝟒𝟔𝟔 < 𝜇 < 56970 If an individual graduate wishes to see if he or she is being paid below average, what salary value should he or she use? Use the lower bound of the confidence interval: $56,466. Dr. Janet Winter, jmw11@psu.edu Stat 200 Page 11 Question 3 The prices (in dollars) for a particular model of 6.0 megapixels digital camera with 3x optical zoom are listed as: $225, $240, $215, $202, $206, $211, $210, $193, $250, $225. Estimate the true mean using this data with 90% confidence. Since the population standard deviation is not given, use: 𝑿 ± 𝒕 𝒔 √𝒏 . Do not use the σ from the calculator. This is a sample, so be sure to use s and work in 2 more places than the data and round the answers to one more place than the data. 𝑿 = 𝟐𝟏𝟕. 𝟕𝟎 Question 4 𝒔 = 𝟏𝟕. 𝟒𝟗 𝒕 = 𝟏. 𝟖𝟑𝟑 𝒅𝒇: 𝟗 𝒏 = 𝟏𝟎 John wants to estimate the average value of the homes in his town with a 99% confidence interval. Use his random sample of 36 homes with an average value of $251,131.42 and standard deviation $1321.46 to find the confidence interval. Since the population standard deviation is not given, use the formula - X ± t s . . √n The degrees of freedom equals 35, but df = 35 is not available in the table. Use the next lower df or df = 34. V. Confidence Interval Estimates for Population Proportions A. Symbols Used to Estimate Proportion • • • p = symbol for the population proportion p� = symbol for the sample proportion; read p “hat” 𝑞� = 1 − 𝑝̂ = symbol for the same proportion of failures. Where x = number of sample units that possess the characteristics of interest and n = sample size. 𝑥 • 𝑝̂ + 𝑛 B. Development of the Formula For a Binomial Probability Distribution with x = number of successes For x: with n p ≥ 5 and n (1 – p) ≥ 5 X = number of successes is approximately normally distributed with: 𝜇 = 𝑛𝑝 𝜎 = �𝑛𝑝(1 − 𝑝) 𝑥 Thus for proportions 𝑝 = 𝑛 𝜇 𝑛𝑝 𝜇𝜌 = = =𝑝 𝑛 𝑛 Dr. Janet Winter, jmw11@psu.edu Stat 200 Page 12 1. The mean of 𝑝̂ is 𝑝 2. The standard deviation for 𝑝̂ becomes: 𝜎𝜌 = . 𝜎 �𝑛𝑝(1 − 𝑝) = 𝑛 𝑛 𝜎𝜌 = � 𝜎𝜌 = � 𝑛𝑝(1 − 𝑝) 𝑛2 𝑝(1 − 𝑝) 𝑛 Next we use the pattern for the confidence interval estimate of the population mean, point estimate ± 𝒛 ∙ 𝒔𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝒅𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏 �(𝟏−𝒑 �) 𝒑 � ± 𝒛� It becomes 𝒑 𝒏 C. Formula for the confidence interval estimate of the population proportion Note: a shorter version of the formula to estimate the population proportion is: 𝑝�∙𝑞� 𝑝̂ ± 𝑧� 𝑛 when np and nq are each greater than or equal to 5. D. Rounding Rules for proportions Always use 4 decimal places for the computation and round the answers to 3 decimal places. E. Example: 1. 55 students in a random sample of 450 enrolled in summer classes. Estimate the population proportion of students taking classes this summer. 𝑿 𝟓𝟓 �= = � = 𝟏−. 𝟏𝟐𝟐𝟐 =. 𝟖𝟕𝟕𝟖 𝒑 = 𝟎. 𝟏𝟐𝟐𝟐 𝒒 𝒏 𝟒𝟓𝟎 �𝒒 � �𝒒 � 𝒑 𝒑 � + 𝒛� � − 𝒛� 𝒑 <𝑝<𝒑 𝒏 𝒏 𝟎. 𝟏𝟐𝟐𝟐 ∙ 𝟎. 𝟖𝟕𝟕𝟖 𝟎. 𝟏𝟐𝟐𝟐 ± 𝟏. 𝟗𝟔� 𝟒𝟓𝟎 𝟎. 𝟏𝟐𝟐±. 𝟎𝟑𝟎 𝟎. 𝟎𝟗𝟐 < 𝑝 < .152 𝟗. 𝟐% < 𝑝 < 15.2% Dr. Janet Winter, jmw11@psu.edu Stat 200 Page 13 2. Is an estimate of 11% about right? Yes, 11% is about right since it is contained within the confidence interval estimate. Question 5 A survey found that out of 200 students, 168 said they needed loans or scholarships to pay their tuition and expenses. Find the 90% confidence interval for the population proportion of students needing loans or scholarships. � = 𝟎. 𝟖𝟒 𝒒 � = 𝟎. 𝟏𝟔 𝒑 𝑝̂ − 𝑧 � 𝑝̂ 𝑞� 𝑝̂ 𝑞� < 𝑝 < 𝑝̂ + 𝑧 � 𝑛 𝑛 Question 6 A study by the University of Michigan found that one in five 13 and 14 year olds is a sometime smoker. To see how the smoking rate of the students at a large school district compared to the national rate, the superintendent surveyed two hundred 13 and 14 year old students and found that 23% said they were sometime smokers. Find the 99% confidence interval of the true proportion and compare this with the University of Michigan’s study. 𝑛 = 200 � = 0.23 𝒑 � = 1 − 0.23 = 0.77 𝒒 𝑝̂ − 𝑧 � 𝑝� 𝑞� 𝑛 < 𝑝 < 𝑝̂ + 𝑧 � 𝑝� 𝑞� 𝑛 F. Formula for the Minimum Sample Size to Estimate a Population Proportion �𝒒 � 𝒑 𝑬 = 𝒛� 𝒏 �𝒒 � 𝑬 𝒑 =� 𝒛 𝒏 �𝒒 � 𝑬 𝟐 𝒑 � � = 𝒛 𝒏 𝒛 𝟐 𝒏 � � = �𝒒 � 𝑬 𝒑 𝟐 𝒛 𝒏 = �𝒒 � 𝑬𝟐 𝒑 𝟐 𝒛 �𝒒 �∙ 𝟐=𝒏 𝒑 𝑬 𝑧 2 𝑛 = 𝑝̂ 𝑞� � � 𝐸 Use 𝑝̂ from a pilot study or previous estimate if it is available. Otherwise, use 𝑝̂ = .5. 𝑛 must be a whole number. If it’s not a whole number, round up to the next larger whole number. Dr. Janet Winter, jmw11@psu.edu Stat 200 Page 14 G. Example: A medical researcher wishes to determine the percentage of drivers using GPS systems in their car. He wishes to be 99% confident that the estimate is within 2 percentage points of the true proportion. A recent study of 180 drivers showed that 25% used GPS systems. a) How large should the sample size be? Since a recent study showed 25% used GPS Systems, 𝑝̂ = 0.25 𝑎𝑛𝑑 𝑞� = 0.75. . 𝑧 2 𝑛 = 𝑝̂ 𝑞� � � 𝐸 2.58 2 𝑛 = 0.25 ∙ 0.75 � � . 02 𝑛 = 3120.187 Since the computed n is not a whole number, round up and use n = 3121. b) If no estimate of the sample proportion is available, how large should the sample be? Since there is no prior estimate of p, use p = 0.5 and q = 0.5 𝒛 𝟐 �𝒒 �� � 𝒏=𝒑 𝑬 𝒏 = 𝟒𝟏𝟔𝟎. 𝟐𝟓 𝟐. 𝟓𝟖 𝟐 𝒏 = 𝟎. 𝟓 ∙ 𝟎. 𝟓 � � 𝟎. 𝟎𝟐 Since the computed n is a not a whole number, round up and use n = 4161. Note: the sample size needs to be larger when there is no prior estimate for p. VI.Confidence Interval Estimate for the Population Variance and Population Standard Deviation A. General Comments To find confidence intervals for variances and standard deviations, • Use the chi-square distribution • Samples must be selected from normally distributed populations. • Assume the population variance is 𝛔𝟐 . • The chi-square distribution is obtained from the values of Dr. Janet Winter, jmw11@psu.edu Stat 200 (𝐧−𝟏)𝐬 𝟐 𝛔𝟐 or 𝐱 𝟐 = (𝐧−𝟏)𝐬 𝟐 𝛔𝟐 Page 15 B. Chi-Square Distribution Reference textbook page 378 (6th Edition) or page 386 (7th Edition). I. Characteristics • Chi-Square is always positive. • It is a family of distributions dependent on degrees of freedom (n – 1). • The mode is always slightly to the left of degrees of freedom. • As n increases, Chi-Square walks off to the right. • Chi-Square distribution is skewed to the right. II. Finding Chi-Square values on the Chi-Square table Since is not symmetrical, two different values are used in the confidence interval formula for the population variance. For For right, use the column . left, use the column in the table for . Process: 1. Use the confidence level to find . 2. Use the column with the appropriate degrees of freedom to find right. 3. Find . 4. Use the column with the appropriate degrees of freedom to find left. Chi-Square Values of df: 18 Chi-Square left Left Confidence Level Right Chi-Square right 9.390 8.231 10.865 7.015 6.265 .95 .975 .90 .99 .995 Dr. Janet Winter, jmw11@psu.edu .90 .95 .80 .98 .99 Stat 200 .05 .25 .10 .01 .005 28.869 31.526 25.989 34.805 37.156 Page 16 C. Formulas 1. Confidence Interval Estimate for the Population Variance: df: n - l Note: right is on the left side of the equation but the right side of the graph and left is on the right side of the equation but the left side of the graph. 2. Confidence Interval Estimate for the Population Standard Deviation Since the population standard deviation is the square root of the population variance, the confidence interval estimate of the population standard deviation is: D. Rounding Rules for Standard Deviation or Variance I. When using actual data: a) find the standard deviation to 2 extra places than the data. b) round the answer to one more decimal place than the original data. II. When using sample standard deviation or variance, work with one more decimal place than the statistic and round to the same number of places as the standard deviation or variance given. E. Example: Find the confidence interval for the standard deviation in the time it takes to fill a car with gas. In a sample of 23 fill-ups, the standard deviation of the time it takes to fill the car is 3.8 minutes. Assume the variable is normally distributed. Dr. Janet Winter, jmw11@psu.edu Stat 200 Page 17 Note: the answer has the same number of decimal places as the given sample standard deviation since the work is done with statistics instead of data. Question 7 Find the 99% confidence interval for the variance and standard deviation of the weights of one-gallon containers of motor oil when the sample of 14 containers has a variance of 3.2. Assume the variable is normally distributed. F. Example: The number of calories in a 1-ounce serving of various regular cheeses is shown. Estimate the population variance with 90% confidence. 110 130 45 100 100 80 95 105 110 105 110 90 100 110 110 70 95 125 120 108 Is the 90% confidence interval estimate for the population variance. Note: • Use the tabled values for and use s rounded to 2 more places than the data in the computation. • Since data is used to compute the standard deviation, the answer has one more place than the original data. Dr. Janet Winter, jmw11@psu.edu Stat 200 Page 18 Question 8 A service station advertises a wait of no more than 30 minutes for an oil change. A sample of 28 oil changes has a standard deviation of 5.2 minutes. Find the 95% confidence interval of the population standard deviation of the times spent waiting for an oil change. (𝒏 − 𝟏)𝒔𝟐 (𝒏 − 𝟏)𝒔𝟐 𝟐 < 𝝈 < 𝒙𝟐 𝒓𝒊𝒈𝒉𝒕 𝒙𝟐 𝒍𝒆𝒇𝒕 VII. Summaries A. Estimates for Population Parameters • • • • • Estimation is an important aspect of inferential statistics. A point estimate is a single value with no accuracy specified. An interval estimate is a range of values with its accuracy specified by the confidence level. Every question about a confidence interval will have the words “Find a confidence interval estimate for…”. Pay particular attention to the words determine whether the confidence interval is for the mean, proportion, or variance (or its square root, the standard deviation). B. Minimum Sample Sizes to Estimate Population Parameters • You always need to know both the confidence level and the maximum error of estimate. In addition, for the: 1. Mean – the population standard deviation (given or estimate) is also required. 𝑧∙𝜎 2 � 𝐸 𝑛=� 2. Proportion – an estimate of the proportion from a pilot study is preferred (or use p = .5). 𝑧 2 𝑛 = 𝑝̂ 𝑞� � � 𝐸 Dr. Janet Winter, jmw11@psu.edu Stat 200 Page 19 C. Rounding Rules I. For Estimates of the Mean a) When using actual data: (a) find the mean and standard deviation to 2 extra places than the date. (b) round the answer to one more decimal place than the original data. Note: This is very important! Answers not rounded correctly are marked wrong on Mathzone. b) When using a mean and standard deviation, work with one more decimal place than the data and round to the same number of decimal places given for the mean. II. For Estimates of the Standard Deviation or Variance a) When using actual data: (a) find the standard deviation to 2 extra places than the data. (b) round the answer to one more decimal place than the original data. b) When using sample standard deviation or variance, work with one more decimal place than the statistic and round to the same number of places as the standard deviation or variance given. III. For Estimates of the Proportions a) Always use 4 decimal places for the computation and round the answers to 3 decimal places. Dr. Janet Winter, jmw11@psu.edu Stat 200 Page 20 Answer: Confidence Coefficient z is called the confidence coefficient, i.e., the numbers of multiples of the standard error for an interval estimate with a 1− ∝ level of confidence. 1− ∝ .90 .95 .99 .80 .85 ∝ 2 .05 .025 .005 .10 .075 ∝ .10 .05 .01 .20 .15 Answer: Method to find the Confidence Coefficient 1. 2. 3. 4. Use the confidence level (1 − 𝛼) to find 𝛼/2. Use the 𝛼/2 column with the appropriate degrees of freedom to find x 2 right. Find 1 − 𝛼/2. Use the 1 − 𝛼/2 column with the appropriate degrees of freedom to find x 2 left. Confidence Level 1−α .90 .95 .99 .80 .98 .96 .93 α .10 .05 .01 .20 .02 .04 .07 α/2 .95 .975 .995 .90 .99 .98 .965 Dr. Janet Winter, jmw11@psu.edu (1 − α ) + α 2 .95 .975 .995 .90 .99 .98 .965 Stat 200 Confidence Coefficient 𝑧(𝑎/2) 1.645 1.96 2.58 1.28 2.33 2.05 1.81 Page 21 Answer: Question 1 A study of 40 English composition professors showed that they spent, on average, 12.6 minutes correcting a student’s term paper. a) Find the 90% confidence interval of the mean time for all composition papers when σ= 2.5 minutes. � n = 40 X= 12.6 Since the population standard deviation is given and n=40 is greater than 30, use the formula: 𝜎 𝜎 𝑋� − 𝑧 < 𝜇 < 𝑋� + 𝑧 √𝑛 √𝑛 2.5 2.5 12.6 − 1.645 < 𝜇 < 12.6 + 1.645 √40 √40 12.6 − 0.6502 < 𝜇 < 12.6 + 0.6502 12.6 − 0.7 < 𝜇 < 12.6 + 0.7 𝟏𝟏. 𝟗 < 𝜇 < 13.3 b) If a professor stated that he spent, on average, 30 minutes correcting a term paper, what would be your reaction? 11.9 < 𝜇 < 13.3 It would be highly unlikely since 30 minutes is far longer than the upper bound of 13.3 minutes. Answer: Question 2 Find the sample size necessary to estimate a population mean to within 0.5 with 95% confidence if the standard deviation is 6.2. 𝑧∙𝜎 2 𝑛=� � 𝐸 . 2 (1.96)(6.2) 𝑛=� � = [24.304]2 = 590.684 0.5 𝑛 = 591 Note: When solving for sample size n, always round up to the next larger integer (Why?) Dr. Janet Winter, jmw11@psu.edu Stat 200 Page 22 Answer: Question 3 The prices (in dollars) for a particular model of 6.0 megapixels digital camera with 3x optical zoom are listed as: $225, $240, $215, $202, $206, $211, $210, $193, $250, $225. Estimate the true mean using this data with 90% confidence. Since the population standard deviation is not given, use: X ± t s . Do not use the σ from the √n calculator. This is a sample, so be sure to use s and work in 2 more places than the data and round the answers to one more place than the data. 𝑋 = 217.70 217.70 ± 1.833 𝑠 = 17.49 17.49 √10 𝑡 = 1.833 𝑑𝑓: 9 𝑛 = 10 217.7 ± 10.1 𝟐𝟎𝟕. 𝟔 < 𝜇 < 227.8 Note: 𝑋 and 𝑠 are found to two decimal places more than the data, but the answer is rounded back to one more place than the data. Answer: Question 4 John wants to estimate the average value of the homes in his town with a 99% confidence interval. Use his random sample of 36 homes with an average value of $251,131.42 and standard deviation $1321.46 to find the confidence interval. Since the population standard deviation is not given, use the formula X ± t s √n . The degrees of freedom equals 35, but df = 35 is not available in the table. Use the next lower df or df = 34. . 251131.42 ± 2.728 1321.46 √36 251131.42 ± 600.82 𝟐𝟓𝟎𝟓𝟑𝟎. 𝟔𝟎 < 𝜇 < 251732.24 Note: Since statistics are given, work one more place than the statistic but round the answer back to the same number of places as 𝑋. Dr. Janet Winter, jmw11@psu.edu Stat 200 Page 23 Answer: Question 5 A survey found that out of 200 students, 168 said they needed loans or scholarships to pay their tuition and expenses. Find the 90% confidence interval for the population proportion of students needing loans or scholarships. 𝑝̂ = 0.84 𝑝̂ − 𝑧� 𝑞� = 0.16 𝑝̂ 𝑞� 𝑝̂ 𝑞� < 𝑝 < 𝑝̂ + 𝑧� 𝑛 𝑛 0.84 ∙ 0.16 0.84 ∙ 0.16 0.84 − 1.645� < 𝑝 < 0.84 + 1.645� 200 200 0.84 ± 1.645�(0.84 ∙ 0.16 ÷ 200) 0.84 − 0.043 < 𝑝 < 0.84 + 0.043 0.797 < 𝑝 < 0.883 𝟕𝟗. 𝟕% < 𝑝 < 88.3% Answer: Question 6 A study by the University of Michigan found that one in five 13 and 14 year olds is a sometime smoker. To see how the smoking rate of the students at a large school district compared to the national rate, the superintendent surveyed two hundred 13 and 14 year old students and found that 23% said they were sometime smokers. Find the 99% confidence interval of the true proportion and compare this with the University of Michigan’s study. . 𝑛 = 200 𝑝̂ = 0.23 0.23 ± 2.58� 0.23 ∙ 0.77 200 𝑝̂ − 𝑧� 𝑝̂ 𝑞� 𝑝̂ 𝑞� < 𝑝 < 𝑝̂ + 𝑧� 𝑛 𝑛 𝑞� = 1 − 0.23 = 0.77 0.23 ± 2.58 ∙ �(0.23 ∙ 0.77 ÷ 200) 0.23 ± 0.077 0.153 < 𝑝 < 0.307 𝟏𝟓. 𝟑% < 𝑝 < 30.7% Since 1/5 = 0.20, the University of Michigan study falls within the confidence interval and it is OK. Dr. Janet Winter, jmw11@psu.edu Stat 200 Page 24 Answer: Question 7 Find the 99% confidence interval for the variance and standard deviation of the weights of onegallon containers of motor oil when the sample of 14 containers has a variance of 3.2. Assume the variable is normally distributed. 𝑛 = 14 (𝑛 − 1)𝑠 2 𝑥2 𝑟𝑖𝑔ℎ𝑡 𝑠 2 = 3.2 (𝑛 − 1)𝑠 2 2 <𝜎 < 2 𝑥 𝑙𝑒𝑓𝑡 13 ∙ 3.2 13 ∙ 3.2 < 𝜎2 < 29.819 3.565 . 𝟏. 𝟒 < 𝛔𝟐 < 11.7 variance . 𝟏. 𝟐 < 𝜎 < 3.4 standard deviation Note: The answer has the same number of decimal places as the given sample standard deviation. Answer: Question 8 A service station advertises a wait of no more than 30 minutes for an oil change. A sample of 28 oil changes has a standard deviation of 5.2 minutes. Find the 95% confidence interval of the population standard deviation of the times spent waiting for an oil change. (𝑛 − 1)𝑠 2 𝑥2 𝑟𝑖𝑔ℎ𝑡 27 ∙ 5.22 43.194 < 𝜎2 < < 𝜎2 < (𝑛 − 1)𝑠 2 𝑥2 𝑙𝑒𝑓𝑡 27 ∙ 5.22 14.573 𝟏𝟔. 𝟗 < 𝛔𝟐 < 50.1 variance in waiting time. 𝟒. 𝟏 < 𝜎 < 7.1 standard deviation in waiting time. Works Cited Triola, M.D., Marc M. and Mario F. Triola. Biostatistics for the Biologoical and Health Sciences. New York: Pearson Education, Inc., 2006. Dr. Janet Winter, jmw11@psu.edu Stat 200 Page 25