Chapter 7: Confidence Interval and Sample Size Learning Objectives

Transcription

Chapter 7: Confidence Interval and Sample Size
Learning Objectives
Upon successful completion of Chapter 7, you will be able to:
•
•
Find the confidence interval for the mean, proportion, and variance.
Determine the minimum sample size when determining a confidence interval for the mean
and for a proportion.
•
Level of confidence, maximum error of Estimate (E) and the sample size are inter-related.
I. Inference Includes:
1. Estimation of a population parameter (μ, ρ, or ) using data from a sample.
2. Hypothesis Testing or using sample data to test a conjecture about the population mean (μ),
population proportion (ρ), or population standard deviation ( ).
II. Two Kinds of Estimate for Parameters
1. A point estimate of the population parameter is the sample statistic, i.e., the point estimate
for the population mean μ is the sample mean of , the point estimate for the population
proportion is the sample proportion, and the point estimate for the population standard
deviation is the sample standard deviation s.
2. An interval estimate of a parameter is a range of values determined from the point estimate.
Dr. Janet Winter, jmw11@psu.edu
Stat 200
Page 1
III. Confidence Interval Estimates for Population Parameters
The confidence level is the probability that intervals determined by these methods will
contain the parameter.
A confidence interval is the range of values determined from a sample statistic and the
specified confidence level.
The common confidence intervals use 90%, 95%, or 99% confidence levels.
IV.Confidence Interval Estimates for the Population Mean μ
A. When to use the Normal Distribution (z) and when to use the t Distribution
for Confidence Interval Estimates of the Population Mean
Start
Yes
Yes
Is the
population normally
distributed?
Yes
z
Use the normal distribution
No
Is σ
known?
Yes
No
Is n > 30?
Is the
population normally
distributed?
No
Yes
Use nonparametric or
bootstrapping methods.
t
Use the t distribution
No
Is n > 30?
No
Use nonparametric or
bootstrapping methods.
“Elementary Statistics: Using the Graphing Calculator for the TI-83/84”, Triola, Mario F.
Stat 200
Page 2
B. Rounding Rules for all Confidence Intervals Estimates of the Mean
I. When using actual data:
a) find the mean and standard deviation to 2 extra places than the data.
b) round the answer to one more decimal place than the original data.
Note: This is very important! Answers not rounded correctly are marked wrong on
Mathzone.
II. When using a mean and standard deviation, work with one more decimal place than the
data and round to the same number of decimal places given for the mean.
C. Meaning of ALL Confidence Interval Estimates
Be sure to reread P 353 (6th edition) or P 361 (7th edition) in the textbook to better
understand the meaning of the confidence interval.
For example: a 90% confidence interval estimate for the population mean is interpreted as
90% of the confidence interval estimates formed with this process include the value of the
population mean.
D. z Interval Estimates for the population Mean
I. Requirements
a) the population standard deviation ( ) is given
b) the sample size n 30;
c) But, if the sample size n < 30, the variable must be selected from a normal
distribution
II. Confidence Coefficient
Stat 200
Page 3
a) Meaning of the Confidence Coefficient
z is called the confidence coefficient, i.e., the number of multiples of the standard
error for an interval estimate with a
level of confidence. Complete the rest of
the table using the confidence level (1-∝). The first 2 have been completed for you
(answers at the end).
.
.90
.95
.10
.05
.05
.025
b) Method to find the Confidence Coefficient:
Find the z value with area to its left, i.e.,
1. Locate inside the Normal Probability Table (Table E)
2. Starting at , move your hand to the left along the row until you reach the Z
column. This is the integer and tenths digits. Go back to , next move your
hand to the top of its column. This is the hundreds digits.
3.
Add the integer and tenths digits to the hundredths digits to find the value for z.
4.
Affix a
sign in front of the number.
Stat 200
Page 4
Using the method described, complete the table below. The first 2 have been
completed for you (answer at the end).
Confidence Level
1−α
α
.90
.95
.99
.80
.98
.96
.93
.10
.05
α/2
(1 − α ) +
.95
.975
α
2
.95
.975
Confidence Coefficient
𝑧(𝑎/2)
1.645
1.96
III. Development of the Confidence Interval Formula
𝜎
𝜎
𝑥̅ − 𝑧
< 𝜇 < 𝑥̅ + 𝑧
√𝑛
√𝑛
Whenever the population standard deviation 𝜎 is known and either the population is
normally distributed or n ≥ 30, the Central Limit Theorem guarantees the sample mean is
normally distributed or:
𝑥̅ − 𝜇
<𝑧
𝜎𝑥̅
−𝑧 <
− 𝑧 ∙ 𝜎𝑥̅ < 𝑥̅ − 𝜇 < 𝑧 ∙ 𝜎 �
−𝑥̅ − 𝑧 ∙ 𝜎𝑥̅ < −𝜇 < −𝑥̅ + 𝑧 ∙ 𝜎𝑥̅
(−𝑥̅ − 𝑧 ∙ 𝜎𝑥̅ ) < (−𝜇) < (−𝑥̅ + 𝑧𝜎𝑥̅ )
−(−𝑥̅ − 𝑧 ∙ 𝜎𝑥̅ ) > −(−𝜇) > −(−𝑥̅ + 𝑧𝜎𝑥̅ )
𝑥̅ + 𝑧 ∙ 𝜎𝑥̅ > 𝜇 > 𝑥̅ − 𝑧𝜎𝑥̅
𝑥̅ − 𝑧 ∙ 𝜎𝑥 < 𝜇 < 𝑥̅ + 𝑧𝜎𝑥̅
𝑥̅ − 𝑧
𝜎
√𝑛
< 𝜇 < 𝑥̅ + 𝑧
𝜎
√𝑛
Note: If the population standard deviation is not known or stated, use
x� − t
s
√n
< 𝜇 < x� + t
s
√n
(see section E page 9).
Stat 200
Page 5
IV. Review of Concepts and Maximum Error of Estimate
is the point estimate and the center of the confidence interval
z is the confidence coefficient, the number of multiples of the standard error needed to
construct an interval estimate of the correct width to have a level of confidence 1− α
is called the maximum error of estimate.
V. Example:
35 fifth-graders have a mean reading score of 82. The standard deviation of the
population is 15.
a) Find the 95% confidence interval estimate for the mean reading scores of all fifthgraders. Since we know the population standard deviation and n≥30, use
. Use Table E backwards with the area to the left of z
equal to .025. The value of z or the confidence coefficient is z = 1.9 + .06 = 1.96.
This means approximately 95% of the sample means will fall within 1.96 standard
errors of the population mean. Use z = 1.96 in the formula.
4.97, rounded to 5, is the maximum error of estimate. Be sure to list it for full
credit in your answers.
Stat 200
Page 6
b) Find the 99% confidence interval estimate of the mean reading scores of all fifthgraders. Since approximately 99% of the sample means will fall within 2.58 standard
errors of the population mean, use z = 2.58
𝑋� = 82.1, 𝑛 = 35, 𝜎 = 15
𝑋� − 𝑧
𝜎
√𝑛
< 𝜇 < 𝑋� + 𝑧
82.1 − 2.58
15
√35
𝜎
√𝑛
< 𝜇 < 82.1 + 2.58
82.1 − 6.54 < 𝜇 < 82.1 + 6.54
15
√35
82.1 ± 6.5
75.6 < 𝜇 < 88.5
6.54, rounded to 6.5, is the maximum error of estimate. Be sure to list it in the
next to last step.
c) Is the 95% confidence interval or the 99% confidence interval larger? Explain why.
95% confidence level: 77 < μ < 87
99% confidence level: 75 < μ < 89
The 99% confidence level is larger because it has a larger z value.
Question 1
A study of 40 English composition professors showed that they spent, on average, 12.6
minutes correcting a student’s term paper.
Find the 90% confidence interval of the mean time for all composition papers when 𝜎 = 2.5
minutes.
n = 40 𝑋�= 12.6
Since the population standard deviation is given and n = 40 is greater than 30, use the formula:
𝝈
𝝈
�−𝒛
�+𝒛
<𝜇<𝑿
𝑿
√𝒏
√𝒏
If a professor stated that he spent, on average, 30 minutes correcting a term paper, what
would be your reaction?
Stat 200
Page 7
VI. Maximum Error of Estimate for Confidence Interval Estimates of μ
a) Definition The maximum error or estimate is always the largest difference between
the point estimate of a parameter and the actual value of the parameter.
The maximum error of estimate is ½ the width of the confidence interval.
b) Maximum Error of Estimate for Confidence Interval Estimates of μ
It is the term 𝐸 = 𝑧
𝜎
√𝑛
VII. Find the Sample Size Using E and the Confidence Level
a) Concept: E is like tolerance or allowable error where:
𝜎
𝐸=𝑧
√𝑛
√𝑛𝐸 = 𝑧𝜎
√𝑛 =
𝑧𝜎
𝐸
𝑧𝜎 2
�
𝐸
𝑛=�
b) Formula for the Minimum Sample Size for an Interval estimate of the population
mean
𝒛𝝈 𝟐
𝒏=� �
𝑬
where E is the maximum error of estimate. If the answer is not a whole number,
round up to the next larger whole number to find the sample size, n. If the population
standard deviation is not available use the sample standard deviation.
c) Example:
An insurance company is trying to estimate the average number of sick days that fulltime food service workers use per year. A pilot study found the standard deviation to
be 2.5 days. How large a sample must be selected if the company wants to be 95%
confident of getting an interval that contains the true mean with a maximum error of
1 day?
s= 2.5
confidence level = 95%
maximum error = 1 day
𝒛𝝈 𝟐
𝟏. 𝟗𝟔 ∙ 𝟐. 𝟓 𝟐
𝒏=� � =�
� = 𝟒. 𝟗𝟐 𝐨𝐫
𝑬
𝟏
𝒏 = 𝟐𝟒. 𝟎𝟏
𝒏 = 𝟐𝟓 𝒘𝒐𝒓𝒌𝒆𝒓𝒔
Stat 200
Page 8
Question 2
Find the sample size necessary to estimate a population mean to within 0.5 with 95%
confidence if the standard deviation is 6.2
𝒏=�
𝒛∙𝝈 𝟐
�
𝑬
Note: When solving for sample size n, always round up to the next larger integer.
E. t Confidence Interval Estimates for the Population Mean
I. Requirements
a) σ is unknown
b) n ≥ 30
c) But, if n < 30, the variable is normally distributed.
II. Characteristics of the t Distribution
Similarities with the normal distribution:
a) Bell shaped.
b) Symmetrical about the mean.
c) The mean, median, and mode are equal to 0 at the center of the distribution.
d) The curve never touches the x axis.
Differences from the standard normal:
a) The variance is greater than 1.
b) The t distribution is actually a family of curves based on the degrees of freedom,
which is related to sample size.
c) As the sample size increases, the t distribution approaches the standard normal
distribution.
Stat 200
Page 9
Read textbook page 362 (6th Edition) or page 370 (7th Edition) for the comparison
between Normal and t distributions.
(Triola & Triola, 2006)
III. Tabled Values for the t Table F:
a) Location
• 6th Edition: Table F – located on the inside cover of the text on the opposite side
from Table E (standard normal).
• 7th Edition: Table F – located on the last page of the textbook or the pull-out card.
b) Method to find the confidence coefficient for t
• Use the column for the appropriate confidence level
• Use the row for the appropriate degrees of freedom.
• The intersection of the appropriate column and appropriate row is the
confidence coefficient.
Note: If the degrees of freedom needed are not listed in the table, always round
down to the nearest table value. For example, if we need degrees of freedom 44,
use df=40 since 44 is not listed in the table.
IV. Degrees of Freedom for Estimates of the Population Mean
Degrees of freedom are the number of values that are free to vary after a sample statistic
has been computed. For the confidence interval for the mean the degrees of freedom are:
sample size minus 1
Stat 200
or
d.f. = n – 1
Page 10
V. Example:
28 employees of XYZ Company travel an average (mean) of 14.3 miles to work. The
standard deviation of their travel time was 2 miles. Find the 95% confidence interval of
the true mean or population mean.
Since the population standard deviation is not given, use the formula:
𝑋−𝑡
𝑠
√𝑛
<𝜇 <𝑋+𝑡
14.3 − 2.052
2
√28
𝑠
𝑛 = 28
√𝑛
< 𝜇 < 14.3 + 2.052
14.3 − 0.776 < 𝜇 < 14.3 + 0.776
2
𝑋 = 14.3
𝑠=2
𝑑𝑓: 27
√28
14.3 ± .8
13.5 < 𝜇 < 15.1
VI. Example:
The average yearly income for 28 engineering graduates in 2008 is $56,718. The
standard deviation was $650.
1. Find the 95% confidence interval estimate for the population mean.
Since the population standard deviation is not given, use the formula:
𝑋−𝑡
𝑠
√𝑛
<𝜇 <𝑋+𝑡
$𝟓𝟔, 𝟕𝟏𝟖 − 𝟐. 𝟎𝟓𝟐
𝑠
√𝑛
𝟔𝟓𝟎
√𝟐𝟖
𝑛 = 28
𝑋 = 56718
< 𝜇 < $56,718 + 2.052
𝟓𝟔𝟕𝟏𝟖 − 𝟐𝟓𝟐 < 𝜇 < 56718 + 252
𝟔𝟓𝟎
𝑠 = 650
𝑑𝑓: 27
√𝟐𝟖
$𝟓𝟔, 𝟕𝟏𝟖 < 𝜇 < $56,970
Note: Now that you are familiar with this problem, it is simpler to record the steps:
𝟓𝟔𝟕𝟏𝟖 ± 𝟐. 𝟎𝟓𝟐
𝟓𝟔𝟕𝟏𝟖 ± 𝟐𝟓𝟐. 𝟏
𝟔𝟓𝟎
√𝟐𝟖
56718 ± 252 (rounded to the same number of places as the mean)
2.
𝟓𝟔𝟒𝟔𝟔 < 𝜇 < 56970
If an individual graduate wishes to see if he or she is being paid below average, what
salary value should he or she use?
Use the lower bound of the confidence interval: $56,466.
Stat 200
Page 11
Question 3
The prices (in dollars) for a particular model of 6.0 megapixels digital camera with 3x optical
zoom are listed as: $225, $240, $215, $202, $206, $211, $210, $193, $250, $225. Estimate
the true mean using this data with 90% confidence.
Since the population standard deviation is not given, use: 𝑿 ± 𝒕
𝒔
√𝒏
. Do not use the σ from
the calculator. This is a sample, so be sure to use s and work in 2 more places than the data
and round the answers to one more place than the data.
𝑿 = 𝟐𝟏𝟕. 𝟕𝟎
Question 4
𝒔 = 𝟏𝟕. 𝟒𝟗
𝒕 = 𝟏. 𝟖𝟑𝟑
𝒅𝒇: 𝟗
𝒏 = 𝟏𝟎
John wants to estimate the average value of the homes in his town with a 99% confidence
interval. Use his random sample of 36 homes with an average value of $251,131.42 and
standard deviation $1321.46 to find the confidence interval.
Since the population standard deviation is not given, use the formula - X ± t
s
. .
√n
The degrees of freedom equals 35, but df = 35 is not available in the table. Use the next
lower df or df = 34.
V. Confidence Interval Estimates for Population Proportions
A. Symbols Used to Estimate Proportion
•
•
•
p = symbol for the population proportion
p� = symbol for the sample proportion; read p “hat”
𝑞� = 1 − 𝑝̂ = symbol for the same proportion of failures.
Where x = number of sample units that possess the characteristics of interest and
n = sample size.
𝑥
• 𝑝̂ + 𝑛
B. Development of the Formula
For a Binomial Probability Distribution with x = number of successes
For x: with n p ≥ 5 and n (1 – p) ≥ 5
X = number of successes is approximately normally distributed with:
𝜇 = 𝑛𝑝
𝜎 = �𝑛𝑝(1 − 𝑝)
𝑥
Thus for proportions 𝑝 = 𝑛
𝜇 𝑛𝑝
𝜇𝜌 = =
=𝑝
𝑛
𝑛
Stat 200
Page 12
1. The mean of 𝑝̂ is 𝑝
2. The standard deviation for 𝑝̂ becomes:
𝜎𝜌 =
.
𝜎 �𝑛𝑝(1 − 𝑝)
=
𝑛
𝑛
𝜎𝜌 = �
𝜎𝜌 = �
𝑛𝑝(1 − 𝑝)
𝑛2
𝑝(1 − 𝑝)
𝑛
Next we use the pattern for the confidence interval estimate of the population mean, point
estimate ± 𝒛 ∙ 𝒔𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝒅𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏
�(𝟏−𝒑
�)
𝒑
� ± 𝒛�
It becomes 𝒑
𝒏
C. Formula for the confidence interval estimate of the population proportion
Note: a shorter version of the formula to estimate the population proportion is:
𝑝�∙𝑞�
𝑝̂ ± 𝑧�
𝑛
when np and nq are each greater than or equal to 5.
D. Rounding Rules for proportions
Always use 4 decimal places for the computation and round the answers to 3 decimal places.
E. Example:
1. 55 students in a random sample of 450 enrolled in summer classes. Estimate the
population proportion of students taking classes this summer.
𝑿
𝟓𝟓
�= =
� = 𝟏−. 𝟏𝟐𝟐𝟐 =. 𝟖𝟕𝟕𝟖
𝒑
= 𝟎. 𝟏𝟐𝟐𝟐
𝒒
𝒏 𝟒𝟓𝟎
�𝒒
�
�𝒒
�
𝒑
𝒑
� + 𝒛�
� − 𝒛�
𝒑
<𝑝<𝒑
𝒏
𝒏
𝟎. 𝟏𝟐𝟐𝟐 ∙ 𝟎. 𝟖𝟕𝟕𝟖
𝟎. 𝟏𝟐𝟐𝟐 ± 𝟏. 𝟗𝟔�
𝟒𝟓𝟎
𝟎. 𝟏𝟐𝟐±. 𝟎𝟑𝟎
𝟎. 𝟎𝟗𝟐 < 𝑝 < .152
𝟗. 𝟐% < 𝑝 < 15.2%
Stat 200
Page 13
2. Is an estimate of 11% about right?
Yes, 11% is about right since it is contained within the confidence interval estimate.
Question 5
A survey found that out of 200 students, 168 said they needed loans or scholarships to pay
their tuition and expenses. Find the 90% confidence interval for the population proportion
of students needing loans or scholarships.
� = 𝟎. 𝟖𝟒 𝒒
� = 𝟎. 𝟏𝟔
𝒑
𝑝̂ − 𝑧 �
𝑝̂ 𝑞�
𝑝̂ 𝑞�
< 𝑝 < 𝑝̂ + 𝑧 �
𝑛
𝑛
Question 6
A study by the University of Michigan found that one in five 13 and 14 year olds is a
sometime smoker. To see how the smoking rate of the students at a large school district
compared to the national rate, the superintendent surveyed two hundred 13 and 14 year
old students and found that 23% said they were sometime smokers. Find the 99%
confidence interval of the true proportion and compare this with the University of
Michigan’s study.
𝑛 = 200
� = 0.23
𝒑
� = 1 − 0.23 = 0.77
𝒒
𝑝̂ − 𝑧 �
𝑝� 𝑞�
𝑛
< 𝑝 < 𝑝̂ + 𝑧 �
𝑝� 𝑞�
𝑛
F. Formula for the Minimum Sample Size to Estimate a Population Proportion
�𝒒
�
𝒑
𝑬 = 𝒛�
𝒏
�𝒒
�
𝑬
𝒑
=�
𝒛
𝒏
�𝒒
�
𝑬 𝟐 𝒑
� � =
𝒛
𝒏
𝒛 𝟐
𝒏
� � =
�𝒒
�
𝑬
𝒑
𝟐
𝒛
𝒏
=
�𝒒
�
𝑬𝟐 𝒑
𝟐
𝒛
�𝒒
�∙ 𝟐=𝒏
𝒑
𝑬
𝑧 2
𝑛 = 𝑝̂ 𝑞� � �
𝐸
Use 𝑝̂ from a pilot study or previous estimate if it is available. Otherwise, use 𝑝̂ = .5. 𝑛 must
be a whole number. If it’s not a whole number, round up to the next larger whole number.
Stat 200
Page 14
G. Example:
A medical researcher wishes to determine the percentage of drivers using GPS systems in
their car. He wishes to be 99% confident that the estimate is within 2 percentage points of
the true proportion. A recent study of 180 drivers showed that 25% used GPS systems.
a) How large should the sample size be? Since a recent study showed 25% used GPS
Systems, 𝑝̂ = 0.25 𝑎𝑛𝑑 𝑞� = 0.75.
.
𝑧 2
𝑛 = 𝑝̂ 𝑞� � �
𝐸
2.58 2
𝑛 = 0.25 ∙ 0.75 �
�
. 02
𝑛 = 3120.187
Since the computed n is not a whole number, round up and use n = 3121.
b) If no estimate of the sample proportion is available, how large should the sample be?
Since there is no prior estimate of p, use p = 0.5 and q = 0.5
𝒛 𝟐
�𝒒
��
𝒏=𝒑
𝑬
𝒏 = 𝟒𝟏𝟔𝟎. 𝟐𝟓
𝟐. 𝟓𝟖 𝟐
𝒏 = 𝟎. 𝟓 ∙ 𝟎. 𝟓 �
�
𝟎. 𝟎𝟐
Since the computed n is a not a whole number, round up and use n = 4161.
Note: the sample size needs to be larger when there is no prior estimate for p.
VI.Confidence Interval Estimate for the Population Variance and Population
Standard Deviation
A. General Comments
To find confidence intervals for variances and standard deviations,
• Use the chi-square distribution
• Samples must be selected from normally distributed populations.
• Assume the population variance is 𝛔𝟐 .
•
The chi-square distribution is obtained from the values of
Stat 200
(𝐧−𝟏)𝐬 𝟐
𝛔𝟐
or 𝐱 𝟐 =
(𝐧−𝟏)𝐬 𝟐
𝛔𝟐
Page 15
B. Chi-Square Distribution
Reference textbook page 378 (6th Edition) or page 386 (7th Edition).
I. Characteristics
• Chi-Square is always positive.
• It is a family of distributions dependent on degrees of freedom (n – 1).
• The mode is always slightly to the left of degrees of freedom.
• As n increases, Chi-Square walks off to the right.
• Chi-Square distribution is skewed to the right.
II. Finding Chi-Square values on the Chi-Square table
Since
is not symmetrical, two different
values are used in the confidence interval
formula for the population variance.
For
For
right, use the column
.
left, use the column in the
table for
.
Process:
1. Use the confidence level
to find
.
2. Use the
column with the appropriate degrees of freedom to find
right.
3. Find
.
4. Use the
column with the appropriate degrees of freedom to find
left.
Chi-Square Values of df: 18
Chi-Square left
Left
Confidence Level
Right
Chi-Square right
9.390
8.231
10.865
7.015
6.265
.95
.975
.90
.99
.995
.90
.95
.80
.98
.99
Stat 200
.05
.25
.10
.01
.005
28.869
31.526
25.989
34.805
37.156
Page 16
C. Formulas
1. Confidence Interval Estimate for the Population Variance:
df: n - l
Note:
right is on the left side of the equation but the right side of the graph and
left is on the right side of the equation but the left side of the graph.
2. Confidence Interval Estimate for the Population Standard Deviation
Since the population standard deviation is the square root of the population variance,
the confidence interval estimate of the population standard deviation is:
D. Rounding Rules for Standard Deviation or Variance
I. When using actual data:
a) find the standard deviation to 2 extra places than the data.
b) round the answer to one more decimal place than the original data.
II. When using sample standard deviation or variance, work with one more decimal place
than the statistic and round to the same number of places as the standard deviation or
variance given.
E. Example:
Find the confidence interval for the standard deviation in the time it takes to fill a car with
gas. In a sample of 23 fill-ups, the standard deviation of the time it takes to fill the car is 3.8
minutes. Assume the variable is normally distributed.
Stat 200
Page 17
Note: the answer has the same number of decimal places as the given sample standard
deviation since the work is done with statistics instead of data.
Question 7
Find the 99% confidence interval for the variance and standard deviation of the weights of
one-gallon containers of motor oil when the sample of 14 containers has a variance of 3.2.
Assume the variable is normally distributed.
F. Example:
The number of calories in a 1-ounce serving of various regular cheeses is shown. Estimate
the population variance with 90% confidence.
110
130
45
100
100
80
95
105
110
105
110
90
100
110
110
70
95
125
120
108
Is the 90% confidence interval estimate for the population variance.
Note:
• Use the tabled values for
and use s rounded to 2 more places than the data in the
computation.
• Since data is used to compute the standard deviation, the answer has one more place
than the original data.
Stat 200
Page 18
Question 8
A service station advertises a wait of no more than 30 minutes for an oil change. A sample
of 28 oil changes has a standard deviation of 5.2 minutes. Find the 95% confidence interval
of the population standard deviation of the times spent waiting for an oil change.
(𝒏 − 𝟏)𝒔𝟐
(𝒏 − 𝟏)𝒔𝟐
𝟐
<
𝝈
<
𝒙𝟐 𝒓𝒊𝒈𝒉𝒕
𝒙𝟐 𝒍𝒆𝒇𝒕
VII. Summaries
A. Estimates for Population Parameters
•
•
•
•
•
Estimation is an important aspect of inferential statistics.
A point estimate is a single value with no accuracy specified.
An interval estimate is a range of values with its accuracy specified by the confidence level.
Every question about a confidence interval will have the words “Find a confidence interval
estimate for…”.
Pay particular attention to the words determine whether the confidence interval is for the
mean, proportion, or variance (or its square root, the standard deviation).
B. Minimum Sample Sizes to Estimate Population Parameters
•
You always need to know both the confidence level and the maximum error of estimate.
In addition, for the:
1. Mean – the population standard deviation (given or estimate) is also required.
𝑧∙𝜎 2
�
𝐸
𝑛=�
2. Proportion – an estimate of the proportion from a pilot study is preferred
(or use p = .5).
𝑧 2
𝑛 = 𝑝̂ 𝑞� � �
𝐸
Stat 200
Page 19
C. Rounding Rules
I. For Estimates of the Mean
a) When using actual data:
(a) find the mean and standard deviation to 2 extra places than the date.
(b) round the answer to one more decimal place than the original data.
Note: This is very important! Answers not rounded correctly are marked wrong on
Mathzone.
b) When using a mean and standard deviation, work with one more decimal place than
the data and round to the same number of decimal places given for the mean.
II. For Estimates of the Standard Deviation or Variance
a) When using actual data:
(a) find the standard deviation to 2 extra places than the data.
(b) round the answer to one more decimal place than the original data.
b) When using sample standard deviation or variance, work with one more decimal
place than the statistic and round to the same number of places as the standard
deviation or variance given.
III. For Estimates of the Proportions
a) Always use 4 decimal places for the computation and round the answers to 3
decimal places.
Stat 200
Page 20
Answer: Confidence Coefficient
z is called the confidence coefficient, i.e., the numbers of multiples of the standard error for an
interval estimate with a 1− ∝ level of confidence.
1− ∝
.90
.95
.99
.80
.85
∝
2
.05
.025
.005
.10
.075
∝
.10
.05
.01
.20
.15
Answer: Method to find the Confidence Coefficient
1.
2.
3.
4.
Use the confidence level (1 − 𝛼) to find 𝛼/2.
Use the 𝛼/2 column with the appropriate degrees of freedom to find x 2 right.
Find 1 − 𝛼/2.
Use the 1 − 𝛼/2 column with the appropriate degrees of freedom to find x 2 left.
Confidence
Level
1−α
.90
.95
.99
.80
.98
.96
.93
α
.10
.05
.01
.20
.02
.04
.07
α/2
.95
.975
.995
.90
.99
.98
.965
(1 − α ) +
α
2
.95
.975
.995
.90
.99
.98
.965
Stat 200
Confidence Coefficient
𝑧(𝑎/2)
1.645
1.96
2.58
1.28
2.33
2.05
1.81
Page 21
Answer: Question 1
A study of 40 English composition professors showed that they spent, on average, 12.6 minutes
correcting a student’s term paper.
a) Find the 90% confidence interval of the mean time for all composition papers when σ= 2.5
minutes.
�
n = 40
X= 12.6
Since the population standard deviation is given and n=40 is greater than 30, use the formula:
𝜎
𝜎
𝑋� − 𝑧
< 𝜇 < 𝑋� + 𝑧
√𝑛
√𝑛
2.5
2.5
12.6 − 1.645
< 𝜇 < 12.6 + 1.645
√40
√40
12.6 − 0.6502 < 𝜇 < 12.6 + 0.6502
12.6 − 0.7 < 𝜇 < 12.6 + 0.7
𝟏𝟏. 𝟗 < 𝜇 < 13.3
b) If a professor stated that he spent, on average, 30 minutes correcting a term paper, what would
be your reaction?
11.9 < 𝜇 < 13.3
It would be highly unlikely since 30 minutes is far longer than the upper bound of 13.3 minutes.
Answer: Question 2
Find the sample size necessary to estimate a population mean to within 0.5 with 95% confidence if
the standard deviation is 6.2.
𝑧∙𝜎 2
𝑛=�
�
𝐸
.
2
(1.96)(6.2)
𝑛=�
� = [24.304]2 = 590.684
0.5
𝑛 = 591
Note: When solving for sample size n, always round up to the next larger integer (Why?)
Stat 200
Page 22
Answer: Question 3
The prices (in dollars) for a particular model of 6.0 megapixels digital camera with 3x optical zoom
are listed as: $225, $240, $215, $202, $206, $211, $210, $193, $250, $225. Estimate the true mean
using this data with 90% confidence.
Since the population standard deviation is not given, use: X ± t
s
. Do not use the σ from the
√n
calculator. This is a sample, so be sure to use s and work in 2 more places than the data and round
the answers to one more place than the data.
𝑋 = 217.70
217.70 ± 1.833
𝑠 = 17.49
17.49
√10
𝑡 = 1.833
𝑑𝑓: 9
𝑛 = 10
217.7 ± 10.1
𝟐𝟎𝟕. 𝟔 < 𝜇 < 227.8
Note: 𝑋 and 𝑠 are found to two decimal places more than the data, but the answer is rounded back
to one more place than the data.
Answer: Question 4
John wants to estimate the average value of the homes in his town with a 99% confidence interval.
Use his random sample of 36 homes with an average value of $251,131.42 and standard deviation
$1321.46 to find the confidence interval.
Since the population standard deviation is not given, use the formula X ± t
s
√n
.
The degrees of freedom equals 35, but df = 35 is not available in the table. Use the next lower df or
df = 34.
.
251131.42 ± 2.728
1321.46
√36
251131.42 ± 600.82
𝟐𝟓𝟎𝟓𝟑𝟎. 𝟔𝟎 < 𝜇 < 251732.24
Note: Since statistics are given, work one more place than the statistic but round the answer back
to the same number of places as 𝑋.
Stat 200
Page 23
Answer: Question 5
A survey found that out of 200 students, 168 said they needed loans or scholarships to pay their
tuition and expenses. Find the 90% confidence interval for the population proportion of students
needing loans or scholarships.
𝑝̂ = 0.84
𝑝̂ − 𝑧�
𝑞� = 0.16
𝑝̂ 𝑞�
𝑝̂ 𝑞�
< 𝑝 < 𝑝̂ + 𝑧�
𝑛
𝑛
0.84 ∙ 0.16
0.84 ∙ 0.16
0.84 − 1.645�
< 𝑝 < 0.84 + 1.645�
200
200
0.84 ± 1.645�(0.84 ∙ 0.16 ÷ 200)
0.84 − 0.043 < 𝑝 < 0.84 + 0.043
0.797 < 𝑝 < 0.883
𝟕𝟗. 𝟕% < 𝑝 < 88.3%
Answer: Question 6
A study by the University of Michigan found that one in five 13 and 14 year olds is a sometime
smoker. To see how the smoking rate of the students at a large school district compared to the
national rate, the superintendent surveyed two hundred 13 and 14 year old students and found
that 23% said they were sometime smokers. Find the 99% confidence interval of the true
proportion and compare this with the University of Michigan’s study.
.
𝑛 = 200
𝑝̂ = 0.23
0.23 ± 2.58�
0.23 ∙ 0.77
200
𝑝̂ − 𝑧�
𝑝̂ 𝑞�
𝑝̂ 𝑞�
< 𝑝 < 𝑝̂ + 𝑧�
𝑛
𝑛
𝑞� = 1 − 0.23 = 0.77
0.23 ± 2.58 ∙ �(0.23 ∙ 0.77 ÷ 200)
0.23 ± 0.077
0.153 < 𝑝 < 0.307
𝟏𝟓. 𝟑% < 𝑝 < 30.7%
Since 1/5 = 0.20, the University of Michigan study falls within the confidence interval and it is OK.
Stat 200
Page 24
Answer: Question 7
Find the 99% confidence interval for the variance and standard deviation of the weights of onegallon containers of motor oil when the sample of 14 containers has a variance of 3.2. Assume the
variable is normally distributed.
𝑛 = 14
(𝑛 − 1)𝑠 2
𝑥2 𝑟𝑖𝑔ℎ𝑡
𝑠 2 = 3.2
(𝑛 − 1)𝑠 2
2
<𝜎 < 2
𝑥 𝑙𝑒𝑓𝑡
13 ∙ 3.2
13 ∙ 3.2
< 𝜎2 <
29.819
3.565
.
𝟏. 𝟒 < 𝛔𝟐 < 11.7 variance
.
𝟏. 𝟐 < 𝜎 < 3.4 standard deviation
Note: The answer has the same number of decimal places as the given sample standard deviation.
Answer: Question 8
A service station advertises a wait of no more than 30 minutes for an oil change. A sample of 28 oil
changes has a standard deviation of 5.2 minutes. Find the 95% confidence interval of the
population standard deviation of the times spent waiting for an oil change.
(𝑛 − 1)𝑠 2
𝑥2 𝑟𝑖𝑔ℎ𝑡
27 ∙ 5.22
43.194
< 𝜎2 <
< 𝜎2 <
(𝑛 − 1)𝑠 2
𝑥2 𝑙𝑒𝑓𝑡
27 ∙ 5.22
14.573
𝟏𝟔. 𝟗 < 𝛔𝟐 < 50.1 variance in waiting time.
𝟒. 𝟏 < 𝜎 < 7.1 standard deviation in waiting time.
Works Cited
Triola, M.D., Marc M. and Mario F. Triola. Biostatistics for the Biologoical and Health
Sciences. New York: Pearson Education, Inc., 2006.
Stat 200
Page 25

Chapter 7: Confidence Interval and Sample Size Learning Objectives

Transcription

Similar documents

MODUL STATISTIKA UNTUK BISNIS DAN MANAJEMEN

Lecture 3 Research Methods Lecture 10

sandusky-perkins area ride connection

STA 2023 STATISTICS Sample Final Exam

8.8 Probability

3 clicks to total control – The new digital Grässlin time switch and

Notes 6.2.notebook

Bowflex MAX Trainer information M5 and M3