Choosing Sample Sizes 10/13/2011
Transcription
Choosing Sample Sizes 10/13/2011
10/13/2011 Choosing Sample Sizes IRA JOHNSON SENIOR QUALITY ENGINEER MOOG SPACE AND DEFENSE GROUP Why Sample Size Important y To Minimize Risks; Make Good, Durable Decisions y To Avoid Oversampling (Costs!) y Consistent, Justifiable Procedures Limits Liability y An Expectation of Q.E.s in Most Organizations 1 10/13/2011 What Will be Discussed? y The Rules of Thumb-Why They Work (or Not) y Data Types, Statistical Terminology y Common Problems in Sampling y Attribute Sample Sizes y Variable Averages and Hypothesis Testing y Standard Deviations y Cpk Sample Size Recommended Sample Sizes Method y The easiest and most accurate method for select sample size is… 2 10/13/2011 Power and Sample Size In Minitab Most Common Method For Sample Sizes y But what if you don’t have Minitab? Don’t have patience to memorize the all those statistical terms? y Quality Professionals have long used methods for selecting sample sizes that do not require Minitab, or any statistical calculations… 3 10/13/2011 The Sample Size Rules of Thumb! Detect changes using ±3 σ Limits 10 samples to estimate an Average 20 samples to estimate a Standard Deviation 30 samples for a Capability study (Cpk) 50 Samples to identify a Distribution (shape) Use the Standard Normal Table from ANSI/ASQ Z1.4 to select sample sizes y 59 samples of attribute data to achieve 95/95 y 1,000 samples needed for valid surveys y y y y y y y DO THESE WORK? Let’s Play Mythbusters! Rules of Thumb-Convenience or Danger? y Rules of thumb persist because they are easy to use, and often work y But will they work for you? Why or why not? 4 10/13/2011 Background/Review COMMON ASSUMPTIONS KEY DEFINITIONS DATA TYPES COMMON SENSE PRACTICES Typical Assumptions for R.O.T. y Samples are randomly and independently selected from the population y Measurement error is negligible y Data is normally distributed y The standard deviation is a known a constant value y Looking for a moderately large change, such as a 1 sigma differences or more 5 10/13/2011 The “Common Sense” Checks Questions That Should ALWAYS Be Asked: yWhat are you going to decide using this data? yHow was the data collected and measured? yHow big a change is important? yIs there any prior knowledge (standard deviation, distribution, etc.)? yWhat confidence is needed that a change will be detected? Will there be ongoing sampling or is this a 1-shot check? yIs the process stable for the short term? Types of Risks Key Definitions y Alpha, α, Type I Error- False Reject; Concluding there is a differences, but there is not y Beta, β, Type II Error- False Acceptance; Concluding there is no difference, but there actually is y Power- Correct Acceptance; Ability to detect a difference of δ, Power = (1- Beta) y Delta, δ - The magnitude of difference /change that is important to detect 6 10/13/2011 Some Common Types of Data y Counts-Integer values (0, 1, 2, 3… etc.); y Binomial Data-Conditions that are 1 of 2 conditions (pass/fail, yes/no, etc.) y Rates- qty per unit; can could range from 0 to ∞, such as defects/hour, complaints/day, repairs/1,000 hours, y Proportions- values between 0 and 1, or 0-100% y Means and/or standard deviations- Variables Data that can range from - ∞ to +∞. Attribute Sample Sizes DERIVE BINOMIAL SAMPLE SIZE EQUATIONS “BY HAND” ANSI Z1.4 (MIL-STD-105) NORMAL TABLE 7 10/13/2011 Coin and Dice Toss Examples Coin: Defect = Tails Dice: Defect= Roll a “1” y 1 Toss= ½= 50% y 1 Roll= 5/6= 83.3% y 2 Tosses= ¼= 25% y 2 Rolls= 25/36 = 69.4% y 3 Tosses= 1/8= 12.5% y N Tosses=(1/2)N y 3 Rolls= 125/216 = 57.8% y N Rolls=(5/6)N Probability of no “defects” in sample is P(0)=(1-p)N , where p is the probability of a defect in a single sample. • Can that equation be used to calculate sample size? • In lay terms, what does (1-p) translate to in these examples? Calculate Sample Size from P(0)=(1-p)N y Consider “p” the rate a defect that you can not accept y Select the Confidence Level for detecting a defect rate of “p” or greater, make P(0)=C.L. y Solve for N y N=Log(P(0))/ Log(1-p) … then round up y Reject the lot if one or more defects are found in the sample of size N 8 10/13/2011 Binomial Sample Sizes Matrix Defect Level Confidence Level p % PPM 50% 95% 0.500000 50.0% 500,000 1.0 4.3 0.166667 16.7% 166,667 3.8 16.4 0.050000 5.0% 50,000 13.5 58.4 0.010000 1.0% 10,000 69.0 298.1 •Can you locate the 59 pc attribute sample for 95/95% R.O.T? Use of Standard Normal Tables y R.O.T: “Use the Standard Normal Table from ANSI/ASQ Z1.4 to select sample sizes”. y Is this a good R.O.T.? Why or Why not? y ANS1: The ANSI/ASQ Z1.4 REQUIRES switch to the Tightened table after too many rejected lots y ANS2: In 2000, DoD declared Z1.4 obsolete; Recommend c=0 plans, which provide equal or greater consumer protection with less overall inspection than Z1.4. 9 10/13/2011 Variable Sample Size Topics CENTRAL LIMIT THEOREM: NORMALITY OF AVERAGES STANDARD DEVIATION OF AVERAGES 3σ LIMITS COMPARED TO α/β RISK LIMITS HYPOTHESIS TEST REVIEW WHEN STANDARD DEVIATION IS NOT KNOWN CHANGES IN STANDARD DEVIATIONS Central Limit Theorem y 10 10/13/2011 Effect of N on σ y Hypothesis Testing Steps y Start with a null hypothesis, H0 y Add an alternative hypothesis, HA, if Ho is not true y Select the allowable risk levels, { α for the risk of false rejecting the null, typically .05 { β for the risk of not accepting the alternative, typically .05 y Identify the appropriate statistical distribution y Determine Sample size that satisfies those risk levels 11 10/13/2011 Using 3 σ Limits to Detect a 1 σ shift y The original distribution is the black curve on the right y The red curve shows the process shifted 1 σ to the left y Using 3σ limits, >97% of the new distribution overlaps the old distribution. y What is the Power of a 3 σ test, for δ=1 σ and sample size n=1? -3 Ho Histogram of Ho, Ha gm Si 1 Sigma Shift, 3 Sigma Limits a Ha +3 Ho Ho gm Si a 0.4 Density 0.3 0.2 0.1 0.0 6.25 7.50 8.75 10.00 11.25 12.50 13.75 Using α=.05 to Detect a 1 σ shift y With the process shifted 1 σ to the left and using α=0.05, >66% of the Ha distribution would overlap the Ho distribution. y What is the power of a 1 sample test with α=0.05? / ha lp (A -Z Ho Histogram of Ho, Ha 1 Sigma Shift, alpha=0.05 2) Ha Ho ( +Z Ho a/ ph Al 2) 0.4 Density 0.3 0.2 0.1 0.0 6.25 7.50 8.75 10.00 11.25 12.50 13.75 12 10/13/2011 Sample Size of 9 y Sample size of 9 Histogramof HoandHa, n=9 1 Sigma Shift, Alpha=0.05 Ha 9.347 Ho 10.653 1.2 1.0 0.8 Density reduces the σx-bar by 3 y Gray area shows the likelihood the likelihood of detecting the 1σ shift 0.6 0.4 0.2 0.0 8.0 8.5 9.0 9.5 Data 10.0 10.5 11.0 Sample Size vs. Power Detection Power for a 1 Sigma Change, Normal Distribution 10 1.0 0.9 0.8 Power 0.7 0.6 0.5 0.4 0.3 0.2 0 5 10 15 Sample Size 20 25 30 N=10 provides Power= 88% for δ=1σ. Not the 95% typically desired but close. Is the R.O.T “busted”? 13 10/13/2011 What δ Can a 10 pc Sample Detect? Power Curve 10 pc Sample Z Test 1.14 1.0 0.8 Power Sample Size 10 0.95 A ssumptions A lpha 0.05 S tDev 1 A lternativ e N ot = 0.6 0.4 0.2 0.0 -1.0 -0.5 0.0 Difference 0.5 1.0 Impact of δ on Sample Size Effect of Delta on Power and Sample Size 13 1.0 0.95 Power 0.8 Variable 0.5 Sigma Power, Normal 1 Sigma Power, Normal 2 Sigma Power, Normal 0.6 0.4 0.2 0.0 0 5 10 15 20 Sample Size 25 30 • N=13 has 95% Power to detect a 1σ change • Changes other than 1σ dramatically affect Power 14 10/13/2011 Graphing the Elements of a Hypothesis Test y Ho is black distribution -Z Ho Ha 1.6 y Ha is red distribution 2) a/ ph ( al y δ= Potential Change to detect Histogramof Ho, Ha 1 Sigma Shift, n=13 Ho Ho 2) a/ ph Al Z( + δ 1.4 y Yellow area signifies β Risk Density 1.2 y Blue area signifies α Risk 1.0 0.8 0.6 0.4 y Gray area signifies Power 0.2 0.0 8.4 8.8 9.2 9.6 10.0 10.4 10.8 y Acceptance Criteria are Dotted Vertical Lines Change in Average, σ Unknown y Z-tables are standard normal distribution, where the standard deviation is a known value. Could be from previous data or an SPC chart. y What do we use if the standard deviation is not known? y ANS: Use Students t-Distribution y The t-test uses the sample data to estimate both an average and a standard deviation, but loses a little power 15 10/13/2011 Power of t-Test vs. Normal Z Power vs Sample Size, Z test compared to t-test with a 1 Sigma shift 10 1.0 0.95 0.9 Power 0.8 Variable 1 Sigma Power, ZNormal t-test 1 Sigma Power 0.6 0.4 0.2 0.0 0 5 10 15 20 Sample Size 25 30 Standard Deviation Sample Sizes y Use Variance (square of Standard Deviation) and the y y y y F-statistic to test for differences The F statistic is the ratio of two variances. As sample size for both the numerator and denominator ⇒ ∞, the F-ratio ⇒ 1. By selecting n2=∞ a single variance can be assessed against the population standard deviation Requires Normal Data!! 16 10/13/2011 20 Samples for Standard Deviation R.O.T. y F ratio vs. sample size is very steep where N<20. y True standard deviation is ~± 25% of the sample value with N=20 y R.O.T Confirmed? F‐ Ratio vs. Sample Size 2.8 2.6 2.4 2.2 2.0 1.8 1.6 1.4 1.2 1.0 0 10 20 30 40 50 60 70 80 Capability Index Confidence Intervals y The following approximation is commonly used: 1 Cˆ pk 2 ˆ ) Cpk = Cpk ± Z (1−α / 2 ) ( + 9n 2 * (n − 1) y It is important to note that the sample size should be at least 25 before these approximations are valid* y *Obtained from ITL.NIST.gov 17 10/13/2011 Are 30 Pieces for Cpk OK? y Green indicates the lower confidence exceeded 1.33. y For a sample size of 30, the calculated Cpk needs to be >1.7 to be confident the actual Cpk is >1.33 Required Cpk: 1.33 Alpha: Lower Confidence Bounds for Cpk Calculated/Estimated Cpk n 1.33 1.4 1.5 1.67 30 1.025804 1.081489 1.160917 1.29568 40 1.067565 1.125226 1.207494 1.347119 80 1.14548 1.206816 1.294364 1.443033 120 1.179622 1.242564 1.332421 1.485044 160 1.19989 1.263785 1.355011 1.509979 700 1.267929 1.335018 1.430834 1.593667 0.05 1.75 1.359004 1.412742 1.512937 1.556818 1.582864 1.670274 Which Rules Failed Mythbusters? Detect changes using ±3 σ Limits 10 samples to approximate an Average 20 samples to estimate a Standard Deviation 30 samples for a Capability study (Cpk) 50 Samples to identify a Distribution (shape) Use the Standard Normal Table from ANSI/ASQ Z1.4 to select sample sizes y 59 samples of attribute data to achieve 95/95 y 1,000 samples needed for valid surveys y y y y y y 18 10/13/2011 The End QUESTIONS? Useful References y How to Choose the Proper Sample Size, by Gary G. Brush, ASQ Quality Press y Zero Acceptance Number Sampling Plans, by Nicholas L. Squeglia, ASQ Press y Online Engineering Statistics Handbook, NIST/Sematech, website: itl.nist.gov 19