02 - Descriptive Statistics
Transcription
02 - Descriptive Statistics
5/8/2014 • Descriptive Statistics Descriptive Statistics PSYC 381 – Organize – Summarize – Graphing • New Terms: Arlo Clark-Foos – Raw scores – Distribution • Frequency Tables • Frequency Tables – Visual depiction of data that shows how often each value occurred. • Typically used for discrete variables. – Steps in Creating… • • • • Find highest and lowest scores Create 2 columns, one for the value and one for the frequency List full range of values, including those with frequency of 0 Count the # of scores at each value, record that in frequency column. • Frequency Tables Days with 2+ hours of TV 7 6 5 – Example – On the board, board record the number of days each week that you spend watching 2+ hours of television • Expanding Frequency Tables to Percentages Frequency – Cumulative Percentage • The Th percentage t off individuals i di id l who h have h scores att a given value or lower. 4 3 2 1 • Calculating: Count how many scores fall at or below a given value. Divide this number by the total number of scores. 0 1 5/8/2014 • Expanding Frequency Tables to Percentages Days with 2+ hours of TV Frequency Percent Cumulative Percent • Grouped Frequency Tables – Reports the frequencies within a given interval rather than the frequencies for a specific value. 7 – When to use instead of frequency tables… 6 5 • Continuous, interval/ratio (scale) variables • When data cover a huge range 4 3 2 – Determining the # of intervals…(5-10?) 1 0 • Grouped Frequency Tables, example. Number of Siblings 5 Frequency • Histograms – Typically used to depict interval data with the values of the variable on the x-axis and the frequencies on the y-axis (similar to a bar graph). 4 3 2 1 0 • Histogram – Example – Constructing from a Frequency Table • Draw x-axis and label with variable of interest and full range of values • Draw y-axis and label it “frequency” • Draw a bar for each value as high as the frequency for that value, as represented on the y-axis • Frequency Polygons – Line graphs with the x-axes representing values (or midpoints of intervals) and the y-axes representing p g frequencies. q • Start the same as a Histogram • Instead of drawing bars to represent frequency, place a dot at the appropriate frequency for each value or interval of values • Connect the dots 2 5/8/2014 • Frequency Polygons • Normal Distribution – A specific frequency distribution in the shape of a bell-shaped, symmetric, unimodal curve • Similar to a frequency polygon with infinite observations, thus a smooth curve instead of connected dots • Skewness • Kurtosis – How much one of the tails of the distribution is pulled away from the center Floor effects & Ceiling effects – The degree to which a curve’s width and thickness of its tails deviate from the normal curve • Mesokurtic (normal), Leptokurtic (tall and thin), Platykurtic (short and fat) Platykurtic • Best represents the center of a data set, the particular value that all the other data seem to be gathering around. Usually at high point of histogram. Leptokurtic • Mean – Arithmetic average of a group of scores – M = Σx/N – Mean, Median, Mode • There is no best measure for all data! 3 5/8/2014 • Median (mdn) • Mode – The middle score of all the scores in a sample when the scores are arranged in ascending order. 5 3 6 9 11 28 3 1 15 1 3 3 5 6 9 11 15 28 – The most common score of all the scores in a sample • Unimodal, Bimodal, Multimodal 5 3 6 9 11 28 3 1 15 9 45 32 27 16 3 89 12 3 9 12 16 27 32 45 89 1 3 3 5 6 9 11 15 28 21.5 • An extreme score that is either very high or very low in comparison with the rest of the scores in the distribution. • Which is the best measure? 3 9 12 2 16 2 17 5 11 45 89 32 1 96 • Example from book (1e): – Tuition Increases & Outliers 1 2 2 3 5 9 11 12 16 17 32 45 89 96 Mode = 2 Median = 11.5 Mean = 24.29 4 5/8/2014 • Range = Xhighest - Xlowest – Does not tell us much, other than absolute spread • How close to the mean? • How far from the mean is the typical score? • Variance & Standard Deviation – The typical amount that the scores in a sample vary, or deviate, from the mean. – Variance: V i SD S 2 or s2 or σ2 • SD2 = Σ(X-M)2 N Sum of Squares (SS): The sum of squared deviations from the mean – Standard Deviation: SD or s or σ • SD = √ Σ(X-M)2 N • Variance & Standard Deviation – Steps in Calculating 1. 2. 3. 4. 5. Find the mean of the data Subtract the mean from each individual score Square each of these numbers Sum all of these squared numbers (Sum of Squares) Divide the resulting number by the number of scores (N) 6. If calculating SD, take the square root of this number • The difference between the first and third quartiles of a data set. – 1st Quartile: 25th percentile • Median of lower half of distribution – 3rdd Quartile: 75thh percentile • Median of upper half of distribution 1. 2. 3. Calculate the median For the lower half, calculate another median (Q1). For the upper half, calculate another median (Q3) IQR = Q3 - Q1 • Ages of everyone in class • Decide the best way to portray this data graphically, then graph it neatly (use a ruler if you have to) – Describe the shape of this distribution • Calculate Measures of Central Tendency – Which is the best for this data set • Calculate Measures of Variability (including IQR) • Show all of your work! 5