y yyy ∑ - The University of Jordan

Transcription

y yyy ∑ - The University of Jordan
University of Jordan
Faculty of Agriculture
Dept. of Agri. Econ. & Agribusiness
Agricultural Statistic (605150)
Dr. Amer Salman
Part (03):
C – Measures of Central Tendency: ‫ﻣﻘﺎﻳﻴﺲ ﺍﻟﱰﻋﺔ ﺍﳌﺮﻛﺰﻳﺔ‬
Descriptive statistics are used for characterizing the center of a frequency distribution,
the statistics must commonly used for this purpose covey an impression of what the
typical measurement in the sample is like, and all can be interpreted; these statistics are
referred to as measures of central tendency.
Data can be classified as:
1. Ungrouped data
2. Grouped data
1. Ungrouped data
a. Mean ( Arithmetic mean) ‫ﺍﻟﻮﺳﻂ ﺍﳊﺴﺎﰊ‬
The mean is the sum of the sample measurement divided by the sample size
‫ ﻋﺪﺩﻫﺎ‬/ ‫ﺍﻟﻮﺳﻂ ﺍﳊﺴﺎﰊ = ﳎﻤﻮﻉ ﺍﻟﻘﻴﻢ‬
The sample size is represented by (n)
The observation are denoted by Y1, Y2,…, Yn or X1, X2,…, Xn
The symbol Σ (sigma uppercase Greek letter) means the sum of. The sample mean is
denoted by Y or X .
Sample mean = Y =
y+y +y
1
2
3
+ .... +
y
n
n
n
The symbol
∑y
i =1
i
represents the sum of:
y 1 + y 2 + .... + y n
For a sample of size n = 5
5
∑y
i =1
i
= y 1 + y 2 + y3 + y 4 + y5
-1-
University of Jordan
Faculty of Agriculture
Dept. of Agri. Econ. & Agribusiness
Mean for population = µ =
µ=
Agricultural Statistic (605150)
Dr. Amer Salman
∑X
N
∑ fx
N
To shorten the expression for the mean of sample of n measurements,
n
Y =
∑y
i =1
i
n
Example (3 – 1):
There was a research about the length stay in a treatment center for a sample of 10
individuals, 12, 7, 21, 10, 14, 5, 40, 14, 45, 8
The mean length of stay in this treatment center for the sample:
10
Y =
∑y
i =1
10
i
=
12+ 7 + 21+ 10 + 14 + 5 + 40 + 14 + 45 + 8 176
=
=17.6 days
10
10
:‫ﺧﺼﺎﺋﺺ ﺃﺳﺎﺳﻴﺔ ﻟﻠﻮﺳﻂ ﺍﳊﺴﺎﰊ‬
‫ ﻛﺬﻟﻚ ﺍﻟﻔﺮﻕ ﻣﺎ‬،‫ ﻣﻌﺎﺩﻟﺔ ﺣﺴﺎﺏ ﺍﻟﻮﺳﻂ ﺍﳊﺴﺎﰊ ﺍﻓﺘﺮﺿﺖ ﻭﺟﻮﺩ ﻗﻴﻢ ﺭﻗﻤﻴﺔ ﳌﺸﺎﻫﺪﺍﺕ ﻣﻌﻴﻨﺔ ﳏﺪﺩﺓ ﺟﻴﺪﹰﺍ ﻭﳝﻜﻦ ﺍﻟﺘﻤﻴﻴﺰ ﺑﻴﻨﻬﻤﺎ‬.1
،‫ ﻟﺬﻟﻚ ﻻ ﻳﻜﻮﻥ ﻣﻦ ﺍﳌﻨﻄﻖ ﺃﻥ ﻧﻘﻮﻡ ﲝﺴﺎﺏ ﺍﻟﻮﺳﻂ ﺍﳊﺴﺎﰊ ﳌﺸﺎﻫﺪﺍﺕ ﻋﻠﻰ ﺃﺳﺎﺱ ﳑﺘﺎﺯ‬،‫ﺑﲔ ﻛﻞ ﺭﻗﻤﲔ ﳏﺪﺩ ﻭﳝﻜﻦ ﻣﻘﺎﺭﻧﺘﻪ‬
.‫ﺎ‬‫ﺟﻴﺪ ﺟﺪﹰﺍ ﻋﺎﺩﻝ ﻭﺿﻌﻴﻒ ﻭﳚﺐ ﺃﻥ ﺗﺘﻤﺜﻞ ﻫﺬﻩ ﺍﻟﺪﺭﺟﺎﺕ ﺑﺄﺭﻗﺎﻡ ﺣﱴ ﻳﺘﺴﲎ ﻟﻨﺎ ﺣﺴﺎ‬
‫ﻤﻮﻋﺔ ﻣﻦ ﺍﻷﻋﺪﺍﺩ ﻋﻠﻰ ﺧﻂ ﻭﺍﺣﺪ ﻭﻳﺴﻤﻰ‬ ‫ ﳝﻜﻦ ﺃﻥ ﻳﻔﺴﺮ ﺍﻟﻮﺳﻂ ﺍﳊﺴﺎﰊ ﻋﻠﻰ ﺃﻧﻪ ﻧﻘﻄﺔ ﺍﻟﺘﻮﺍﺯﻥ‬.2
"center of gravity of observations”
Y = 17.6 days
Length of stay (days)
0
5
10
15
20
25 30
35
40
45
50
The mean as the center of gravity
-2-
University of Jordan
Faculty of Agriculture
Dept. of Agri. Econ. & Agribusiness
Agricultural Statistic (605150)
Dr. Amer Salman
Weighted Average: ‫ﺍﻟﻮﺳﻂ ﺍﳊﺴﺎﰊ ﺍﳌﺮﺟﺢ‬
‫ ﺑﺬﻟﻚ‬n2 ‫ ﻭ‬n1 ‫ ( ﻣﺮﺗﻜﺰﺓ ﻋﻠﻰ ﺣﺠﻤﺎ ﻋﻴﻨﺘﲔ ﳘﺎ‬Y 2 ‫ ﻭ‬Y 1 ) ‫ﻤﻮﻋﺘﲔ ﳐﺘﻠﻔﺘﲔ ﻣﻦ ﺍﻟﺒﻴﺎﻧﺎﺕ‬ ‫ﻋﻠﻰ ﻓﺮﺽ ﻭﺟﻮﺩ ﺃﻭﺳﺎﻁ ﺣﺴﺎﺑﻴﺔ‬
weighted average ‫( ﻫﻮ ﺍﻟﻮﺳﻂ ﺍﳊﺴﺎﰊ ﺍﳌﺮﺟﺢ‬n1+ n2) ‫ﻳﻜﻮﻥ ﺍﻟﻮﺳﻂ ﺍﳊﺴﺎﰊ ﺍﻟﻜﻠﻲ ﻟﻠﻤﺠﻤﻮﻋﺘﲔ ﻣﻌﹰﺎ‬
Y =
n1 y1 + n2 y 2
n1 + n2
=(
n1
n2
) y1 + (
) y2
n1 + n2
n1 + n2
Each separate sample mean receives weight proportional to the number of
observations on which it is based.
General formula:
Weighted Average formula: Y =
n1 y1 + n2 y 2 + L + nk y k
n1 + n2 + L + nk
Example (3 – 2):
Mean of plant production by area and plant type
Table (3 – 1):
Type
North Ghor (N1)
Mean (Kg)
Sample size (n)
Tomato
6000
4 (farm)
Eggplant
7000
8 (farm)
Middle Ghor (N2)
Mean (Kg)
Sample size (n)
9000
6 (farm)
10000
2 (farm)
For Eggplant: n1 = 8 n2 = 2
y1 = 7000 Kg / du y 2 = 10000 Kg / du
Overall mean of plant production for ten farms is
n1 y1 + n2 y 2 8(7000) + 2(10000) 76,000
Y =
=
=
=7600 Kg / du
n1 + n2
(8 + 2)
10
The weighted average of 7600 Kg/du is four times closer to 7000 Kg/du (a distance of
600 Kg) than it is to 10000 Kg/du (a distance of 2400 Kg/du). This is because there are
four times as many observations with the mean of 7000 kg/du (n1 = 8) as there are with
the mean of 10000 (n2 = 2).
For Tomato: n1 = 4 n2 = 6
y1 = 6000 Kg / du
y 2 = 9000 Kg / du
-3-
University of Jordan
Faculty of Agriculture
Dept. of Agri. Econ. & Agribusiness
Agricultural Statistic (605150)
Dr. Amer Salman
Overall mean of plant production for ten farms is
n1 y1 + n2 y 2 4(6000) + 6(9000) 78,000
Y =
=
=
=7800 Kg / du
(4 + 6)
10
n1 + n2
Weighted average of 7800 Kg/du 1.5 times closer to 9000 Kg than to 6000 Kg/du
‫ ﻭﻟﻜﻦ ﻋﻨﺪﻣﺎ ﳛﺪﺙ‬U- Shaped ‫ﻫﺬﺍ ﻣﺜﺎﻝ ﻋﻦ ﻭﺳﻂ ﺣﺴﺎﰊ ﺣﻴﺚ ﻛﺎﻧﺖ ﻣﺸﺎﻫﺪﺍﺗﻪ ﻣﺘﻤﺎﺛﻠﺔ ﺗﻘﺮﻳﺒﹰﺎ ﺣﻮﻝ ﺍﻟﻮﺳﻂ ﳑﺎ ﻳﻌﻄﻴﻨﺎ ﺷﻜﻞ ﻳﺴﻤﻰ‬
:‫ﺃﻥ ﻳﻜﻮﻥ ﻋﻨﺪﻧﺎ ﻣﺸﺎﻫﺪﺓ ﺃﻋﻠﻰ ﺑﻜﺜﲑ ﺃﻭ ﺃﻗﻞ ﺑﻜﺜﲑ ﻣﻦ ﺑﺎﻗﻲ ﺍﳌﺸﺎﻫﺪﺍﺕ ﻳﺆﺩﻱ ﺫﻟﻚ ﺇﱃ ﺍﻟﺘﺎﱄ‬
Relative
frequency
Mean
Skewed to the right (+ ve skew ness)
Income
9800
Example (3 – 3):
4200, 4400, 4700, 5200, 5400, 5450 and 9800
Mean = 18,200 according to the longer tail of distribution points. The mean tends to be
drawn in the direction of the tail of a skewed distribution
.‫ ﺳﺤﺐ ﺍﻟﻮﺳﻂ ﺇﱃ ﺍﻟﻴﻤﲔ‬9800 ‫ﺍﻟﺮﻗﻢ‬
-4-
University of Jordan
Faculty of Agriculture
Dept. of Agri. Econ. & Agribusiness
Relative
Frequency
Agricultural Statistic (605150)
Dr. Amer Salman
Skewed to the lift ( - ve skew ness)
30
Exam Score
.‫ﻫﺬﺍ ﻳﻌﲏ ﺃﻥ ﺍﳌﺘﻮﺳﻂ ﳝﻴﻞ ﺇﱃ ﺍﻻﲡﺎﻩ ﺣﻴﺚ ﲡﻤﻊ ﺍﳌﺸﺎﻫﺪﺍﺕ ﺍﻷﻛﺜﺮ‬
The presence of the large observation 98000 results in an extreme skew ness to the
right of income distribution hence, the mean is drawn above most of measurements (six
of seven). In general, the more highly skewed a frequency distribution is, the less
representative of a typical observation the mean tend to be.
.‫ﺘﻤﻊ ﻭﻳﻜﻮﻥ ﻭﺳﻄﻪ ﻣﺘﺤﻴﺰ‬‫ﻛﻠﻤﺎ ﺯﺍﺩ ﺍﻟﺘﻮﺍﺀ ﻣﻨﺤﲎ ﺍﻟﺘﻮﺯﻳﻊ ﺍﻟﺘﻜﺮﺍﺭﻱ ﻛﻠﻤﺎ ﻗﻞ ﲤﺜﻴﻞ ﻣﻔﺮﺩﺍﺗﻪ ﺃﻭ ﺍﻟﻌﻴﻨﺔ ﳍﺬﺍ ﺍ‬
Estimated the mean for a grouped data is:
∑ fy
Y =
n
Where ∑ fy refers to the sum of the frequency of each class (f) times the class midpoint
(y).
The advantages of the mean are:
1. It is familiar and understood by virtually everyone.
.‫ﳝﻜﻦ ﻓﻬﻤﻪ ﻣﻦ ﻗﺒﻞ ﺃﻱ ﺷﺨﺺ‬
2. All the observations in the data are taken into account.
.‫ﲨﻴﻊ ﺍﳌﻼﺣﻈﺎﺕ ﺗﺆﺧﺬ ﺑﺎﳊﺴﺒﺎﻥ‬
3. It is used in performing many other statistical procedures and tests.
.‫ﳝﻜﻦ ﺍﺳﺘﺨﺪﺍﻣﻪ ﰲ ﺍﳒﺎﺯ ﻓﺤﻮﺹ ﺇﺣﺼﺎﺋﻴﺔ ﺃﺧﺮﻯ‬
-5-
University of Jordan
Faculty of Agriculture
Dept. of Agri. Econ. & Agribusiness
Agricultural Statistic (605150)
Dr. Amer Salman
The disadvantages of the mean are:
1. It is affected by extreme values.
.‫ﻳﺘﺄﺛﺮ ﺑﺎﻟﻘﻴﻢ ﺍﳌﺘﻄﺮﻓﺔ‬
2. It is time consuming for a large body of ungrouped data.
.‫ﻳﺄﺧﺬ ﻭﻗﺖ ﻛﺜﲑ ﻟﻠﺤﺴﺎﺏ ﰲ ﺣﺎﻟﺔ ﻭﺟﻮﺩ ﺣﺠﻢ ﻛﺒﲑ ﻣﻦ ﺍﻟﺒﻴﺎﻧﺎﺕ‬
3. It can not be calculated when the last class of grouped data is open ended (it
includes lower limit and upper limit).
.‫ﻻ ﳝﻜﻦ ﺣﺴﺎﺑﻪ ﰲ ﺣﺎﻟﺔ ﺣﺪﻭﺩ ﻓﺌﺎﺕ ﻣﻔﺘﻮﺣﺔ‬
b. The Median ‫ﺍﻟﻮﺳﻴﻂ‬
The median of ungrouped data id the value of the middle item when all the items are
arranged in either ascending or descending order in items of values. If there are an odd
number of observations then the median is a uniquely defined.
If there are an odd number of observations:
3, 4, 4, 5, 6, 8, 8, 8, 10
The median = 6
If the sample size is even, then there are two middle measurements and the median is
usually taken to be the mean of the two
Example (3 – 4):
5, 5, 7, 9, 11, 12, 15, 18
(9 + 11)
Has median of
= 10
2
Median position =
(n + 1) (8 + 1)
=
= 4.5 (between the forth and fifth numbers)
2
2
Example (3 – 5):
Find the median position in the following cases
n = 8, median position = 4.5 so that for n=8 the median is the mean of the values of the
4th and 5th items
n = 20 median position = 10.5 so that for n=20 the median is the mean of the values of
the 10th and the 11th items.
-6-
University of Jordan
Faculty of Agriculture
Dept. of Agri. Econ. & Agribusiness
Agricultural Statistic (605150)
Dr. Amer Salman
n = 11 median position = 6 so that for n=11 the median is the mean of the values of the
6th item.
n = 25 median position = 13 so that for n=25 the median is the mean of the values of the
13th item.
The median for grouped data is given by the formula:
⎛ (n / 2) − F ⎞
⎟⎟ c
Median = L + ⎜⎜
fm
⎝
⎠
Where:
L is the lower limit of the median class (i.e. the class that contains the middle item of the
distribution).
n is the number of observations in the data set (total set).
F is the sum of the frequencies up to but not including the median class ( all classes lower
than the median class ‫) ﺣﱴ ﺍﻟﻔﺌﺔ ﺍﻟﱵ ﲢﺘﻮﻱ ﺍﻟﻮﺳﻴﻂ‬.
f m is the frequency of the median class.
c is the width of class interval.
:‫ﻭﳝﻜﻦ ﺍﳊﺼﻮﻝ ﻋﻠﻰ ﻣﻌﺎﺩﻟﺔ ﺍﻟﻮﺳﻴﻂ ﺑﺸﻜﻞ ﺁﺧﺮ‬
⎛ ( N / 2) − ∑ f i ⎞
⎟c
Median = L1 + ⎜
⎜
⎟
f median
⎝
⎠
Where:
L1 is the lower class boundary or limit of the median class (the class containing the
median).
.‫ﺍﳊﺪ ﺍﻷﺩﱏ ﻟﻔﺌﺔ ﺍﻟﻮﺳﻴﻂ‬
N is the number of the items in the data (total frequency).
.‫ﺍﻟﺘﻜﺮﺍﺭ ﺍﻟﺘﺮﺍﻛﻤﻲ‬
∑f
i
is the sum of frequencies of all classes lower than the median class.
-7-
University of Jordan
Faculty of Agriculture
Dept. of Agri. Econ. & Agribusiness
Agricultural Statistic (605150)
Dr. Amer Salman
.‫ﳎﻤﻮﻉ ﺍﻟﺘﻜﺮﺍﺭ ﻟﻠﻔﺌﺎﺕ ﻗﺒﻞ ﻓﺌﺔ ﺍﻟﻮﺳﻴﻂ‬
fmedian is the frequency of the median class.
.‫ﺗﻜﺮﺍﺭ ﻓﺌﺔ ﺍﻟﻮﺳﻴﻂ‬
c is the size or width of median interval.
.‫ﻋﺮﺽ ﺍﻟﻔﺌﺔ‬
The advantages of the median are:
1. It is not affected by extreme values.
.‫ﻻ ﻳﺘﺄﺛﺮ ﺑﺎﻟﻘﻴﻢ ﺍﳌﺘﻄﺮﻓﺔ‬
2. It is easily understood (i.e. half the data are smaller than the median and half are
greater).
.‫ﳝﻜﻦ ﻓﻬﻤﻪ ﺑﺴﻬﻮﻟﺔ‬
3. It can be calculated even when the last class is open ended and when the data are
qualitative rather than quantitative.
.‫ﳝﻜﻦ ﺣﺴﺎﺑﻪ ﺣﱴ ﻟﻮ ﻛﺎﻧﺖ ﻟﺪﻳﻨﺎ ﻓﺘﺮﺍﺕ ﻣﻔﺘﻮﺣﺔ ﻭ ﺣﱴ ﻟﻮ ﻛﺎﻧﺖ ﺑﻴﺎﻧﺎﺕ ﻧﻮﻋﻴﺔ‬
The disadvantages of the median are:
1. It dose not use much the information available.
.‫ﻻ ﺗﺴﺘﺨﺪﻡ ﺍﻟﺒﻴﺎﻧﺎﺕ ﺍﳌﺘﻮﻓﺮﺓ ﻛﺎﻣﻠﺔ‬
Example (3 – 6):
8, 9, 10, 11, 12
0, 9, 10, 10, 10
8, 9, 10, 11, 100
8, 9, 10, 100, 100
Thus for interval variables, the median dose not utilizes all of the information
available in the sample partly because of this, it is as valuable as the mean for some
inferential purposes.
2. It requires that observations be arranged into an array, which is time
consuming for a large body of ungrouped data.
.‫ﳛﺘﺎﺝ ﻟﻮﻗﺖ ﻣﻦ ﺃﺟﻞ ﺗﺮﺗﻴﺐ ﺍﻷﻋﺪﺍﺩ ﺗﺼﺎﻋﺪﻳﹰﺎ ﺃﻭ ﺗﻨﺎﺯﻟﻴﹰﺎ ﰲ ﺣﺎﻟﺔ ﺍﻟﺒﻴﺎﻧﺎﺕ ﻏﲑ ﺍﳌﺒﻮﺑﺔ‬
-8-
University of Jordan
Faculty of Agriculture
Dept. of Agri. Econ. & Agribusiness
Agricultural Statistic (605150)
Dr. Amer Salman
Some properties of the Median:
1. For a symmetric distribution (U-shaped) the median and mean are
identical. Example: Y1 = 4, Y2 = 5, Y3 = 7, Y4 = 9, Y5 = 10
Mean = 7 = median.
Normally distributed
Mean = Median
2. For a skewed distribution, the mean lies toward the direction of skew (the
longer tail) relative to the median for income distribution is larger the
median. The distribution of grades on an exam tends to be skewed to the
left when there are some students who do considerably poorer than others
(for such distributions, the mean is less than the median).
Median
Mean
Rightward (+ ve) skew ness
-9-
University of Jordan
Faculty of Agriculture
Dept. of Agri. Econ. & Agribusiness
Agricultural Statistic (605150)
Dr. Amer Salman
Median
Mean
Leftward skew ness (-
.‫ﻫﺬﺍ ﻷﻥ ﺍﻟﻮﺳﻂ ﻳﺘﺄﺛﺮ ﺑﺎﻟﻘﻴﻢ ﺍﳌﺘﻄﺮﻓﺔ ﺑﻴﻨﻤﺎ ﺍﻟﻮﺳﻴﻂ ﻻ ﻳﺘﺄﺛﺮ ﺑﺎﻟﻘﻴﻢ ﺍﳌﺘﻄﺮﻓﺔ‬
‫ﻣﺸﺘﻘﺎﺕ ﺍﻟﻮﺳﻴﻂ‬
The median is a special case of a more general set of measures of location called
percentiles.
The pth percentile is a number such that p % of the scores fall below it and (100-p)
fall above it.
The median in fact is the p= 50th percentile; that is, the median is larger than 50 % of
the measurements and smaller than the others 50 %. Tow other percentiles that are often
listed in describing a frequency distribution are lower and upper quartiles.
The p= 25th percentile is called the lower quartile.
The p= 25th percentile is called the upper quartile.
The quartiles can be used with the median to split the distribution into four parts, each
containing a proximately one- fourth of the measurements is called inter - quartile range
the middle half of the observations are within that range.
- 10 -
University of Jordan
Faculty of Agriculture
Dept. of Agri. Econ. & Agribusiness
Agricultural Statistic (605150)
Dr. Amer Salman
25% of measurements
Inter-quartile range
‫ﻣﺘﻮﺳﻂ ﺍﳌﺪﻯ ﺍﻟﺮﺑﻴﻌﻲ‬
25% of
measurements
Lower quartile
25%
measurements
Median
Q2
‫ﺍﻟﺮﺑﻴﻊ ﺍﻷﻭﻝ‬
Q1
Upper quartile
‫ﺍﻟﺮﺑﻴﻊ ﺍﻟﺜﺎﻟﺚ‬
Q3
Q1 = first quartile
Q3 = third quartile
Q3 – Q1 = Inter-quartile range
Locations:
Q1 =
(n+1) , Q
4
3=
3
(n+1) , Q2 = (n+1)
2
4
Example (3 – 7):
Public school expenditures per student are recorded for all the school districts in a
particular state. The distribution of expenditures is described by the lower quartile of $
1250, the median of $1400 and the upper quartile of $1770. This means that a quarter of
the school districts spent less than $1250 per student on public education. Similarly, a
quarter of the public school districts expenditures were between $1250 and $1400,
between $1400 and $1770 and above $1770. The Inter-quartile range is $1770 - $1250=
$520, so the middle half of the public school expenditures fall within a range of $520.
- 11 -
University of Jordan
Faculty of Agriculture
Dept. of Agri. Econ. & Agribusiness
Agricultural Statistic (605150)
Dr. Amer Salman
Example (3 – 8):
Verify this assertion for n = 23
Solution:
The median position is
(23 + 1)= 12
2
So that there are eleven values to its left and eleven values to its right. Thus the Q1
(11 + 1)= 6 and correspondingly, Q is the 6th value counting
(quartile one) position is
3
2
from the other end that is, the Q3 position is 18, it fallows that there are five values to the
left of the Q1 and median position (between 6-12), five values between the median
position and Q3 position (between 12 and 18), and five values to the right of Q3 position.
The Inter-quartile range is 18 - 6 = 12 (between Q3 – Q1).
Example (3 – 9):
The following are the numbers of minutes which a person, on her way to work, had to
wait for the bus on fourteen working days.
10, 2, 17, 6, 8, 3, 10, 2, 9, 5, 9, 13, 1 and 10
Find the median, Q1 and Q3.
Solution:
For n = 14, the median position is
(14 + 1)=7.5 , so that Q
2
fourth from the other end position (11)
Since the data arranged according to size are
1 2 2 3 5 6 8 9 9 10 10 10 13 17
It can be seen that the median is
(8 + 9)=8.5
2
Q1 and Q3 = 10
Range = Q3 – Q1 = 10 – 3 = 7
Q1 =
Q3 =
(n+ 1)=14 + 1=3.75
4
4
3(n + 1)
=11.25
4
- 12 -
1 position
is
(7 + 1)=4
2
and Q3 is
University of Jordan
Faculty of Agriculture
Dept. of Agri. Econ. & Agribusiness
Agricultural Statistic (605150)
Dr. Amer Salman
c. The Mode ‫ﺍﳌﻨﻮﺍﻝ‬
The mode of a set of numbers is that value which occurs with the greatest frequency,
i.e., 17 is the most common value. The mode may not exit and even if it dose exist it may
not be unique.
‫ ﺑﻌﺪ ﺗﺮﺗﻴﺐ‬.‫ﻫﻲ ﺍﻟﻘﻴﻤﺔ ﺍﻷﻛﺜﺮ ﺗﻜﺮﺍ ًﺭ ﻭﳑﻜﻦ ﺃﻥ ﻻ ﺗﺘﻮﺍﺟﺪ ﻭﺇﺫﺍ ﻭﺟﺪﺕ ﻓﻤﻦ ﺍﳌﻤﻜﻦ ﺃﻥ ﻻ ﻳﻜﻮﻥ ﻣﺘﺴﺎﻭﻳﹰﺎ ﻭ ﳝﻜﻦ ﺃﻥ ﻳﻜﻮﻥ ﻟﻪ ﺃﻛﺜﺮ ﻣﻦ ﻗﻴﻤﺔ‬
.‫ﺍﻷﻋﺪﺍﺩ‬
Example (3 – 10):
The set of 2, 2, 5, 7, 9, 9, 9, 10, 10, 11, 12, 18 has mode = 9 (uni - modal).
Example (3 – 11):
The set of 3, 5, 8, 10, 12, 15, and 16 has no mode.
Example (3 – 12):
The set of 2, 3, 4, 4, 4, 5, 5, 7, 7, 7, 9 has two modes 4 and 7 and is called bimodal
(multimodal > 2).
In the case of grouped data where a frequency curve has been constructed to fit the
data, the mode will be the value or values of X corresponding to the maximum point (or
∧
points) on the curve. This value of X is sometimes denoted by X .
From the frequency distribution or histogram the mode can be obtained from the
formula:
⎛ ∆1 ⎞
⎟⎟c
Mode = L1 +⎜⎜
⎝ ∆1 + ∆ 2 ⎠
Where:
L1 = lower class boundary of modal class (class containing the mode).
.‫ﺍﳊﺪ ﺍﻷﺩﱏ ﻟﻔﺌﺔ ﺍﳌﻨﻮﺍﻝ‬
∆1 = excess of modal frequency over frequency of next lower class or (frequency of
modal class minus the frequency of previous class).
.‫ﺍﻟﻔﺮﻕ ﺑﲔ ﺗﻜﺮﺍﺭ ﻓﺌﺔ ﺍﳌﻨﻮﺍﻝ ﻭﺍﻟﻔﺌﺔ ﺍﻟﱵ ﺗﺴﺒﻘﻬﺎ‬
∆ 2 = excess of modal frequency over frequency of next highest class.
.‫ﺍﻟﻔﺮﻕ ﺑﲔ ﺗﻜﺮﺍﺭ ﻓﺌﺔ ﺍﳌﻨﻮﺍﻝ ﻭﺍﻟﻔﺌﺔ ﺍﻟﱵ ﺗﻠﻴﻬﺎ‬
c = size or width of modal class interval.
- 13 -
University of Jordan
Faculty of Agriculture
Dept. of Agri. Econ. & Agribusiness
Agricultural Statistic (605150)
Dr. Amer Salman
.‫ﻣﺪﻯ ﺍﻟﻔﺌﺔ‬
Properties of the mode:
1. The mean, median and the mode are identical for a uni - modal symmetric
distribution, such as a bell-shaped distribution.
2. A frequency distribution is called bimodal (tri - modal, multimodal), if there are
two (three, many) values that occur with the greatest frequency. In practice, a
distribution is usually referred to a bimodal if there are tow distinct mounds in the
distribution, even if they are not of exactly the same height.
Example (3 – 13):
Relative
Frequency
5
10
15
20
25 30
35
40
Number of years in military
A hypothetical bimodal distribution (note early retirement with benefits after 20 years).
Advantages of the Mode:
1. It is not affected by extreme values.
.‫ﻻ ﻳﺘﺄﺛﺮ ﺑﺎﻟﻘﻴﻢ ﺍﳌﺘﻄﺮﻓﺔ‬
2. It is easily understood.
.‫ﳝﻜﻦ ﻓﻬﻤﻪ ﺑﺴﻬﻮﻟﺔ‬
- 14 -
University of Jordan
Faculty of Agriculture
Dept. of Agri. Econ. & Agribusiness
Agricultural Statistic (605150)
Dr. Amer Salman
3. It can be calculated even when the last class is open-ended and when the data are
qualitative rather than quantitative.
.‫ﳝﻜﻦ ﺣﺴﺎﺑﻪ ﺣﱴ ﰲ ﺣﺎﻟﺔ ﺍﻟﻔﺘﺮﺍﺕ ﺍﳌﻔﺘﻮﺣﺔ‬
Disadvantages of the Mode:
1. The mode dose not uses much of information available (like the median).
.(‫ﻻ ﻳﺄﺧﺬ ﲨﻴﻊ ﺍﳌﺸﺎﻫﺪﺍﺕ ﺑﻌﲔ ﺍﻻﻋﺘﺒﺎﺭ )ﻛﻤﺎ ﻫﻮ ﺍﳊﺎﻝ ﰲ ﺍﻟﻮﺳﻴﻂ‬
2. Sometimes no value if the data is not represented more than once, so that
there is no mode, while at other times there may be many modes. In general
the mean is the most frequently used measure of central tendency and the
mode is the least used.
.‫( ﺭﻏﻢ ﻭﺟﻮﺩﻩ‬Example 2) ‫ﰲ ﺑﻌﺾ ﺍﳊﺎﻻﺕ ﻻ ﺗﺴﺘﻄﻴﻊ ﲢﺪﻳﺪﻩ‬
Empirical Relation between Mean, Median and Mode
For uni - modal frequency curves which are moderately skewed (a symmetrical) we
have the empirical relation.
Mean − Mode = 3 (Mean − Median )
Mean − Mode = 3 Mean − 3 Median
− 2 Mean = Mode − 3 Median
Mean =
3 Median − Mode
2
In the following figures shown the relation position of the mean, median and median
for frequency curves which are skewed to the right and lift respectively. For
symmetrical curves the mean, mode and median all coincide.
- 15 -
University of Jordan
Faculty of Agriculture
Dept. of Agri. Econ. & Agribusiness
Agricultural Statistic (605150)
Dr. Amer Salman
+ Ve to the right
Mode
Median
Mean
- ve to the left
Mean
Median
Mode
The cases:
1. Mean > Median > Mode
2. Mean < Median < Mode
3. Mean = Median = Mode
Skewed to the right (+ ve).
Skewed to the lift (- ve).
Normal distribution.
.‫ﺇﺫﺍ ﺗﺴﺎﻭﻯ ﺍﻟﻮﺳﻂ ﻭﺍﻟﻮﺳﻴﻂ ﻓﺎﳌﻨﻮﺍﻝ ﻣﺴﺎﻭﻱ ﳍﻤﺎ ﺿﻤﻨﻴﹰﺎ‬
- 16 -
University of Jordan
Faculty of Agriculture
Dept. of Agri. Econ. & Agribusiness
Agricultural Statistic (605150)
Dr. Amer Salman
2. Grouped data
Table (3 – 2):
Ungrouped Data
Grouped Data
Sample
Population
Sample
Population
Mean
∑X
∑X
∑ fX
∑ fX
X =
µ=
µ=
X=
n
N
n
N
Median
‫ﺍﻟﻘﻴﻤﺔ ﺍﻟﻮﺳﻄﻴﺔ ﺑﻌﺪ ﺗﺮﺗﻴﺐ ﺍﻟﺒﻴﺎﻧﺎﺕ ﺗﺼﺎﻋﺪﻳﹰﺎ‬
⎛ (n / 2) − F1 ⎞
⎟⎟ c
L1 + ⎜⎜
fm
⎝
⎠
‫ﺍﻟﻘﻴﻤﺔ ﺍﻷﻛﺜﺮ ﺗﻜﺮﺍﺭﹰﺍ ﺑﻌﺪ ﺗﺮﺗﻴﺐ ﺍﻟﺒﻴﺎﻧﺎﺕ ﺗﺼﺎﻋﺪﻳﹰﺎ‬
Mode
⎛ ∆1
L1 +⎜⎜
⎝ ∆1 + ∆ 2
⎞
⎟⎟c
⎠
Example (3 – 14):
Table (3 – 3): Grades on a quiz for a class of 40 students.
7
5
6
2
8
7
10
4
5
5
4
6
3
5
6
7
9
8
4
6
7
8
3
6
6
7
2
7
7
4
4
9
3
8
7
10
9
2
9
5
4
6
7
9
4
6
7
9
4
6
7
10
4
6
7
10
Min. value = 2, Max. Value = 10
Data array of grades:
2
2
2
4
5
5
6
6
7
8
8
8
3
5
7
8
3
5
7
9
3
5
7
9
Range = Highest value – Lowest value
= 10.5 – 1.5 = 9
No. of intervals =
(Interval width =
9
Range
= =9
Interval width 1
9
Range
= = 1)
No. of int ervals 9
- 17 -
University of Jordan
Faculty of Agriculture
Dept. of Agri. Econ. & Agribusiness
Agricultural Statistic (605150)
Dr. Amer Salman
Table (3 – 4): Frequency distribution of grades.
Grade
Freq.(f)
Class midpoint
(x)
1.5 - 2.4
3
2
2.5 – 3.4
3
3
3.5 – 4.4
5
4
4.5 – 5.4
5
5
5.5 – 6.4
6
6
6.5 – 7.4
8
7
7.5 – 8.4
4
8
8.5 – 9.4
4
9
9.5 – 10.4
1
10
N = 40
Cumm. Freq.
3
6
11
16
22
30
34
38
40
fX
∑
6
9
20
25
36
56
32
36
20
fx = 240
Find the Mean, Median and Mode
A) For the grades on the quiz for the class 40 students the ungrouped data in table 1 and
B) for the grouped data of these grades given in table (3 – 4).
A) µ =
∑f
x
N
=
7 + 5 + 6 + L + 5 240
=
=6 Points
40
40
The median is given by the value of
Median position =
N +1
2
N + 1 40 + 1 41
=
=
= 20.5 the average of the 20th and 21st the value
2
2
2
6+6
=6
2
The Mode is 7 (the value occurs most frequently in the data set).
Median =
B) µ =
∑f
N
x
=
240
=6
40
Median position = 20.5
⎛ (n / 2) − F1 ⎞
⎟⎟ c
Median = L1 + ⎜⎜
fm
⎠
⎝
Where:
- 18 -
University of Jordan
Faculty of Agriculture
Dept. of Agri. Econ. & Agribusiness
Agricultural Statistic (605150)
Dr. Amer Salman
L1 = 5.5 = lower limit of the median class (the (5.5-6.4) class which contains the 20th and
21st observations).
.‫ﺍﳊﺪ ﺍﻷﺩﱏ ﺍﻟﻔﺌﺔ ﺍﻟﱵ ﲢﻮﻱ ﺍﻟﻮﺳﻴﻂ‬
(N,n) = 40 = number of observations.
.‫ﺘﻤﻊ ﺃﻭ ﺣﺠﻢ ﺍﻟﻌﻴﻨﺔ‬‫ﺇﻣﺎ ﺣﺠﻢ ﺍ‬
F1 = 16 = sum of observations up to not including the median.
.‫ﺍﻟﺘﻜﺮﺍﺭ ﺍﻟﺘﺮﺍﻛﻤﻲ ﻟﻠﻔﺌﺔ ﺍﻟﱵ ﺗﺴﺒﻖ ﻓﺌﺔ ﺍﻟﻮﺳﻴﻂ‬
fm (median) = 6 = frequency of the median class
.(‫ﺗﻜﺮﺍﺭ ﻓﺌﺔ ﺍﻟﻮﺳﻴﻂ )ﻟﻴﺲ ﺍﻟﺘﺮﺍﻛﻤﻲ‬
c= 1= the width of the class interval.
.‫ﻃﻮﻝ ﺍﻟﻔﺘﺮﺓ‬
⎛ (40 / 2) −16 ⎞
Median = 5.5 + ⎜
⎟ (1) =6.17
6
⎝
⎠
‫( ﻷﻥ ﻣﻌﺪﻝ ﺍﳌﻔﺮﺩﺍﺕ ﺍﳌﺘﻮﺍﺟﺪﺓ ﰲ ﻛﻞ ﻓﺘﺮﺓ ﻻ ﻳﺘﺴﺎﻭﻯ‬16.7) ‫( ﻻ ﻳﺴﺎﻭﻱ ﻗﻴﻤﺔ ﺍﻟﻮﺳﻴﻂ ﰲ ﺍﳊﺎﻟﺔ ﺍﻟﺜﺎﻧﻴﺔ‬6) ‫ﻗﻴﻤﺔ ﺍﻟﻮﺳﻴﻂ ﰲ ﺍﳊﺎﻟﺔ ﺍﻷﻭﱃ‬
.‫ﻣﻊ ﻣﻨﺘﺼﻒ ﺍﻟﻔﺘﺮﺓ‬
Q1 position =
n + 1 40 + 1 41
=
=
= 10.25
4
4
4
⎛ (n / 4) − F1 ⎞
⎟ c = 3.5 + ⎛⎜ (40 / 4) − 6 ⎞⎟1 = 4.3
Q1 = L1 + ⎜
⎜
⎟
f Q1
5
⎝
⎠
⎝
⎠
Q3 position =
3
(n + 1) = 3 (40 + 1) = 30.75
4
4
⎛ 3
⎞
⎛ 3
⎞
⎜ ( 40) − 30 ⎟
⎜ ( n) − F1 ⎟
⎟1 = 7.5
⎟ c = 7.5 + ⎜ 4
Q3 = L1 + ⎜ 4
4
⎜
⎟
⎜ FQ3 ⎟
⎜
⎟
⎜
⎟
⎝
⎠
⎝
⎠
- 19 -
University of Jordan
Faculty of Agriculture
Dept. of Agri. Econ. & Agribusiness
Agricultural Statistic (605150)
Dr. Amer Salman
Q1 < Q2 (med) < Q3
4.3 < 6.17 < 7.5
The mode for the grouped data in table (3 – 4)
⎛ ∆1
Mode = L1 +⎜⎜
⎝ ∆1 + ∆ 2
⎞
⎟⎟c
⎠
Where:
L1 = 6.5 = lower limit of the modal class [(6.5-7.4) class with highest frequency].
.‫ﺍﳊﺪ ﺍﻷﻋﻠﻰ ﻟﻠﻔﺌﺔ ﺍﻟﱵ ﲢﻮﻱ ﺃﻋﻠﻰ ﺗﻜﺮﺍﺭ‬
∆1 = 2 = frequency of modal class 8 minus the frequency of the previous class 6.
.‫ﺍﻟﻔﺮﻕ ﻣﺎ ﺑﲔ ﺃﻋﻠﻰ ﺗﻜﺮﺍﺭ ﻭﺍﻟﺬﻱ ﻳﺴﺒﻘﻪ‬
∆ 2 = 4 = refers to the modal class 8 minus the frequency of the following class 4.
.‫ﺍﻟﻔﺮﻕ ﺑﲔ ﺃﻋﻠﻰ ﺗﻜﺮﺍﺭ ﻭﺍﻟﺬﻱ ﻳﻠﻴﻪ‬
c = 1 = width of the class interval.
.‫ﻃﻮﻝ ﺍﻟﻔﺌﺔ‬
⎛
⎞
8.6
⎟⎟1 = 6.83
Mode = 6.5+⎜⎜
⎝ (8 − 6) + (8 − 4) ⎠
∆1 ; ∆ 2 must be + ve or else it is wrong.
Grouped and ungrouped results aren’t the same because we take the class mid-point
(‫ )ﻣﻨﺘﺼﻒ ﺍﻟﻔﺘﺮﺍﺕ‬not the average of the class (‫)ﻣﻌﺪﻻﺕ ﺍﻟﻔﺘﺮﺍﺕ‬.
- 20 -