HSLC Guru

Class 10 Mathematics Chapter 14 Question Answer | Statistics | English Medium | ASSEB

Welcome to HSLC Guru

Hello dear student! Welcome to HSLC Guru. This lesson presents the complete English-medium solutions for ASSEB (Assam State School Education Board) Class 10 Mathematics, Chapter 14 — Statistics. Statistics is the branch of mathematics that deals with the collection, organisation, presentation, analysis and interpretation of numerical data. In Class IX you learned to compute the three measures of central tendency — mean, median and mode — for ungrouped data. In this chapter you extend those ideas to grouped data: data that has been organised into class intervals with frequencies. You will study three short-cut methods of computing the mean (direct, assumed-mean and step-deviation), the formulas for mode and median of a frequency distribution, the empirical relation that ties the three measures together, and the graphical determination of the median through cumulative-frequency curves called ogives. Every problem from Exercise 14.1, 14.2, 14.3 and 14.4 of the textbook is solved step-by-step below, followed by additional HSLC-style questions and a short glossary.


Summary

For grouped frequency data, the mean is the average $\bar{x}=\dfrac{\sum f_i x_i}{\sum f_i}$ where $x_i$ is the class mark (mid-point) of the $i$-th class. Three equivalent methods compute it: the direct method (multiply class mark by frequency and divide by total frequency), the assumed-mean method (subtract a chosen value $a$ to make arithmetic easier), and the step-deviation method (further divide by the common class width $h$). The mode is located inside the modal class — the class with the highest frequency — by the formula $\text{Mode}=l+\dfrac{f_1-f_0}{2f_1-f_0-f_2}\,h$. The median is located inside the median class — the class containing the $(n/2)$-th observation — by $\text{Median}=l+\dfrac{n/2-cf}{f}\,h$. The three measures satisfy the empirical relation $3\,\text{Median}=\text{Mode}+2\,\text{Mean}$. The cumulative frequency of a class is the running total of frequencies up to that class; plotting it gives the less-than ogive or the more-than ogive, and the abscissa of their point of intersection (or of the point where the ogive crosses the line $y=n/2$) is the median.


Key Formulas at a Glance

Quantity Formula
Class mark $x_i$ $x_i = \dfrac{\text{lower limit} + \text{upper limit}}{2}$
Mean (Direct) $\bar{x} = \dfrac{\sum f_i x_i}{\sum f_i}$
Mean (Assumed mean) $\bar{x} = a + \dfrac{\sum f_i d_i}{\sum f_i}$, where $d_i = x_i – a$
Mean (Step deviation) $\bar{x} = a + h\cdot\dfrac{\sum f_i u_i}{\sum f_i}$, where $u_i = \dfrac{x_i-a}{h}$
Mode $\text{Mode} = l + \left(\dfrac{f_1-f_0}{2f_1-f_0-f_2}\right) h$
Median $\text{Median} = l + \left(\dfrac{\frac{n}{2}-cf}{f}\right) h$
Empirical relation $3\,\text{Median} = \text{Mode} + 2\,\text{Mean}$

In the mode formula, $l$ is the lower limit of the modal class, $h$ is the class width, $f_1$ is the frequency of the modal class, $f_0$ the frequency of the class preceding it and $f_2$ the frequency of the class succeeding it. In the median formula, $l$ is the lower limit of the median class, $n=\sum f_i$, $cf$ is the cumulative frequency of the class preceding the median class and $f$ is the frequency of the median class.


Exercise 14.1 — Mean of Grouped Data

Question 1. A survey was conducted by a group of students as a part of their environment awareness programme, in which they collected the following data regarding the number of plants in 20 houses in a locality. Find the mean number of plants per house.

Number of plants 0–2 2–4 4–6 6–8 8–10 10–12 12–14
Number of houses 1 2 1 5 6 2 3

Which method did you use for finding the mean, and why?

Answer: Since the class marks $x_i$ and frequencies $f_i$ are small, we use the direct method.

Class $f_i$ $x_i$ $f_i x_i$
0–2 1 1 1
2–4 2 3 6
4–6 1 5 5
6–8 5 7 35
8–10 6 9 54
10–12 2 11 22
12–14 3 13 39
Total 20 162

$$\bar{x}=\frac{\sum f_i x_i}{\sum f_i}=\frac{162}{20}=8.1$$

Mean number of plants per house = 8.1.

Question 2. Consider the following distribution of daily wages of 50 workers of a factory. Find the mean daily wages of the workers.

Daily wages (in Rs) 500–520 520–540 540–560 560–580 580–600
Number of workers 12 14 8 6 10

Answer: Take assumed mean $a=550$ and class width $h=20$, so $u_i=\dfrac{x_i-550}{20}$.

Class $f_i$ $x_i$ $u_i$ $f_i u_i$
500–520 12 510 $-2$ $-24$
520–540 14 530 $-1$ $-14$
540–560 8 550 0 0
560–580 6 570 1 6
580–600 10 590 2 20
Total 50 $-12$

$$\bar{x}=a+h\cdot\frac{\sum f_i u_i}{\sum f_i}=550+20\cdot\frac{-12}{50}=550-4.8=545.20$$

Mean daily wages = Rs 545.20.

Question 3. The following distribution shows the daily pocket allowance of children of a locality. The mean pocket allowance is Rs 18. Find the missing frequency $f$.

Allowance (Rs) 11–13 13–15 15–17 17–19 19–21 21–23 23–25
Number of children 7 6 9 13 $f$ 5 4

Answer: Class marks: 12, 14, 16, 18, 20, 22, 24.

$\sum f_i = 7+6+9+13+f+5+4 = 44+f$.

$\sum f_i x_i = 7(12)+6(14)+9(16)+13(18)+f(20)+5(22)+4(24)$

$\quad = 84+84+144+234+20f+110+96 = 752+20f$.

Using $\bar{x}=18$:

$$18=\frac{752+20f}{44+f}\Longrightarrow 18(44+f)=752+20f\Longrightarrow 792+18f=752+20f$$

$\Longrightarrow 2f=40 \Longrightarrow f=\boxed{20}$.

Question 4. Thirty women were examined in a hospital by a doctor and the number of heart beats per minute were recorded and summarised below. Find the mean heart beats per minute.

Beats / minute 65–68 68–71 71–74 74–77 77–80 80–83 83–86
Number of women 2 4 3 8 7 4 2

Answer: Take $a=75.5$, $h=3$, so $u_i=(x_i-75.5)/3$.

Class $f_i$ $x_i$ $u_i$ $f_i u_i$
65–68 2 66.5 $-3$ $-6$
68–71 4 69.5 $-2$ $-8$
71–74 3 72.5 $-1$ $-3$
74–77 8 75.5 0 0
77–80 7 78.5 1 7
80–83 4 81.5 2 8
83–86 2 84.5 3 6
Total 30 4

$$\bar{x}=75.5+3\cdot\frac{4}{30}=75.5+0.4=75.9$$

Mean heart beats per minute = 75.9.

Question 5. In a retail market fruit vendors were selling mangoes kept in packing boxes. These boxes contained varying number of mangoes. The following was the distribution of mangoes according to the number of boxes.

Mangoes 50–52 53–55 56–58 59–61 62–64
Number of boxes 15 110 135 115 25

Find the mean number of mangoes kept in a packing box. Which method did you use, and why?

Answer: The classes are inclusive; convert to continuous classes by subtracting $0.5$ from each lower limit and adding $0.5$ to each upper limit. Class marks remain $51, 54, 57, 60, 63$. Take $a=57$, $h=3$.

Class $f_i$ $x_i$ $u_i$ $f_i u_i$
49.5–52.5 15 51 $-2$ $-30$
52.5–55.5 110 54 $-1$ $-110$
55.5–58.5 135 57 0 0
58.5–61.5 115 60 1 115
61.5–64.5 25 63 2 50
Total 400 25

$$\bar{x}=57+3\cdot\frac{25}{400}=57+0.1875\approx 57.19$$

Mean = 57.19 mangoes per box. Step-deviation method was used because frequencies and class marks are large but $h$ is constant.

Question 6. The table below shows the daily expenditure on food of 25 households in a locality. Find the mean daily expenditure on food.

Expenditure (Rs) 100–150 150–200 200–250 250–300 300–350
Number of households 4 5 12 2 2

Answer: Take $a=225$, $h=50$.

Class $f_i$ $x_i$ $u_i$ $f_i u_i$
100–150 4 125 $-2$ $-8$
150–200 5 175 $-1$ $-5$
200–250 12 225 0 0
250–300 2 275 1 2
300–350 2 325 2 4
Total 25 $-7$

$$\bar{x}=225+50\cdot\frac{-7}{25}=225-14=211$$

Mean daily expenditure on food = Rs 211.

Question 7. To find out the concentration of $\mathrm{SO_2}$ in the air (in parts per million, i.e., ppm), the data was collected for 30 localities in a certain city.

$\mathrm{SO_2}$ (ppm) 0.00–0.04 0.04–0.08 0.08–0.12 0.12–0.16 0.16–0.20 0.20–0.24
Frequency 4 9 9 2 4 2

Find the mean concentration of $\mathrm{SO_2}$ in the air.

Answer: Class marks $x_i$: $0.02, 0.06, 0.10, 0.14, 0.18, 0.22$.

Class $f_i$ $x_i$ $f_i x_i$
0.00–0.04 4 0.02 0.08
0.04–0.08 9 0.06 0.54
0.08–0.12 9 0.10 0.90
0.12–0.16 2 0.14 0.28
0.16–0.20 4 0.18 0.72
0.20–0.24 2 0.22 0.44
Total 30 2.96

$$\bar{x}=\frac{2.96}{30}\approx 0.099\text{ ppm}$$

Mean concentration of $\mathrm{SO_2}$ = 0.099 ppm.

Question 8. A class teacher has the following absentee record of 40 students of a class for the whole term. Find the mean number of days a student was absent.

Days 0–6 6–10 10–14 14–20 20–28 28–38 38–40
Number of students 11 10 7 4 4 3 1

Answer: Class widths are unequal, so we use the direct method.

Class $f_i$ $x_i$ $f_i x_i$
0–6 11 3 33
6–10 10 8 80
10–14 7 12 84
14–20 4 17 68
20–28 4 24 96
28–38 3 33 99
38–40 1 39 39
Total 40 499

$$\bar{x}=\frac{499}{40}=12.475\approx 12.48$$

Mean number of days absent per student $\approx$ 12.48 days.

Question 9. The following table gives the literacy rate (in percentage) of 35 cities. Find the mean literacy rate.

Literacy rate (%) 45–55 55–65 65–75 75–85 85–95
Number of cities 3 10 11 8 3

Answer: Take $a=70$, $h=10$.

Class $f_i$ $x_i$ $u_i$ $f_i u_i$
45–55 3 50 $-2$ $-6$
55–65 10 60 $-1$ $-10$
65–75 11 70 0 0
75–85 8 80 1 8
85–95 3 90 2 6
Total 35 $-2$

$$\bar{x}=70+10\cdot\frac{-2}{35}=70-0.5714\approx 69.43\%$$

Mean literacy rate = 69.43%.


Exercise 14.2 — Mode of Grouped Data

Question 1. The following table shows the ages of patients admitted to a hospital during a year. Find the mode and the mean of the data given. Compare and interpret the two measures of central tendency.

Age (years) 5–15 15–25 25–35 35–45 45–55 55–65
Number of patients 6 11 21 23 14 5

Answer: Highest frequency = 23, so modal class is $35\text{–}45$. Hence $l=35,\;f_1=23,\;f_0=21,\;f_2=14,\;h=10$.

$$\text{Mode}=35+\frac{23-21}{2(23)-21-14}\times 10=35+\frac{2}{11}\times 10=35+1.81\approx 36.8\text{ years}$$

Mean: Take $a=40$, $h=10$.

Class $f_i$ $x_i$ $u_i$ $f_i u_i$
5–15 6 10 $-3$ $-18$
15–25 11 20 $-2$ $-22$
25–35 21 30 $-1$ $-21$
35–45 23 40 0 0
45–55 14 50 1 14
55–65 5 60 2 10
Total 80 $-37$

$$\bar{x}=40+10\cdot\frac{-37}{80}=40-4.625=35.38\text{ years}$$

Interpretation: Mode = 36.8 years, Mean = 35.38 years. The maximum number of patients admitted is in the age group 36–37 years (mode), while on average a patient admitted is about 35.38 years old.

Question 2. The following data gives the information on the observed lifetimes (in hours) of 225 electrical components. Determine the modal lifetimes of the components.

Lifetimes (hours) 0–20 20–40 40–60 60–80 80–100 100–120
Frequency 10 35 52 61 38 29

Answer: Modal class is $60\text{–}80$ (frequency 61). $l=60,\;f_1=61,\;f_0=52,\;f_2=38,\;h=20$.

$$\text{Mode}=60+\frac{61-52}{2(61)-52-38}\times 20=60+\frac{9}{32}\times 20=60+5.625=65.625\text{ hours}$$

Modal lifetime = 65.625 hours.

Question 3. The following data gives the distribution of total monthly household expenditure of 200 families of a village. Find the modal monthly expenditure of the families. Also find the mean monthly expenditure.

Expenditure (Rs) 1000–1500 1500–2000 2000–2500 2500–3000 3000–3500 3500–4000 4000–4500 4500–5000
Families 24 40 33 28 30 22 16 7

Answer: Modal class $1500\text{–}2000$, $l=1500,\;f_1=40,\;f_0=24,\;f_2=33,\;h=500$.

$$\text{Mode}=1500+\frac{40-24}{2(40)-24-33}\times 500=1500+\frac{16}{23}\times 500=1500+347.83\approx 1847.83$$

Modal expenditure $\approx$ Rs 1847.83.

Mean: Take $a=2750$, $h=500$.

Class $f_i$ $x_i$ $u_i$ $f_i u_i$
1000–1500 24 1250 $-3$ $-72$
1500–2000 40 1750 $-2$ $-80$
2000–2500 33 2250 $-1$ $-33$
2500–3000 28 2750 0 0
3000–3500 30 3250 1 30
3500–4000 22 3750 2 44
4000–4500 16 4250 3 48
4500–5000 7 4750 4 28
Total 200 $-35$

$$\bar{x}=2750+500\cdot\frac{-35}{200}=2750-87.5=2662.50$$

Mean monthly expenditure = Rs 2662.50.

Question 4. The following distribution gives the state-wise teacher-student ratio in higher secondary schools of India. Find the mode and mean of this data. Interpret the two measures.

Students per teacher 15–20 20–25 25–30 30–35 35–40 40–45 45–50 50–55
Number of states 3 8 9 10 3 0 0 2

Answer: Modal class $30\text{–}35$, $l=30,\;f_1=10,\;f_0=9,\;f_2=3,\;h=5$.

$$\text{Mode}=30+\frac{10-9}{2(10)-9-3}\times 5=30+\frac{1}{8}\times 5=30+0.625=30.625$$

Modal ratio $\approx$ 30.6 students per teacher.

Mean: Take $a=32.5$, $h=5$.

Class $f_i$ $x_i$ $u_i$ $f_i u_i$
15–20 3 17.5 $-3$ $-9$
20–25 8 22.5 $-2$ $-16$
25–30 9 27.5 $-1$ $-9$
30–35 10 32.5 0 0
35–40 3 37.5 1 3
40–45 0 42.5 2 0
45–50 0 47.5 3 0
50–55 2 52.5 4 8
Total 35 $-23$

$$\bar{x}=32.5+5\cdot\frac{-23}{35}=32.5-3.286\approx 29.22$$

Mean ratio $\approx$ 29.22. Interpretation: most states (mode) have around 30 students per teacher, while on average the ratio is about 29.2.

Question 5. The given distribution shows the number of runs scored by some top batsmen of the world in one-day international cricket matches. Find the mode of the data.

Runs 3000–4000 4000–5000 5000–6000 6000–7000 7000–8000 8000–9000 9000–10000 10000–11000
Batsmen 4 18 9 7 6 3 1 1

Answer: Modal class $4000\text{–}5000$, $l=4000,\;f_1=18,\;f_0=4,\;f_2=9,\;h=1000$.

$$\text{Mode}=4000+\frac{18-4}{2(18)-4-9}\times 1000=4000+\frac{14}{23}\times 1000=4000+608.7\approx 4608.7$$

Mode $\approx$ 4608.7 runs.

Question 6. A student noted the number of cars passing through a spot on a road for 100 periods each of 3 minutes and summarised it in the table below. Find the mode of the data.

Cars 0–10 10–20 20–30 30–40 40–50 50–60 60–70 70–80
Frequency 7 14 13 12 20 11 15 8

Answer: Modal class $40\text{–}50$, $l=40,\;f_1=20,\;f_0=12,\;f_2=11,\;h=10$.

$$\text{Mode}=40+\frac{20-12}{2(20)-12-11}\times 10=40+\frac{8}{17}\times 10=40+4.71\approx 44.71$$

Mode $\approx$ 44.7 cars.

Histogram for Question 6 (Cars per 3-minute interval)

0 5 10 15 20 Number of Cars Freq 7 14 13 12 20 11 15 8 0 10 20 30 40 50 60 70 80 Modal class


Exercise 14.3 — Median of Grouped Data

Question 1. The following frequency distribution gives the monthly consumption of electricity of 68 consumers of a locality. Find the median, mean and mode of the data and compare them.

Consumption (units) 65–85 85–105 105–125 125–145 145–165 165–185 185–205
Consumers 4 5 13 20 14 8 4

Answer: $n=68$, so $n/2=34$. Cumulative frequencies: 4, 9, 22, 42, 56, 64, 68. The cumulative frequency just exceeding 34 is 42, hence median class $125\text{–}145$, $l=125,\;cf=22,\;f=20,\;h=20$.

$$\text{Median}=125+\frac{34-22}{20}\times 20=125+12=137\text{ units}$$

Mode: Modal class $125\text{–}145$, $l=125,\;f_1=20,\;f_0=13,\;f_2=14,\;h=20$.

$$\text{Mode}=125+\frac{20-13}{40-13-14}\times 20=125+\frac{7}{13}\times 20\approx 135.77$$

Mean: Take $a=135$, $h=20$. Step-deviation gives $\bar{x}=135+20\cdot\dfrac{7}{68}\approx 137.06$.

Median $\approx 137$, Mode $\approx 135.77$, Mean $\approx 137.06$. The three measures are approximately equal — the distribution is nearly symmetrical.

Question 2. If the median of the distribution given below is 28.5, find the values of $x$ and $y$.

Class 0–10 10–20 20–30 30–40 40–50 50–60 Total
Frequency 5 $x$ 20 15 $y$ 5 60

Answer: $5+x+20+15+y+5=60\Rightarrow x+y=15$.

Median $=28.5$ lies in $20\text{–}30$, so median class is $20\text{–}30$. Here $l=20,\;n/2=30,\;cf=5+x,\;f=20,\;h=10$.

$$28.5=20+\frac{30-(5+x)}{20}\times 10\Longrightarrow 8.5=\frac{25-x}{2}$$

$\Rightarrow 25-x=17 \Rightarrow x=8$. Hence $y=15-8=7$.

$\boxed{x=8,\;y=7}$.

Question 3. A life insurance agent found the following data for distribution of ages of 100 policy holders. Calculate the median age, if policies are given only to persons whose age is 18 years onwards but less than 60 years.

Age (years) Below 20 Below 25 Below 30 Below 35 Below 40 Below 45 Below 50 Below 55 Below 60
Number of policy holders 2 6 24 45 78 89 92 98 100

Answer: Convert to ordinary frequency table.

Age Frequency $cf$
15–20 2 2
20–25 4 6
25–30 18 24
30–35 21 45
35–40 33 78
40–45 11 89
45–50 3 92
50–55 6 98
55–60 2 100

$n=100$, $n/2=50$. Cumulative frequency just $\ge 50$ is 78, so median class $35\text{–}40$. $l=35,\;cf=45,\;f=33,\;h=5$.

$$\text{Median}=35+\frac{50-45}{33}\times 5=35+\frac{25}{33}\approx 35.76\text{ years}$$

Median age = 35.76 years.

Question 4. The lengths of 40 leaves of a plant are measured correct to the nearest millimetre, and the data obtained is represented in the following table. Find the median length of the leaves.

Length (mm) 118–126 127–135 136–144 145–153 154–162 163–171 172–180
Leaves 3 5 9 12 5 4 2

Answer: Convert to continuous classes.

Class $f$ $cf$
117.5–126.5 3 3
126.5–135.5 5 8
135.5–144.5 9 17
144.5–153.5 12 29
153.5–162.5 5 34
162.5–171.5 4 38
171.5–180.5 2 40

$n=40$, $n/2=20$. Median class $144.5\text{–}153.5$, $l=144.5,\;cf=17,\;f=12,\;h=9$.

$$\text{Median}=144.5+\frac{20-17}{12}\times 9=144.5+2.25=146.75\text{ mm}$$

Median length = 146.75 mm.

Question 5. The following table gives the distribution of the lifetime of 400 neon lamps. Find the median lifetime of a lamp.

Lifetime (hours) 1500–2000 2000–2500 2500–3000 3000–3500 3500–4000 4000–4500 4500–5000
Number of lamps 14 56 60 86 74 62 48

Answer: $cf$: 14, 70, 130, 216, 290, 352, 400. $n/2=200$. Median class $3000\text{–}3500$, $l=3000,\;cf=130,\;f=86,\;h=500$.

$$\text{Median}=3000+\frac{200-130}{86}\times 500=3000+\frac{70\times 500}{86}\approx 3406.98$$

Median lifetime $\approx$ 3406.98 hours.

Question 6. 100 surnames were randomly picked from a local telephone directory and the frequency distribution of the number of letters in the English alphabet in the surnames was obtained. Determine the median, mean and modal size of the surnames.

Letters 1–4 4–7 7–10 10–13 13–16 16–19
Surnames 6 30 40 16 4 4

Answer: $cf$: 6, 36, 76, 92, 96, 100. $n/2=50$. Median class $7\text{–}10$, $l=7,\;cf=36,\;f=40,\;h=3$.

$$\text{Median}=7+\frac{50-36}{40}\times 3=7+\frac{42}{40}=7+1.05=8.05$$

Mode: Modal class $7\text{–}10$, $l=7,\;f_1=40,\;f_0=30,\;f_2=16,\;h=3$.

$$\text{Mode}=7+\frac{40-30}{80-30-16}\times 3=7+\frac{30}{34}\approx 7.88$$

Mean: Take $a=8.5$, $h=3$ (or use direct method). Computing directly: $\sum f_i x_i = 6(2.5)+30(5.5)+40(8.5)+16(11.5)+4(14.5)+4(17.5)=15+165+340+184+58+70=832$. So $\bar{x}=832/100=8.32$.

Median $\approx 8.05$, Mode $\approx 7.88$, Mean $=8.32$.

Question 7. The distribution below gives the weights of 30 students of a class. Find the median weight of the students.

Weight (kg) 40–45 45–50 50–55 55–60 60–65 65–70 70–75
Students 2 3 8 6 6 3 2

Answer: $cf$: 2, 5, 13, 19, 25, 28, 30. $n/2=15$. Median class $55\text{–}60$, $l=55,\;cf=13,\;f=6,\;h=5$.

$$\text{Median}=55+\frac{15-13}{6}\times 5=55+\frac{10}{6}=55+1.67\approx 56.67\text{ kg}$$

Median weight $\approx$ 56.67 kg.


Exercise 14.4 — Cumulative Frequency Curves (Ogives)

Question 1. The following distribution gives the daily income of 50 workers of a factory. Convert the distribution above to a less-than type cumulative frequency distribution, and draw its ogive.

Daily income (Rs) 100–120 120–140 140–160 160–180 180–200
Workers 12 14 8 6 10

Answer: Less-than cumulative frequency table:

Daily income (less than) 120 140 160 180 200
Cumulative workers 12 26 34 40 50

Plot the points $(120,12),(140,26),(160,34),(180,40),(200,50)$ and join them by a smooth free-hand curve.

Daily income (Rs) — upper limit cf 0 10 20 30 40 50 120 140 160 180 200 Less-than ogive

Question 2. During the medical check-up of 35 students of a class, their weights were recorded as follows. Draw a less than type ogive for the given data. Hence obtain the median weight from the graph and verify the result by using the formula.

Weight (kg) less than 38 40 42 44 46 48 50 52
Number of students 0 3 5 9 14 28 32 35

Answer: Plot $(38,0),(40,3),(42,5),(44,9),(46,14),(48,28),(50,32),(52,35)$. Locate $y=n/2=17.5$ on the cf axis; draw a horizontal line meeting the curve and drop a perpendicular to the $x$-axis. The foot of the perpendicular gives the median $\approx 46.5$ kg.

Weight (kg) — upper limit cf 0 5 10 15 20 25 30 35 38 40 42 44 46 48 50 52 n/2 = 17.5 46.5 Less-than ogive

Verification by formula: Frequencies of classes 38–40, 40–42, … give $f$: 3, 2, 4, 5, 14, 4, 3. $cf$: 3, 5, 9, 14, 28, 32, 35. $n/2=17.5$, median class $46\text{–}48$, $l=46,\;cf=14,\;f=14,\;h=2$.

$$\text{Median}=46+\frac{17.5-14}{14}\times 2=46+0.5=46.5\text{ kg}$$

Graphical median = formula median = 46.5 kg. ✓

Question 3. The following table gives production yield per hectare of wheat of 100 farms of a village. Change the distribution to a more than type distribution, and draw its ogive.

Yield (kg/ha) 50–55 55–60 60–65 65–70 70–75 75–80
Number of farms 2 8 12 24 38 16

Answer: More-than cumulative frequency table:

Yield (more than) 50 55 60 65 70 75
Cumulative farms 100 98 90 78 54 16

Plot $(50,100),(55,98),(60,90),(65,78),(70,54),(75,16)$ and join with a smooth curve.

Yield (kg/ha) — lower limit cf 0 20 40 60 80 100 50 55 60 65 70 75 More-than ogive

The more-than ogive starts at the maximum cf $(=n)$ at the lowest class boundary and decreases to zero at the highest class boundary.


Combined Less-than and More-than Ogive (Question 2 revisited)

When both ogives of the same data are drawn on the same axes, they intersect at a point whose abscissa equals the median.

Class boundary cf Median More-than Less-than (Median, n/2)


Additional Practice Questions

Q1. If $\bar{x}$ is the mean of $n$ observations $x_1,x_2,\dots,x_n$, prove that $\sum_{i=1}^{n}(x_i-\bar{x})=0$.

Answer: By definition $\bar{x}=\dfrac{1}{n}\sum x_i$, so $\sum x_i = n\bar{x}$. Therefore

$$\sum_{i=1}^{n}(x_i-\bar{x})=\sum x_i – n\bar{x}=n\bar{x}-n\bar{x}=0.$$

Q2. The mean of the following data is 53. Find the missing frequencies $p$ and $q$ if the total frequency is 100.

Class 0–20 20–40 40–60 60–80 80–100 Total
Frequency 15 $p$ 21 $q$ 17 100

Answer: $15+p+21+q+17=100\Rightarrow p+q=47$.

Class marks 10, 30, 50, 70, 90. Then $\sum f_i x_i = 10(15)+30p+50(21)+70q+90(17) = 150+30p+1050+70q+1530 = 2730+30p+70q$.

$$53=\frac{2730+30p+70q}{100}\Longrightarrow 30p+70q=2570\Longrightarrow 3p+7q=257$$

Solving with $p+q=47$: $3p+7q=257$, $3p+3q=141$, so $4q=116\Rightarrow q=29,\;p=18$.

Q3. Find the mode of the following data:

Marks 0–10 10–20 20–30 30–40 40–50
Students 5 8 15 20 12

Answer: Modal class $30\text{–}40$, $l=30,\;f_1=20,\;f_0=15,\;f_2=12,\;h=10$.

$$\text{Mode}=30+\frac{20-15}{40-15-12}\times 10=30+\frac{50}{13}\approx 33.85$$

Q4. The mean of 7 numbers is 8. If a number is included, the mean becomes 8.6. Find the included number.

Answer: Sum of 7 numbers $=7\times 8=56$. Sum of 8 numbers $=8\times 8.6=68.8$. The included number $=68.8-56=12.8$.

Q5. Use the empirical relation to find the mode of a distribution whose mean is 67.5 and median is 69.

Answer: $\text{Mode}=3\,\text{Median}-2\,\text{Mean}=3(69)-2(67.5)=207-135=72.$

Q6. Find the median of the following data:

Class 0–10 10–20 20–30 30–40 40–50
Frequency 5 15 30 8 2

Answer: $cf$: 5, 20, 50, 58, 60. $n/2=30$. Median class $20\text{–}30$, $l=20,\;cf=20,\;f=30,\;h=10$.

$$\text{Median}=20+\frac{30-20}{30}\times 10=20+\frac{100}{30}\approx 23.33$$

Q7. The following table shows the marks scored by 140 students in an examination of a certain paper. Calculate the average marks by step-deviation method.

Marks 0–10 10–20 20–30 30–40 40–50
Students 20 24 40 36 20

Answer: $a=25$, $h=10$. $u_i$: $-2,-1,0,1,2$. $f_i u_i$: $-40,-24,0,36,40$. $\sum f_i u_i=12$.

$$\bar{x}=25+10\cdot\frac{12}{140}=25+0.857\approx 25.86\text{ marks}$$

Q8. The mean and median of a moderately asymmetrical distribution are 25 and 27 respectively. Find the value of mode.

Answer: $\text{Mode}=3(27)-2(25)=81-50=31.$

Q9. Construct the less-than cumulative frequency table for the data below and use it to estimate the median graphically.

Marks 0–10 10–20 20–30 30–40 40–50
Students 4 6 10 15 5

Answer: $cf$ (less-than 10, 20, 30, 40, 50): 4, 10, 20, 35, 40. $n/2=20$. The line $y=20$ meets the ogive at $x=30$, so median $=30$ marks. Verification: median class is exactly $20\text{–}30$ with $l=20,\;cf=10,\;f=10,\;h=10$, giving $\text{Median}=20+\dfrac{20-10}{10}\times 10=30$. ✓

Q10. If the mean of the following frequency distribution is 6.4, find the value of $p$.

$x_i$ 2 4 6 10 $p+5$
$f_i$ 3 2 3 1 2

Answer: $\sum f_i = 11$. $\sum f_i x_i = 6+8+18+10+2(p+5) = 42+2p+10 = 52+2p$.

$$6.4=\frac{52+2p}{11}\Longrightarrow 70.4=52+2p\Longrightarrow p=9.2$$

Q11. The arithmetic mean of the following frequency distribution is 50. Find the value of $f$.

Class 0–20 20–40 40–60 60–80 80–100
Frequency 17 $f$ 32 24 19

Answer: Class marks 10, 30, 50, 70, 90. $\sum f_i = 92+f$. $\sum f_i x_i = 170+30f+1600+1680+1710 = 5160+30f$.

$$50=\frac{5160+30f}{92+f}\Longrightarrow 50(92+f)=5160+30f\Longrightarrow 4600+50f=5160+30f$$

$\Rightarrow 20f=560\Rightarrow f=28$.

Q12. The lengths of 50 leaves are measured. Find the mode if the modal class is $145\text{–}155$ with frequency 18, preceding class frequency 12, succeeding class frequency 9, and class width 10.

Answer: $l=145,\;f_1=18,\;f_0=12,\;f_2=9,\;h=10$.

$$\text{Mode}=145+\frac{18-12}{36-12-9}\times 10=145+\frac{60}{15}=145+4=149\text{ mm}$$


Glossary

Term Meaning
Class interval The range of values between the lower and upper limits of a class, e.g. $10\text{–}20$.
Class mark ($x_i$) The mid-point of a class interval; $x_i=\frac{1}{2}(l_i+u_i)$.
Frequency ($f_i$) The number of observations falling in a particular class.
Cumulative frequency The running total of frequencies up to a given class.
Mean ($\bar{x}$) The arithmetic average of all observations.
Median The middle value when the data are arranged in order; for grouped data it is found by the formula $l+\dfrac{n/2-cf}{f}h$.
Mode The value that occurs most often; for grouped data it is computed inside the modal class.
Modal class The class with the highest frequency.
Median class The class containing the $(n/2)$-th observation.
Assumed mean ($a$) A convenient value chosen near the centre of the data to simplify calculations.
Step deviation ($u_i$) $u_i=(x_i-a)/h$, used when class widths are equal.
Ogive The graph of a cumulative frequency distribution; it is a smooth curve.
Less-than ogive Plot of cf against the upper class limits — an increasing curve.
More-than ogive Plot of cf against the lower class limits — a decreasing curve.
Empirical relation $3\,\text{Median}=\text{Mode}+2\,\text{Mean}$ — approximate link between the three measures.

Worked-Example Walkthrough — Three Methods Side-by-Side

To give you a clear feel for which method is most efficient, let us solve the same distribution by all three methods and compare. Consider the marks scored by 50 students in a 100-mark test:

Marks 10–25 25–40 40–55 55–70 70–85 85–100
Number of students ($f_i$) 2 3 7 6 6 6

Method 1 — Direct. Compute $x_i$ and $f_i x_i$ directly.

Class $f_i$ $x_i$ $f_i x_i$
10–25 2 17.5 35.0
25–40 3 32.5 97.5
40–55 7 47.5 332.5
55–70 6 62.5 375.0
70–85 6 77.5 465.0
85–100 6 92.5 555.0
Total 30 1860.0

$$\bar{x}=\frac{1860}{30}=62$$

Method 2 — Assumed mean. Take $a=62.5$. Then $d_i=x_i-62.5$.

$x_i$ $f_i$ $d_i$ $f_i d_i$
17.5 2 $-45$ $-90$
32.5 3 $-30$ $-90$
47.5 7 $-15$ $-105$
62.5 6 0 0
77.5 6 15 90
92.5 6 30 180
Total 30 $-15$

$$\bar{x}=62.5+\frac{-15}{30}=62.5-0.5=62$$

Method 3 — Step deviation. $a=62.5,\;h=15,\;u_i=(x_i-62.5)/15$.

$x_i$ $f_i$ $u_i$ $f_i u_i$
17.5 2 $-3$ $-6$
32.5 3 $-2$ $-6$
47.5 7 $-1$ $-7$
62.5 6 0 0
77.5 6 1 6
92.5 6 2 12
Total 30 $-1$

$$\bar{x}=62.5+15\cdot\frac{-1}{30}=62.5-0.5=62$$

All three methods give the same answer $\bar{x}=62$. The step-deviation method involved the smallest numbers and is therefore the fastest when class widths are equal. The direct method is preferable when class widths are unequal (because $u_i$ is then not well defined), and the assumed-mean method is a reasonable compromise when $h$ is awkward but the deviations $d_i$ are not too large.


Why the Mode Formula Works — Brief Justification

Inside the modal class the actual mode lies somewhere between the lower limit $l$ and the upper limit $l+h$. Linear interpolation argues that the mode is shifted from $l$ by an amount proportional to how much the modal frequency $f_1$ exceeds the previous frequency $f_0$ relative to its excess over both neighbours. Specifically

$$\frac{\text{Mode}-l}{h}=\frac{f_1-f_0}{(f_1-f_0)+(f_1-f_2)}=\frac{f_1-f_0}{2f_1-f_0-f_2}$$

so $\text{Mode}=l+\dfrac{f_1-f_0}{2f_1-f_0-f_2}\cdot h$. The denominator is positive whenever $f_1>f_0$ and $f_1>f_2$ (the defining condition of a modal class).


Why the Median Formula Works — Brief Justification

If $cf$ observations lie below the median class and the median class itself contains $f$ observations spread uniformly across width $h$, then the $(n/2)$-th observation lies $(n/2-cf)$ steps into the median class. Each step is worth $h/f$ on the $x$-axis, so the median is at

$$\text{Median}=l+(n/2-cf)\cdot\frac{h}{f}=l+\frac{n/2-cf}{f}\cdot h$$

The assumption of uniform spread within the median class is what makes this only an estimate, not the exact median.


Choosing Between Mean, Median and Mode

Situation Best measure Reason
Symmetric data, no outliers (e.g. heights of students) Mean Uses every data value; smallest variance.
Skewed data with extreme values (e.g. household incomes) Median Resistant to outliers — the middle is unaffected.
Categorical or repetitive data (e.g. shoe sizes sold) Mode The most common value is the practical “typical” choice.
Distribution is moderately asymmetric Empirical relation One can be estimated from the other two using $3M_d=M_o+2\bar{x}$.

HSLC Examination Tips

  • Always state the formula clearly before substituting numbers — markers award method marks even if the arithmetic slips.
  • Convert inclusive (discontinuous) classes such as $50\text{–}52,\,53\text{–}55$ to continuous classes by subtracting and adding 0.5 to the limits before identifying the modal or median class.
  • For the step-deviation method, double-check that the class width $h$ is the same for every class. If not, fall back to the direct method.
  • When constructing a frequency distribution from a “less-than” cumulative table, subtract successive cumulative frequencies. From a “more-than” cumulative table, also subtract successive values but in the opposite order.
  • For ogive questions, label both axes — $x$-axis for the variable, $y$-axis for cumulative frequency. Mark the scale clearly.
  • To read the median from a less-than ogive, draw a horizontal line at $y=n/2$, find the point of intersection with the curve, drop a perpendicular and read the $x$-coordinate.
  • To find the median as the intersection of less-than and more-than ogives, the two curves must be drawn on the same axes and at the same scale.

More Practice Problems

Q13. Find the mean marks of students from the following cumulative frequency distribution.

Marks (less than) 10 20 30 40 50 60 70 80 90 100
Number of students 5 9 17 29 45 60 70 78 83 85

Answer: Convert to ordinary frequencies by successive subtraction: 5, 4, 8, 12, 16, 15, 10, 8, 5, 2.

Class marks: 5, 15, 25, 35, 45, 55, 65, 75, 85, 95. Take $a=45,\;h=10$.

$x_i$ $f_i$ $u_i$ $f_i u_i$
5 5 $-4$ $-20$
15 4 $-3$ $-12$
25 8 $-2$ $-16$
35 12 $-1$ $-12$
45 16 0 0
55 15 1 15
65 10 2 20
75 8 3 24
85 5 4 20
95 2 5 10
Total 85 29

$$\bar{x}=45+10\cdot\frac{29}{85}=45+3.41\approx 48.41$$

Mean marks $\approx$ 48.41.

Q14. The following table gives the frequency distribution of the number of orders per day received by a small business. Find the mode.

Orders 0–5 5–10 10–15 15–20 20–25 25–30
Days 4 9 14 20 11 6

Answer: Modal class $15\text{–}20$, $l=15,\;f_1=20,\;f_0=14,\;f_2=11,\;h=5$.

$$\text{Mode}=15+\frac{20-14}{40-14-11}\times 5=15+\frac{30}{15}=15+2=17\text{ orders}$$

Q15. Compute the median of the following data using the formula and verify it graphically using a less-than ogive.

Class 0–8 8–16 16–24 24–32 32–40 40–48
Frequency 5 9 10 16 7 3

Answer: $cf$: 5, 14, 24, 40, 47, 50. $n/2=25$. Median class $24\text{–}32$, $l=24,\;cf=24,\;f=16,\;h=8$.

$$\text{Median}=24+\frac{25-24}{16}\times 8=24+0.5=24.5$$

Class boundary (upper limit) cf 0 10 20 30 40 50 0 8 16 24 32 40 48 n/2 = 25 24.5

The graph confirms median $\approx 24.5$.

Q16. Find the mean, median and mode of the following distribution and verify the empirical relation $3\,\text{Median}=\text{Mode}+2\,\text{Mean}$.

Class 0–10 10–20 20–30 30–40 40–50
Frequency 3 8 15 10 4

Answer: Class marks 5, 15, 25, 35, 45. Take $a=25,\;h=10$.

$x_i$ $f_i$ $u_i$ $f_i u_i$ $cf$
5 3 $-2$ $-6$ 3
15 8 $-1$ $-8$ 11
25 15 0 0 26
35 10 1 10 36
45 4 2 8 40
Total 40 4

Mean: $\bar{x}=25+10\cdot\dfrac{4}{40}=25+1=26$.

Median: $n/2=20$. Median class $20\text{–}30$, $l=20,\;cf=11,\;f=15,\;h=10$.

$$\text{Median}=20+\frac{20-11}{15}\times 10=20+6=26$$

Mode: Modal class $20\text{–}30$, $l=20,\;f_1=15,\;f_0=8,\;f_2=10,\;h=10$.

$$\text{Mode}=20+\frac{15-8}{30-8-10}\times 10=20+\frac{70}{12}\approx 25.83$$

Verification: $3\,\text{Median}=78$ and $\text{Mode}+2\,\text{Mean}=25.83+52=77.83\approx 78$. ✓

Q17. An ogive of a less-than type passes through the points $(20,5),(30,12),(40,25),(50,42),(60,52),(70,60)$. Estimate (i) the median, (ii) the third quartile $Q_3$.

Answer: $n=60$. (i) $n/2=30$ — find $x$ such that the $y$-coordinate is 30: this lies between $(40,25)$ and $(50,42)$. By linear interpolation $x=40+\dfrac{30-25}{42-25}\cdot 10=40+\dfrac{50}{17}\approx 42.94$. So Median $\approx 42.94$.

(ii) $3n/4=45$ — between $(50,42)$ and $(60,52)$: $x=50+\dfrac{45-42}{52-42}\cdot 10=50+3=53$. So $Q_3\approx 53$.

Q18. A frequency distribution of the marks of 100 students has total $\sum f_i x_i = 5500$. Find the mean.

Answer: $\bar{x}=\dfrac{\sum f_i x_i}{\sum f_i}=\dfrac{5500}{100}=55.$

Q19. If for some data $\sum f_i u_i = -10,\;\sum f_i = 50,\;a=25,\;h=5$, find $\bar{x}$.

Answer: $\bar{x}=25+5\cdot\dfrac{-10}{50}=25-1=24.$

Q20. The marks obtained by 100 students of Class X in a Mathematics paper are given below.

Marks 0–5 5–10 10–15 15–20 20–25 25–30 30–35 35–40 40–45 45–50
Students 2 5 6 8 10 25 20 15 6 3

Find the mean and mode.

Answer: Class marks 2.5, 7.5, …, 47.5. Take $a=27.5,\;h=5$.

$x_i$ $f_i$ $u_i$ $f_i u_i$
2.5 2 $-5$ $-10$
7.5 5 $-4$ $-20$
12.5 6 $-3$ $-18$
17.5 8 $-2$ $-16$
22.5 10 $-1$ $-10$
27.5 25 0 0
32.5 20 1 20
37.5 15 2 30
42.5 6 3 18
47.5 3 4 12
Total 100 6

$$\bar{x}=27.5+5\cdot\frac{6}{100}=27.5+0.3=27.8\text{ marks}$$

Modal class $25\text{–}30$, $l=25,\;f_1=25,\;f_0=10,\;f_2=20,\;h=5$.

$$\text{Mode}=25+\frac{25-10}{50-10-20}\times 5=25+\frac{75}{20}=25+3.75=28.75\text{ marks}$$


Multiple-Choice Questions for Quick Revision

MCQ1. The class mark of the class $20\text{–}40$ is

(a) 20 (b) 30 (c) 40 (d) 60

Answer: (b) 30. The class mark is the mid-point: $(20+40)/2=30$.

MCQ2. If the mean of $x,\;x+3,\;x+5,\;x+7,\;x+10$ is 9, then $x$ equals

(a) 4 (b) 5 (c) 6 (d) 7

Answer: (b) 5. The sum is $5x+25$, mean $=x+5=9\Rightarrow x=4$. (Re-check: $5x+25=45\Rightarrow x=4$.) Correct option: (a) 4.

MCQ3. For grouped data, the mode is given by

(a) $l+\dfrac{f_0}{f_1+f_2}h$ (b) $l+\dfrac{f_1-f_0}{2f_1-f_0-f_2}h$ (c) $l+\dfrac{n/2-cf}{f}h$ (d) None.

Answer: (b).

MCQ4. The empirical relationship between mean, median and mode is

(a) Mean $=$ 3 Median $-$ 2 Mode (b) Mode $=$ 3 Median $-$ 2 Mean (c) 2 Mode $=$ 3 Mean $-$ Median (d) None.

Answer: (b).

MCQ5. The cumulative frequency curve for a frequency distribution is called

(a) Histogram (b) Frequency polygon (c) Ogive (d) Bar graph.

Answer: (c).

MCQ6. The two ogives (less-than and more-than) for the same data intersect at the point whose abscissa equals the

(a) Mean (b) Median (c) Mode (d) None of these.

Answer: (b).

MCQ7. If $\sum f_i d_i=20$ and $\sum f_i=10$ with assumed mean $a=50$, then the mean is

(a) 50 (b) 52 (c) 54 (d) 70.

Answer: (b). $\bar{x}=50+20/10=52$.

MCQ8. While computing the mean of grouped data, we assume that the frequencies are

(a) Centred at the upper limits (b) Centred at the lower limits (c) Centred at the class marks (d) Evenly distributed across all classes.

Answer: (c).

MCQ9. The median of a grouped data with classes $0\text{–}10,\;10\text{–}20,\;\dots$ is found to lie in the class $30\text{–}40$. Then the value of $l$ in the median formula is

(a) 30 (b) 35 (c) 40 (d) 30.5.

Answer: (a).

MCQ10. If the difference of the mode and the median of a data is 24, then the difference between the median and the mean is

(a) 12 (b) 24 (c) 8 (d) 36.

Answer: (a). From $3M_d=M_o+2\bar{x}$, we get $M_o-M_d=2(M_d-\bar{x})$, so $M_d-\bar{x}=24/2=12$.


Fill in the Blanks

  1. The class mark of the class $35\text{–}55$ is _______. Ans: 45.
  2. The most frequently occurring observation is called the _______. Ans: mode.
  3. The graph of a cumulative frequency distribution is called an _______. Ans: ogive.
  4. If $\bar{x}=45$ and Median $=46$, then Mode $=$ _______. Ans: $3(46)-2(45)=48$.
  5. The class with the highest frequency is called the _______ class. Ans: modal.
  6. For finding the median of grouped data, we look for the class containing the _______-th observation. Ans: $(n/2)$.
  7. If $a$ is the assumed mean and $d_i=x_i-a$, then $\bar{x}=a+\dfrac{\sum f_i d_i}{\sum f_i}$, called the _______ method. Ans: assumed-mean.
  8. If the mean of 4 observations is 6 and a fifth observation 16 is included, the new mean is _______. Ans: $40/5=8$.

True or False

  1. The mean of grouped data depends on the choice of assumed mean. False — the mean is invariant; only the working numbers change.
  2. The class mark is the average of the lower and upper limits of a class. True.
  3. For symmetric distributions, mean = median = mode. True.
  4. The mode can never be calculated for grouped data. False — there is a formula based on the modal class.
  5. An ogive can be used to find the median graphically. True.
  6. The empirical relation gives an exact, not approximate, link between the three measures. False — it is an approximate relation valid for moderately asymmetric data.
  7. Step deviation reduces to assumed mean when $h=1$. True.
  8. The less-than ogive is a decreasing curve. False — it is an increasing curve; the more-than ogive is decreasing.

Long-Answer Practice Problems

L1. The mean of the following frequency distribution is 62.8 and the sum of all frequencies is 50. Compute the missing frequencies $f_1$ and $f_2$.

Class 0–20 20–40 40–60 60–80 80–100 100–120
Frequency 5 $f_1$ 10 $f_2$ 7 8

Answer: Total frequency: $5+f_1+10+f_2+7+8=50\Rightarrow f_1+f_2=20$.

Class marks 10, 30, 50, 70, 90, 110. So

$\sum f_i x_i = 50+30f_1+500+70f_2+630+880 = 2060+30f_1+70f_2$.

$$62.8=\frac{2060+30f_1+70f_2}{50}\Longrightarrow 30f_1+70f_2=3140-2060=1080$$

$\Rightarrow 3f_1+7f_2=108$. Combined with $f_1+f_2=20$ ($\Rightarrow f_1=20-f_2$):

$3(20-f_2)+7f_2=108\Rightarrow 60+4f_2=108\Rightarrow f_2=12,\;f_1=8.$

L2. The median of the following data is 525. Find the values of $x$ and $y$ if the total frequency is 100.

Class 0–100 100–200 200–300 300–400 400–500 500–600 600–700 700–800 800–900 900–1000
Frequency 2 5 $x$ 12 17 20 $y$ 9 7 4

Answer: $2+5+x+12+17+20+y+9+7+4=100\Rightarrow x+y=24$.

The median 525 lies in $500\text{–}600$, so median class is $500\text{–}600$, $l=500,\;f=20,\;h=100,\;n/2=50$.

$cf$ before median class $=2+5+x+12+17=36+x$.

$$525=500+\frac{50-(36+x)}{20}\times 100\Longrightarrow 25=\frac{(14-x)\cdot 100}{20}=5(14-x)$$

$\Rightarrow 14-x=5\Rightarrow x=9$. Therefore $y=24-9=15$.

L3. Draw a less-than ogive and a more-than ogive for the following data on the same axes and read the median.

Class 0–10 10–20 20–30 30–40 40–50
Frequency 5 15 20 23 17

Answer: Less-than $cf$: 5, 20, 40, 63, 80. More-than $cf$: 80, 75, 60, 40, 17.

Plot less-than points $(10,5),(20,20),(30,40),(40,63),(50,80)$ and more-than points $(0,80),(10,75),(20,60),(30,40),(40,17)$. The two curves meet at the point whose $x$-coordinate is approximately 30 — that is the median.

Class boundary cf 0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 Median ≈ 30 Less-than More-than

Verifying by formula: $n/2=40$. Less-than $cf$: 5, 20, 40, 63, 80, so the 40th observation falls exactly at the boundary $x=30$ — i.e. the median is exactly 30. ✓

L4. The annual rainfall record (in cm) of a certain district is given below. Compute the mean rainfall and the modal rainfall.

Rainfall (cm) 20–30 30–40 40–50 50–60 60–70 70–80 80–90
Number of years 2 5 12 16 9 4 2

Answer: Take $a=55,\;h=10$.

$x_i$ $f_i$ $u_i$ $f_i u_i$
25 2 $-3$ $-6$
35 5 $-2$ $-10$
45 12 $-1$ $-12$
55 16 0 0
65 9 1 9
75 4 2 8
85 2 3 6
Total 50 $-5$

$$\bar{x}=55+10\cdot\frac{-5}{50}=55-1=54\text{ cm}$$

Modal class $50\text{–}60$, $l=50,\;f_1=16,\;f_0=12,\;f_2=9,\;h=10$.

$$\text{Mode}=50+\frac{16-12}{32-12-9}\times 10=50+\frac{40}{11}\approx 53.64\text{ cm}$$

L5. The marks scored by 50 students in a test are given below. Find the median, mean and mode and check that the empirical relation is satisfied approximately.

Marks 0–10 10–20 20–30 30–40 40–50 50–60
Students 2 4 10 15 13 6

Answer: Class marks 5, 15, 25, 35, 45, 55. $a=35,\;h=10$.

$x_i$ $f_i$ $u_i$ $f_i u_i$ $cf$
5 2 $-3$ $-6$ 2
15 4 $-2$ $-8$ 6
25 10 $-1$ $-10$ 16
35 15 0 0 31
45 13 1 13 44
55 6 2 12 50
Total 50 1

Mean: $\bar{x}=35+10\cdot\dfrac{1}{50}=35.2$.

Median: $n/2=25$. Median class $30\text{–}40$, $l=30,\;cf=16,\;f=15,\;h=10$.

$$\text{Median}=30+\frac{25-16}{15}\times 10=30+6=36$$

Mode: Modal class $30\text{–}40$, $f_1=15,\;f_0=10,\;f_2=13,\;l=30,\;h=10$.

$$\text{Mode}=30+\frac{15-10}{30-10-13}\times 10=30+\frac{50}{7}\approx 37.14$$

Check: $3\,\text{Median}=108$ and $\text{Mode}+2\,\text{Mean}=37.14+70.4=107.54\approx 108$. ✓

L6. The mean of 100 items is found to be 30. If at the time of calculation two items were wrongly taken as 32 and 12 instead of 23 and 11 respectively, find the corrected mean.

Answer: Original sum = $100\times 30 = 3000$. Correction: subtract $32+12=44$ and add $23+11=34$. Corrected sum $=3000-44+34=2990$. Corrected mean $=2990/100=29.9$.

L7. The arithmetic mean of 5 numbers is 27. If one of the numbers is excluded, the mean of the remaining 4 is 25. What is the excluded number?

Answer: Sum of 5 numbers $=5\times 27=135$. Sum of 4 numbers $=4\times 25=100$. The excluded number $=135-100=35$.

L8. If $\bar{x}_1$ and $\bar{x}_2$ are the means of two distributions with $n_1$ and $n_2$ observations respectively, prove that the mean $\bar{x}$ of the combined distribution is

$$\bar{x}=\frac{n_1\bar{x}_1+n_2\bar{x}_2}{n_1+n_2}$$

Answer: Total of first set $=n_1\bar{x}_1$; total of second $=n_2\bar{x}_2$. Combined total $=n_1\bar{x}_1+n_2\bar{x}_2$ over $(n_1+n_2)$ observations. Hence the formula.

Apply: if $n_1=30,\;\bar{x}_1=50,\;n_2=20,\;\bar{x}_2=60$, then

$$\bar{x}=\frac{30(50)+20(60)}{50}=\frac{1500+1200}{50}=\frac{2700}{50}=54$$


Step-by-Step Method Summary

Mean by Direct Method

  1. For every class, compute $x_i=\dfrac{l_i+u_i}{2}$.
  2. Multiply each $x_i$ by its frequency $f_i$ to obtain $f_i x_i$.
  3. Sum to get $\sum f_i x_i$ and $\sum f_i$.
  4. Apply $\bar{x}=\dfrac{\sum f_i x_i}{\sum f_i}$.

Mean by Assumed-Mean Method

  1. Choose $a$ near the centre of the class marks (any value will work; centre minimises arithmetic).
  2. Compute $d_i=x_i-a$.
  3. Multiply $f_i$ by $d_i$, sum.
  4. Apply $\bar{x}=a+\dfrac{\sum f_i d_i}{\sum f_i}$.

Mean by Step-Deviation Method

  1. Choose $a$ near the centre and let $h$ = common class width.
  2. Compute $u_i=\dfrac{x_i-a}{h}$ — these will be small integers $\dots,-3,-2,-1,0,1,2,3,\dots$.
  3. Multiply $f_i$ by $u_i$, sum.
  4. Apply $\bar{x}=a+h\cdot\dfrac{\sum f_i u_i}{\sum f_i}$.

Mode

  1. Identify the modal class — the one with the highest frequency $f_1$.
  2. Read off $l$ (lower limit), $f_0$ (frequency of preceding class), $f_2$ (frequency of succeeding class), $h$ (class width).
  3. Apply $\text{Mode}=l+\dfrac{f_1-f_0}{2f_1-f_0-f_2}\cdot h$.

Median

  1. Construct the cumulative frequency column.
  2. Compute $n/2$.
  3. Identify the median class — the first one whose cumulative frequency $\ge n/2$.
  4. Read off $l,\;cf$ (cumulative frequency of class preceding median class), $f$ (frequency of median class), $h$.
  5. Apply $\text{Median}=l+\dfrac{n/2-cf}{f}\cdot h$.

Median from Ogive

  1. Plot the less-than cumulative frequency curve.
  2. Mark $y=n/2$ on the cf-axis. Draw a horizontal line meeting the curve.
  3. From this intersection, drop a perpendicular onto the $x$-axis. Its foot is the median.
  4. (Alternative): plot both ogives — their point of intersection has $x$-coordinate equal to the median.

Mixed Examination-Style Problems

E1. The following table gives the ages of patients admitted to a hospital. Find the median age of the patients.

Age (years) 5–15 15–25 25–35 35–45 45–55 55–65
Number of patients 6 11 21 23 14 5

Answer: $cf$: 6, 17, 38, 61, 75, 80. $n=80,\;n/2=40$. Median class $35\text{–}45$, $l=35,\;cf=38,\;f=23,\;h=10$.

$$\text{Median}=35+\frac{40-38}{23}\times 10=35+\frac{20}{23}\approx 35.87\text{ years}$$

E2. The marks obtained by 30 students in an examination are tabulated below. Find the mean using the assumed-mean method.

Marks 10–20 20–30 30–40 40–50 50–60
Students 5 8 10 4 3

Answer: Class marks 15, 25, 35, 45, 55. Take $a=35$.

$x_i$ $f_i$ $d_i$ $f_i d_i$
15 5 $-20$ $-100$
25 8 $-10$ $-80$
35 10 0 0
45 4 10 40
55 3 20 60
Total 30 $-80$

$$\bar{x}=35+\frac{-80}{30}=35-2.67=32.33$$

Mean marks $\approx$ 32.33.

E3. A survey regarding the heights (in cm) of 51 girls of Class X of a school was conducted and the following data was obtained. Find the median height.

Height (less than) 140 145 150 155 160 165
Number of girls 4 11 29 40 46 51

Answer: Convert to ordinary frequencies: 4, 7, 18, 11, 6, 5. Classes: $\le 140,\;140\text{–}145,\;145\text{–}150,\;150\text{–}155,\;155\text{–}160,\;160\text{–}165$. $n=51,\;n/2=25.5$. The cumulative frequency just $\ge 25.5$ is 29, so median class $145\text{–}150$, $l=145,\;cf=11,\;f=18,\;h=5$.

$$\text{Median}=145+\frac{25.5-11}{18}\times 5=145+\frac{14.5\times 5}{18}=145+4.03\approx 149.03\text{ cm}$$

Median height $\approx$ 149.03 cm.

E4. A frequency distribution of the number of words in 60 short essays is given below. Find the mode.

Words 600–800 800–1000 1000–1200 1200–1400 1400–1600
Essays 10 14 20 10 6

Answer: Modal class $1000\text{–}1200$, $l=1000,\;f_1=20,\;f_0=14,\;f_2=10,\;h=200$.

$$\text{Mode}=1000+\frac{20-14}{40-14-10}\times 200=1000+\frac{1200}{16}=1000+75=1075\text{ words}$$

E5. The following is the frequency distribution of the duration of phone calls (in seconds) at a call centre. Find the mean call duration.

Duration (s) 95–125 125–155 155–185 185–215 215–245
Calls 14 22 28 21 15

Answer: Take $a=170,\;h=30$.

$x_i$ $f_i$ $u_i$ $f_i u_i$
110 14 $-2$ $-28$
140 22 $-1$ $-22$
170 28 0 0
200 21 1 21
230 15 2 30
Total 100 1

$$\bar{x}=170+30\cdot\frac{1}{100}=170+0.3=170.3\text{ seconds}$$

E6. The mean of the following frequency table is 50. Find the missing frequency $f$.

Class 0–20 20–40 40–60 60–80 80–100
Frequency 17 $f$ 32 24 19

Answer: Class marks 10, 30, 50, 70, 90.

$\sum f_i=92+f$, $\sum f_i x_i = 170+30f+1600+1680+1710=5160+30f$.

$$50=\frac{5160+30f}{92+f}\Longrightarrow 4600+50f=5160+30f\Longrightarrow 20f=560\Longrightarrow f=28$$

E7. If the mean of the data $5,\;7,\;9,\;x,\;13,\;15$ is 10, find $x$.

Answer: Sum $=49+x$. Mean $=\dfrac{49+x}{6}=10\Rightarrow 49+x=60\Rightarrow x=11$.

E8. The marks obtained by 35 students in a test out of 50 marks are recorded below. Find the mean and median.

Marks 0–10 10–20 20–30 30–40 40–50
Students 3 5 10 12 5

Answer: $a=25,\;h=10$.

$x_i$ $f_i$ $u_i$ $f_i u_i$ $cf$
5 3 $-2$ $-6$ 3
15 5 $-1$ $-5$ 8
25 10 0 0 18
35 12 1 12 30
45 5 2 10 35
Total 35 11

$\bar{x}=25+10(11/35)=25+3.14\approx 28.14$.

$n/2=17.5$, median class $20\text{–}30$, $l=20,\;cf=8,\;f=10,\;h=10$.

$$\text{Median}=20+\frac{17.5-8}{10}\times 10=20+9.5=29.5$$

E9. A survey shows that the daily wages of workers (in Rs) in a small factory follow the distribution below. Determine the modal wage.

Wages (Rs) 200–250 250–300 300–350 350–400 400–450
Workers 12 18 27 20 17

Answer: Modal class $300\text{–}350$, $l=300,\;f_1=27,\;f_0=18,\;f_2=20,\;h=50$.

$$\text{Mode}=300+\frac{27-18}{54-18-20}\times 50=300+\frac{450}{16}=300+28.125=328.125$$

Modal wage $\approx$ Rs 328.13.

E10. A discrete frequency distribution has $\sum f_i=80,\;\sum f_i x_i^2=12000,\;\sum f_i x_i=800$. Find the mean.

Answer: $\bar{x}=\dfrac{\sum f_i x_i}{\sum f_i}=\dfrac{800}{80}=10$.


Common Mistakes to Avoid

  • Forgetting to make classes continuous. If the data table has gaps such as $50\text{–}52,\;53\text{–}55,\;56\text{–}58,\dots$, you must subtract 0.5 from every lower limit and add 0.5 to every upper limit before locating the modal or median class.
  • Reading the wrong column for the median formula’s $cf$. The $cf$ in the median formula is the cumulative frequency of the class preceding the median class, not of the median class itself.
  • Confusing $l$ with the class mark. In both the mode and median formulas, $l$ is the lower limit, not the mid-point.
  • Ignoring sign of $\sum f_i u_i$. The step-deviation sum can be negative — keep the sign in $\bar{x}=a+h\cdot(\sum f_i u_i)/(\sum f_i)$.
  • Arithmetic with class widths. The step-deviation method requires equal class widths. If widths differ, fall back to the direct method.
  • Plotting ogives against the wrong $x$-coordinate. Less-than ogive uses upper class limits; more-than ogive uses lower class limits.
  • Drawing histogram bars with gaps. Histogram bars must touch each other (continuous classes); a bar chart has gaps but a histogram does not.
  • Forgetting to verify the modal-class condition. The modal class must satisfy $f_1>f_0$ and $f_1>f_2$. If the highest frequency occurs in the first or last class, identify the modal class by inspection but be careful — the formula only applies when $f_0$ and $f_2$ both exist.

Notation Used in This Chapter

Symbol Meaning
$x_i$ $i$-th class mark (mid-point of the $i$-th class)
$f_i$ $i$-th frequency
$\bar{x}$ arithmetic mean of grouped data
$a$ assumed mean
$d_i$ $x_i-a$, deviation from assumed mean
$h$ class width (assumed equal across all classes)
$u_i$ $(x_i-a)/h$, step deviation
$n$ $\sum f_i$, total frequency
$cf$ cumulative frequency of class preceding the median class
$f$ frequency of the median class
$l$ lower limit of the median class (or the modal class, depending on context)
$f_1,\;f_0,\;f_2$ frequencies of the modal class, the class preceding it, and the class succeeding it, respectively
$M_d$ median
$M_o$ mode

This completes Chapter 14 Statistics. Continue practising these problems and the additional questions to build mastery in the three measures of central tendency and graphical representation through ogives — common HSLC examination favourites.

Leave a Comment