Class 12 Education Chapter 7 Question Answer | Statistics and Its Application in Education | English Medium | ASSEB

Class 12 Education Chapter 7 — Statistics and Its Application in Education

Welcome to HSLC Guru. This page provides complete English-medium question answers for ASSEB Class 12 Education Chapter 7 — Statistics and Its Application in Education. Detailed notes, short and long questions, MCQs, formulas table, and key terms are included to help students prepare for the AHSEC/ASSEB Higher Secondary Final Examination.

About the Chapter

This chapter introduces the meaning, nature and uses of statistics in the field of education. It covers the methods of data collection, organisation of raw data into a frequency distribution, graphical representation of data through histogram, frequency polygon and ogive, and the computation and interpretation of measures of central tendency (Mean, Median, Mode), measures of variability (Range, Mean Deviation, Standard Deviation), correlation, and the properties of the Normal Probability Curve. Together, these tools help teachers and educational researchers describe, analyse and interpret quantitative data scientifically.

Summary

Meaning of Statistics: The word “statistics” originates from the Latin word status, the Italian statista and the German statistik, all of which mean “political state”. In its modern sense, statistics is the science of collection, organisation, presentation, analysis and interpretation of numerical data. According to Horace Secrist, “Statistics are aggregates of facts affected to a marked extent by multiplicity of causes, numerically expressed, enumerated or estimated according to reasonable standards of accuracy, collected in a systematic manner for a predetermined purpose and placed in relation to each other.”

Uses of Statistics in Education: Statistics helps in measuring abilities and achievements of pupils, evaluating teaching methods, conducting educational research, constructing and standardising tests, classifying students, comparing performance of groups, predicting future performance, framing educational policies and forecasting trends in education.

Data Collection: Data are facts or figures collected for a specific purpose. Primary data are originally collected by the investigator (e.g. by observation, interview, questionnaire, schedule), while secondary data are obtained from already existing sources such as books, journals, reports, government publications and websites.

Frequency Distribution and Class Interval: A frequency distribution is a tabular arrangement of data showing the number of times (frequency) each score or class of scores occurs. A class interval is the range or width of a group of scores (e.g. 10–14, 15–19). The size of the class interval is the difference between the upper and lower limits, and the mid-point is the average of the two limits. Tally marks are used to record frequencies while constructing the table.

Histogram and Frequency Polygon: A histogram is a graphical representation of a frequency distribution by means of adjacent rectangles whose widths represent class intervals and heights represent frequencies. A frequency polygon is a line graph obtained by joining the mid-points of the tops of the rectangles of the histogram, or by plotting frequencies against class mid-points and joining the points by straight lines.

Ogive (Cumulative Frequency Curve): An ogive is a graph of cumulative frequencies plotted against the upper (less than ogive) or lower (more than ogive) class boundaries. It is useful for finding median, quartiles, deciles and percentiles graphically.

Measures of Central Tendency: A measure of central tendency is a single value that represents the centre of a distribution. The three common measures are:

Mean (Arithmetic Average): The sum of all scores divided by the number of scores. It is the most reliable and most used measure.
Median: The middle-most score in a distribution arranged in ascending or descending order; it divides the distribution into two equal halves.
Mode: The score or class which occurs most frequently in a distribution.

Measures of Variability (Dispersion): These show the extent to which scores are scattered around the central value.

Range: The difference between the highest and the lowest score.
Mean Deviation (MD): The arithmetic mean of the absolute deviations of scores from the mean (or median).
Standard Deviation (SD or σ): The positive square root of the mean of the squared deviations of scores from the mean. It is the most stable and most widely used measure of variability.

Correlation: Correlation is the statistical relationship between two variables. The coefficient of correlation (r) ranges from −1 (perfect negative) through 0 (no correlation) to +1 (perfect positive). Karl Pearson’s product-moment method and Spearman’s rank-difference method are commonly used in educational research.

Normal Probability Curve (NPC): The Normal Curve is a bell-shaped, symmetrical curve in which the mean, median and mode coincide at the centre. About 68.26% of cases lie within ±1σ, 95.44% within ±2σ and 99.74% within ±3σ from the mean. Most physical, mental and educational traits are normally distributed in the population.

Educational Applications: Statistics is used in scoring and interpreting tests, grading students, comparing classes and schools, evaluating curriculum and teaching methods, conducting educational surveys and experiments, predicting success, identifying gifted and backward learners, and standardising psychological and achievement tests.

সাৰাংশ (Assamese Summary)

“পৰিসংখ্যা বিজ্ঞান” বা Statistics হ’ল সংখ্যাত্মক তথ্যৰ সংগ্ৰহ, সংগঠন, উপস্থাপন, বিশ্লেষণ আৰু ব্যাখ্যাৰ বিজ্ঞান। শিক্ষাৰ ক্ষেত্ৰত পৰিসংখ্যাই ছাত্ৰছাত্ৰীৰ যোগ্যতা আৰু কৃতিত্ব মূল্যায়ন কৰা, পৰীক্ষা প্ৰণয়ন আৰু মানীকৰণ কৰা, শিক্ষণ পদ্ধতিৰ তুলনা, শ্ৰেণীবিভাজন, ভৱিষ্যদ্বাণী আৰু শৈক্ষিক গৱেষণাত গুৰুত্বপূৰ্ণ ভূমিকা পালন কৰে। তথ্য মুখ্যতঃ দুই প্ৰকাৰৰ — মুখ্য (Primary) আৰু গৌণ (Secondary)। অশোধিত তথ্যক শ্ৰেণী ব্যৱধান আৰু পৌনঃপুনিকতাৰ সহায়ত পৌনঃপুনিকতা বিতৰণ গঠন কৰা হয় আৰু ইয়াক হিষ্ট’গ্ৰাম, পৌনঃপুনিকতা বহুভুজ আৰু অজিভ আদি গ্ৰাফেৰে দেখুওৱা হয়। কেন্দ্ৰীয় প্ৰৱণতাৰ মান হিচাপে গড় (Mean), মধ্যমা (Median) আৰু বহুলক (Mode) ব্যৱহাৰ হয়; বিচ্যুতিৰ মান হিচাপে ব্যাপ্তি (Range), গড় বিচ্যুতি (Mean Deviation) আৰু মান বিচ্যুতি (Standard Deviation) ব্যৱহাৰ হয়। দুটা চলকৰ মাজৰ সম্বন্ধ জোখাৰ বাবে সহসম্বন্ধ (Correlation) ব্যৱহাৰ হয়। সাধাৰণ সম্ভাৱনা বক্ৰ (Normal Probability Curve) এক ঘণ্টা-আকৃতিৰ সমমিতিক বক্ৰ য’ত গড়, মধ্যমা আৰু বহুলক একে বিন্দুত মিলে। শিক্ষাত মূল্যায়ন, ক্ৰমানুসৰণ, তুলনা, ভৱিষ্যদ্বাণী আৰু গৱেষণাত পৰিসংখ্যাৰ প্ৰয়োগ অপৰিহাৰ্য।

Textbook Questions and Answers

1. What is the meaning of the word “Statistics”?

Answer: The word “Statistics” is derived from the Latin word status, the Italian statista and the German statistik, all of which originally meant “political state”. In modern usage, statistics refers to the science of collecting, organising, presenting, analysing and interpreting numerical data so as to draw valid conclusions and make reasonable decisions.

2. Define statistics. Give Horace Secrist’s definition.

Answer: Statistics is the science that deals with the methods of collecting, classifying, presenting, comparing and interpreting numerical data. Horace Secrist defines it as: “Statistics are aggregates of facts affected to a marked extent by multiplicity of causes, numerically expressed, enumerated or estimated according to reasonable standards of accuracy, collected in a systematic manner for a predetermined purpose and placed in relation to each other.”

3. Mention the main uses of statistics in education.

Answer: Statistics is used in education for (i) measuring achievement and ability of pupils, (ii) constructing and standardising tests, (iii) classifying and grading students, (iv) comparing the performance of two or more groups, (v) evaluating teaching methods and curriculum, (vi) predicting future performance, (vii) interpreting research data, and (viii) framing educational policies and planning.

4. What is meant by primary and secondary data?

Answer: Primary data are the data which are originally collected by the investigator for the first time for a specific purpose, e.g. through observation, interview, questionnaire or schedule. Secondary data are those which have already been collected, processed and published by some other agency, and the investigator uses them for his/her own purpose, e.g. data taken from government reports, books, journals, magazines and websites.

5. What is a frequency distribution? What is meant by class interval?

Answer: A frequency distribution is a tabular arrangement of raw scores into classes (or class intervals) showing how many times each class of scores occurs. A class interval is a group or range of scores having a definite lower limit and upper limit, e.g. 10–19, 20–29 etc. The size of a class interval is the difference between its upper and lower limits, and the mid-point of an interval is the average of its two limits.

6. What is a histogram?

Answer: A histogram is a graphical representation of a frequency distribution by means of a series of adjacent rectangles erected on the class intervals (taken on the X-axis) with heights proportional to the corresponding frequencies (taken on the Y-axis). There is no gap between the rectangles because the variable is continuous.

7. What is a frequency polygon? How does it differ from a histogram?

Answer: A frequency polygon is a line graph constructed by plotting the frequencies of the classes against the mid-points of the class intervals and joining the successive points by straight lines. A histogram uses rectangles to show frequencies, whereas a frequency polygon uses a continuous broken line. A frequency polygon may be drawn directly or by joining the mid-points of the tops of the rectangles of a histogram.

8. What is an ogive?

Answer: An ogive is a cumulative frequency curve obtained by plotting the cumulative frequencies of a distribution against the upper or lower class boundaries on a graph paper and joining the points by a smooth curve. There are two types — the “less than” ogive and the “more than” ogive. Ogive is mainly used to determine the median, quartiles, deciles and percentiles graphically.

9. What is meant by measures of central tendency? Name them.

Answer: A measure of central tendency is a single representative value of a distribution that gives an idea about the centre or average of the data. The three principal measures of central tendency are (i) Mean (Arithmetic Mean), (ii) Median and (iii) Mode.

10. Define Mean, Median and Mode.

Answer: Mean is the arithmetic average of all the scores, obtained by dividing the sum of the scores by their number. Median is the middle-most score in a distribution arranged in order of magnitude; it divides the distribution into two equal halves. Mode is the score or class which occurs most frequently in the distribution.

11. Mention the merits and demerits of Mean.

Answer: Merits: (i) easy to calculate and understand, (ii) based on all observations, (iii) rigidly defined, (iv) capable of further algebraic treatment, (v) most stable measure of central tendency. Demerits: (i) greatly affected by extreme scores, (ii) cannot be calculated for open-ended classes, (iii) not suitable for qualitative data, (iv) may give a value which is not actually present in the data.

12. What is meant by measures of variability? Name them.

Answer: Measures of variability or dispersion show the extent to which the individual scores of a distribution are scattered around the central value. The common measures of variability are (i) Range, (ii) Mean Deviation (MD), and (iii) Standard Deviation (SD or σ).

13. Define Range, Mean Deviation and Standard Deviation.

Answer: Range is the difference between the highest and the lowest score in a distribution. Mean Deviation is the arithmetic mean of the absolute deviations of scores from the mean (or median). Standard Deviation is the positive square root of the arithmetic mean of the squared deviations of scores from their mean. It is the most reliable and widely used measure of variability.

14. What is correlation? What are its types?

Answer: Correlation is the statistical relationship between two variables which shows the extent to which they vary together. Its types are: (i) Positive correlation — both variables move in the same direction; (ii) Negative correlation — the variables move in opposite directions; (iii) Zero/No correlation — no systematic relationship; (iv) Perfect correlation (+1 or −1) and partial correlation; (v) Linear and non-linear correlation.

15. What is the Normal Probability Curve? State its properties.

Answer: The Normal Probability Curve (NPC) is a smooth, symmetrical, bell-shaped curve representing a theoretical normal distribution. Its main properties are: (i) it is bell-shaped and perfectly symmetrical about the mean; (ii) the mean, median and mode coincide at the centre; (iii) the curve is asymptotic to the X-axis on both sides; (iv) the total area under the curve is 1 (or 100%); (v) approximately 68.26% of cases lie within ±1σ, 95.44% within ±2σ and 99.74% within ±3σ from the mean; (vi) the curve has only one mode (unimodal).

16. What are the educational applications of the Normal Curve?

Answer: The Normal Curve is used in education to (i) determine the percentage of cases in different ability groups, (ii) convert raw scores into standard scores (z-scores) and percentile ranks, (iii) compare two distributions, (iv) standardise psychological and achievement tests, (v) classify students into ability groups, (vi) test the significance of statistical differences, and (vii) generalise about a population from a sample.

Short Answer Type Questions

1. Write any two characteristics of statistics.

Answer: (i) Statistics deals with aggregates of facts, not isolated figures. (ii) The data are numerically expressed and collected in a systematic manner for a predetermined purpose.

2. State two methods of collecting primary data.

Answer: Two common methods are (i) the questionnaire method and (ii) the interview/observation method.

3. What is a tally mark?

Answer: A tally mark is a small vertical stroke (|) used while preparing a frequency table to count the number of items falling within each class interval. Every fifth mark is drawn diagonally across the previous four for ease of counting.

4. What is meant by class limit and class boundary?

Answer: The smallest and largest values that can be included in a class interval are its lower and upper class limits. The actual values that mark the end of one class and the beginning of the next without gap are its class boundaries (obtained by adjusting half a unit on each side).

5. Differentiate between Mean and Median.

Answer: Mean is the arithmetic average of all the scores and is influenced by extreme values. Median is the middle-most value of the ordered data and is not affected by extreme scores.

6. What is the formula for finding the Mean of grouped data (Direct Method)?

Answer: Mean (M) = Σfx / N, where f = frequency of each class, x = mid-point of the class, and N = Σf = total number of cases.

7. Write the formula for Median of grouped data.

Answer: Median = L + [(N/2 − F) / fm] × i, where L = lower limit of the median class, N = total frequency, F = cumulative frequency of the class preceding the median class, fm = frequency of the median class, and i = size of the class interval.

8. Write the formula for Mode of grouped data.

Answer: Mode = 3 × Median − 2 × Mean (empirical formula). For grouped data: Mode = L + [(f1 − f0) / (2f1 − f0 − f2)] × i.

9. What are the merits of Median?

Answer: Median is easy to compute, not affected by extreme scores, can be calculated for open-ended distributions, and can be located graphically from an ogive.

10. Write any two uses of Standard Deviation.

Answer: (i) It is used to measure the variability of a distribution accurately. (ii) It is used to compute standard scores (z-scores) and to test the significance of statistical results.

11. What is the value of correlation coefficient when there is no relationship between two variables?

Answer: When there is no relationship between two variables, the value of the correlation coefficient (r) is 0 (zero correlation).

12. What is the percentage of cases lying within ±1σ from the mean in a normal distribution?

Answer: Approximately 68.26% of cases lie within ±1σ from the mean in a normal distribution.

13. Mention any two characteristics of the Normal Probability Curve.

Answer: (i) It is bell-shaped and perfectly symmetrical about the mean. (ii) The mean, median and mode coincide at the centre of the curve.

14. State any two uses of graphical representation of data.

Answer: (i) Graphs make complex numerical data easy to understand at a glance. (ii) They help in comparing two or more sets of data and in identifying trends.

15. What is meant by skewness?

Answer: Skewness refers to the lack of symmetry in a distribution. If the longer tail is on the right, the distribution is positively skewed; if on the left, it is negatively skewed.

Long Answer Type Questions

1. Define statistics. Discuss its uses and importance in the field of education.

Answer: Statistics is the science of collecting, classifying, presenting, analysing and interpreting numerical data so as to draw valid conclusions and make rational decisions. Horace Secrist defines it as “aggregates of facts affected to a marked extent by multiplicity of causes, numerically expressed, enumerated or estimated according to reasonable standards of accuracy, collected in a systematic manner for a predetermined purpose and placed in relation to each other.”

In the field of education statistics is indispensable. Its main uses are: (i) Measurement of pupils’ achievement — examination marks are processed through statistics to evaluate progress; (ii) Construction and standardisation of tests — item analysis, reliability and validity of tests are determined statistically; (iii) Classification and grading — students are classified into ability groups using measures of central tendency and the normal curve; (iv) Comparison — performance of two classes, schools, methods or sexes can be compared; (v) Educational research — sampling, hypothesis testing and analysis of data depend wholly on statistics; (vi) Prediction and guidance — correlation and regression help predict future success; (vii) Educational planning — enrolment, drop-out, teacher–pupil ratio and budgeting use statistical estimates; (viii) Evaluation of methods and curriculum. Hence, modern educational practice without statistics is virtually impossible.

2. Explain the methods of collection of data and prepare a frequency distribution from raw scores.

Answer: Data may be collected as primary (collected first-hand by the investigator) or secondary (taken from existing sources). Primary data are gathered by observation, interview, questionnaire, schedule, rating scale, projective techniques, sociometric devices and tests. Secondary data are obtained from books, journals, government reports, census reports, websites and similar published sources.

Steps in preparing a frequency distribution from raw scores: (i) determine the range = highest − lowest score; (ii) decide on the number and size (i) of class intervals (usually 5–15 classes; size 3, 5 or 10); (iii) prepare classes in order beginning from the lowest convenient value; (iv) list each score against its class using tally marks; (v) count the tally marks and write the frequency (f); (vi) total the frequencies (Σf = N) and present the table neatly with class interval, tally and frequency columns. The grouped table thus formed is a frequency distribution.

3. Describe with examples the construction of a histogram, frequency polygon and ogive.

Answer: Histogram: Mark the class boundaries on the X-axis and the frequencies on the Y-axis using a suitable scale. On each class boundary erect a rectangle whose height equals the frequency of that class. The rectangles are placed adjacent to one another without any gap because the variable is continuous.

Frequency Polygon: First find the mid-point of each class interval. Plot frequencies against the mid-points and join the successive points by straight lines. The line is extended to the X-axis at both ends (one class below and one class above) to close the polygon. It can also be drawn by joining the mid-points of the tops of the rectangles of a histogram.

Ogive: Calculate cumulative frequencies of the classes — “less than” cumulative frequencies are plotted against the upper class boundaries (less-than ogive) and “more than” cumulative frequencies against the lower class boundaries (more-than ogive). Joining the points by a smooth freehand curve gives the ogive. The point where the two ogives intersect, when read on the X-axis, gives the median.

4. Explain the meaning of measures of central tendency and discuss the merits and demerits of Mean, Median and Mode.

Answer: A measure of central tendency is a single value that represents the centre of a distribution and around which other scores tend to cluster. The three commonly used measures are Mean, Median and Mode.

Mean — Merits: easy to calculate, rigidly defined, based on all observations, capable of further algebraic treatment, the most stable measure. Demerits: greatly affected by extreme scores, cannot be computed for open-ended classes, not suitable for qualitative data, may give a value not actually present in the data.

Median — Merits: easy to understand and compute, not affected by extreme scores, can be obtained from open-ended classes and from an ogive, useful for skewed distributions. Demerits: not based on all observations, not suitable for further algebraic treatment, requires arrangement of data in order, less stable in small samples.

Mode — Merits: simplest measure, easily understood, not affected by extreme scores, useful for qualitative data such as preferences. Demerits: not rigidly defined, may be ill-defined when there are two or more modes, not based on all observations, not capable of further algebraic treatment, less stable.

5. What are measures of variability? Explain Range, Mean Deviation and Standard Deviation with their merits and demerits.

Answer: Measures of variability or dispersion show the extent to which scores deviate from a central value. They supplement measures of central tendency and reveal the homogeneity or heterogeneity of a distribution.

Range = Highest score − Lowest score. Merits: simplest, easy and quick. Demerits: based only on two extreme scores, highly unstable, not suitable for further analysis.

Mean Deviation (MD) = Σ|x − M| / N, the arithmetic mean of the absolute deviations of scores from the mean (or median). Merits: based on all scores, easy to interpret, less affected by extreme scores than SD. Demerits: ignores algebraic signs, not capable of further algebraic treatment, less reliable than SD.

Standard Deviation (σ) = √[Σ(x − M)² / N]. It is the most reliable and widely used measure of variability. Merits: based on all observations, rigidly defined, capable of further algebraic treatment, used in computing z-scores, correlation and tests of significance, the most stable measure. Demerits: difficult to compute compared with Range and MD, gives more weight to extreme scores due to squaring.

6. What is correlation? Discuss its types and educational uses.

Answer: Correlation is a statistical measure that describes the degree and direction of relationship between two variables. The coefficient of correlation (r) ranges from −1 (perfect negative) through 0 (no correlation) to +1 (perfect positive).

Types: (i) Positive — both variables increase or decrease together (e.g. study hours and marks); (ii) Negative — when one increases, the other decreases (e.g. absence and marks); (iii) Zero/no correlation — variables are independent; (iv) Perfect correlation (+1 or −1); (v) Linear and non-linear; (vi) Simple, partial and multiple correlation. The chief methods of computation are Karl Pearson’s product-moment method and Spearman’s rank-difference method.

Educational uses: (a) finding the relationship between intelligence and achievement; (b) establishing reliability and validity of tests; (c) educational and vocational guidance; (d) predicting future performance through regression; (e) selecting items in test construction; (f) comparing the results of different methods of teaching.

7. Describe the Normal Probability Curve and its educational applications.

Answer: The Normal Probability Curve (NPC) is a smooth, bell-shaped, symmetrical curve representing a theoretical normal distribution. It was developed by De Moivre, Laplace and Gauss. Its main properties are: (i) bell-shaped and perfectly symmetrical about the mean; (ii) mean, median and mode coincide at the centre; (iii) asymptotic — extends infinitely on both sides without touching the X-axis; (iv) total area under the curve is unity (or 100%); (v) about 68.26% of cases fall within ±1σ, 95.44% within ±2σ and 99.74% within ±3σ from the mean; (vi) it is unimodal; (vii) the points of inflection lie at ±1σ.

Educational applications: (i) determining the relative position of an individual in a group through standard scores or percentile ranks; (ii) classifying students into homogeneous ability groups; (iii) standardising psychological and achievement tests; (iv) converting raw scores into comparable scales (T-scores, stanines); (v) testing the significance of statistical results; (vi) generalising about a population from a sample; (vii) ability grouping in classroom situations; (viii) curving of marks (“grading on the curve”).

Multiple Choice Questions (MCQs)

1. The word “Statistics” is derived from the Latin word —
(a) Statista (b) Status (c) Statistik (d) State

Answer: (b) Status.

2. Statistics is defined as a science of —
(a) collecting data only (b) analysing data only (c) collection, presentation, analysis and interpretation of data (d) only graphical representation

Answer: (c) Collection, presentation, analysis and interpretation of data.

3. Data collected first-hand for a specific purpose are called —
(a) secondary data (b) primary data (c) grouped data (d) qualitative data

Answer: (b) Primary data.

4. Which of the following is a method of collecting primary data?
(a) Census report (b) Government publication (c) Questionnaire (d) Magazine

Answer: (c) Questionnaire.

5. The difference between the upper and lower class limits is called —
(a) class boundary (b) class size / class interval (c) frequency (d) tally

Answer: (b) Class size / class interval.

6. A graphical representation of a frequency distribution by adjacent rectangles is called —
(a) bar diagram (b) pie chart (c) histogram (d) ogive

Answer: (c) Histogram.

7. A frequency polygon is obtained by joining the —
(a) corners of rectangles (b) mid-points of the tops of the rectangles of a histogram (c) class boundaries (d) cumulative frequencies

Answer: (b) Mid-points of the tops of the rectangles of a histogram.

8. The cumulative frequency curve is called —
(a) histogram (b) ogive (c) frequency polygon (d) bar diagram

Answer: (b) Ogive.

9. The most commonly used measure of central tendency is —
(a) Mode (b) Median (c) Mean (d) Range

Answer: (c) Mean.

10. The middle-most value of an ordered distribution is the —
(a) Mean (b) Median (c) Mode (d) Range

Answer: (b) Median.

11. The score that occurs most frequently is the —
(a) Mean (b) Median (c) Mode (d) Range

Answer: (c) Mode.

12. The empirical relation between Mean, Median and Mode is —
(a) Mode = 3 Median − 2 Mean (b) Mode = 2 Mean − Median (c) Mean = 3 Median − 2 Mode (d) Median = Mean + Mode

Answer: (a) Mode = 3 Median − 2 Mean.

13. The simplest measure of variability is —
(a) Mean Deviation (b) Standard Deviation (c) Range (d) Quartile Deviation

Answer: (c) Range.

14. The most reliable measure of variability is —
(a) Range (b) Mean Deviation (c) Standard Deviation (d) Quartile Deviation

Answer: (c) Standard Deviation.

15. The square of standard deviation is called —
(a) variance (b) range (c) mean deviation (d) coefficient

Answer: (a) Variance.

16. The value of correlation coefficient lies between —
(a) 0 to 1 (b) −1 to 0 (c) −1 to +1 (d) 0 to 100

Answer: (c) −1 to +1.

17. If r = +1, the correlation is —
(a) perfect positive (b) perfect negative (c) zero (d) partial

Answer: (a) Perfect positive.

18. The product-moment method of correlation was introduced by —
(a) Spearman (b) Karl Pearson (c) Fisher (d) Yule

Answer: (b) Karl Pearson.

19. The Normal Probability Curve is —
(a) bell-shaped and symmetrical (b) skewed (c) rectangular (d) U-shaped

Answer: (a) Bell-shaped and symmetrical.

20. In a normal distribution the percentage of cases lying within ±1σ from the mean is approximately —
(a) 50% (b) 68.26% (c) 95.44% (d) 99.74%

Answer: (b) 68.26%.

21. The percentage of cases lying within ±2σ from the mean is approximately —
(a) 68.26% (b) 95.44% (c) 99.74% (d) 100%

Answer: (b) 95.44%.

22. In a normal curve, Mean, Median and Mode —
(a) are all different (b) coincide at the centre (c) are at three different points (d) are equal to zero

Answer: (b) Coincide at the centre.

23. Skewness refers to —
(a) symmetry of distribution (b) lack of symmetry of distribution (c) variability (d) central tendency

Answer: (b) Lack of symmetry of distribution.

24. The total area under a normal curve is —
(a) 0 (b) 0.5 (c) 1 (d) 100

Answer: (c) 1 (or 100%).

25. Which of the following is NOT a measure of central tendency?
(a) Mean (b) Median (c) Mode (d) Standard Deviation

Answer: (d) Standard Deviation.

Important Statistical Formulas

Measure	Formula	Symbols
Mean (ungrouped)	M = ΣX / N	X = scores, N = number of scores
Mean (grouped, direct)	M = Σfx / N	f = frequency, x = mid-point
Mean (assumed mean)	M = A + (Σfd / N) × i	A = assumed mean, d = (x − A)/i, i = class size
Median (grouped)	Md = L + [(N/2 − F)/fm] × i	L = lower limit of median class, F = cf of preceding class, fm = freq. of median class
Mode (empirical)	Mo = 3 Md − 2 M	Md = Median, M = Mean
Mode (grouped)	Mo = L + [(f1 − f0)/(2f1 − f0 − f2)] × i	f1 = freq. of modal class, f0 = preceding, f2 = succeeding
Range	R = H − L	H = highest, L = lowest score
Mean Deviation	MD = Σ\|x − M\| / N	x = score, M = mean
Standard Deviation	σ = √[Σ(x − M)² / N]	For ungrouped data
SD (grouped)	σ = √[Σf(x − M)² / N]	f = frequency, x = mid-point
Variance	σ²	Square of SD
z-score	z = (X − M) / σ	Standard score
Pearson’s r	r = Σxy / (N·σx·σy)	x = X − Mx, y = Y − My
Spearman’s ρ	ρ = 1 − [6 Σd² / N(N² − 1)]	d = difference of ranks
Quartile Deviation	QD = (Q3 − Q1) / 2	Q1, Q3 = quartiles

Key Terms

Term	Meaning
Statistics	Science of collecting, organising, analysing and interpreting numerical data.
Variable	A quantity which can take different values (e.g. age, marks).
Primary data	Data collected first-hand by the investigator.
Secondary data	Data taken from already published sources.
Frequency	The number of times a score occurs in a distribution.
Class interval	A range of scores grouped together (e.g. 10–19).
Class size (i)	Difference between upper and lower class limits.
Mid-point	Average of upper and lower limits of a class.
Cumulative frequency	Running total of frequencies up to a class.
Histogram	Adjacent-rectangle graph of a frequency distribution.
Frequency polygon	Line graph of frequencies versus class mid-points.
Ogive	Cumulative frequency curve.
Mean	Arithmetic average of scores.
Median	Middle-most value of an ordered distribution.
Mode	Most frequently occurring score.
Range	Difference between highest and lowest score.
Mean Deviation	Average of absolute deviations from the mean.
Standard Deviation (σ)	Square root of the average of squared deviations from the mean.
Variance	Square of standard deviation.
Correlation	Statistical relationship between two variables.
Coefficient of correlation (r)	Numerical measure of correlation, ranging from −1 to +1.
Normal Probability Curve	Bell-shaped, symmetrical theoretical distribution.
Skewness	Lack of symmetry of a distribution.
z-score	Standard score showing position in σ units from the mean.
Percentile	Value below which a given percentage of scores fall.