i
Unit 1: Foundations of Assessment Unit 2: Data Analysis Unit 3: Assessment Framework Data Analysis Resources

1.1: Goals and Objectives
1.2: Types of Assessments
1.3: NRT and CRT
1.4: Commonly Used Test Scores
1.5: Assessment Framework

2.1: Screening Assessment
2.2: Data Trends & Disaggregated Data
2.3: DART Model
2.4: Curriculum Mapping
2.5: Descriptive Statistics
3.1: Progress Monitoring
3.2: Diagnostic Assessment
3.3: Expectancy Table
3.4: Outcome Assessment
Glossary
References
Index

 

Lesson 2.5: Data Analysis: Descriptive Statistics

 

  1. Lesson Objectives
  2. Frequency Distribution
  3. Graphic Displays of Distributions
  4. Measures of Central Tendency
  5. Indicators of Variability
  6. Suggested Reading

Lesson Objectives
  • Define, construct, and interpret visual descriptions of data: frequency distribution, histogram, and frequency polygon.
  • Define, calculate, and interpret descriptive statistics concepts: mean, median, mode, range, and standard deviation.



Introduction

By analyzing students’ performance, you will obtain information to improve both tests and instruction, to determine grades or the passing score on the test, to determine whether the test was too easy or too hard, and to evaluate the effectiveness of instruction. For example, if many scores are low, such scores may reflect students’ effort and learning abilities. However, they may also indicate that the instruction was ineffective or that the test was too difficult.

Now that you have access to multiple forms of data from standard and teacher-made tests, you organize and summarize them so that they provide useful information. Data can be organized through tables such as in a frequency distribution, and data can be presented in a visual format through the use of graphs and charts such as a histogram, frequency polygon or a scatter-plot.

Statistical methods are used to summarize and describe data. One way to summarize a set of scores is to look at the measures of central tendency (mean, median, and mode). Another method to describe how much scores are different from one another is by examining measures of variability (range and standard deviation).

Back to Top

Frequency Distribution

In statistics, the term “frequency” refers to the number of times each score or event occurs. We are usually interested in frequency as it relates to the number of students obtaining each score on a test. One way to record frequencies is in a frequency distribution table, where each score is listed in a column on the left side and the frequency with which it occurred is listed on the right. While a frequency table helps to organize data, it does not provide a great deal of descriptive information about the scores. Frequency distributions provide the initial organization and information that is the starting point for many other statistical methods.

Example: Mr. Walker wants to examine the general performance trends among his eighth grade language arts students in order to evaluate student learning and his own instruction. He gives a mid-term language arts exam and obtains the following scores for his students:

95 91 100 96 92 91 87 84 70 65 96 65 56 86 43 65 22 40 93

To evaluate those scores, he creates a frequency distribution table to organize the test data. The obtained scores are listed in order from highest to lowest.

Frequency Distribution

Scores

Tallies

Frequency

100

|

1

96

| |

2

95

|

1

93

|

1

92

|

1

91

| |

2

87

|

1

86

|

1

84

|

1

70

|

1

65

| | |

3

56

|

1

43

|

1

40

|

1

22

|

1

Back to Top

Graphic Displays of Distributions

The data from a frequency table can be displayed graphically. A graph can provide a visual display of the distributions, which gives us another view of the summarized data. For example, the graphic representation of the relationship between two different test scores through the use of scatter plots. We learned that we could describe in general terms the direction and strength of the relationship between scores by visually examining the scores as they were arranged in a graph. Some other examples of these types of graphs include histograms and frequency polygons.

A histogram is a bar graph of scores from a frequency table. The horizontal x-axis represents the scores on the test, and the vertical y-axis represents the frequencies. The frequencies are plotted as bars.

Histogram of Mid-Term Language Arts Exam

Figure 9.2 Histogram of Mid-Term Language Arts Exam

A frequency polygon is a line graph representation of a set of scores from a frequency table. The horizontal x-axis is represented by the scores on the scale and the vertical y-axis is represented by the frequencies.

Frequency Polygon of Mid-Term Language Arts Exam

Figure 9.3 Frequency Polygon of Mid-Term Language Arts Exam

A frequency polygon could also be used to compare two or more sets of data by representing each set of scores as a line graph with a different color or pattern. For example, you might be interested in looking at your students’ scores by gender, or comparing students’ performance on two tests (see Figure 9.4).

Frequency Polygon of Midterm by Gender

 Midterm by Gender

Back to Top

Measures of Central Tendency

One way to summarize your data is to look at the measures of central tendency: mean, median, and mode.

Mean (or arithmetic average)

The mean is the average performance level of a group of students. It is obtained by taking the sum of a set of scores and dividing by the total number of scores. The mean can be distorted if there are some scores that are extremely different (outliers) from the mean of the majority of scores for the group. Consequently, the median is the most descriptive measure of central tendency.

Median

The median is the point in the distribution that splits the scores in two equal groups, which is also known as the midpoint of a distribution, or the 50th percentile. To calculate the median, organize the raw scores in rank order. The median is the middle value on the scale that divides the number of scores into equal halves, if the number of scores is odd. When the number of scores is even, the median is calculated as the average of the two middle scores.

Mode

The mode is the most frequently occurring score in a distribution. There are no mathematical calculations needed for the mode. Once the data are organized in a frequency distribution format, the mode can be identified. In some cases there may be more than one mode in a distribution if two or more scores share the highest frequency. A set of scores with two modes is called bimodal; those with more than two are called multimodal. The following is a step-by-step demonstration of calculating the measures of central tendency. Mr. Walker creates a table to display the range of scores obtained by his students:

Calculating Mean, Median, and Mode

100

96

96

95

93

92

91

91

87

86

84

70

65

65

65

56

43

40

22

 

Calculating the Mean: Mr. Walker adds up all of the scores (total 1,437 points) and divides the result by the total number of scores (19): 1,437 ÷19 = 75.6

Finding the Median: Note that with an odd number (19) of scores, the median is the number that divides the scores into two equal halves, nine scores above and nine scores below. In this distribution of scores, 86 is the median.

Finding the Mode: In this distribution of scores, 65 occurs most frequently.

 

 

 

 

Representation of the Mean, Median, and Mode on a Curve

When representing a standardized test score, they will normally be distributed and have a symmetrical distribution. Notice that in this distribution the mean, median, and mode are equal. The mean is also the number that divides the scores into two equal groups and is the score that occurs most frequently.

The shape of a distribution of your test scores can provide useful clues about your test and your students’ performance. When representing students’ scores on a graph, the scores often will be positively or negatively skewed. When the distribution is positively skewed, that implies that the most frequent scores (the mode) and the median are below the mean. If your test is very difficult, there may be many low scores and few high ones. The distribution of scores would have a shape similar to the one depicted below that is positively skewed.

When the tail points to the left, the distribution is negatively skewed. In this distribution there are high scores and relatively few low scores. Notice that the mean is influenced by the skewing.

 

The mean can be distorted if there are some scores that are extremely different (outliers) from the mean of the majority of scores for the group. Consequently, the median is the most descriptive measure of central tendency.

 

Back to Top

Indicators of Variability

Variability is the dispersion of the scores within a distribution. Given a test, a group of students with a similar level of performance on a specific skill tend to have scores close to the mean. Another group with varying levels of performance will have scores widely spread and further from the mean. In other words, how varied are the scores? Two common measures of variability are the range and standard deviation.

Range

The range, R, is the difference between the lowest and the highest scores in a distribution. The range is easy to compute and interpret, but it only indicates the difference between the two extreme scores in a set.

If we use the scores from Mr. Walker’s class (above), we would calculate the range as: Range (R) = the highest score – the lowest score in the distribution.

95 91 100 96 92 91 87 84 70 65 96 65 56 86 43 65 22 40 93

R = 100 - 22 = 78, so the range is 78.

Standard Deviation

A more useful statistic than simply knowing the range of scores would be to see how widely dispersed different scores are from the mean. The most common measure of variability is the standard deviation (SD). The standard deviation is defined as the numeric index that describes how far away from the mean the scores in the distribution are located. The formula for the standard deviation is:

formula

Where X = the test score, M = mean, and N = number of scores.

Figure 9.6 Standard Deviation

The higher the standard deviation, the wider the distribution of the scores is around the mean. This indicates a more heterogeneous or dissimilar spread of raw scores on a scale. A lower value of the standard deviation indicates a narrower distribution (more similar or homogeneous) of the raw scores around the mean.

Example:

Table 5.1.6

Scores

Step 1

Deviation Scores

Step 2

Square of Deviation Scores

100

24.4

595.36

96

20.4

416.16

96

20.4

416.16

95

19.4

376.36

93

17.4

302.76

92

16.4

268.96

91

15.4

237.16

91

15.4

237.16

87

11.4

129.96

86

10.4

108.16

84

8.4

70.56

70

-5.6

31.36

65

-10.6

112.36

65

-10.6

112.36

65

-10.6

112.36

56

-19.6

384.16

43

-32.6

1062.76

40

-35.6

1267.36

22

-53.6

2872.96

Step 3: Sum of Dev. Scores = 9130.28

 

 Calculating the Standard Deviation:

formula

X = squared raw scores

M = mean = 75.6

N = number of scores = 19

To calculate the Standard Deviation of this distribution of scores:

  1. Subtract the mean (X-75.6) from each score in the distribution. This will give you roughly half positive and half negative scores (called deviation scores).
  2. Square each of the deviation scores (X - 75.6)², which will make them all positive.
  3. Add the squared deviation scores together (9,130.28).
  4. Divide the resulting sum by the number of scores in the distribution (9,130.28÷19 = 480.54).
  5. Take the square root of the results of your division from Step 4. The square root that you arrive at is the standard deviation.

√480.54 = 21.92 = SD for this distribution of scores.

 

Suggested Reading

More information about data analysis is available at:

http://www.cas.lancs.ac.uk/glossary_v1.1/presdata.html

http://www.gphillymath.org/MSMathLinks/

 


Back to Top

Continue to Lesson 3.1 by using the next button below.

next

The Development of the training was funded through the Florida Department of Education,
and jointly developed by Florida State University and North East FL Educational Consortium.

Copyright © 2005 NEFEC except where noted. All rights reserved. Unauthorized use prohibited.