Response Rates (1)
The following class size and response ratio standards help ensure valid interpretation of student rating data:
Class Size
(students) |
Minimum Acceptable
Response Ratio |
|
5-10 |
80% |
21-30 |
75% |
31-50 |
66% |
51-100 |
50% (a) |
>100 |
50% (b) |
|
| a) 75% recommended |
| b) 66% recommended |
The Mean
The mean reported on U.Va.’s online evaluation system is an arithmetic mean, which is the sum of all the individual student ratings for a particular question divided by the number of students answering that question. By itself, this number has very little meaning.
Important Point: Research has clearly shown that numerical student evaluation data is positively biased, meaning that responses are skewed toward the positive end. Thus, the average rating on a five-point scale tends to fall around 3.4 rather than the expected 3.0.
The Standard Deviation
The standard deviation (std dev) is simply the "average" or "expected" variation around the mean. A large standard deviation indicates that the data points are far from the mean and a small standard deviation indicates that they are clustered closely around the mean. The standard deviation is a useful measure when interpreting the mean for data which are well (and normally) distributed. When large clusters of data exist on both ends of the ratings spectrum, the standard deviation is virtually meaningless.
Important Point: On a five-point scale (e.g. Strongly Agree, Agree, Neutral, Disagree, Strongly Disagree), a standard deviation less than 1.2 indicates relatively good agreement.
Theall M. and Jennifer Franklin. “Using Student Ratings for Teaching Improvement.” New Directions for Teaching and Learning 48 (1991): 83-96. |
|
Norm Referencing
In order to accurately interpret and compare evaluation data, context is necessary. A norm, or reference group provides this context. Some typical norms include the following: All 1000-level anthropology courses; All upper-division, required biology courses; All ENWR discussion sections. The more specific the norm, the better reference it provides.
Important Point: At U.Va., department or school administrators define the norms. Unfortunately, the norms which are used are often too large (e.g. all department courses or all chemistry labs) to be meaningful for instruction and course improvement purposes.
Important Point: If you teach multiple sections of a course and include your own, instructor-authored questions, the norm automatically includes all students in your course, regardless of section.
Beyond the Basics: Confidence Interval
Like other numerical data, student evaluation data have margin of errors and confidence intervals within which the true values probably lie. Knowing something about the “true” value of a response is sometimes more meaningful than a general statistic like standard deviation. While a confidence interval is not reported for items on U.Va.’s online evaluations, it can easily be calculated from the mean, standard deviation, and number of students responding to an item.
The first step is to calculate the margin of error:

Note: The value 1.96 is a statistically-based constant for a 95 percent confidence interval.
Then, simply add the margin of error to the mean for the item under examination to obtain the highest probable limit to the true score and subtract the margin of error from the mean to obtain the lowest probable limit. The range from the highest probable limit to the lowest probable limit is the 95 percent confidence interval.
To illustrate, if the response to a particular question has a mean of 4.35, a standard deviation of 0.45 and a sample size of 25 students, then the margin of error in the mean is calculated as follows:

Thus, the confidence interval is 4.35 + 0.18 = 4.37 and 4.35 – 0.18 = 4.33. In other words, the true value of the item—with 95 percent confidence—lies between 4.37 and 4.33. |