Standardized Test Results - Teacher Essentials

Mean, median and mode scores

A mean score is the average of the scores of the students who took the test. To compute the mean, one must find the sum by adding all the scores, and then divide by the total number of scores. Means could be calculated for various score types (e.g., mean scale score, mean percentage correct). For example: for five students, the mean of the percentage correct scores 10, 20, 20, 40, and 60 is 150 / 5 = 30.

A median score is calculated by listing scores in ascending or descending order and finding the middle, or median, of the list. If the number of scores is an odd number, then the median score is the score in the middle of the list. If the number of scores is an even number, then the median score is the average of the two middle scores on the list. For example, for nine students, the median of the percentage correct scores 10, 10, 10, 20, 20, 40, 50, 50, and 60 is 20.

A mode score exists if one particular score occurs more frequently than the others in a list of scores. There can be more than one mode score if several numbers repeat at the same frequency. For example, for six students, the mode of the percentage correct scores 10, 20, 20, 40, 50, and 60 is 20 because this score occurs most frequently.

Normal curve and standard error

Normal curve equivalents (NCE) are equal interval scores, ranging from 1-99, used to measure where a student falls along the normal curve, or to compare their results across two (or more) years of marks. NCE scores can be averaged, which is important in studying overall school performance and student learning gains.

Standard error of measurement (SEM) is a statistical phenomenon that all test and quiz scores are subject to. It is the amount of error in individual test scores if a student were to take the same test repeatedly, with no change in her or her level of knowledge and preparation. The difference between a student's actual score and his or her highest or lowest hypothetical score is known as the SEM.

Scale score and raw score

A scale score may be reported on both NRTs and CRTs. Scale scores are mathematical conversions from raw scores to a new, arbitrarily chosen scale to represent student achievement. They have no inherent or readily apparent meaning. The better a student performs on a test, the higher the scale score reported. Each test publisher can determine its own scale range to represent achievement. Practitioners must know the scale range of the test in order to interpret a student's achievement. For example, typical scale score ranges of some better-known standardized tests are:

SAT/GRE: 200-800 on each sub-test (verbal and quantitative)
ACT: 1-36

Some tests have continuous score scales that depict the range of performance across grade levels.

A raw score is the number of test items that a student answers correctly. A raw score has no meaning in isolation. It needs to be converted to another score type in order to be interpreted. For example, if there are 60 items on a test and a student gets 36 correct, the raw score is 36. However, if this is the best score in the class (or the worst score) it has a different meaning.

Stanine, percentage and percentile

A stanine score is a standard score that ranges between 1 and 9 with an average of 5, often used as a broad representation of achievement. Most students are expected to receive a stanine score of 4, 5, or 6.

A percentage score indicates what percentage of questions the student got correct. For example, a student whose raw score was 36 correct out of 60 questions has a percentage correct score of 60% (as calculated by dividing 36 by 60 and multiplying by 100).

A percentile is useful to judge how a student performed in comparison to other students. National Percentile Rank (NPR) represents the rank of an individual student as compared to those students in the norm group. For example, if a student's NPR is 69, this student scored as well or better than 69% of the students who took the test. Conversely, 31% of the students scored as well or better than this student. Percentiles cannot be averaged, as they are rankings.