Midterm Test in Language Testing
Duration: 45 minutes
Read the following article on Norm-Referenced Tests. Then do the tasks below.
A. Match the titles (numbered 1 to 10) to the appropriate paragraphs (lettered A to J).
B. Decide whether the author of this article is for or against NR testing, and write a short paragph to explain why you think so.
Human beings make tests. They decide what topics to include on the test, what kinds of questions to ask, and what the correct answers are, as well as how to use test scores. Tests can be made to compare students to each other (norm-referenced tests) or to see whether students have mastered a body of knowledge (criterion or standards-referenced tests). This fact sheet explains what NRTs are, their limitations and flaws, and how they affect schools.
1. In making an NRT, it is often more important to choose questions that sort people along the curve than it is to make sure that the content covered by the test is adequate.
2. Most achievement NRTs are multiple-choice tests.
3. Norm-referenced tests (NRTs) compare a person's score against the scores of a group of people who have already taken the same exam, called the "norming group."
4. NRTs are designed to "rank-order" test takers -- that is, to compare students' scores.
5. NRT's are a quick snapshot of some of the things most people expect students to learn.
6. NRTs usually have to be completed in a time limit.
7. One more question right or wrong can cause a big change in the student's score.
8. Scores are usually reported as percentile ranks.
9. Tests can be biased.
10. The items on the test are only a sample of the whole subject area.
11. Teaching to the test explains why scores usually go down when a new test is used.
12. To make comparing easier, testmakers create exams in which the results end up looking at least somewhat like a bell-shaped curve.
When you see scores in the paper which report a school's scores as a percentage -- "the Lincoln school ranked at the 49th percentile" -- or when you see your child's score reported that way -- "Jamal scored at the 63rd percentile" -- the test is usually an NRT.
Some also include open-ended, short-answer questions. The questions on these tests mainly reflect the content of nationally-used textbooks, not the local curriculum. This means that students may be tested on things your local schools or state education department decided were not so important and therefore were not taught.
A commercial norm-referenced test does not compare all the students who take the test in a given year. Instead, test-makers select a sample from the target student population (say, ninth graders). The test is "normed" on this sample, which is supposed to fairly represent the entire target population (all ninth graders in the nation). Students' scores are then reported in relation to the scores of this “norming" group.
Testmakers make the test so that most students will score near the middle, and only a few will score low (the left side of the curve) or high (the right side of the curve).
The scores range from 1st percentile to 99th percentile, with the average student score set at the 50th percentile. If Jamal scored at the 63rd percentile, it means he scored higher than 63% of the test takers in the norming group. Scores also can
be reported as "grade equivalents," "stanines," and "normal curve equivalents."
In some cases, having one more correct answer can cause a student's reported percentile score to jump more than ten points. It is very important to know how much difference in the percentile rank would be caused by getting one or two more questions right.
The tests sometimes emphasize small and meaningless differences among testtakers. Since the tests are made to sort students, most of the things everyone knows are not tested. Questions may be obscure or tricky, in order to help rank order the testtakers.
Some questions may favor one kind of student or another for reasons that have nothing to do with the subject area being tested. Non-school knowledge that is more commonly learned by middle or upper class children is often included in tests. To help make the bell curve, testmakers usually eliminate questions that students with low overall scores might get right but those with high overall scores get wrong. Thus, most questions which favor minority groups are eliminated.
Some students do not finish, even if they know the material. This can be particularly unfair to students whose first language is not English or who have
learning disabilities. This "speededness" is one way testmakers sort people out.
There are often thousands of questions that could be asked, but tests may have just a few dozen questions. A test score is therefore an estimate of how well the student would do if she could be asked all the possible questions.