Higher pass rates could be due to tougher tests, expert says

The number of correct answers needed to pass state exams is falling — but the head of the state’s testing oversight board says that’s because the tests are actually getting more difficult.

Critics charge that the tests have become so easy that students can guess their way through them. But there might be a good reason for the shift, said Howard Everson, chair of the state body that oversees the testing process: As the individual exam questions have gotten harder, students need to answer fewer of them correctly to earn the same score.

“The idea you have to remove from your head is that a test has a certain number of questions and all of those questions have the same weight every year,” Everson said.

Instead, he said, the state has asked CTB/McGraw-Hill, the company that publishes the exams, to make test questions slightly harder every year. The publisher then adjusts the scale that calculates a student’s final score from the number of correct answers according to the difficulty level of that year’s questions.

The modifications ensure that the test is scored fairly from year to year, Everson said, so that a student correctly answering seven relatively easy multiple choice questions one year would not receive the same final score on an exam as a student correctly answering seven harder questions a different year.

But a side effect is that students have to answer fewer questions correctly each year to pass the tests.

In 2006, for example, a seventh-grade student needed to earn 28 out of 50 possible correct answers on a combination of multiple choice and open-ended questions to score a Level 3 on the math exam, indicating that the student met state learning standards. In 2009, a seventh-grader needed only 22 of the 50 correct, a decline of nearly 12 percent.

Similarly, fifth-graders in 2009 needed half the total number of correct answers on their math exam for a Level 3 score, down more than 8 percent from 2006.

If the state wanted to head off this trend, it could modify the scale and raise the “cut scores” that separate proficiency levels, Everson said. (The cut scores are based on a complicated point-to-score conversion process detailed here.) But the scales and cut scores have not been reviewed since 2004, he said.

The trend is especially relevant in New York City, where students in grades 3-8 must now score at least a Level 2 on the exams to be promoted to the next grade.

The ease with which students can hit the Level 2 mark may account for the dramatic reduction of the numbers of failing students in New York City. Only a tiny proportion of city students now score at the lowest level on the state tests.

Some critics dispute Everson’s assertion that test questions are getting harder. They point to a recent study that revealed that some test questions are reused year after year in virtually identical form.

Everson said his committee determined that the tests were technically sound. But if student performance is truly improving, he said, the way the tests are graded should change.

“It’s certainly time for another assessment of the assessments,” he said. “We do want to make adjustments if we’re testing a higher-ability population in 2010 than we were in 2000.”

Everson emphasized that the elements that influence test score results are complex. Without an updated review of whether higher scores truly reflect greater learning, he said, it is difficult to know how to interpret exam results. Everson has been calling for a review but so far has not persuaded the state to undertake one.

“That’s the question that goes begging at the moment,” he said. “Are the abilities of the children really improving? And if they are, what does that imply for the testing program?”