Reports ranking teachers on how much they were able to increase students’ test scores from one year to the next arrived in principal’s inboxes this week, and this time Department of Education officials say the reports are simpler and fairer than in years past.
First released in 2008, teacher data reports have rankled teachers who object to being judged solely on test scores and confused principals, some of whom found the reports too complicated to use. The reports released this week cover 12,000 teachers and address some of those concerns. They contain less information, are easier to read, and use a new formula to calculate teachers’ value-added scores.
This year, Chancellor Joel Klein has made it clear what should be done with the data: one in ten teachers who are up for tenure will have their reports used as a criteria in their tenure evaluations.
On Tuesday morning, principals with students in grades 3-8 — the state gives yearly math and English tests to these students — were given school summary reports. Teachers won’t receive their individual data reports until next week. The vast majority work in traditional public schools, as less than a dozen charter schools chose to participate, according to the Department of Education’s chief talent officer, Amy McIntosh.
This year’s reports offer much of the same data as last year’s, with a few tweaks. Teachers’ value-added scores — how much they’ve helped students’ improve over the course of a year — are still compared to the scores of teachers who have similar kinds of students in their classrooms and a similar experience level. They’re also broken down to show how well someone teaches special education students versus students who arrive scoring in the top tier.
But this year, the reports don’t include comparisons to teachers citywide, nor do they give a breakdown of teachers’ scores for each year they’ve been teaching. Instead, the reports show data from the 2008-2009 school year as well teachers’ value-added scores averaged over as many as four years.
“We did find that that [showing each year’s data] tended to focus people more on an annual trend and, given the properties of value-added, year to year changes are not necessarily the thing we would want people to take a lesson from,” McIntosh said.
The new reports also use a different formula to calculate a teacher’s value-added score. In previous years, the formula only noted whether a teacher taught special education students, but didn’t distinguish the highly functioning students from those who are more challenging. Now, the formula includes these distinctions, and also takes into account the margin of error on the state’s tests.
Professor at New York University’s Institute for Education and Social Policy, Sean Corcoran, said that while the new reports are much easier to understand, they still suffer from the problem that plagues any value-added system: too much certainty.
“What the performance categories ignore —and this is true for any value-added system — is that teachers’ value-added measures are estimates, and as such are subject to error,” Corcoran wrote in an email.
A teacher reading her report will see a single value-added score, but she’ll also see that, accounting for errors, her performance could fall anywhere across a wide range of scores.
“The five performance categories are already pretty crude, but when you consider that a teacher with a 65th – 95th percentile confidence interval could be an Average, Above Average, or Excellent teacher, it doesn’t tell us very much,” Corcoran wrote.
“We know statistically that the most likely result is the point we’ve highlighted,” McIntosh said. “But this is not a surgical scalpel and people should use caution and judgment and other metrics when they take into account this data.”
Another common concern for those studying and using value-added data is how stable the results are. From year to year, teachers sometimes move in rank from very bottom to the top, and though more data smooths out those inconsistencies, it doesn’t completely erase them.
In the New York City data reports, most teachers who ranked in a category for two years will still place there two years later. For example, 70 percent of math teachers who ranked in the bottom for two straight years were still placing there two years later, showing that their effects on students had been consistent.
But some teachers have been shown to make great changes. Six percent of math jumped from the very worst ranking to the very best over four years.
“This could be because the teachers improved, or it could simply be noise,” Corcoran said. “With these statistical models we can’t say.”
Consistency data for Math:
Consistency data for ELA:
Teacher Data Report sample
School Summary Report sample