data-driven decisionmaking

Why we won't publish individual teachers' value-added scores

Tomorrow’s planned release of 12,000 New York City teacher ratings raises questions for the courts, parents, principals, bureaucrats, teachers — and one other party: news organizations. The journalists who requested the release of the data in the first place now must decide what to do with it all.

At GothamSchools, we joined other reporters in requesting to see the Teacher Data Reports back in 2010. But you will not see the database here, tomorrow or ever, as long as it is attached to individual teachers’ names.

The fact is that we feel a strong responsibility to report on the quality of the work the 80,000 New York City public school teachers do every day. This is a core part of our job and our mission.

But before we publish any piece of information, we always have to ask a question. Does the information we have do a fair job of describing the subject we want to write about? If it doesn’t, is there any additional information — context, anecdotes, quantitative data — that we can provide to paint a fuller picture?

In the case of the Teacher Data Reports, “value-added” assessments of teachers’ effectiveness that were produced in 2009 and 2010 for reading and math teachers in grades 3 to 8, the answer to both those questions was no.

We determined that the data were flawed, that the public might easily be misled by the ratings, and that no amount of context could justify attaching teachers’ names to the statistics. When the city released the reports, we decided, we would write about them, and maybe even release Excel files with names wiped out. But we would not enable our readers to generate lists of the city’s “best” and “worst” teachers or to search for individual teachers at all.

It’s true that the ratings the city is releasing might turn out to be powerful measures of a teacher’s success at helping students learn. The problem lies in that word: might.

Value-added measures do, by many readings, appear to do the job that no measure of a teacher’s quality has done before: They estimate the amount of learning by students for which a teacher, and no one else, is responsible, and they do this with impressive reliability. That is, a teacher judged to be more effective one year by value-added is likely to continue to be judged effective the next year, and the year after that.

But this is not true for every teacher — hardly. Many teachers will be mislabeled; no one disputes this. Value-added scores may be more reliable than existing alternatives, but they are still far from perfectly reliable. It’s completely possible, for instance, that a teacher judged as less effective one year will be judged as very effective the next, and vice versa.

As we reported two years ago, when the NYU economist Sean Corcoran looked at New York City’s value-added data, he found that 31 percent of English teachers who ranked in the bottom quintile of teachers in 2007 had jumped to one of the top two quintile by 2008. About 23 percent of math teachers made the same jump.

The fluctuation is acknowledged by even the strongest supporters of using value-added measures to evaluate teachers. One of the creators of the city’s original value-added model, the Columbia economist Jonah Rockoff, compares value-added scores to baseball players’ batting averages. One of his reasons: In each case, the year-to-year fluctuations of an individual’s score are about the same.

“If someone hit, you know, .280 last year, that doesn’t guarantee they’re going to hit .280 next year,” Rockoff said today. “However, if you hit .210 last year and I hit .300, there’s a very high likelhood I’m going to hit more than you next year, too. Whereas if you hit .280 and I hit .278, we’re basically the same.”

Another challenge is that many researchers still aren’t convinced that value-added scores are measuring the right sort of teacher impact. The challenge lies in the flaws of the measures on which value-added scores depend — standardized state test scores.

Tests are supposed to measure what a student has learned about a subject, but they can also reflect other things, like how well her teacher prepared her for the test, or how well she mastered the narrow band of the subject the test assessed.

The test-prep concern is magnified by findings that a single teacher can generate two different value-added scores if evaluators use two different student tests to determine them. The Gates Foundation’s Measures of Effective Teaching study calculated value-added scores for teachers based on both state tests and more conceptual tests. They found substantial differences between the two, according to an analysis by the economist Jesse Rothstein of the University of California at Berkeley.

“If it’s right that some teachers are good at raising the state test scores and other teachers are good at raising other test scores, then we have to decide which tests we care about,” Rothstein said today. “If we’re not sure that this is the test that captures what good teaching is, then we might be getting our estimates of teaching quality very wrong.”

Flags about exactly what high value-added ratings reward are also raised by studies that ask how the ratings match up with measures of what teachers actually say and do in the classroom. Heather Hill,  professor at Harvard’s Graduate School of Education, rated math teachers’ teaching quality based on an observation rubric called the Mathematical Quality of Instruction, which looks at factors like whether the teacher made mathematical errors and the quality of her explanations. Then Hill compared the math teaching rating to value-added measures.

Two individual cases stood out: One teacher had made a slew of math errors in her teaching, and the other had failed to connect a class activity to math concepts. But teachers’ value-added scores put them at the top of their cohort.

There is some reason to think that value-added measures reflect more than test prep. Rockoff points out that while different tests can produce different value-added scores for the same teacher, the two measures are still correlated. Using different tests, he said, is akin to looking at slugging percentage rather than batting average. “I’m sure those two things are positively correlated, but probably not one for one,” he said.

More persuasively, a recent study by Rockoff and two other colleagues concluded that value-added measures can actually predict long-term life success outcomes, including higher cumulative lifelong income, reduced chance of teen pregnancy, and living in a high-quality neighborhood as an adult. The study examined an anonymous very large urban school district that bears several similarities to New York City.

That study targeted another concern about value-added measures: that teachers score consistently well year after year not because of something they are doing, but because they consistently teach students with certain advantages.

Rothstein has used value-added models to conclude that fifth-grade teachers have strong effects on their students’ performances in third-grade — something they could not possibly influence, unless value-added scores reflect not just teachers’ influence but also advantages brought by students.

Rockoff and his colleagues evaluated the possibility by testing a question. If high-value added teachers do well because they get the “better” students of those in their grade, then their students’ high test score growth would be linked with mediocre performance in other classrooms. That would mean that, when researchers looked at growth for the entire grade, the “better” students’ growth would be canceled out by their less lucky peers. But the scores were not canceled out, suggesting that effective teachers did more than just have unusually good students.

None of this means that we won’t write about what the data dump includes or that we might not publish an adapted database that strips out information linking the city’s data to individual teachers. With more than 90 columns in the Excel sheet the city has developed — and more than 17,000 rows, representing the number of reports issued over their two-year lifespan — the release might well enable us to examine the city’s value-added experiment in new ways.

Value-added measures certainly aren’t going away. City officials only stopped producing Teacher Data Reports because they knew the State Education Department is preparing its own. The measures, which are expected to come out in 2013, will make up 25% of the evaluation for teachers of math and English in tested grades.

that was weird

The D.C. school system had a pitch-perfect response after John Oliver made #DCPublicSchools trend on Twitter

Public education got some unexpected attention Sunday night when John Oliver asked viewers watching the Emmys to make #DCPublicSchools trend on Twitter.

Oliver had been inspired by comedian Dave Chappelle, who shouted out the school system he attended before he announced an award winner. Within a minute of Oliver’s request, the hashtag was officially trending.

Most of the tweets had nothing to do with schools in Washington, D.C.

Here are a few that did, starting with this pitch-perfect one from the official D.C. Public Schools account:

Oliver’s surreal challenge was far from the first time that the late-show host has made education a centerpiece of his comedy — over time, he has pilloried standardized testing, school segregation, and charter schools.

Nor was it the first education hashtag to take center stage at an awards show: #PublicSchoolProud, which emerged as a response to new U.S. Education Secretary Betsy DeVos, got a shoutout during the Oscars in February.

And it also is not the first time this year that D.C. schools have gotten a surprise burst of attention. The Oscars were just a week after DeVos drew fire for criticizing the teachers she met during her first school visit as secretary — to a D.C. public school.

Startup Support

Diverse charter schools in New York City to get boost from Walton money

PHOTO: John Bartelstone
Students at Brooklyn Prospect Charter School in 2012. The school is one of several New York City charters that aim to enroll diverse student bodies.

The Walton Family Foundation, the philanthropy governed by the family behind Walmart, pledged Tuesday to invest $2.2 million over the next two years in new charter schools in New York City that aim to be socioeconomically diverse.

Officials from the foundation expect the initiative to support the start of about seven mixed-income charter schools, which will be able to use the money to pay for anything from building space to teachers to technology.

The effort reflects a growing interest in New York and beyond in establishing charter schools that enroll students from a mix of backgrounds, which research suggests can benefit students and is considered one remedy to school segregation.

“We are excited to help educators and leaders on the front lines of solving one of today’s most pressing education challenges,” Marc Sternberg, the foundation’s K-12 education director and a former New York City education department official, said in a statement.

Walton has been a major charter school backer, pouring more than $407 million into hundreds of those schools over the past two decades. In New York, the foundation has helped fund more than 100 new charter schools. (Walton also supports Chalkbeat; read about our funding here.)

Some studies have found that black and Hispanic students in charter schools are more likely to attend predominantly nonwhite schools than their peers in traditional schools, partly because charter schools tend to be located in urban areas and are often established specifically to serve low-income students of color. In New York City, one report found that 90 percent of charter schools in 2010 were “intensely segregated,” meaning fewer than 10 percent of their students were white.

However, more recently, a small but rising number of charter schools has started to take steps to recruit and enroll a more diverse student body. Often, they do this by drawing in applicants from larger geographic areas than traditional schools can and by adjusting their admissions lotteries to reserve seats for particular groups, such as low-income students or residents of nearby housing projects.

Founded in 2014, the national Diverse Charter Schools Coalition now includes more than 100 schools in more than a dozen states. Nine New York City charter groups are part of the coalition, ranging from individual schools like Community Roots Charter School in Brooklyn to larger networks, including six Success Academy schools.

“There’s been a real shift in the charter school movement to think about how they address the issue of segregation,” said Halley Potter, a senior fellow at the Century Foundation, a think tank that promotes socioeconomic diversity.

The Century Foundation and researchers at Teachers College at Columbia University and Temple University will receive additional funding from Walton to study diverse charter schools, with the universities’ researchers conducting what Walton says is the first peer-reviewed study of those schools’ impact on student learning.