Putting Ohio's teacher evaluation reforms in a national context

11.1.2011

The State Board of Education has just eight weeks left to develop a model framework for teacher evaluations that will be used or adapted by over 1000 local education agencies (LEA) by July of 2013. (Ohio’s biennial budget – HB 153 – stipulated that the Board come up with a model by December 31 of this year.) Skeletal requirements are spelled out in state law. Evaluations must: include measures of student growth (50 percent); be based on multiple measures; rate teachers according to four tiers of effectiveness (accomplished, proficiency, developing, and ineffective); and inform other personnel decisions, particularly layoffs (strict seniority-based layoffs were struck from state law).

But what else will the model framework include, especially for that remaining - and some would argue more important - 50 percent of a teacher’s rating? To what degree will districts and charter schools need to enact a replica of the state’s forthcoming model, or something closely resembling it, instead of merely repackaging their current systems? And how will teacher evaluations impact other key personnel decisions, if at all? Despite the fact that legislation clearly spells out a handful of requirements surrounding Ohio’s new teacher evaluations, the answers to these questions aren’t as straightforward as one might think.

In Fordham’s analysis of Ohio’s education legislation from the first half of 2011 (primarily the biennial budget, HB 153), we observed that when it comes to teacher evaluations, “the budget leaves many decisions to local districts.” Depending on whom you ask – this can be a recipe for watering down evaluations or it could be a fact worth celebrating, in that it allows districts themselves to take ownership and drive meaningful reforms to teacher effectiveness.

Recent analyses from two well-respected national organizations put statewide teacher evaluations in national context. In State of the States: Trends and Early Lessons on Teacher Evaluation and Effectiveness Policies, the National Council on Teacher Quality summarized the state of teacher effectiveness policies nationwide, zooming in on 17 states and the District of Columbia (including Ohio) that require student growth to be a predominant factor for teacher ratings. (Note, NCTQ publishes a comprehensive annual yearbook on state teacher policies broadly that is also worth checking out.)

In State of the States, Ohio was among just a handful wherein teacher evaluation ratings aren’t directly tied to dismissal, certification, or tenure. That is, among the dozen and a half states under study, the vast majority have prescribed not only what shall comprise a rigorous teacher evaluation but also how it will factor into high-stakes decisions – whether rewards or sanctions.

Democrats for Education Reform’s Built to Succeed? Ranking New Statewide Teacher Evaluation Practices took it a step further and ranked states, attempting to measure “which of those [19 states plus DC] states’ laws are tough enough to withstand the challenges ahead and are most likely to succeed in increasing teacher quality.” DFER depicted Ohio as one of just a few states with “clear potential for weakening the evaluation process at the ground level,” ranking Ohio among the bottom third of states overall and bottom-most among states in the Midwest. When it comes to having real implications for poor ratings (dismissal, layoffs, placement, tenure, and compensation) Ohio earned only three points out of a possible 21. (In each strand, a state can earn zero to three points; there are a total of 20 strands in areas such as “strength of evaluation plan,” “employment implications,” etc.)

Fordham unapologetically supports accountability for educators, an end to LIFO (last in, first out) layoffs, and more performance-based decision making in schools generally. But it’s hard to escape the truth that no state – not even those heralded by DFER and NCTQ – has had teacher evaluations for long enough to be able to discern what impact it’s had on student achievement, and how or if it’s changed teachers’ behavior or ensured better teachers for the kids who need them the most. Lest this sound like backtracking on our beliefs around teacher effectiveness, let’s reiterate: Ohio must craft and implement more meaningful evaluations for teachers that differentiate for quality.

But perhaps designing that system and collecting data on teacher effectiveness (and sharing that data in a transparent, but low-stakes way) should be priority number one. Once the state has several years of data, a list of robust and meaningful assessments from which to draw that data (which the State Board of Education has been tasked with devising, but which doesn’t exist yet), and enough time so that educators can observe for themselves that the data make intuitive sense (namely that it accurately gauges their own abilities and those of their colleagues) – then the state can worry about tying meaningful rewards and sanctions to those evaluation ratings. Or possibly by that point, districts themselves will be motivated to do so on their own.

It’s understandable to want to improve teacher quality by pulling all levers at once. DC Public Schools and Harrison School District 2 have done phenomenal work in developing rigorous teacher evaluations that, while not perfect, are worlds better than previous systems and that educators on the whole seem to buy into. But we should keep in mind that these reforms happened on a local level; DCPS has 44,900 students and HSD 2 has just over 10,000 students, a splash in the bucket compared to Ohio’s 1.8 million students. Creating such a system for an entire state, let alone a state as large and diverse as Ohio, requires a lot more work (and a lot more cooperation and buy-in from schools and the people working in them).

The danger in launching fully ahead and tying all personnel decisions – layoffs, transfers, pay, certification, tenure, etc. – to evaluation systems before we’ve seen the accuracy of those evaluations systems is obvious: it threatens credibility and could foster hostility from districts and schools. Worse, by tying high-stakes rewards and consequences to an evaluation system that doesn’t exist yet, we risk creating dozens of incentives for principals to inflate scores instead of honestly evaluating teachers and providing them with meaningful feedback to improve.

This isn’t to say that high-stakes decisions shouldn’t eventually be directly connected to teachers’ ratings. But while Ohio ranks low according to DFER’s likelihood-of-watering-down scale, it also ranks low in terms of absolutely screwing up, unfairly dismissing teachers, or creating hostility. That may sound unnecessarily risk-averse, but caution in this realm may end up producing a better – and more importantly, more sustainable – outcome for Ohio’s teachers, schools, and students.