Standards, Testing & Accountability

If you care about state education policy and/or the new federal education law, you ought to spend some time doing three things. First, consider how the performance of schools (and networks of schools) needs to be assessed. Second, read the short Fordham report At the Helm, Not the Oar. Third, encourage your favorite state’s department of education to undertake an organizational strategic planning process.

All three are part of a single, important exercise: figuring out what role the state department of education must play in public schooling.

By now, everyone knows that ESSA returns to states the authority to create K–12 accountability systems. So it’s worth giving some thought to what, exactly, schools and districts should be held accountable for. What do we want them to actually accomplish?

But even if we get clear on the “what,” the “who” and “how” remain. Which entity or entities should be tasked with this work, and how should they go about it?

In At the Helm, which I co-wrote in 2014 with Juliet Squire, we argue that there are lots and lots of things handed to state departments of education (also known as state education agencies, or “SEAs”) that could be better achieved elsewhere....

Joanne Weiss

On February 2, I had the privilege of being a judge for the Fordham Institute’s ESSA Accountability Design Competition. It’s widely known that I’m a fan of using competition to drive policy innovation, and this competition did not disappoint. Fordham received a stunning array of proposals from teachers, students, state leaders, and policy makers.

But before we turn to the insights buried in these pages, I want to praise the competition’s conception, which mirrored the process that states should replicate as they design their own accountability systems. Contestants explained how their proposed accountability systems would support a larger vision of educational success and spur desired actions. They laid out their design principles—attributes like simplicity, precision, fairness, and clarity. They defined the indicators that should therefore be tracked, and they explained how those indicators would roll up into ratings of school quality. Finally, they laid out how each rating would be used to inform or determine consequences for schools. All decisions were explained in the context of how they would forward the larger vision.

Together, these proposals represent a variety of both practical and philosophical approaches to accountability system design. Here are the five major themes I found most noteworthy.

1. The...

Michael Hansen

I walked away from Fordham’s School Accountability Design Competition last Tuesday pleasantly surprised—not only at the variety of fresh thinking on accountability, but also at how few submissions actually triggered the “I think that’s illegal” response. I left encouraged at the possibilities for the future.

The problem of one system for multiple users

Having done some prior work on school accountability and turnaround, I took great interest in the designs that came out of this competition and how they solved what I’m going to call the “one-system-multiple-user” problem. Though the old generation of systems had many drawbacks, I see this particular problem as their greatest flaw and the area where states will most likely repeat the mistakes of the past.

Basically, the one-system-multiple-user problem is this: The accountability design is built with a specific objective in mind (school accountability to monitor performance for targeted interventions) for a single user (the state education office); but the introduction of public accountability ratings induces other users (parents, teachers, district leaders, homebuyers, etc.) to use the same common rating system. Where the problem comes in is that not all user groups have the same objective; indeed we expect them to have different purposes in...

The Thomas B. Fordham Institute has been evaluating the quality of state academic standards for nearly twenty years. Our very first study, published in the summer of 1997, was an appraisal of state English standards by Sandra Stotsky. Over the last two decades, we’ve regularly reviewed and reported on the quality of state K–12 standards for mathematicsscienceU.S. historyworld historyEnglish language arts, and geography, as well as the Common CoreInternational BaccalaureateAdvanced Placement and other influential standards and frameworks (such as those used by PISA, TIMSS, and NAEP). In fact, evaluating academic standards is probably what we’re best known for.

For most of the last two decades, we’ve also dreamed of evaluating the tests linked to those standards—mindful, of course, that in most places, the tests are the real standards. They’re what schools (and sometimes teachers and students) are held accountable for, and they tend to drive curricula and instruction. (That’s probably the reason why we and other analysts have never been able to demonstrate a close relationship between the quality of standards per se and changes in student achievement.) We wanted to know how well matched the assessments were to the standards, whether they were of high...

New York State education officials raised a ruckus two weeks ago when they announced that annual statewide reading and math tests, administered in grades 3–8, would no longer be timed. The New York Post quickly blasted the move as “lunacy” in an editorial. “Nowhere in the world do standardized exams come without time limits,” the paper thundered. “Without time limits, they’re a far less accurate measure.” Eva S. Moskowitz, founder of the Success Academy charter schools had a similar reaction. “I don’t even know how you administer a test like that,” she told the New York Times

I’ll confess that my initial reaction was not very different. Intuitively, testing conditions would seem to have a direct impact on validity. If you test Usain Bolt and me on our ability to run one hundred meters, I might finish faster if I’m on flat ground and the world record holder is forced to run up a very steep incline. But that doesn’t make me Usain Bolt’s equal. By abolishing time limits, it seemed New York was seeking to game the results, giving every student a “special education accommodation” with extended time for testing. 

But after reading the research and talking to leading psychometricians, I’ve concluded that both...

The assessments edition

In this week's podcast, Mike Petrilli and Robert Pondiscio preview Fordham’s long-awaited assessments evaluation, analyze low-income families’ education-related tech purchases, and wave the red flag about TFA’s lurch to the Left. In the Research Minute, David Griffith examines how well the nation’s largest school districts promote parent choice and competition between schools.

Amber's RM

Grover (Russ) J. Whitehurst, "Education Choice and Competition Index 2015," Brookings (February 2016).

Last week, we cautioned that Ohio’s opt-out bill (HB 420) offers a perverse incentive for districts and schools to game the accountability system. The bill has since been amended, but it is no closer to addressing the larger issues Ohio faces as it determines how best to maintain accountability in response to the opt-out movement. 

Current law dings schools and districts when a student skips the exam by assigning a zero for that student when calculating the school’s overall score (opting out directly impacts two of ten report card measures). The original version of HB 420 removed those penalties entirely. Instead of earning a zero, absent students would simply not count against the school. Realizing the potential unintended consequences under such a scenario, including the possible counseling out of low-achieving students and larger numbers of opt-outs overall, the drafters of the substitute bill incorporated two changes.

First, the amended version requires the Ohio Department of Education to assign two separate Performance Index (PI) grades for schools and districts for the 2014–15 school year—one reflecting the scores of all students required to take exams (including those who opt out) and another excluding students who didn’t participate. Second, in...

Following in the footsteps of a previous study, CAP researchers have examined the effects of a state’s commitment to standards-based reform (as measured by clear standards, tests aligned to those standards, and whether a state sanctions low-performing schools) on low-income students’ test scores (reading and math achievement on the NAEP from 2003 to 2013). The results indicate that jurisdictions ranked highest in commitment to standards-based reform (e.g., Massachusetts, Florida, Tennessee, the District of Columbia) show stronger gains on NAEP scores for their low-income students. The same relationship seems to be present in states ranked lowest in commitment to standards-based reform: low-income students in Iowa, Kansas, Idaho, Montana, North Dakota, and South Dakota did worse.

As you can imagine, a lot of caveats go with the measure of commitment to standards-based reform. Checking the box for “implemented high standards” alone is likely to pose more questions than it answers. Beyond that, implementation, teaching, and assessment of standards are all difficult, if not impossible, to quantify. The authors acknowledge that some of their evidence is “anecdotal and impressionistic,” but they are talking about the “commitment to standards” piece. They are four-square behind NAEP scores as a touchstone of academic success or lack...

  • If you ask a thoughtful question, you may be pleased to receive a smart and germane answer. If you post that question in your widely read newspaper column on education, you’ll sometimes be greeted with such a torrent of spontaneous engagement that you have to write a second column. That’s what happened to the Washington Post’s Jay Matthews, who asked his readers in December to email him their impressions of Common Core and its innovations for math: Was it baffling them, or their kids, when they sat down to tackle an assignment together? He revealed some of the responses last week, and the thrust was definitively in support of the new standards. “My first reaction to a Common Core worksheet was repulsion,” one mother wrote of her first grader’s homework. “I set that aside and learned how to do what [my son] was doing. And something magical happened: I started doing math better in my head.” The testimonials are an illuminating contribution to what has become a sticky subject over the last few months. Common Core advocates would be well advised to let parents know that their kids’ wonky-looking problem sets can be conquered after all.
  • Homework
  • ...

The eyes of the nation are fixed on a tournament of champions this week. Snacks have been prepared, eager spectators huddle around their screen of preference, and social media is primed to blow up. Veteran commentators have gathered at the scene to observe and pontificate. For the competitors, the event represents the culmination of months of dedicated effort, and sometimes entire careers; everything they’ve worked for, both at the college and professional level, has led up to this moment. The national scrutiny can be as daunting for grizzled journeymen as it is for fresh-faced greenhorns. You know what I’m talking about:

The Fordham Institute’s ESSA Accountability Design Competition.

Okay, you probably know what I’m talking about. If you inhabit the world of education policy, you took notice of Fordham’s January call for accountability system frameworks that would comply with the newly passed Every Student Succeeds Act—and take advantage of the new authority the law grants to states. With the federal influence on local classrooms scaled back so suddenly, it will be up to education agencies in Wisconsin and Mississippi and Alaska to adopt their own methods of setting the agenda for schools and rating their performance in adhering to it.

The purpose of...

Pages