Standards, Testing & Accountability

The ESEA reauthorization conferees delivered some good news for America’s high-achieving students last week. Absent further amending, the Every Student Succeeds Act (ESSA) will include a necessary and long-overdue section provision that allows states to use computer-adaptive tests to assess students on content above their current grade level. That’s truly excellent news for kids who are “above grade level”—and for their parents, teachers, and schools.

Here’s the language, with emphasis added: 

The quality of state assessments matters enormously to children of all ability levels, but today’s tests do a grave disservice to high-achievers. Most current assessments do a lousy job of measuring academic growth by pupils who are well above grade level because they don’t contain enough “hard” questions to allow reliable measurement of achievement growth at the high end.

Doing that with paper-and-pencil tests would mean really long testing periods. But a major culprit is an NCLB provision requiring all students to take the “same tests” and (at least as interpreted during both the George W. Bush and Barack Obama presidencies) barring material from those tests that’s significantly above or below the students’ formal grade levels. Though the intentions behind this decision were honorable—to...

Gary Kaplan

The Massachusetts Board of Elementary and Secondary Education wisely decided this week to tack between the Scylla of MCAS and the Charybdis of PARCC. Following Commissioner Mitchell Chester’s recommendation, they chose to adopt MCAS 2.0, a yet-to-be-developed hybrid of the two options. Their adroit navigation calms the troubled waters for the time being. But choosing a test is only the beginning of the voyage. Strong and sustained tailwinds will be needed to swell the sails of student achievement.

A test is a measuring instrument. It shows where a student needs to improve, but it doesn’t provide instructional strategies and tools to achieve that improvement. Even without a new test, current state, local, and national assessments already generate more data than anyone can digest.   

Assessment data should directly drive instruction, and the instruction should be individualized to the student. This is the intent. But data-driven, individualized instruction can only take place online. Teachers can’t cut and paste textbooks—but software can be customized with a keystroke. Still, very few schools have the computers and software to support individualized online instruction.

MCAS 2.0 can be an effective driver of instruction if the state invests in a computer for every student (along with the...

Any baseball team finding itself down 3-0 in a seven-game series points to the 2004 Boston Red Sox. Despite the longest of odds—they hadn’t won a World Series in eighty-six years! Their Bronx nemeses had them down!—they staged a miraculous comeback, winning four games straight.

Now, any on-the-brink team getting peppered by reporters’ questions can point to the Sox. “Yes, we’re down big,” they can say. “Sure, things haven’t gone as we wanted. But it can be done! Just give it time! The Red Sox did it!”

Of course, what these teams fail to mention is that the thirty-two other times an MLB team went down 3-0, that team lost the series. Worse, in the 110 instances in which an NBA team went down 3-0, that team always lost the series.

In other words, past poor performance predicts prospective performance.

But Secretary of Education Arne Duncan is undaunted. True to the administration’s messianic approach to policymaking, he sought yesterday to defy history. Presumably wearing a Johnny Damon jersey under his suit, the secretary traveled to the home of the Red Sox to rally-cap the legacy of his signature initiatives.

I tip my own cap to his PR team. The choice of Boston for this...

When is a test not a test? Sure, there’s an easy answer—“When it doesn’t send opt-out parents running for their torches and pitchforks”—but that’s not what we’re looking for. Give up? It’s when the test is a “locally driven performance assessment.” An article in Education Week explains the rise of these specially designed student tasks in eight New Hampshire school districts, which have been granted authority by the Education Department to employ them as alternatives to standardized tests. The districts will work with the state and one another to develop Performance Assessment of Competency Education (PACE), a series of individual and group queries that allow students to exhibit mastery over a subject without filling in bubbles. The challenges (which include the design of a forty-five-thousand-cubic-foot water tower to show proficiency in geometry) sound stimulating, and the Granite State’s record in competency-based education is extensive. It’s not hard to see why such an option would be attractive to state and local officials, especially when testing has become roughly as popular there as a leaf-peeping tax. What remains to be seen is whether this approach to assessment captures the same vital data as traditional measures.

Of course, some folks...

A small storm has blown up around the fact that certain math items on the 2015 National Assessment of Educational Progress (NAEP) do not align with what fourth and eighth graders are actually being taught in a few states—mainly places attempting to implement the Common Core State Standards within their schools’ curricula.

NAEP is only administered in grades four, eight, and twelve. So the specific issue is whether the fourth graders who sat for NAEP this spring had a reasonable opportunity to learn the skills, algorithms, techniques—broadly speaking, “the content”—on that test. If their state standards had moved some portion of what used to be fourth-grade math to the fifth or sixth grade, or replaced it with something else entirely, their state’s NAEP scores would likely be lower.

This kind of misalignment is blamed for some of the math declines that NAEP recently reported. Department officials in Maryland, for example, examined the NAEP math sub-scores and determined that many Maryland fourth graders are no longer being taught some of those things before they take the test.

We are left to wonder: Should NAEP frameworks and assessments be updated to reflect what’s in...

I spent a few hours digging into the recently released 2015 NAEP TUDA data. The results didn’t get much media coverage. That’s a shame because these are the best assessments for understanding student performance in (and comparing the results of) America’s biggest urban districts.

It’s a treasure trove of information, and it tells hundreds of stories. I encourage you to get into the numbers and see what pops for you.

I tried to condense my big takeaways into six headlines and images.

1. We’ve been trying to improve urban districts for half a century. These are the results. No district is able to get even one in five black kids up to proficiency in eighth-grade math or reading.

2. Across the participating districts, there has been meager progress in both subjects and both grades for more than a decade.

3. A few districts, however, have made gains over time, most notably Atlanta, Chicago, D.C., and Los Angeles. They deserve credit.

4. Instances of progress deserve attention because progress is not guaranteed. For example,...

Last week, in the wake of President Obama’s pledge to reduce the amount of time students spend taking tests, my colleagues Robert Pondiscio and Michael Petrilli weighed in with dueling stances on the current state of testing and accountability in America’s schools. Both made valid points, but neither got it exactly right, so let me add a few points to the conversation.

Like Robert, I don’t see how we can improve our schools if we don’t know how they’re doing, which means we need the data we get from standardized tests. But I also believe that—because we’re obligated to intervene when kids aren’t getting the education they deserve—some tests must inevitably be “high-stakes.” The only real alternative to this is an unregulated market, which experience suggests is a bad idea.

Must this logic condemn our children to eternal test-preparation purgatory? I hope not, but I confess to some degree of doubt. The challenge is creating an accountability system that doesn't inadvertently encourage gaming or bad teaching. Yet some recent policy shifts seem to have moved us further away from that kind of system.

As Mike noted, the problem of over-testing has been exacerbated in recent years by the...

A new study by the NAEP Validity Studies Panel analyzes the alignment of the assessment’s 2015 Math Items (the actual test questions) for grades four and eight to the math Common Core State Standards (CCSS).

To do so, the panel enlisted as reviewers eighteen mathematicians, teachers, math educators, and supervisors who have familiarity with Common Core. This group classified all 150 items in the 2015 NAEP math pool for each grade as either matching a CCSS standard or not.

The reviewers determined that the Common Core and NAEP were reasonably aligned at both grade levels— not surprising, since CCSS writers had the NAEP frameworks at their disposal. Further, NAEP is by design broader than the CCSS and is supposed to maintain a degree of independence relative to the “current fashions in instruction and curriculum.”

Panelists found that 79 percent of NAEP items were matched to the content that appears in the CCSS at or below grade 4. The overall alignment of NAEP to CCSS standards at or below grade eight is even closer, 87 percent.

There is, however, variation in matches across content areas. In fourth grade, the least aligned content area was data analysis, statistics,...

Late last night, results were released from the National Assessment of Educational Progress (NAEP)—an exam that is widely considered the best domestic gauge of student achievement. NAEP is administered in each state, every two years, to a representative sample of fourth and eighth grade students in reading and math. With its rigorous content and stringent standards for meeting proficiency, NAEP provides a clear and honest view of student achievement in Ohio and across the nation.

The bottom line from these test results is that too many Buckeye children are struggling to meet rigorous academic goals. The NAEP results for 2015 show just 45 and 37 percent of fourth graders are proficient in math and reading, respectively. In eighth grade, only 36 percent of youngsters are proficient on each of the assessments. Relative to national averages, Ohio students achieve at somewhat higher levels—though some of that is due to its favorable demographics vis-à-vis poorer states. Yet their performance still trails well behind the top-performing states in the nation, such as Massachusetts and New Jersey. Compared to 2013—the last round of NAEP testing in these grades and subjects—student proficiency in Ohio was slightly lower (as were the national averages).

The following...

OK, everyone, back away from the ledge. With the release of NAEP data this week, the predictable deluge of commentary is well underway—mainly of the gnashing-of-teeth, rending-of-garments variety. NAEP may be the nation’s report card, but it is also the nation’s Rorschach test. Perception is in the eye of the beholder, and many see darkness and misery: “A Decade of Academic Progress Halts,” says the Los Angeles Times. “Student Score in Reading and Math Drop,” says U.S. News & World Report.

One of the frequent criticisms of NAEP punditry is “misNAEPery”—the sin of attributing fluctuations to particular policies, for example. One particularly virulent form of this fallacy—failure to account demographic changes in states over time—has become slightly less tenable this week, courtesy of this illuminating analysis by Matthew Chingos of the Urban Institute.

Not every state is the same. States with higher concentrations of black and Hispanic children, low-income families, and English language learners (ELLs) have a harder time rising to the top because they have more students mired at the bottom. But when you adjust for these demographic realities, a different NAEP emerges. There’s Massachusetts, still sitting pretty atop the tables. But Texas and...