Let's tell the truth: High-stakes tests damage reading instruction

Without a doubt, and in the main, testing has done more good than harm in America’s schools. My Fordham colleague Andy Smarick is absolutely correct to argue that annual testing “makes clear that every student matters.” The sunshine created by testing every child, every year has been a splendid disinfectant. There can be no reasonable doubt that testing has created momentum for positive change—particularly in schools that serve our neediest and most neglected children.

But it’s long past time to acknowledge that reading tests—especially tests with stakes for individual teachers attached to them—do more harm than good. A good test or accountability scheme encourages good instructional practice. Reading tests do the opposite. They encourage poor practice, waste instructional time, and materially damage reading achievement, especially for our most vulnerable children. Here’s why:

A test can tell you whether a student has learned to add unlike fractions, can determine the hypotenuse of a triangle, or understands the causes of the Civil War—and, by reasonable extension, whether I did a good or poor job as a teacher imparting those skills and content. But reading comprehension is not a skill or a body of content that can be taught. The annual reading tests we administer to children through eighth grade are de facto tests of background knowledge and vocabulary. Moreover, they are not “instructionally sensitive.” Success or failure can have little to do with what is taught.

A substantial body of research has consistently shown that reading comprehension relies on the reader knowing at least something about the topic he or she is reading about (and sometimes quite a lot). The effects of prior knowledge can be profound: Students who are ostensibly “poor” readers can suddenly comprehend quite well when reading about a subject they know a lot about—even outperforming “good” readers who lack background knowledge the “poor” readers possess. 

Reading tests, however, treat reading comprehension as a broad, generalized skill. To be clear: Decoding, the knowledge of letter-sound relationships that enables you to pronounce correctly written words, is a skill. This is why early instruction in phonics is important. But reading comprehension, the ability to make meaning from decoded words, is far more complex. It’s not a skill at all, yet we test it like one, and in doing so we compel teachers to teach it like one. Doing so means students lose.

Even our best schools serving low-income children—public, parochial, and charter alike—have a much harder time raising ELA (English language arts) scores than math. This is unsurprising. Math is school-based and hierarchical (there’s a logical progression of content to be taught). But reading comprehension is cumulative. The sum of your experiences, interests, and knowledge, both in and out school, contribute to your ability to read with understanding. This is why affluent children who enjoy the benefit of educated parents, language-rich homes, and ample opportunities for growth and enrichment come to school primed to do well on reading tests—and why reading scores are hard to move.

Teacher quality plays a role, but note how fourth-grade NAEP math scores have risen over the years while reading has remained flat, even though the same teacher usually handles both subjects. This suggests that our teachers, when they know what to teach, are stronger than we think. In math, standards, curriculum and assessments are closely aligned (there’s no surprising content on math tests). By treating reading as a collection of content-neutral skills, we make reading tests a minefield for both kids and teachers.

The text passages on reading-comprehension tests are randomly chosen, usually divorced from any particular body of knowledge taught in school. New York State’s Common Core-aligned fifth-grade reading test earlier this year, for example, featured passages about BMX bike racing and sailing. The sixth-grade test featured a poem about “pit ponies,” horse and donkeys used in mines to pull carts of ore. Another passage described how loggerhead sea turtles navigate based on Earth’s magnetic field. That sounds more “school-based,” but in the absence of a common curriculum, there’s no guarantee that New York sixth-graders learned about sea turtles or Earth’s magnetic field from their sixth-grade teacher, by watching Magic School Bus, or (alas) ever. Students who had prior knowledge, whether from home, school, a weekend museum trip with their parents, or personal interest, had an advantage. This means the test was not “instructionally sensitive”—teacher input mattered little.

Certainly, test questions are “standards-based.” One question on the sea turtles passage measured students’ ability to determine the “central idea” of the text; another focused on their ability to “cite textual evidence to support analysis of what the text says explicitly as well as inferences drawn from the text” (Standard RI.6.1). Should students fail at this task, here’s the guidance the New York State Education Department offers teachers:

To help students succeed with questions measuring RI.6.1, instruction can focus on building students’ ability to comprehend grade-level complex texts and identifying specific, relevant evidence that supports an analysis of what the text says explicitly as well as inferences drawn from the text.

This is not bad advice, per se, but it’s unlikely to build reading ability. There’s simply no guarantee that practice in identifying specific, relevant evidence that supports inferences drawn from one text or topic will be helpful in another setting. Testing, especially with value-added measures attached, functionally requires teachers to waste precious time on low-yield activities (practicing inferring; finding the main idea, etc.) that would be better spent building knowledge across subjects. We then hold teachers accountable when they follow that advice and fail, as they inevitably must. This is Kafkaesque. 

Students who score well on reading tests are those who have a lot of prior knowledge about a wide range of subjects. This is precisely why Common Core calls for (but cannot impose) a curriculum that builds knowledge coherently and sequentially within and across grades. That’s the wellspring of mature reading comprehension—not “skills” like making inferences and finding the main idea that do not transfer from one knowledge domain to another.

As a practical matter, standards don’t drive classroom practice. Tests do. The first—and perhaps only—litmus test for any accountability scheme is, “Does this encourage the classroom practices we seek?” In the case of annual reading tests, with high stakes for kids and teachers, the answer is clearly “no.” Nothing in reading tests—both as currently conceived or anticipated under Common Core—encourages schools or teachers to make urgently needed, long-term investments in coherent knowledge building from grade to grade that will drive language proficiency.   

What could replace them? Options might include testing reading annually, but eliminating stakes; testing decoding up to grade four; or substituting subject-matter tests to encourage teaching across content areas. The best and most obvious solution would be curriculum-based tests with reading passages based on topics taught in school. But that would require a common curriculum—surely a nonstarter when mere standards in language arts are politically upsetting.

Annual testing “makes clear that the standards associated with every tested grade and subject matter,” Andy writes. Again, I agree wholeheartedly. But reading is not a subject. It’s a verb. It’s long past time to recognize that reading tests don’t measure what we think they do. 

Accountability is essential and non-negotiable, and testing works. Just not in reading.

Robert Pondiscio
Robert Pondiscio is a Senior Fellow and the Vice President for External Affairs at the Thomas B. Fordham Institute.