Building a test worth teaching to

“Believing we can improve schooling with more tests,” Robert
Schaeffer of FairTest once argued, “is like believing you can make
yourself grow taller by measuring your height.”

It’s a great line. Such statements are the seductive battle cries of
the anti-standards and anti-assessment crowd. But is there any reason
behind this kind of rhetoric?

Parents rarely complain that their young babies are being weighed
and measured too much—even though it can create an extra burden in an
often stressful time in their lives. That’s not because parents naively
believe these basic tests will make their babies grow faster or
taller, but rather because they trust that their doctor will use the
data from these and other tests to flag early problems and develop
individualized plans to help their children thrive.

Of course, education assessments—particularly end-of-year summative
assessments—are far more complicated than scales. But the purpose of
tests in school is no different: to flag problems early and often so
that they can be addressed before they become lifelong issues.

In education, like in medicine, there are unintended consequences to
relying on a limited number of tests in a narrow range of subjects.
According to a report
released by Common Core last week, 76 percent of teachers feel that
critical subjects like science, history, and art are being “crowded out
by extra attention being paid to math and language arts,” and 93
percent of those teachers believe that this crowding is a direct result
of the state testing regimes that focus almost exclusively on reading
and math.

But, too frequently, people see these unintended consequences and
seek to throw the baby out with the bathwater—they argue that we should
abandon standards- and assessment-driven reform because our current
experiment has so far fallen short.  That is a mistake. In the end, our
biggest problem isn’t that we test students too often, but rather that
the quality and scope of tests we administer year in and year out are

A quick scan of the battery of released reading tests on state
websites reveals a distressing array of inane reading passages and
low-quality questions that promote exactly the kind of instruction we
want to avoid. In reading, for instance, rather than selecting passages
for their word length and asking them to make rather empty “text to
self” connections, why not select passages based on their literary
merit and ask them to analyze the author’s actual words? Or to defend a
text-dependent thesis statement? And why not focus informational
passages on important and grade-appropriate history and science
content—content that our education standards already ask students to
master and that, if we held students accountable for knowing, teachers
might spend more time teaching?

reason is simple: too many states have low-quality assessments because
too few states (if any) make getting assessment right a top priority.
States spend a comparatively miniscule amount of their budgets on
assessment. In Ohio, for instance, a back-of-the-envelope calculation
reveals that assessment accounts for a mere 0.7 percent of the state’s
total education spending. (In other states, I’m sure the figure is
similar.) We pay for a household scale, but we want the diagnostic
functionality of an MRI.

And yet we all know that, in order for standards to gain traction in
the classroom and drive the kind of educational change that reformers
on all sides of the debate want to see, teachers must have access to
useful and reliable achievement data gathered through sophisticated
assessments. They must be able to diagnose where individual students are
struggling so that they can target extra help, and they need to be
able to identify where the class is struggling so they know when to
move on and what to focus on if the group isn’t yet ready.

And so our challenge is not to abandon testing and hope for the
best. Encouraging teachers to stop “teaching to the test” makes about
as much sense as encouraging doctors to stop “treating to the
diagnostic.” The two are—and should be—linked. Instead, our challenge
is to develop a test on which only students with deep content mastery
can succeed. In short, we simply must develop a test worth teaching to.

