Avoid fruit salad

What will the Education Department’s current $350 million competition to develop new multi-state assessments actually yield? One new “national test”? Two? A bunch? Will they be any good? Will they yield the information that America needs and that many educators and parents crave?

Answers to many such questions will not be known for years. But one key question will get answered, for better or worse, by September 30: How many different test-development initiatives will Secretary Duncan agree to pay for?

And here’s the answer to dread: just one. The number of potential applicants has been falling faster than volcanic ash over Europe and a scary rumor says those that remain may yet wrap themselves into a single project before the end-of-June application deadline.

Education Department spokespersons have recently talked talk of funding “at least two” such consortia. Not many weeks ago, however, the field was buzzing with talk that six or more consortia were forming for this purpose. At a big ETS-sponsored conference last month, four new assessment “models” were displayed before an audience of state and district leaders. It was, in effect, an audition, after which states were expected to sign up for the version they liked best—or maybe more than one.

Today, however—nine weeks before proposals are due at ED—there are clear signs that, like the airline industry, consolidation is underway and perhaps just two consortia will even apply (not counting a separate high-school-only assessment development that is widely believed to be a de facto “set-aside” for Marc Tucker’s National Center on Education and the Economy). The National Governors Association and Council of Chief State School Officers, which are developing the new “common core” standards with which all the new assessments are supposed to be aligned, recently issued a paper describing the two groups. One—the “Partnership for Assessment of Readiness for College and Career”—is led by Florida, Massachusetts, and Louisiana, and will be managed by Achieve. The other—the “Smarter Balanced Consortium”—is led by West Virginia, Nebraska, and Oregon, among others, with technical assistance from Stanford’s Linda Darling-Hammond. A few days ago, however, an astute Education Week follower of this topic blogged that perhaps the two would wind up joining forces and only one single solitary hegemonic consortium would seek Duncan’s money (plus Tucker, of course).

That is an outcome devoutly to be shunned. Also to be avoided is a grant competition that yields but a single winner. For while the United States will be well served in 2010 by a single set of “common” academic standards—provided, of course, that these are voluntary, independent of federal control, and possessed of solid, rigorous content—it would be an enormous mistake to slice and dice and recombine the various assessment schemes into a single fruit salad. What the country needs, and what states need, at least for now, is a clear choice between apples and oranges. Indeed, we would favor kumquats, too, and lament the apparent disappearance of an innovative plan sketched (at that ETS conference) by the University of Pittsburgh’s Lauren Resnick and Wireless Generation’s Larry Berger.

As much as the Education Department wishes this weren’t so and has tried in its eighty-plus page grant competition to describe a delectable fruit salad, the fact is that not all assessments can serve all worthwhile ends. In particular, the testing world has long distinguished between assessments meant to inform teachers, improve instruction, and provide useful mid-course feedback to schools and educators—these are often called “formative”—and the kind that are used primarily for accountability purposes, i.e., to gauge what has and hasn’t been learned, usually at year’s end, usually at the individual, classroom, school, district, and state levels. (The latter are typically termed “summative.”)

Oversimplifying, those who emphasize the former—formative—approach tend also to be much taken with “performance” tasks, student portfolios and such, preferably designed and evaluated by teachers, while those more interested in accountability are more apt to craft relatively traditional tests, speedily and inexpensively administered, quickly and objectively scored. Again oversimplifying, the nascent Florida-Achieve “partnership” appears to be squatting in the latter camp while the fledgling West Virginia-Darling-Hammond group is pitching its tent in the former.

Both approaches have value and each of the two consortia has appealing elements in its preliminary design. But they simply aren’t the same thing and neither will—or can, or should be expected to—serve every worthy purpose in the realm of assessment. Although Secretary Duncan’s team understandably dreams of a single, tidy, new national assessment system that serves both formative and summative purposes, any new system also needs to be valid, reliable, affordable, manageable, and a bunch of other things. It’s folly to suppose that dreaming about a single combined approach will make it real. And everything we know about assessments gives us pause regarding the feasibility of such a hybrid.

Far better to let multiple teams—preferably more than two—develop different designs, each with its own integrity and viability. Then let states select the approach that suits them. Let Congress decide how to revise NCLB/ESEA to take account of these new assessments. Let NAEP continue to serve as external auditor—and comparer—of states’ performance, regardless of what they use for their own tests. Let’s see what works. Let’s encourage some experimentation. Let’s allow pears to be separate from mangoes. Let’s avoid fruit salad.

Then there is the awkward, sensitive matter of Professor Darling-Hammond and her earnest, deep, and well-reasoned view that assessment is mainly for the purpose of improving instruction, not for tallying, ranking, evaluating, judging, or scoring. Indeed, she seems none too interested in external standards or in aligning assessments with them. Were there to be but one single new assessment, and were she to play a leadership role in that project (as we must assume would happen), the assessment that results would not likely foster either stronger student achievement or better results-based accountability.

One clue: The Palo Alto charter school most prominently identified with Ms. Darling-Hammond (and, indeed, with the Stanford ed school) has produced abysmal results on the current California state test. So abysmal that its charter is not being renewed. We suspect that this school is a cheerful place full of eager teachers, plenty of portfolios, performances and formative feedback, maybe even contented, fulfilled students. The problem is that they’re not learning what the state of California, in its wisdom, has said they ought to learn.

In a matter of months, the “Common Core” state standards initiative is going to finalize its version of what young Americans ought to learn. We don’t know what the final edition will look like or whether we’ll like it. But the March draft was good, better than I for one expected, and I foresee many states concluding—round two Race to the Top applicants are supposed to conclude this by August—that the “common core” standards are superior to what they’ve been using.

That’s a promising but risky move for American education, one far too important to compromise with a single, flawed new assessment system that tries to serve too many different purposes, that confuses peaches with pineapples, and that is designed in large measure by a respected academic who has many virtues but who also has a lackluster track record when it comes to kids actually learning.

Chester E. Finn, Jr.
Chester E. Finn, Jr. is a Distinguished Senior Fellow and President Emeritus at the Thomas B. Fordham Institute