The evidence on test scores and long-term outcomes: Limited but encouraging

5.8.2018

For weeks now, I’ve been debating Patrick Wolf, Michael McShane, and Collin Hitt about the relationship between short-term test score changes and long-term student outcomes, like college enrollment and graduation. Most recently I proposed three hypotheses that those of us who support test-based accountability—for schools of choice and beyond—would embrace. Now let’s see how the evidence stacks up against them.

To be clear, this is a slightly different exercise from asking whether test-based accountability policies lead to stronger outcomes in terms of student achievement. That’s an important endeavor too, and studies like Thomas Dee’s and Brian Jacob’s evaluation of accountability systems under No Child Left Behind indicate that the answer is yes.

But that’s not quite what we’re after, because those studies show that holding schools accountable for raising test scores…results in higher test scores. What we want to know is whether higher test scores—or, more accurately, stronger test score growth—relates to better outcomes for students in the real world.

So let’s take it one hypothesis at a time.

1. Students who learn dramatically more at school, as measured by valid and reliable assessments, will go on to graduate from high school, enroll in and complete postsecondary education, and earn more as adults than similar peers who learn less.

You would think that there would be lots of studies looking at students’ learning gains in elementary or middle school and how that impacts their high school graduation or college enrollment rates. Yet to my knowledge none exist. (Academics: Let’s change that please!)

What we do have is the famous Raj Chetty et al. study examining teacher value-added, which found that students who learn more in elementary school earn more as adults. It’s just one study, but it’s a remarkable finding, one that might be hard to replicate unless more scholars can gain access to the tax data Chetty and his colleagues have.

2. Elementary and middle schools that dramatically boost the achievement of their students should also boost their long-term outcomes, including high school graduation, postsecondary enrollment, performance, and completion, as well as later earnings.

Here we have a bit more to go on, at least if we look at studies that examine both individual schools and programs that are focused at least in part on elementary or middle schools. Remember that we’re interested in schools or programs that make a significant impact on achievement, for good or ill. According to Hitt, McShane, and Wolf’s review, there are four of those. I will use their words to describe the results:

Harlem Promise Academies, which had “a positive and significant impact on math scores, a positive but an insignificant impact on high school graduation, and a positive but insignificant impact on college attendance.” (Also: Admitted females were 12.1 percentage points less likely to be pregnant in their teens, and males are 4.3 percentage points less likely to be incarcerated.)
“No Excuses” Charter Schools in Texas, which “produced significant gains in ELA and math scores and in high school graduation rates.” They also had “a small and statistically insignificant impact on earning,” according to the study itself.
Boston Charter Schools, which had “positive and significant effects on language arts, positive and significant impacts on math scores, negative but significant impacts on high school graduation rates, and positive but insignificant impacts on college attendance rates.” Here we have our first hint of a mismatch. However, the negative finding for graduation disappears if we look at five-year graduation rates, lending credence to the theory that the city’s no-excuses, high expectations charter schools are making their students take more time to graduate, while boosting their achievement. Students were also more likely to enroll in four-year universities, where low-income students tend to earn credentials at higher rates.
Other Charter Schools in Texas, which “produced significant gains in high school graduation rates, despite having negative but significant impacts on ELA and math scores.”* ~~Here we have our first true mismatch~~. However, according to the study, attendance in these schools was also related to a decline in college enrollment rates and lower earnings as adults. In other words, this study actually bolsters the case for test-based accountability, while undermining the case for high school graduation rates.

So what to make of these studies? Positive and significant impacts on student achievement in the Harlem Success Academies, Boston charter schools, and “no excuses” schools in Texas were related to positive but statistically insignificant impacts on high school graduation and/or college enrollment rates. Harlem Success also had a positive impact on teen pregnancy (for girls) and incarceration rates (for boys); Boston charter schools had a positive impact on enrollment in four-year versus two-year colleges; and “no excuses” charters in Texas had a positive but insignificant impact on earnings. Meanwhile the other charter schools in Texas saw negative impacts on test scores and negative impacts on college enrollment and earnings.

Any fair reading of this research would acknowledge a strong relationship between test score impacts and long term outcomes. If there is anything to worry about here, it is the disconnect between high school graduation and college enrollment and earnings that manifests itself in the Texas study.

3. High schools that dramatically boost the achievement of their students should also boost their long-term outcomes, including postsecondary enrollment, performance, and completion, and earnings.

Here the research base is a tad larger. We can start with a 2016 study of Texas’s accountability system by all-stars David J. Deming, Sarah Cohodes, Jennifer Jennings, and Christopher Jencks, published in The Review of Economics and Statistics, and repackaged for a lay audience in Education Next, which found, in the authors’ words, that:

Pressure on schools to avoid a low performance rating led low-scoring students to score significantly higher on a high-stakes math exam in 10th grade….Later in life, they were more likely to attend and graduate from a four-year college, and they had higher earnings at age 25.

The second, a working paper by the University of Michigan’s Daniel Hubbard, finds, in his words, that:

Students who attend high schools with higher value added perform better in college, both in tested and untested subjects; a student who attends a high school one standard deviation above the mean level of value added will have first-year grades about 0.09 grade points higher than the grades of an identical student in an average high school. The effect remains positive and highly significant after a variety of adjustments to deal with selection into college and into high school. This result implies that schools with high value added are not earning those scores by teaching to the test or by reallocating resources toward tested subjects, but instead by preparing students effectively to perform well in the standardized test and beyond.

And what about the studies reviewed by Hitt, McShane, and Wolf? I count just two that fit with my hypothesis, in that they include significant findings for achievement; have data on either college enrollment or graduation; and aren’t looking at idiosyncratic models like early college, selective enrollment, or CTE. Let’s again use the AEI authors’ own words to describe the results:

Chicago Charter High Schools, which “had a positive and significant impact on math scores…and a positive and significant impact on college attendance.”
New York City’s Small Schools of Choice, which “had positive and significant impacts on ELA scores, and positive and significant impacts on high school graduation and college attendance.”

So what to make of the results for high school programs? All four show a clear match between achievement and college enrollment and/or performance. The Texas study found higher achievement led to higher earnings as well.

***

Where do I end up after rummaging through all of these studies?

The research base is very thin—too thin for a serious meta-analysis. With only nine relevant studies, this is clearly a field still in its infancy.
Almost all of the evidence we do have indicates that changes in test scores and in long-term outcomes match. In each of the nine cases, the student achievement impacts and the longest long-term outcomes point in the same direction. Not all of the long-term impact outcomes are statistically significant. But this is still a promising finding.

No doubt this debate will continue; we plainly need a lot more empirical evidence to inform it. In the meantime, the best studies we have indicate that test-based accountability is a smart approach, imperfect as it is, because students who learn more go on to do better in the real world. And yes, that’s what really counts.

* Chalkbeat's Matt Barnum pointed out that the gains in high school graduation rates were actually small and statistically insignificant.