Does This Analysis of Test Scores Make Any Sense?—A Commentary by Ian Ayres ’86
Does This Analysis of Test Scores Make Any Sense?
By Ian Ayres ’86
Here’s the latest guest post from Yale economist and law professor Ian Ayres. Here are Ayres’s past posts and here is a recent discussion of standardized tests.
A recent article in the Times trumpeted the results of a report that had just been released by the Educational Testing Service (E.T.S.).
The E.T.S. researchers used four variables that are beyond the control of schools: the percentage of children living with one parent; the percentage of eighth graders absent from school at least three times a month; the percentage of children age 5 or younger whose parents read to them daily; and the percentage of eighth graders who watch five or more hours of TV a day. Using just those four variables, the researchers were able to predict each state’s results on the federal eighth-grade reading test with impressive accuracy.
“Together, these four factors account for about two-thirds of the large differences among states,” the report said. In other words, the states that had the lowest test scores tended to be those that had the highest percentages of children from single-parent families, eighth graders watching lots of TV and eighth graders absent a lot, and the lowest percentages of young children being read to regularly, regardless of what was going on in their schools.
The article fairly portrays the text of the study, which concludes:
In statistical terms, these four factors account for two-thirds of the differences in the actual scores (r squared = .66). That is a very strong association. (emphasis added).
The last sentence is odd. Normally, I’d look at the statistical significance of the individual factors if I were going to judge the strength of the association. The report’s phrasing suggests a strong association between the reading score outcome and all four of the underlying factors. But what you would not learn unless you dug into the appendix is that only 3 of the 4 factors were statistically significant.
It turns out that the impact of the “percentage of children under age 18 in a state who live with one parent” (labeled in the table as “onepar”) is neither large nor statistically different from zero. A one standard deviation increase in the percentage of single-parent kids only reduces the predicted reading score by only about half a point (while a one-standard deviation increase in heavy TV watchers reduces the predicted reading score by 3.3 points).
Moreover, this marginal effect by traditional standards is not statistically significant. The estimated negative impact of single-parent families may simply be a byproduct of chance (the T value indicates that the estimated negative coefficient of -0.0656 is only about four-tenths of a standard deviation away from zero –- so we can’t reject in this data the possibility that the true impact of one-parent families on reading test scores is positive).
When I reran the same regression but dropped the “onepar” variable, the adjusted r-squared increased slightly. (You can download an Excel file with the full results and data here). That’s right: a three-factor regression does an even better job at explaining the reading score data.
We shouldn’t put very much weight on this regression. Instead of analyzing data on individual students, the report focused on aggregate state data that suppresses by averaging a great deal of the real variation of interest. The 4-factor regression only concerns 50 state data points. There may be other evidence in other studies that children of one parent families have poorer educational outcomes, but there is not a strong association between the two variables in this particular regression data.