Part 2—Why Maine seemed so special: Friend, do you want to use test scores in reading and math to judge a school or school district? To judge an entire state’s schools?
To compare school performance in one state to school performance in another?
If you have reliable test scores to use, that’s a sensible thing to do. But if you want to evaluate schools that way, you have to “disaggregate” test scores.
You can’t compare the overall scores from some state to the overall scores from another. You have to break the test scores down to see how different groups of students in the two states did.
If one state has a high proportion of low-income kids, that state may have low overall scores, even though those low-income kids are outscoring their peers around the nation. And uh-oh! Because white kids currently outscore black and Hispanic kids, a state with a high proportion of white students has a built-in advantage.
If you want to rate the true performance of a state’s schools, you have to “disaggregate” its scores. In her unfortunate book, As Texas Goes, Gail Collins sings the praises of disaggregation—then fails to practice the technique as she rolls her eyes at the state of Texas for its low overall scores. (See THE DAILY HOWLER, 9/26/12.)
That’s par for the course with a writer like Collins. But uh-oh! Twelve years ago, the late Molly Ivins made the same mistake in one of her syndicated columns.
Collins quotes that Ivins column in her massively bungled new book. For that reason, it may be worth taking a trip back in time to see what happens when journalists fail to practice “disaggregation.”
Twelve years ago, Molly Ivins made a substantial mistake.
Given the time frame, her mistake may have been understandable. It was a mistake all the same.
The episode started when the Rand Corporation released a major report on the performance of the various states’ public schools on the National Assessment of Educational Progress (NAEP). The report appeared in July 2000. Rand heaped praise on the Texas schools for the high performance of its various students—after disaggregation.
Rand stressed the progress Texas schools had made on the NAEP between 1990 and 1996. Beyond that, it said that black and Hispanic kids in Texas were scoring right at the top of the nation as compared to black and Hispanic peers in the other states.
That sounded like good news for Texas. But at the time, a presidential campaign was under way, featuring the Republican governor of Texas. And sure enough! As Ivins described the Rand report, she stressed the idea that Governor Bush had nothing to do with the progress displayed by kids in the Texas schools.
That may or may not have been true. But in the process, Ivins made grossly misleading remarks about her state’s public schools.
The Texas schools were still “slightly below average,” the liberal columnist wrote. At one point, she used a statistic which made it sound like things might be somewhat worse:
IVINS (7/29/00): The study shows that Texas is improving fast. Our scores are still slightly below the national average (27th of the 44 states that use the national tests); but we're moving up—second in improvement on math scores, and our minority kids are outperforming others around the country.Do we really know that education reform “takes 10 to 20 years before we can see any results?” Ivins may have stretching a bit, denying credit to Bush.
So the governor stood up and took a bow. Excuse me.
The report was based on tests between 1990 and 1996. One thing we know about education reform is that it takes 10 to 20 years before we can see any results, before we can tell whether what we've tried is working.
The real story on how our schools rocketed from abysmal to only slightly below average in a mere 30 years starts in 1968, with a lawsuit...
But according to a wisecrack by Ivins, the Texas schools had “rocketed from abysmal to only slightly below average in a mere 30 years.” And she included a statistic which sounded gloomier till: She said Texas was still “27th of the 44 states that use the national tests.”
That statistic can be defended as technically accurate. But it was grossly misleading.
Let’s forget about credit and blame. Instead, let’s consider Ivins’ claims about the Texas schools.
Were Texas schools “below average” in 2000, when Ivins’ column appeared? Were they “below average” in 1996, the last year considered in the Rand report? In fact, once you “disaggregated” their scores, Texas students were outscoring their peers around the nation by very significant margins as of 1996.
Once again, here are some of the 1996 scores which led Rand to praise the Texas schools, followed by the corresponding scores from the year 2000, when Ivins wrote her column:
Texas students, fourth-grade math, 1996 NAEPOn the basis of those 1996 scores, the Rand study heaped praise on the Texas schools. But Rand had disaggregated the data.
White kids: First in the nation (of 43 participating states)
Black kids: First in the nation (of 35 states)
Hispanic kids: Second in the nation (of 25 states)
Texas students, fourth-grade math, 2000 NAEP
White kids: Second in the nation (of 40 participating states)
Black kids: First in the nation (of 32 states)
Hispanic kids: Second in the nation (of 21 states)
Misleadingly, Ivins did not.
To those who read the Rand report, the strong performance of Texas students wasn't a mystery. Rand stressed the high performance of Texas kids. Other journalists were able to see this.
Example: A few days before Ivins’ column appeared, Melanie Markey did a detailed report on the Rand study in the Houston Chronicle. With perfect accuracy, Markley described Rand’s findings about the high performance of the Texas schools.
Texas students “outperformed all other states when variations in demographics were taken into account,” Markley correctly reported:
MARKLEY (7/26/00): Texas a qualified No. 1 in U.S. education studyTexas “outperformed all other states when variations in demographics were taken into account,” Markley wrote. That is to say, when test scores were disaggregated.
Texas students, building on reforms launched in the 1980s, outperformed peers with similar backgrounds on national tests measuring education progress, the California-based Rand research group reported Tuesday.
Smaller class sizes, well-funded preschool programs and adequate resources for teachers are key factors that separate top-ranked Texas from the likes of California, which came in last among 44 states participating in the three-year private study, Rand found.
Based on an analysis of the National Assessment of Educational Progress tests given from 1990 to 1996, the study ranks states by raw scores, scores that compare students with similar backgrounds and score improvements.
When comparing raw scores, Texas ranks below most other states. Maine, North Dakota, Iowa, New Hampshire and Montana are at the top, while Mississippi, Louisiana, California, Alabama and South Carolina are at the bottom.
Researchers said the states with high raw scores tend to have fewer minorities, higher family incomes and better-educated parents.
But Texas ranked second only to North Carolina in improved scores—about twice as great as the national average—and outperformed all other states when variations in demographics were taken into account.
After disaggregation, Texas kids were outperforming their peers in all other states! This point was made a bit more clearly in Anjetta McQueen’s report for the Associated Press:
MCQUEEN (7/25/00): Researchers used specific categories to see how well each state is educating children regardless of background. For example, on the 1996 math test of fourth-graders, black students in Texas ranked first when compared with blacks in other states; Hispanic students in Texas ranked fifth. Meanwhile, California's black students ranked last; California's Hispanic students ranked fourth from the bottom.McQueen seems to have erred on one point. According to official NAEP data, Hispanic kids in Texas scored second in the nation, behind only Maryland, in fourth-grade math in 1996. To give you a rough idea of their relative success, they outscored the national average for their peers by more than twelve points—and by a very rough rule of thumb, ten points on the NAEP scale is often said to equal one academic year.
After disaggregation, how well were Texas students doing in 1996? Here’s where the three major demographic groups stood in fourth-grade math by the final year Rand studied:
Texas students, fourth-grade math, 1996 NAEPOn fourth-grade math, Texas kids were kicking the nation’s backside. How did Ivins end up saying that the state was still 27th, out of just 44 states?
White kids: Outscored white kids nationwide by 10.1 points
Black kids: Outscored black kids nationwide by 12.8 points
Hispanic kids: Outscored Hispanic kids nationwide by 12.4 points
Where did she get that gloomy statistic? Therein lies several tales.
Ivins’ statement can be defended as technically accurate. One lone graph in the lengthy Rand report compared the average scores the various states had achieved on all NAEP tests from 1990 through 1996. (For the Rand report, just click here. Scroll to page 14.)
In this one solitary graph, there was no attempt to “disaggregate” scores—to adjust for income, race or ethnicity. And sure enough! On this measure, Texas did finish 27th, out of 44 states. And surprise! These were the five top-scoring states, along with the percentage of their students who were white:
Top five states on Ivins’ preferred measure, with percentage of students who were white:Maine finished first on this ill-conceived measure—but then, 98 percent of Maine’s students were white! North Dakota finished second on this measure; its kids were 96 percent white. (The data for race come from 1992.)
1. Maine (98)
2. North Dakota (96)
3. Iowa (93)
4. New Hampshire (97)
5. Montana (unavailable)
27. Texas (50)
Texas did finish 27th on this particular measure. The main reasons: The state had large numbers of low-income students. And its student population was only about 50 percent white.
Duh! As anybody could have guessed, success on this measure correlated strongly with the percentage of white kids in a gievn state’s schools. Presumably because it ranked Texas so low, Ivins pulled this statistic from a mammoth report which lavishly praised the Texas schools for outscoring all other states—after disaggregation.
Can we talk? In her column, Ivins dogged Texas for its rank on this single, ill-conceived measure. In essence, she was complaining that Texas had too many minority kids and too many kids with low incomes.
We’ll assume that Ivins didn’t fully understand that fact. But that was the nature of the statistic she pulled from that lengthy report—a lengthy report which explicitly stressed the high achievement of Texas kids if you compared such kids with their nationwide peers.
Final point: Was Maine really first in the nation, while Texas languished in 27th? Only because the state of Maine had so many white students! For more proof of Texas’ high performance, just consider this:
Was Maine really best in the nation? In fact, if you consider white kids only, Texas was generally outscoring Maine by the period under review! Here’s how white students from the two states ranked on a string of NAEP tests:
White students only, fourth-grade math, 1996 NAEPBy 1996, Texas tended to outscore Maine, often by fairly large margins, even if you looked at white students only. Maine came in first on that overall ranking only because the state had almost no minority kids.
Texas: First in the nation
White students only, fourth-grade reading, 1998 NAEP
Texas: Second in the nation
White students only, eighth-grade math, 2000 NAEP
Texas: Sixth in the nation
White students only, fourth-grade math, 2000 NAEP
Texas: Second in the nation
Texas finished 27th! In such ways, we get misled when we don’t “disaggregate” test scores. Ivins could have told the world about the relative success of all the kids in Texas. Instead, she cherry-picked a single statistic—a statistic which gave a grossly misleading picture of the performance of the Texas schools.
We will assume that Ivins may not have fully understood her topic that day. Twelve years later, in her unfortunate book, Collins praises the wonders of disaggregation—after which, she utterly fails to employ the praised technique.
Ivins’ column was greatly misleading, but she may not have understood. Twelve years later, what’s Collins’ excuse?
Why does this nonsense continue?
Tomorrow: Quoting Barbara Bush!