Quite correctly, Drum disaggregates!


A commenter asks a good question: Almost surely, you will never see a serious discussion of our public school test scores.

The truth is, no one really cares about the truth of that matter. On the other hand, a lot of people in the press corps care about billionaires.

In this country, many well-known billionaires are committed to the standard gloomy stories in which our public schools just keep getting worse. Our schools were so much better back then! As James Atlas just said!

Almost surely, you'll never see a serious discussion of this topic. But just in case a young revolutionary is lurking somewhere, we want to record appropriate points about the way test scores should be discussed. This brings us to an excellent move Kevin Drum recently made.

And to a very good question.

Drum made his excellent move July 2, while showering us with the kind of praise we always find so embarrassing. His self-referential headline said this:

“Kevin Drum Smackdown Watch: I Forgot About Disaggregation!”

In his post, Drum explains why you have to “disaggregate” test scores if you want “to really understand what's going on.”

We agree with that overall judgment. For most purposes, you have to look at how different parts of the student population are doing. You can’t just look at the “aggregate” score. You have to see how Hispanic students did. You have to take a separate look at the scores of black students if you want to get a full picture of what is going on.

In his post, Drum explains why that’s so important. We were glad to see his post, because Drum has always tended to present aggregate scores when he reviews the NAEP.

Drum is one of the only liberals who seems to care about such topics. The analysts were glad to see him singing the praises of disaggregation.

Having said that, let us also say this. One of Drum's commenters posed a very good question that day. First, he quoted something Drum said, then he let her rip:
COMMENTER (7/2/13): “The rising share of blacks and Hispanics has pushed down the average when you lump everyone together.”

OK, but why is it assumed to be obviously improper to "lump everyone together?" I should think it would depend on whether the question is, “Are scores failing to improve,” or the question is, “Why scores are failing to improve.” This explanation only goes to why, it doesn't undo the fact that scores are failing to improve.
In fact, overall test scores are improving, even before disaggregation. But the commenter asked a very good question. Here’s the answer:

It isn’t “obviously improper” to lump everyone together. Depending on the question you’re trying to answer, that may be the right thing to do.

But most of the time when we talk about test scores, we’re trying to evaluate the work of our schools, or the work of some particular school, or the value of some instructional program. In all such discussions, you really need to disaggregate scores, so that you end up comparing roughly similar groups of students.

If School A is full of kids from high-income professional homes, and School B is full of the children of low-income recent immigrants from low-literacy backgrounds, it doesn’t make sense, for most purposes, to compare their test scores. Given the way the world works today, School A will have much higher scores.

That doesn’t necessarily mean that School A is doing a better job. If the student populations are hugely different, the difference in test scores doesn't mean much of anything at all.

Generally, we talk about test scores in the search for new ways to beat up on teachers. If you’re going to use test scores to examine the work of our teachers, you really do need to “disaggregate” scores.

Aggregate scores are not unimportant, but they only tell part of a much larger story. When you break scores down in various ways, you give yourself a much richer idea of all that is going on.

You also skip past the obvious blunders the billionaires like to make.

Tomorrow: The final, shocking installment in our NAEP-watch reports


  1. Disaggregation, while analytically necessary in most cases, does come with a price: accepting the reality that different groups perform differently. People of good faith understand that phenomenon as the product of history and economics and not innate abilities. There is a level of discomfort, however, in the fear of giving potential cover for racist judgments. It's a variant of the squirming that the tracking issue generated.

  2. Ack.

    Overall test scores at age 17 were NOT improving before disaggregation: they were unchanged, according to Drum's post.

    It was a case of Simpson's paradox, in which the overall average can stay the same, or even go down, for a group, but can rise for all component subgroups (because the relative sizes of the subgroups can change).

    Point is, you can't even detect the rise, which is real, without the disaggregation.

    Of course one could live in a la-la land in which all subgroups are supposed to score the same, and so the changing composition of the overall group shouldn't matter. But that is on a different planet from the one we live on.

  3. Oh, Urban, get real. When we disaggregate, we compare green apples to green apples, we don't say green apples are superior to red apples. Trying to see if a group's performance is improving isn't racist. In fact, it is the opposite, trying to see if the improvements and benefits are equal all other things being equal. If not, then we say we need to do something to improve the outcomes. If we shy away from doing serious analysis just because some sleazy capitalist school privatizer or some racist nitwit might use it in their argument we are playing into their hands. And tracking ain't the same thing, kiddo. Believe me, I was tracked and treated as mentally retarded because I stuttered, so refused to read out loud in 1st grade. Tracking is the arbitrary assignment to some subgroup based on some arbitrary criteria, but analyzing performance is not tracking.

    1. OK, thanks for correcting me, I guess. That's assuming you actually read what I said. But you seem not to know much about the history of racial stereotyping outside your own personal experience.

  4. Drum is not the only one who did this!

    I was reading comments to what was an ACCURATE piece about the scores and readers seized on the issue of 17 year olds doing poorly, not knowing about drop-out rates.

    The one and only reason I knew about drop.out rates was reading this blog.

    You and Drum should correspond more often.

  5. Most school policy is still set at the local level in response (in addition to local concerns and goals) to each state's mandates and policies (these, in turn, being affected by federal mandates and policies). Which means that disaggregation (of whatever measures are being used) is vital for determining local policies. Yet parents, school administrators, and even teachers are often unduly influenced by reports of aggregate results for the whole country or whole state or whole city, or by mere theories of what disaggregated results would look like.

  6. Somerby says: "Almost surely, you'll never see a serious discussion of this topic." A serious discussion of this topic goes on every day, all day at Diane Ravitch's blog site. Oh wait, I forgot, Bob hates Diane Ravitch.

    1. That's because she's a "piece of work," or so he says.

  7. I don't believe Bob reads the comments, but this is only tangentially related anyway. Even here in Singapore we have to suffer comparisons to Finland. Although, in the article below, they at least mention the small size of both nations and how every nation faces its own unique educational challenges: