INSTANT ASSESSMENTS: Ravitch blasted Kristof's claims!

MONDAY, JUNE 5, 2023

Her work was astoundingly bad: All last week, we outlined some of the background to our latest search. 

(Note: We'll be AWOL for several days later this week. For that reason, we're covering a lot of ground in this morning's report.)

As you may recall, the current search concerns a set of claims by Nicholas Kristof—a set of claims which followed, and echoed, this May 17 analysis piece by the Associated Press.

We love Kristof's values, but how good was his analytical work? The headline on his essay in the New York Times went exactly like this:

Mississippi Is Offering Lessons for America on Education

"Thank God for Mississippi," Kristof said, at the end of his lengthy June 1 essay. The following excerpts give the flavor of what he said he said he had seen in the public schools of that state:

KRISTOF (6/1/23): [I]t’s extraordinary to travel across this state today and find something dazzling: It is lifting education outcomes and soaring in the national rankings. With an all-out effort over the past decade to get all children to read by the end of third grade and by extensive reliance on research and metrics, Mississippi has shown that it is possible to raise standards even in a state ranked dead last in the country in child poverty and hunger and second highest in teen births.

In the National Assessment of Educational Progress, a series of nationwide tests better known as NAEP, Mississippi has moved from near the bottom to the middle for most of the exams—and near the top when adjusted for demographics. Among just children in poverty, Mississippi fourth graders now are tied for best performers in the nation in NAEP reading tests and rank second in math.


Mississippi has achieved its gains despite ranking 46th in spending per pupil in grades K-12. Its low price tag is one reason Mississippi’s strategy might be replicable in other states...

“This is something I’m proud of,” said Erica Jones, a second-grade teacher who is the president of the Mississippi affiliate of the National Education Association, the teachers’ union. “We definitely have something to teach the rest of the country.”

Readers, how about it? Does Mississippi "definitely have something to teach the rest of the country?"

After reading the original AP report, we initiated a search of an answer to that question. At this point, three weeks later, we remain unsure of the answer.

Back to Kristof's essay:

"Thank God for Mississippi," Kristof said at the end of his lengthy piece. How had this low-income state achieved such success? Kristof said it was largely due to three policies:

He said it was due to the teaching of phonics. He said it was due to early intervention with kids who are struggling. 

He also said the success was due to a third-grade retention policy, in which kids have to repeat the third grade if they can't pass an end-of-year reading test.

We have learned, through long experience, to question claims of surprising test score gains. For that reason, we set out on a search.

So did education writer Diane Ravitch—and she drew an instant conclusion. After a search of roughly one day, Ravitch's headline said this:

Nicholas Kristof Does Not Know How to Fix Education

So went Ravitch's instant assessment. In our view, her work was astoundingly poor and her instant assessment was worthless.

Dear reader, don't get us wrong! As she conducted her overnight search, Ravitch gave voice to several points of concern we'd be inclined to share.

As she started, Ravitch said that Kristof "is terrific on many issues but consistently wrong when he writes about education." As we noted last week, we've had the same impression in the past.

As we noted, it has seemed to us that he's too inclined to accept the pronouncements of the educational experts—for example, of those who said that "nothing was working" even as Naep scores kept rising for all demographic groups over an extended period of time.

We tend to agree with Ravitch's assessment of Kristof's past work.  Beyond that, Ravitch quickly challenged the third-grade retention policy. 

As with Ravitch, so too here. This is an issue which jumped out as us in Kristof's essay, and in the earlier report by the AP. 

Concerning the retention policy, here's the start of what Ravitch said:

RAVITCH (6/2/23): The 2013 legislation [in Mississippi] also enacted third-grade retention. Any child who didn’t pass the third-grade reading test was retained. Most researchers think retention is a terrible, humiliating policy. But Kristof assures readers that failing students get a second chance to pass. 9% of students in third grade flunked. He considers this policy to be a great success, inspiring third graders to try harder, citing a study funded by Jeb Bush’s foundation (Florida also practices third grade retention, which lifts its fourth grade reading scores on NAEP).


It seems fairly obvious that the big gains in NAEP in fourth grade were fueled by the policy of holding back third graders. Jeb Bush boasted of the “Florida Miracle,” which was based on the same strategy: juice up fourth grade scores by holding back the lowest performing third graders.

Long story short! Ravitch was saying that those Grade 4 score gains have been caused by the retention policy in a way which is artificial. Briefly, let's consider why Ravitch would say that:

One thing seems fairly obvious. A state which makes a lot of third graders repeat third grade will almost surely show an improved average score on the next year's Grade 4 tests.

In Mississippi's case, here's how would work. Like Ravitch, we'll use the nine percent retention figure which Kristof cites in his piece.

As the Grade 3 retention policy went into effect, the lowest-performing nine percent of Mississippi's third graders didn't move on to fourth grade, as they otherwise would have. In this way, they were eliminated from participation in the next year's Grade 4 reading test. 

Presumably, this gave a rocket boost to the average score on that Grade 4 reading test. Here's why:

All those kids who had to repeat third grade were good, decent kids. That said, the lowest scorers in any grade group exert a large downward pull on a state's average score. 

How large might that downward pull be? Consider these two data points from last year's Naep:

U.S. public schools
Grade 4 reading, 2022 Naep
Average score nationwide: 216.11 
10th percentile score: 160.47

Nationwide, the average score was 216.11—but good grief! According to official Naep data, the lowest performing ten percent of Grade 4 test takers scored 160.47 or lower—and many scored quite a bit lower than that! 

Stating the obvious, last year's nationwide average score would have been much higher if that lowest performing ten percent had been eliminated from the exercise. Presumably, Mississippi benefitted from a large score boost on Grade 4 tests in the year its Grade 3 retention policy went into effect.

Key point! Presumably, the benefit would have been smaller in subsequent years. Here's why:

In the second year of this policy's operation, the kids who had to repeat third grade in Year 1 would now be in the fourth grade, after spending two years in third grade. It's likely that they would have remained among the lower scoring fourth graders—but after two years in third grade, plus the subsequent year in fourth grade, their scores would presumably have been better than they would have been the year before.

(We'll show you Naep data to that effect when we convene tomorrow.)

However you slice it, the basic point seems obvious. A state which retains a lot of third graders likely has a statistical advantage on the Grade 4 Naep when compared to other states which don't retain lots of kids. 

This doesn't tell us whether the retention policy is a good or a bad idea. This does mean that it's hard to make a valid statistical comparison between the different states with different retention policies.

In our view, Ravitch should probably have spent a bit more time on this particular point. Instead, she focused on a second criticism of Kristof's analysis. That critique went like this:

The so-called "Main Naep" tests reading and math in Grades 4, 8 and 12. Echoing a statement by Kristof himself, Ravitch says that Mississippi's Grade 8 gains haven't kept pace with its Grade 4 gains—score gain which (in her view) were basically phony anyway.

Ravitch hammers Kristof very hard on this point. Based on an overnight search, she offered these (bizarre) assertions about Mississippi's performance in Grade 8 reading and in Grade 8 math:

RAVITCH: Eighth grade reading scores in Mississippi have gone up over the past two decades, but scores went up everywhere. In the latest national assessment (NAEP), 37 states had scores higher than those of Mississippi on the NAEP eighth grade reading test. Only one state (New Mexico) was lower. The other 13 were tied. In Mississippi, 25% of the state’s students in 2019 (pre-pandemic) were at or above proficient, compared to 20% in 2003. Nationally, in 2019, 29% of students were at or above proficient*.

In 2019, 42 states and jurisdictions outperformed Mississippi in percentage of students at or above proficient in eighth grade math, eight were tied, and only two scored below Mississippi. 24% were at or above proficient in 2019, a big increase over 2009 when it was 15%. But Mississippi still lags the national average, because scores were rising in other states.

Has Mississippi made progress in the past decade? Yes. Is it a model for the nation? No. When impressive fourth grade scores are followed by not-so-impressive scores in eighth grade, it suggests that the fourth grade scores were anti Oakley boosted by holding back the 9% who were the least successful readers. A neat trick but not an upfront way to measure progress.

Bizarrely, Ravitch seems to think that the most recent Naep tests were conducted in 2019. 

She offers data from that year's Naep testing only. In the first of those two paragraphs, she even offers this link to an overview of Mississippi's performance on the Naep—to an overview of Mississippi's performance on the Naep in 2019!

She then proceeds to hammer Kristof on the basis of Mississippi's allegedly poor performance at the Grade 8 level in 2019. Bizarrely though, she offers only "aggregate" data, failing to make the statistical "adjustments for demographics" which let us perform a valid assessment of Mississippi's performance as compared to more affluent states.

Ravitch remains influential, especially in blue tribe circles. The link to her attack on Kristof was sent to us by a long-time local education activist, a person of justified high standing.

That said, Ravitch's account of Mississippi's Grade 8 performance is astoundingly incompetent. So it goes, again and again, when a person conducts a search of the high-end American discourse concerning the public schools.

Tomorrow, we'll show you how Mississippi's eighth graders actually performed on the most recent Naep—on the Naep which was conducted last year, in the spring of 2022.

Partisan furies to the side, we're forced to tell you this:

After making the obvious statistical adjustments, Mississippi's eighth graders outpaced almost all other states on the 2022 Naep. After making the obvious statistical adjustments, the state's eighth graders performed above the national average in almost all demographic categories, often by substantial margins.

Some of that may still reflect a misleading statistical advantage derived from that third-grade retention policy. But Ravitch—who doesn't even seem to know when the latest Naep testing occurred—failed to make the obvious statistical adjustments as she hammered a series of partisan points against Kristof.

The woods are lovely, dark and deep. But many searches, over many years, have convinced us of a surprising fact:

This is the way the discourse tends to work in this badly floundering, vastly self-impressed nation, especially when people pretend to talk about the public schools and the good, decent people within them.

Tomorrow: Mississippi's eighth grade scores


  1. The puzzle pieces are coming together.

    1. Sure, if you saw off the edges and hammer them into place.

  2. "it suggests that the fourth grade scores were anti Oakley boosted by holding back the 9% who were the least successful readers"

    What does anti Oakley mean? Is this a typo or something that Somerby should have explained?

    1. "anti- Oakley" or "Annie Oakley" - it makes a big difference.

    2. I don't understand "anti-Oakley" either. But Annie Oakley was a sharpshooter.

    3. Best I can guess: it’s a typo for Annie Oakley, who was so known for shooting holes in cards that a sort of meme developed whereby punched tickets were referenced as “Annie Oakleys”, particularly complimentary tickets, so the phrase came to mean “free ticket” and “free ride”.

      So with that phrase, Ravitch is likely indicating the 4th grade scores got a “free ride” increase, boosted by removing low scoring students, instead of the hard work of actually increasing reading proficiency.

      Ravitch is 84, so it is unsurprising for her to use an outdated phrase.

    4. Your interpretation is plausible.

    5. Maybe it was Auntie Oakley.

  3. Whatever the data reveals about the policy of holding kids back (and it isn’t as clear cut as Somerby indicates, see my post above), it isn’t clear how the policy that Mississippi enacted is any kind of subterfuge used merely to inflate naep scores. If you actually read up on everything that Mississippi is doing to help third graders read, it’s more than just holding them back, and its purpose is to … help kids read. Studies show that learning to read by third grade is crucial to future success.

  4. Somerby agrees with much of what Ravitch says and then concludes:

    "That said, Ravitch's account of Mississippi's Grade 8 performance is astoundingly incompetent."

    Somerby claims this is because Ravitch failed to make "the obvious statistical adjustments" but Somerby himself leaves those to tomorrow's essay. We are supposed to accept his evaluation of Ravitch but he won't say why. We are obviously just supposed to take his judgment on faith, no evidence needed, just some vague reference to the obvious statistical adjustments, whatever he means by that.

    Somerby often teases future discusson that he never supplies. Today he wants to call Ravitch wrong, but he provides no support, just this vague tease. Ravitch is famous as an education expert and activist. Somerby is an asshole with a blog. Who shall I believe? Hmmmmm, well obviously not Somerby who likes to call experts names but conceals his own info while making mistakes about the subjects he himself raises, mistakes never corrected because he won't read his comment section. I'll take Ravitch any day -- she isn't afraid to face her critics and has done more for education than Somerby ever could do with his coy slyness and misleading presentations.

    When you make a comparison between the 4th grade results and the 8th grade results, you are not comparing the same group of kids. The schools do not test all available students, but identify a subset to take the test. That means you cannot draw a longitudinal conclusion by comparing the size of the increases in the two cohorts. And both Ravitch and Somerby ignore the impact of the policy of early intervention with kids who are struggling, a more important contributor to improvement than simply holding children back. Somerby assumes that kids who are retained and then go on to 4th grade will be worse performers in 4th grade. That may be true if no extra attention is provided, but since the core of the MS program was the extra attention given early to poor readers, why assume they remain poor readers after retention. This needs to be verified through tested, not assumed as Somerby does to refute Ravitch. Somerby implicitly calls the improved teacher training and the extra specialist attention ineffective, attributing the higher scores to an artifact, but he has no evidence to support his opinion and contradicts the explanations given by the MS administrators. You cannot do that by assuming your own suspicions -- evidence is needed and Somerby provides none and has none.

  5. Looks like maybe we be spared the promised Morning Joe hate watching post. That’s about the best we can say about today’s nothingburger.

  6. Somerby’s argument is solely about statistics, so he’d better have the goods if he wants to criticize the media for calling what happened in Mississippi a “miracle.”

    He doesn’t have the goods.

    And it’s indicative of his whole approach that he centers every discussion of education around numbers, and not the needs and concerns of children, which include being able to read.

  7. “she offers only "aggregate" data, failing to make the statistical "adjustments for demographics" which let us perform a valid assessment of Mississippi's performance as compared to more affluent states.”

    You mean, it’s possible that the effort that Mississippi puts into teaching third graders to read may actually be helping eighth graders, that is, if you adjust for demographics?

  8. “We love Kristof's values”

    You mean, these values:

    “Our Tribe’s Own Moral Scold: We aren’t huge fans of Nicholas Kristof’s twice-weekly New York Times columns.

    In our view, Kristof is becoming a relentless moral scold—the type of scold who convinces people to stay away from liberals and progressives.”

    Did not see “moral scold” on the list of values to admire.

    1. Kristof is a neoliberal elitist who went carpetbagging to Oregon and tried to gaslight the state into making him governor. The attempt failed, yet Kristof can consider it a success as he kept the $3 million his campaign raised.

      Somerby, like most on the right, does not suffer from having an ideology, or even integrity. One day you’re Somerby’s hero, the next you’re a target for derision.

      With people like Kristof and Somerby, you’re scraping the bottom of the barrel.

  9. Bob raises several salient points today.

    1. The were points, but they weren't salient.

    2. They were points, but they weren't salient.

    3. The major point is Bob is afraid of talking about Trump.

    4. But Bob didn’t raise that point.

    5. Somerby makes an implied point, which is that 2022 NAEP scores aren’t particularly instructive due to the impact of Covid.

    6. That’s not salient.

    7. Seems like a really dumb way to look at it.

    8. Tests obscure the likely fact that we are all, roughly, equally smart/dumb.

    9. Yes, we are all the same height and weight too.

    10. We’re bigger than mice and smaller than elephants.

    11. The variance in human height and weight is greater than the variance in smartness or dumbness, which is nearly impossible to quantify in any meaningful way, and whatever variance there is, is largely due to environment and experiences, unlike body size, which is more related to genetics.

      Interestingly, humans have relatively limited sexual dimorphism (human males are only 15% larger than females), which apparently was a contribution to the egalitarian nature of human society up until we transitioned to surplus based societies about 10-12k years ago.

      Having said that, I often feel more closely aligned with an elephant than a mouse, largely due to my sugar addiction and my hysterical reactions to mice.

    12. Why is so little variance in mental ability?

    13. Actually the measurements of most human traits form a normal distribution (bell curve), which means the variability is the same, not different as you suggest. This includes mental ability. There is no difference in this by race. There are many ways of measuring mental ability, encompassed by the study of intelligence and cognitive science (psychology). Genetics, not just environment, affects ability. Every child has the right to develop and maximize his or her ability. That is part of our nation’s guarantee of equality of opportunity.

      When you disaggregate measurements of human traits, the within group differences are much greater than between group differences and the distributions of scores substantially overlap. There are women who are taller than the average man, men shorter than the average woman, black people smarter than the average white or Asian person. This is why individual people must be evaluated based on their individual abilities, not group norms. It is a travesty whenever Somerby posts NAEP means and suggests they show that black kids cannot do science. History shows that is untrue, as does NAEP if you focus on the tails of the distribution not the means.

      Somerby is motivated to denigrate progress among black kids because he is a bigot. He seeks to blind his readers with NAEP intricacies but he doesn’t know what he is talking about.

      All kids deserve a chance to pursue their interests, even if it infuriates racists and inconveniences school administrators. The IQ needed to become a lawyer, doctor, accountant, is not in the gifted range but around 115-120 (1 SD above the mean). Morivation is a stronger determinant of success than ability. A lot of progress is made by people in that range, so the task of schools is not to find the very smartest kids but to nurture those whose interest will drive them to contribute to humanity.

    14. Some basic traits do follow a bell curve; however, this does not indicate similar variability among different traits, since scales, parameters, axes, deviations, etc. vary, this indicates little more than average is about average.

      Mental ability can be measured, but not in a particularly meaningful way, in part because genetics play a limited role in this ill-defined trait.

      There is no guarantee of equality of opportunity, which is also not the end all be all ideal.

      Individual abilities are neither innate nor set in stone.

      “Success” is mostly a function of privilege and happenstance. Notions like motivation and ability are so nebulous as to have little useful meaning.

      Somerby does seem to deny racism, but positing a meritocratic society is no better.

    15. @8:09 -- Your opinions are inconsistent with a lot of the literature on this stuff, which has been studied empirically.

      1. The normal curve is a specific distribution with a mean of 0 and a standard deviation of 1. Traits that produce this distribution when measured, as most human traits do, have the same variability. The x-axis is z-scores (measurement minus the mean divided by standard deviation), so the axes are not varying as you suggest. This permits comparison across groups and traits measured.

      2. Mental ability can be and is measured in a "meaningful way". Genetics does not affect measurement of traits as you suggest. Genetics is the major determinant of human traits and abilities and thus the major source of variability.

      3. Our constitution guarantees that all men (people) are created equal and are entitled to life, liberty and the pursuit of happiness. Pursuit of opportunity is part of this.

      4. The study of intelligence is a subfield of personality theory, which is the study of human traits that are stable across time and place. By definition. That implies that such abilities are set in stone, although malleable to experience and environment. Intelligence is used for problem solving across many situations in one's life and it is a stable and fixed set of traits and abilities with which an individual approaches life's challenges. Again, by definition and by observation. For example, IQ test scores remain stable after early childhood, and only vary early on because the types of measurement tasks change. That stability lasts until old age, when cognitive decline sets in due to health problems.

      You don't know what you are talking about. Much of what you have written is contradicted by actual research. You are entitled to your own opinions, but not your own set of facts.

    16. Ok Charles Murray.

      What you claim is inaccurate, not backed by serious academic literature.

      Some basic human traits can be measured in a way that you could generate a bell curve, but not most.

      However, a bell curve does not indicate that different traits vary to the same degree, which is a non sequitur of a concept. It’s not just the scale of axes that vary, it’s also values, parameters, and deviations. Your claims are silly enough to warrant the troll moniker.

      It’s impossible to measure mental ability in a way that would meaningful indicate what proportion is related to genetics versus environment, but honest attempts at such in fact indicate that the environment vastly outweighs the impact of genetics.

      Your paraphrase about equality is from the Declaration of Independence, not the Constitution.

      The US does not in any way, not de jure nor de facto, guarantee equality of opportunity.

      When you claim something is set in stone, yet malleable, you’ve lost the argument. You’ve exposed yourself as a troll operating in bad faith. Your mumblings about IQ are largely nonsensical and inaccurate.

  10. "Individual abilities are neither innate nor set in stone."

    Does everybody have the same ability? If so, why? What makes everybody's ability the same?

    1. Probably not in the sense you mean, and nobody has suggested that.

      I always say there are dumb questions, but go ahead and ask them, you never know what might lead to edification.

      Having said that, a society based solely on a hierarchy of ability sounds like a nightmare, but if floats your boat, go for it, more power to you.

    2. I’ve never worked hard, or been motivated or ambitious, yet by normal metrics I’m more successful than 99% of people, more successful than anyone here.

      I had always assumed this is because I am a superior human being; I deserved to be more successful.

      But as I learned about certain things such as how free will doesn’t exist, about the arbitrary nature of social constructs, about the natural state of how humans coexist, I realized my assumptions were wrong, and indeed made an ass of me.

    3. No, everyone does not have the same ability. That is what is meant by ability being distributed according to a normal distribution (bell curve). People differ in their abilities -- that is what is meant by variability, people vary.

      This looks like a ChatGPT style troll spouting incorrect statements about psychology (which includes the measurement of human abilities and traits).

      We have a society based on a hierarchy of wealth and access to power now. Ancient China devised a society based on ability to compose a certain type of poem in order to gain access to civil service positions and rise in a hierarchy of power, coupled with a hereditary monarchy. It ensured equality of opportunity even though people varied in their ability to gain literacy and write such poetry. I'm not sure that was worse than the elections we hold today.

      We obviously do not hold politicians to any particular ability standard, since both Lauren Boebert and George Santos were elected without credentials and are both uneducated con artists. That doesn't float my boat. I think people should be qualified for the jobs they hold, so that they will perform them effectively, including government leadership.

      With AI, it is going to be easier than ever to flood blogs with nonsensical garbage like this troll is doing. If you are interested in human mental abilities, please read some psychology textbooks or take a course online or in community college. This is an actual science, not something any troll can make up to confuse people on blogs. Don't fall for pseudoscience about what people are like -- something that goes back hundreds of years, if not to the Greeks (who were largely wrong in their philosophizing about human abilities).

    4. That attack on the Greeks is racist. They were, and they still are, good decent people.

    5. Free will doesn't exist. I couldn't stop trolling if I wanted to, and I don't want to, because, lacking free will, I can't want to.

    6. If psychology textbooks led you to ramble on about strawman arguments, imply that the main human endeavor is to hold a job and perform that job in a manner you deem as effective, then you missed the point of those textbooks.

    7. Ability is normally distributed. So some kids can be good at science, and others just can't. (They're still good, decent kids, and they still deserve our support.) So we test them to see who will benefit from science education.

      Now our challenge is to test them in a way that doesn't exclude bright black and hispanic kids.

    8. I’m confused. Is the troll at 9:42 the same as 9:25, or is that one AI?

      The trolls, AI, and the plain morons are all commingling.

      9:52 mentions trolls and AI, so they must be the plain moron.

      Either way, pretty sure free will is not about desire or decisions. Wait is that the troll?

    9. I’m actually a scientist, with a phd, even though it took me 7 yrs of grad school.

      When I was a kid, I was terrible at science, I got C’s all through high school in all my subjects.

      Later in college, I watched a documentary that sparked my interest in my field, not because of the subject itself, but because it was very well made.

      Now I do research in my field, and sadly, am finding that most of my fellow scientists aren’t terribly good at what they do, even though they are educated and interested. Worse, little of value comes from my work, as there are powerful political forces that easily override my expertise in pursuit of their own interests, namely corporations.

    10. 12:20, what is your field? What research have you done? If you want to preserve anonymity, describe your work in very general terms.

  11. I am not a bot. I am a troll.