Raising Minnesota: Ripley responds!

THURSDAY, OCTOBER 24, 2013

Why did she dump the TIMSS: Amanda Ripley is a good “human interest” writer.

In her widely praised book, The Smartest Kids in the World, she creates interesting portraits of three American exchange students. Her portrait of South Korea’s “pressure cooker” educational culture is especially striking.

Ripley is good at human interest. But she’s terrible at the nuts and bolts of analyzing test scores. When she compares the scores of various states, she doesn’t even “disaggregate”—that is, she doesn’t compare different groups of kids in one state to comparable groups in the others.

For that reason, she ends up thinking that states which have the most white kids are the states which have the best schools. That is a very basic, fundamental error.

We’ll look at that problem in our next post, when we compare Minnesota’s performance in math to that of the other states. Next week, we’ll examine Ripley’s failure to consider the particular needs and circumstances of many low-income kids.

Today, let’s examine a more basic question. Why didn’t Ripley use data from the TIMSS and the PIRLS in her book?

The world’s developed nations take part in three international testing programs—the TIMSS, the PIRLS and the PISA. Presumably, all three programs are considered valuable.

But Ripley completely ignores the TIMSS and the PIRLS in her international comparisons. Because American kids have scored least well on the PISA, this tends to tilt her story in a gloomy direction.

On Tuesday, we told you what a cynic would say about this choice. A cynic would say that Ripley dumped the TIMSS and the PIRLS to produce the gloomy tale about U.S. schools that ranking “reformers” love.

Is that why Ripley ignored the TIMSS? We can’t answer that. Ripley doesn’t denounce American teachers, students and schools in the way the most aggressive “reformers” do. On the other hand, by citing the PISA alone, she produces a gloomier narrative, as preferred by current elites.

(She also overstates routinely. This starts with the largest howler in the history of books. It sits right on page 2.)

Why did Ripley deep-six the TIMSS and the PIRLS? On Tuesday, Kevin Drum linked to our post; he noted the way this choice tilts Ripley’s book toward the gloomy. In response, Ripley sent an e-mail to Drum, explaining her total reliance on the PISA.

Ripley explains this decision on pages 19-23 of her book. In our view, her explanation is embarrassingly unsophisticated. To simplify matters, let’s look at her e-mail to Drum, the text of which can be seen by all.

This is the way she begins. Warning! Gorilla dust!
RIPLEY EMAIL (10/23/13): First of all, I am excited whenever anyone is interested in a debate about TIMSS versus PISA! I geeked out on this debate for many months while working on the book, and it was a pretty lonely existence—so I am happy to have an excuse to talk about it.

I agree it is critical to be skeptical of PISA—or any test or metric. All have their flaws. Which is why I spent a long time learning about all the various international tests—studying sample questions, reading about their strengths and weaknesses and analyzing their results over time.

In fact, I used many different data points to decide which countries to feature in the book, including high school graduation rates, college attainment rates, spending per pupil, rankings of national competitiveness and other economic indicators, as well as test data from TIMSS, PIRLS and NAEP.

As it turns out, international test data is strongly correlated from one test to another. (The correlation between TIMSS 2007 and PISA 2007 was 0.93.) But as you note, there are some differences between PISA and TIMSS findings. In the end, I made a very conscious decision to prioritize PISA findings for two main reasons:
Try to ignore the pleasant asides and self-deprecation with which elites will sometimes endear themselves to other elites. Also ignore that third paragraph, which is completely irrelevant.

Ignore the wonky statement about correlation, which is completely useless in this context. As you ignore all that dust, please note one key point:

In paragraph 2, Ripley says that all tests of this type “have their flaws.” She says we should “be skeptical about the PISA—or any test or metric.”

We would tend to agree. There is no such thing as a perfect test, especially where sampling is involved.

But if all testing programs have their flaws, it seems even stranger that Ripley builds her entire book around the PISA alone, while ignoring the TIMSS and the PIRLS. This is especially true based on what we noted on Tuesday—at one point in her book, Ripley uses TIMSS results to establish Minnesota’s elite status in math.

Ripley never names the TIMSS, not even in her endnotes. But she does put TIMSS results to that use, and she refers to the unnamed TIMSS as “a major international test.”

If TIMSS results can be used that way in the case of Minnesota, why are TIMSS results AWOL when she compares the United States to Finland? Why would you disregard two-thirds of the available data?

She had three sets of data to consider. Why did she only consider one? As Ripley’s email continues, she offers these two reasons:
RIPLEY: In the end, I made a very conscious decision to prioritize PISA findings for two main reasons:

1. PISA is a test administered to 15-year-olds, which means it catches kids closer to the end of their compulsory schooling. TIMSS is given to 4th and 8th graders, which is useful, too, but I was most interested in the cumulative effects of countries' education systems, rather than the midpoint.

2. Unlike TIMSS, PISA was designed to test students' abilities to apply knowledge to solve real-world problems and think for themselves. (TIMSS is a test of school curriculum.) I was most interested in those higher-order thinking skills, since they are increasingly valuable in the modern economy. To see if the hype on PISA was true, I took the test myself, and I found it to be a remarkably sophisticated test.
We’re slicing the lunch meat amazingly thin in that first explanation. Eighth-graders and 15-year-olds are rather close in age. By the way:

In her book, Ripley uses fourth grade TIMSS scores to establish Minnesota’s elite status. If you recall, we told you to read the highlighted passage with care:
RIPLEY (page 73): In 1995, Minnesota fourth graders placed below average for the United States on an international math test. Despite being a mostly white, middle-class state, Minnesota was not doing well in math. When Eric started kindergarten two years later, however, the state had smarter and more focused math standards. When he was eleven, Minnesota updated those standards again, with an eye toward international benchmarks. By the time he went to high school, his peers were scoring well above average for the United States and much of the world. In 2007, Minnesota elementary students rocked a major international math test, performing at about the same level as kids in Japan.
Minnesota’s elementary students rocked the world on that major international test! She is referring to the state's fourth grade scores on the 2007 TIMSS.

As we noted, Minnesota’s eighth graders didn’t do as well on the TIMSS as the state’s fourth graders. That’s why Ripley qualified her claim, in a way most readers wouldn’t have noticed.

But please note: In that passage from her book, Ripley is using fourth grade scores from the TIMSS to establish Minnesota’s status. This week, in her e-mail to Drum, she says she didn’t use eighth grade scores because eighth graders are too young!

Such contradictions abound in Ripley’s work. It seems to us that Ripley cuts corners in pursuit of more thrilling stories. She may also be cutting corners, and embellishing, so that she doesn’t have to break with the preferred elite story line, in which miraculous Finland is off on a cloud and the U.S. is fumbling along.

Beyond that, Ripley simply doesn’t seem to be sophisticated with test scores; she doesn’t seem to understand the basic blocking and tackling involved. This brings us to her second point to Drum, where she says she took the PISA and “found it to be a remarkably sophisticated test.”

In our view, that’s a semi-embarrassing statement. The corresponding passage in her book is even worse. Here’s why:

As Ripley tells Drum, the TIMSS is designed as “a test of school curriculum.” It tests how much math a student knows. (Presumably, knowledge of math is important if we want kids to be able to learn more math as they continue in school.)

By way of contrast, the PISA advertises itself as a test of critical thinking. Its questions are more “arty” than those on the TIMSS.

We assume both tests have been judged to be valid and useful, or all those nations wouldn’t be taking them. The question remains: why did Ripley focus entirely on the PISA while throwing away the TIMSS?

In her book, she describes the process by which she came to accept the cult of the PISA. At one point, she decides to take the test for herself so she can better understand it.

Pouring on the human interest, she says she gave herself a pop quiz on the day of the test. Do you believe this story?
RIPLEY (page 20): I got there early, probably the only person in history excited to take a standardized test. The researchers who administered PISA in the United States had an office on K Street in downtown D.C., near the White House, wedged between the ;law firms and the lobbyists.

In the elevator, it occurred to me that I hadn’t actually taken a test in fifteen years. This could be embarrassing. I gave myself a quick pop quiz. What was the quadratic formula? What was the value of pi? Nothing came to mind. The elevator doors opened.
Here’s your test: Do you believe a single word of that second paragraph?

Ripley is a highly successful career achiever. Do you believe that she didn’t think about these matters until she found herself in the elevator that day? Do you believe she stood there trying to remember the value of pi?

Everything is possible! That said, we don’t believe that, any more than we believe that Ripley decided to write her book because her mind was blown by the chart that appears on page 3. (She misdescribes the chart on page 2, grossly overstating what it shows about Finland’s miraculous rise.)

Ripley is good with human interest. At times, this may produce results which aren't entirely truthful. Back to her adventure with the PISA:

For the next several pages in her book, she describes her experiences taking and self-scoring the test. She describes some items on the test. She describes the claims made by the test’s developers.

Finally, she draws a conclusion. The highlighted statement seems deeply clueless to us:
RIPLEY (page 23): After I left the building, my sense of relief faded. My score, I realized, did not bode well for teenagers in my own country. This test wasn’t easy, but it wasn’t that hard either. On one question that I’d gotten right, only 18 percent of American fifteen-year-olds were with me. There were other questions like that, which many or most of the Finns and the Koreans were getting right, just as I was, but most young Americans were getting wrong.

PISA demanded fluency in problem solving and the ability to communicate; in other words, the basic skills I needed to do my job and take care of my family in a world choked with information and subject to sudden economic change. What did it mean for a country if most of its teenagers did not do well on this test? Not all of our kids had to be engineers or lawyers, but didn’t all of them know how to think?

I still didn’t believe PISA measured everything, but I was now convinced that it measured critical thinking. The American Association of University Professors had called critical thinking “the hallmark of American education—an education designed to create thinking citizens for a free society.” If critical thinking was the hallmark, why didn’t it show itself by age fifteen?
Good God! Can a person determine, in that manner, if a test “measures critical thinking?” That would require a difficult statistical determination, even after you came to some agreement on what “critical thinking” is.

Does Ripley really believe that a non-professional can assess a major testing program in the manner she describes? That a person can sensibly become “convinced that it measures critical thinking” in this slapdash manner?

That passage seems very foolish to us. It helps stamp Ripley as a rank amateur in this basic part of her book.

We assume the TIMSS, the PIRLS and the PISA are all useful tests. Even if Ripley thinks the PISA is great, that doesn’t explain why she wouldn’t want to consider the TIMSS as well.

In her account, the TIMSS can tell us how well our eighth grade students know “the school curriculum” in math. In a book of 230 pages, why wouldn’t she want to include the fact that American eighth graders matched Finland in math on the 2011 TIMSS?

Why doesn’t she mention that fact in her book events?

Minnesota’s TIMSS scores from 2007 showed the state’s elite status. But American scores from 2011 aren’t worth mentioning. And remember—Minnesota’s scores were for the fourth grade. American eighth grade scores are no good. Eighth graders are too young!

Here’s the next chunk of Ripley’s email. We will make a quick point:
RIPLEY (continuing directly): You are right that American kids do better on TIMSS, especially in reading. And you are right that many people exaggerate our failings relative to other countries. It drives me nuts. Which is why I went to great lengths throughout the book to avoid such hyperbole.

I feel weird quoting myself, but just in case you don't believe me, this is from p. 4: "The vast majority of countries did not manage to educate all their kids to high levels, not even all of their better-off kids. Compared to most countries, the United States was typical, not much better nor much worse…Our elementary students did fine on international tests, thank you very much, especially in reading. The problems arose in math and science, and they became most obvious when our kids grew into teenagers...."
It’s true. Ripley doesn’t bash American students and teachers the way some reformers do. We don’t always agree with her judgments in this general area. But she doesn’t lambaste our ratty teachers with their infernal unions.

She does quote herself a bit selectively. On page 4 of her book, she also says this:
RIPLEY (page 4): The vast majority of countries did not manage to educate all their kids to high levels, not even all of their better-off kids. Compared to most countries, the United States was typical, not much better nor much worse. But, in a small number of countries, really just a handful of eclectic nations, something incredible was happening. Virtually all kids were learning critical thinking skills in math science and reading They weren’t just memorizing facts; they were learning how to solve problems and adapt. That is to say, they were training to survive in the modern economy.

How to explain it? American kids were better off, on average, than the typical child in Japan, New Zealand or South Korea, yet they knew far less math than those children.
Do American kids know far less math than children in New Zealand? If you look at the PISA, yes! But if you look at the TIMSS, you get a different result.

How did those countries score on the most recent TIMSS? In Grade 4 math, the United States outscored New Zealand by a substantial margin:
Grade 4 math, 2011 TIMSS, selected countries:
Finland 545
United States 541
New Zealand 486
These are the eighth-grade scores:
Grade 8 math, 2011 TIMSS, selected countries:
Finland 514
United States 509
New Zealand 487
The U.S. also outscored New Zealand at both grade levels in science. Those correlations between PISA and TIMSS don’t look real strong here!

The United States should have better schools. We should teach more math, and we should teach the joy of exploration. We should push our middle-class students more, and everyone else who can keep up. We should examine the needs of American kids from low-literacy backgrounds from their earliest preschool years on.

That said, Ripley is a Time magazine alumna. She is good at telling, selling and embellishing human interest stories.

Her book is often strong in its human interest aspects. But in basic ways we’ll continue to explore, it’s also an ungodly mess. American students deserve much better than this kind of scattershot work.

Next post: Misconstruing Minnesota

17 comments:

  1. Zzzzzzzzzzzzzz . . . .

    ReplyDelete
  2. How much do fourth grade math tests really matter? Because at a certain point in learning about mathematics, beyond mere mechanical calculation, doesn't mathematics ability converge with reading ability? And isn't verbal competence the foundation for intermediate and higher mathematics ability, as well as all the other disciplines? - E

    ReplyDelete
    Replies
    1. Not exactly. Verbal competence is a predictor of math ability but that doesn't mean math requires verbal skills. Verbal ability is the strongest correlate with IQ generally and it predicts the common factor that produces success across a variety of tests (Spearman's g). It doesn't mean that learning to read teaches you math. It means that the same ability that allows you to learn to read allows you to learn math. They are still separate domains with distinct symbol systems and content. Someone with an 800 verbal SAT will not be able to do any math if they haven't been taught any math. Also, don't pooh pooh calculation. You need basic math facts to learn higher math and fluency is really important to progress in higher level math. You forgot to mention the canard that girls are good at calculation but cannot do higher level math (boys excell there). Why don't girls superior verbal skills carry them at that point? If you can't do fourth grade math, you won't do eighth grade or later math.

      Delete
  3. One little thing I pulled out of Ripley's email to Drum (I tried to comment there) is that she says:

    "In fact, I used many different data points to decide which countries to feature in the book, including high school graduation rates, college attainment rates, spending per pupil, rankings of national competitiveness and other economic indicators, as well as test data from TIMSS, PIRLS and NAEP."

    How exactly would she the NAEP to choose which countries to focus on? Are there any other countries taking the NAEP? It's a little detail, but just one more to suggest she doesn't really care that much about getting it right.

    ReplyDelete
  4. Another detail is her use of charts from conservative economist Eric Hanushek, who is constantly citing his own un-replicated 1992 research as an example of what science knows: http://garyrubinstein.teachforus.org/2012/06/09/do-effective-teachers-teach-three-times-as-much-as-ineffective-teachers/

    Hanushek is the source Ripely's alarming chart in the first pages of her book that puts Norway way at the bottom of PISA scorers. But Norway's scores have gone down and then up from year to year.

    See: http://www.pisa.no/english/index.html:
    "In 2000 Norwegian students performed on average in an OECD context; in 2003 there was a slight decline, in 2006 they performed significantly below the OECD average and in 2009 the Norwegian results are very close to the level they were in 2000. The average figures, which accordingly are almost the same as they were in 2000, nonetheless conceal an interesting change which appears in all three subject areas in 2009: the portion of students at the lowest levels is reduced compared with 2000, and correspondingly, the portion of students at the highest levels is also reduced in reading and mathematics literacy." -E

    ReplyDelete
  5. Are all American kids taking these three international tests? I am in my 40s and I had never heard of any of them until I started reading Daily Howler. I don't remember ever taking any international standardized test, either in my public education or my private high school education. Are ghetto kids (sorry) really sitting down and taking the PISA?

    ReplyDelete
    Replies
    1. These tests are not assessing kids. They are assessing schools and school systems and national education systems. They are supposed to sample comparably and report scores for different strata so good comparisons can be made but don't test all kids. You can tell when individual kids are being assessed because you get a test score back with that kid's name attached to it. Scores are sent home to parents. In that case, all the kids are supposed to be tested.

      You probably took several tests when you were in school. You wouldn't be told whether it was one of the international ones, a district mandated achievement test, an IQ test, or something being administered for research purposes. Most kids become aware of the purpose of tests when they take the PSAT, SAT or ACT for college admission and scholarship purposes.

      Delete
  6. The chart on page 3 of Ripley's book which shows Finland "rocketing" and Norway and Italy plummeting, is from an unpublished book by Hanushek and Wossmann (forthcoming supposedly in 2013). There is no explanation of how the two economists arrived at these measures other than that they "projected kids' performance onto a common measuring stick". Performance on what, exactly? PISA tests? Other tests? .And what exactly is that "common measuring stick" and what is it supposed to measure? There is no way to tell.

    The chart is given as evidence that poverty does not influence test results, since Norway's children are presumably not poor and yet score badly (or really only average, though you would never be able to tell that from the way the chart is drawn). It seems a very strange piece of evidence indeed for such a conclusion, one decisively belied by all other, more straighforward types of evidence. - E

    ReplyDelete
  7. It's interesting how poverty and money are both the root of all evil. The gods must be crazy.

    I took the Iowa Basic Skills test in elementary school and always scored in the 99th percentile, but my school record was uneven and I never amounted to anything as an adult.

    ReplyDelete
  8. Why do you think US adults are so devoted to the idea that they were and are both smarter and better educated than the current crop of kids and young people?
    Why is this belief so widely shared among "elites" (as Mr. Somerby correctly notes)?
    It isn't true.
    So why are they proving to be so difficult to persuade?
    My own sense is this lie caught on so well because it makes "us" (adults) feel superior.
    My 5th grader does both more complex and MORE math than I did when I was 11, and I'm 51 and I went to well-regarded public high school where nearly everyone went immediately to college after high school.
    I know this because I look at his work. I'm not alone, either! A LOT of parents know this. So why are we all going along with this huge lie?
    Is it FAIR to continue to tell US kids they're dumber and less well educated than their parents? Shouldn't they get credit for the work that they do?

    ReplyDelete
    Replies
    1. It's because of the younger generation's music. And it's always been that way.

      Delete
  9. I wonder if Ms. Ripley, while riding the elevator, noticed whether there was elevator music. Ezra Klein would have. And its absence would have been noted in a tweet.

    ReplyDelete