Thursday, January 14, 2010

Obesity is NOT levelling off in the US

If anything, we're getting fatter faster.

Over the past few days, I've been seeing and listening to reports in credible news sources, the New York Times and National Public Radio, repeating a claim by CDC researchers that the obesity epidemic in the U.S. is leveling off.

I was suspicious. I had heard essentially the same story two years ago and found that claim to be less than credible, so I decided to investigate more closely.

So here's the data as they present it - the red line that they want you to focus on represents the proportion of the adult population who are overweight, but not obese - you can see that in recent years it has turned down a bit. But the blue and green lines, which represent obese (but not extremely obese), and extremely obese have been increasing, with no sign of slowing down whatsoever.

I used exactly the same data, from the same table in their report to generate this graph.
You can see a little more clearly here (at least I think), that the proportion of people who are overweight, obese, or extremely obese has been increasing, not leveling off.

Taking a closer look at the last few years, the proportion of the population that was some degree of overweight in 1999-2000 was 69.5%. It increased 1.6% to 71.1% in 2001-2002, 0.3% to 71.4% in 2003-2004, and 2.1% to 73.5% in 2005-2006. So the latest two years of data show the largest increase in overweight in the U.S. population so far documented. (the steepness of the little lines between the bars on the graph tell the same story)

So how could the esteemed news agencies I mentioned at the top of the blog gotten the story completely wrong? Not only that, but they were able to find a plethora of obesity experts to proffer explanations and predictions based on the erroneous conclusion that the obesity epidemic in the U.S. is leveling off.

There were two fundamental misunderstandings, as far as I can excavate: first, almost everyone involved seems to have mis-understood the term "statistical significance", and second, there seems to have been a widespread (excuse the pun) misunderstanding of looking at changes in the middle of the weight distribution.

As to mis-understanding the term "statistical significance", here is the sentence from the original CDC report:
The NHANES 2005-2006 data for persons age 20 years and over suggest an increase, between the late 1980s and today, in obesity in the United States, with the estimated age-adjusted prevalence moving upward from a previous level of 23 percent in NHANES III (1988-94) to approximately 34 percent. The change between 2003-2004 and 2005-2006, however, was not statistically significant. {emphasis added}

This little phrase "not statistically significant" was mis-interpreted to mean "no change", that is, a leveling off, between the data collected in 2003/2004 and that collected in 2005/2006.
Interestingly enough, that exact same mis-interpretation was made when comparing the 2001/2002 data to the 2003/2004 data in this CDC report. Apparently, the last time they got it right was in comparing the 1988-1994 data to the 1999-2002 data, when they didn't invoke the problematic concept of statistical significance at all, in this CDC report.

But obesity rates didn't level off in their data - as I showed above - in fact it was the largest increase seen yet in this data set.


The other mis-interpretation seems to have arisen from how this report discussed being overweight as something distinct from being obese, and being obese as something distinct from being "extremely" obese. That is, when the broke up the overweight population into three sub-groups, the proportion of people who are "just" overweight has decreased in the last few years (the red line in the top graph)
Here's the offending sentence from the original report cited above:
Although the prevalence of obesity has more than doubled since 1980, the prevalence of overweight has remained stable over the same time period.

Which only makes sense if you completely ignore the fact that lots more people are becoming more than "just" overweight (i.e. obese), and many fewer people have been at a "normal" weight (the tan bars in my re-graphing). That is, they have pulled the middle of a rapidly shifting distribution out and claimed that it isn't changing, but that is misleading because the size of the weight distribution on either side of it is changing
dramatically.


I'm afraid I may sound like some punctuation-correcting nag, nipping at the heels of our most esteemed news outlets for some petty transgression all the while making it sound like the English language is on the verge of collapse into a ruin of uninterpretable gibberish.

But this isn't a punctuation or spelling error. It's getting the story completely backwards.
And I don't really blame the NYT or NPR for screwing it up. They were, after all, following the lead of our most esteemed public health agency. They were able to find plenty of esteemed experts willing to line up and go on at length about the implications of this erroneous interpretation of the findings.

I trace it back to the deeply ingrained mis-training that scientists get about what the phrase "statistical significance" means. When the British eugenicists who were followers of Darwin coined the phrase "statistical significance" in the late 1800's, they intended it to mean (at least as far as I can interpret their meaning over 100 years later) something like "significant in a statistical sense only, and quite possibly not in any other sense". So to them the phrase "not statistically significant" wouldn't have had any particular meaning at all, perhaps a study that was "not statistically significant" wasn't looking at a big difference. Perhaps there was some sort of bias hiding a real difference. Perhaps there just wasn't enough of a sample size to reveal the real difference in a statistically striking manner.
At any rate, over the years and in order to simplify the concept of "statistical significance", it has been taught as "an unlikely probability of a finding arising from chance alone", and the phrase "not statistically significance" has increasing become mis-interpreted as meaning "no difference".

And that, my friends, is why you'll find that I seldom, if ever, use the phrase "statistical significance" - because in the end it doesn't really mean anything - it's only about the statistical interpretation of a dataset, which is only one small window for understanding and interpreting numbers.

No comments:

Post a Comment