Monday, May 10, 2010

How to lie with statistics

In graduate school, we used to call this "practicing math without a license", which is perhaps a little more generous than "how to lie with statistics" (a reference, of course, to Darrell Huff's classic book). A nice example occurs in the May 9th column by Ross Douthat in the New York Times.

Douthat draws on teen pregnancy & abortion data from the Guttmacher Institute to argue that "Conservative states may have more teen births and more divorces, but liberal states have many more abortions", which leads him to conclude that "the “red family” model" represents "an attempt, however compromised, to navigate post-sexual revolution America without relying on abortion." In order to make this argument, he points out that Connecticut has much higher abortion rates than Montana, that New York has many more abortions than does Texas, and that the rate Massachusetts easily exceeds that in Utah.

However, this is a pretty egregious form of cherry-picking. It is also true that Vermont has a much lower abortion rate than does Virginia. And New Hampshire has less than half the abortion rate than does Florida.

If one charts teen pregnancy rates against teen abortion rates, it is true that the trend line is positive (higher pregnancy rates are associated with higher abortion rates) and that the two states with the highest abortion rates are "blue" (New York and New Jersey) while the three with the lowest rates are "red" (Kentucky, South Dakota, Utah). But the trend is not nearly as strong as Douthat suggests, as the following plot shows.



Moreover, the much more striking finding that emerges is how much variation there is. Douthat wants very much to suggest that these factors must be inextricably related: that it is nearly impossible "to navigate post-sexual revolution America [successfully] without relying on abortion." But the data do not come even close to supporting this conclusion.

For example, Vermont and New Hampshire seem to do very well, thank you very much. So do Wisconsin and Minnesota. If Douthat had compared Texas to one of those states rather than to New York, his argument would fall apart. In fact, New Hampshire's pregnancy rate is about one third as high as that in Texas, and it has a lower abortion rate.

None of this is to suggest that it is not worth asking why both pregnancy rates and abortion rates show such variation across states. But Douthat's simplistic conclusion is either innocently (practicing math without a license) or disingenuously (lying with statistics) tendentious.