A study purporting to find a connection between stimulus spending and the partisanship of a district suffers from an obvious flaw. But in so doing, it provides an example of why it's important to retain some common sense -- and some sense of context -- when conducting a statistical analysis.
The study, by Veronique de Rugy of George Mason University and the National Review, claims that congressional districts which elected a Democrat to the Congress received a larger amount of stimulus finds by a margin which is statistically significant even after controlling for certain other effects like the unemployment rate. However, the study does not control for at least one other variable that is overwhelmingly important in determining the dispensation of stimulus funds.
The variable in question is in fact pretty obvious if you simply look at the districts that have received the largest amount of stimulus money, according to de Rugy's dataset.
The district that received the largest amount of stimulus funding in the 4th Quarter of 2009, according to de Rugy's tally, is California's 5th Congressional District. Is there anything notable about the 5th Congressional? Well, it is home to the state capital, Sacramento. Let's keep that in mind.
Next on the list is New York's 21st Congressional District. The largest city in the 21st is the state capital of New York, Albany.
Third is the 21st Congressional District of Texas. It contains parts of Texas' state capital, the wonderful city of Austin. (Another district that contains parts of Austin -- the 25th -- ranks 14th on de Rugy's list.)
At this point, it ought to be pretty obvious what is going on. The three districts receiving the largest amount of stimulus funds are home to the capitals of the three largest states -- New York, California, and Texas. Let's pause for a moment and make a bold prediction. I'll bet you that the district that ranks 4th on the list will contain the capital of the 4th largest state, Florida.
Bingo. Up 4th on the list is Florida's 2nd Congressional, home to Tallahassee.
Fifth is Pennsylvania's 17th, which hosts the state capital, Harrisburg.
The sixth through tenth districts contain the capital cities of other large states: Ohio, Georgia, Michigan, Illinois and New Jersey, respectively. They are followed by districts that include the state capitals of Indiana, Tennessee, Virginia -- then another part of Austin, Texas -- then Arizona, Missouri, North Carolina and Wisconsin. Finally, in 19th place is South Carolina's 3rd Congressional District, which does not host a state capital. (Ironically, it has elected a Republican -- J. Gresham Barrett -- to the Congress).
This, of course, makes perfect sense. A lot of stimulus funds are distributed to state agencies, which are then responsible for allocating and administering the funds to the presumed benefit of citizens throughout the state. These state agencies, of course, are usually located in or near the state capital.
In fact, the differences are pretty overwhelming. There are 78 congressional districts that contain all, or part, of a state's capital city. Collectively, they received $118 billion in the fourth quarter. The 357 districts that are not home to a state capital received only $48 billion, however. On a per-district basis, the districts with state capitals received 11 times more funding. The ratio would be higher still if we excluded districts that included only outlying areas of state capital cities that do not host any state governmental institutions.
The other piece of the puzzle, of course, is that state capitals are much more likely to elect Democrats to Congress for a variety of reasons. They are, by definition, urban (although some smaller state capitals like Montpelier stretch the definition). They are, by definition, home to large numbers of governmental employees, who may be more sympathetic to bigger government. They tend to be highly educated and often are home to large state universities.
That de Rugy has testified before Congress on the basis of her evidence, and never paused to consider why the top five congressional districts on her list overlap with Sacramento, Albany, Austin, Tallahassee and Harrisburg, is mind-boggling. The presence of a state capital is the overwhelmingly dominant factor it predicting the dispensation of stimulus funds. This could have been discerned in literally five minutes if she had bothered to look at the apparent outliers in her dataset and considered whether they had anything in common -- a practice that should be among the first things that any researcher does when evaluating any dataset.
By the way -- if you throw out the districts that are home to state capitals, those which elected Democratic members to Congress still rank higher, receiving 31 percent more stimulus funds, on average, than those which elected Republicans. So, perhaps there is hope for her analysis yet. At that point, it would become important to consider other variables such as the economic conditions within each district. I'm not going to do her work for her, but I would suggest to de Rugy that she consider the following recommendations to correct other flaws with her research design:
First, she should look at unemployment rates at a district-by-district level, which are available through the American Community Survey, rather than at a metropolitan area level as she has done. Unemployment rates are often much higher in poor, downtrodden, inner cores of cities (which are also much more likely to elect Democrats to Congress), as opposed to suburbs and outlying areas.
Next, I would suggest that she look at a more robust array of demographic variables, such as the urban-rural distribution, the poverty rate (as opposed to just average income), the population, the number of seniors and children, and perhaps the racial composition. Were she have to considered these variables initially (particularly the urban/rural distribution), they may have nullified her conclusion, even without accounting for the presence of state capitals.
But my bet is that this is all a bunch of noise resulting from an incomplete -- and possibly deliberately biased -- research design. If de Rugy follows my recommendations -- excludes state capitals, accounts for a broader array of demographic variables, and evaluates unemployment rates at the district level -- and still finds a statistically significant positive relationship between the distribution of stimulus funds and whether the district elected a Democrat to Congress, I will buy her and three of her colleagues lunch anywhere in Washington or New York City. And if such a study is published in a credible, peer-reviewed journal, I will buy them dinner as well.