Can you cook the books by using more accurate statistics?

That’s the question hanging over the Obama administration, now that the Census bureau has decided to change the way it assesses the number of Americans without insurance in the middle of the Obamacare rollout.

The basic problem the Census has been struggling with is how, exactly, to define “Americans without insurance.” If you ask your survey respondents “Do you currently have health insurance?” the percentage answering “No” will be a lot lower than the number of people who would say “No” to “Have you been uninsured at any point in the last year?” If you change your question to “Were you uninsured for all of last year?” the “Nos” will plunge accordingly.

The Census’s Continuing Population Survey has struggled for years with the phrasing of this question, and, when compared to other surveys of insurance coverage, has persistently overestimated the number of Americans without insurance. However, its numbers have still been commonly used, since CPS is the only survey that produces state-by-state insurance numbers across the nation.

The Census bureau did the right thing and has been investigating how to improve the accuracy of their numbers. Yuval Levin describes one of the error checks the CPS ran, and the surprising results.

In 2000, for instance, the CPS supplement introduced a simple verification question: If people had answered “no” when presented with a list of possible options for different kinds of insurance coverage on the questionnaire, then the interviewer, rather than just note them as uninsured, would say “So does this mean I should record you as uninsured?” They found that an amazing 8 percent of respondents answered “no,” and only in the wake of this verification question (which, for those who answered in the negative, was followed again by a list of insurance options) reported that they were in fact insured.

The CPS has finally found a new question, that they trust to produce reliable data, but, since they’re switching over just as Obamacare goes into effect, the methodological change may obscure the effects of Obama’s signature legislation.  As reported in the New York Times:

In the test last year, the percentage of people without health insurance was 10.6 percent when interviewers used the new questionnaire, compared with 12.5 percent using the old version. Researchers said that they had found a similar pattern in the data for different age, race and ethnic groups.

But Ezra Klein of Vox isn’t worried that the changes in the survey will make it impossible to measure the impact of the Affordable Care Act. According to Klein, the CPS changed their methodology just in time.

Politics aside, there’s a technocratic logic to this timing. The Census Bureau’s change begins with data for 2013 — meaning it starts before Obamacare does. By making the switch in 2013, there’ll be a baseline to compare obamacare to, and that baseline won’t fall apart in year two or three or four.

Unfortunately, a baseline data point is a lot less valuable than a baseline trend. The test for Obamacare isn’t just if it brings the numbers of the uninsured down, but if the new policies cause more people to sign up faster than historical data would predict. The 2013 datapoint may be a baseline measurement of coverage, but it can’t serve as a baseline for the changing trend of coverage.

The ideal solution might have been to run both questions, the old and the new, in parallel on the CPS for a period of five to 10 years. Instead of posing the improved question to all respondents, the Census employees could randomize assignments, so that a third to a half of all those surveyed answered the old, biased question, while the rest answered the new, improved question.

By asking both questions, we would be able to continue to compare the post-Obamacare data to the historical record, since we would still have access to survey data that was wrong in the same way. Meanwhile, we would have the more accurate data as a snapshot of the actual numbers of the uninsured.

It’s not uncommon to run questions in parallel, whether for different subsets of a sample or just by asking all respondents both questions. In a Health and Human Services study that tried to disentangle the effects of different methods, the researchers noted that one of the rival measures of insurance, SIPP (the Survey of Income and Program Participation), altered its methods in 1996 to cross-validate its questions, when it “started asking about coverage during the month of the interview, in addition to asking about coverage during the months prior to the interview.” This kind of continuity could have mitigated the fear that the CPS overhaul was a ploy to cook the books.

Ultimately, the timing of the CPS revisions is unfortunate, and it is probably attributable to Obamacare. But it’s the increased importance of accurate numbers on insurance, rather than any corrupt maneuvering, that caused the change to happen at one of the worst possible times. It would have been a mistake for Obama to delay the change, and sentence us to another decade of bad statistics.