- The American Conservative - https://www.theamericanconservative.com -

A Guide to Lies, Significant Lies, and Statistics

In today’s empirical age, the magic words in a disagreement aren’t “Simon Says” but “Studies Show.” Unfortunately, studies now show that the conventional test to demonstrate that a finding is significant turns out to be easily gameable.

In a paper in Psychological Science [1], Joseph P. Simmons, Leif D. Nelson, and Uri Simonsohn walk readers through four ways to create the illusion of a significant finding, even when the data doesn’t back you up.

For example, if you wanted to be able to publish a paper saying that something had a significant link to future income, you could just examine an enormous number of possible factors. You might collect data on race, gender, parents’ income, size of school, SAT scores, astrological sign, etc. The more comparisons you made, the greater the chance that at least one of them would register as significant, by chance alone.

Simmons, Nelson, and Simonsohn crunch the numbers and determine that, if you used all four of their tricks in one study, you’d be able to get meaningless data to register as “significant” over 60% of the time. They even run a fake study of their own (“proving” that listening to music about aging causes subjects to become younger) to demonstrate their methods in action.

The researchers conclude their paper with advice for authors and journal reviewers, but none for laypeople. Scientific research shapes the policies we recommend and the choices we make in our day-to-day lives, but how much credence should we give new findings, when we know that they may be statistical flukes (or worse, designed to deceive)?

Ultimately, the peer-reviewed journal system is, to paraphrase Churchill, the worst approach to understanding the world, except for all the others that have been tried. When we make an idol of empiricism, any flawed result or pervasive bias leaves us feeling betrayed and defiant.

Instead of thinking and talking about science as the purest form of inquiry, we might be better off thinking of it as a somewhat finicky old car. It usually gets us where we need to go, but it’s a good idea to check out the engine and be prepared to swap out or repair parts. The reforms proposed by Simmons, Nelson, and Simonsohn will keep the kludge running well enough until the next element breaks, and it’s time to work out another fix.

In our day-to-day lives, that means instead of accepting scientific results on faith, or gleefully nitpicking the methodology of inconvenient results, we should look for opportunities to replicate or test the latest results.

One of the simplest solutions is trying out new ideas with pilot programs. On a governmental level, that might mean setting up randomized controlled trials (e.g., if the Obamacare mandate applied only to people whose social security numbers were odd) and, on a personal level, designing quick and dirty experiments (e.g., seeing if you can successfully guess which days your spouse gave you decaffeinated coffee instead of caffeinated, and how much of a difference it really makes to your productivity).

When we’re active auditors of scientific results, instead of spectators, we force ourselves to think about the implications of our assumptions, because we have to make our models of the world specific enough to test. Ideally, we can make statistical research not just resilient in the face of problems, but antifragile [2]: actively strengthened by perturbations. We won’t be afraid of discovering errors, because the adjustments we’ll have to make to avoid them will make the whole edifice stronger.

Follow @leahlibresco [3]

Comments Disabled (Open | Close)

Comments Disabled To "A Guide to Lies, Significant Lies, and Statistics"

#1 Comment By vandelay On January 13, 2014 @ 9:23 am

“Instead of thinking and talking about science as the purest form of inquiry, we might be better off thinking of it as a somewhat finicky old car. It usually gets us where we need to go, but it’s a good idea to check out the engine and be prepared to swap out or repair parts.”

Yes! What first turned me off from the atheist and skeptic communities was the unhealthy fetishization of science you often find there. Science is praised and idolized to the point where any findings that can claim peer review are immediately promoted to exalted status, regardless of the integrity of the original experiment(s) or its replication (assuming it was replicated at all, as they often aren’t.)

Science is nothing but a tool, and though it’s sharper than the rest, it’s still a fairly blunt instrument. That doesn’t mean it needs to be regarded with hostility or suspicion – I think its idolization is in large part a reaction to Young Earth Creationists and such – but it’s very important to keep the proper perspective on what science is and what it’s actually capable of.

#2 Comment By balconesfault On January 13, 2014 @ 10:06 am

Perhaps tangential to this point is that often I hear cited that some study appeared in a “peer reviewed journal” as evidence that the scientific community has completely embraced its methodology and conclusions.

This isn’t really true. What it means is that the peer review panel of that particular journal reviewed the work and determined that it had merit and deserved publication – but that publication should be viewed as a portal towards presenting it towards the greater audience of those in the profession, many of whom may review it with different experiential backgrounds or presuppositions than those held by the review panel, or may try (and fail) to replicate the study, or may challenge some of the underlying assumptions or models.

In short – the peer review process is not a validation that the greater scientific community agrees with a study or approach, but rather that the reviewers for that journal believe the study or approach has sufficient merit to put it out there for the greater scientific community to look at. The list of peer reviewed conclusions that are later found flawed is pretty long, and is not an indictment of the process – but often of the way that outsiders understand the process.

#3 Comment By Johann On January 13, 2014 @ 10:37 am

Outstanding article. Scientific studies should include theoretical statisticians to set up the study and review the results. Unfortunately, this is not normally the case. Many scientists believe they can use plug and play statistical tools. The problem is that its not always or perhaps even usually not obvious what distribution should be used. And the most serious “mistake”, whether its statistical ignorance or purposeful deception, is to depart from the original statistical sampling plan, continuing to seek out more samples until one gets some desired result. The article alludes to this practice. This practice erases the original statistically based calculation of the probability of a false positive.

This part of science is most definitely broken today.

#4 Comment By EliteCommInc. On January 13, 2014 @ 1:08 pm

The age of empericism . . . kidding right?

The age in which fundamental principles which govern empirical findings have been jettisoned for “Don’t hurt my feelings l’est I accuse you of a hate crime” school of fact.

As for the APA itself, 1973 leaves serious doubt as to their research – data conclusions. I like the APA and their community, but they are hardly the ones to be crying foul these days.

But I do agree that statistical interpretation can certainly be misused.

I think no greater abuse occurs occurs in the statistical interpretation of criminal behavior.

#5 Comment By Cascade Joe On January 13, 2014 @ 2:02 pm

To vandelay from the Richard Feynman lectures: “…everything we know is only some kind of approximation, because we know that we do not know all the laws as yet. Therefore, things must be learned only to be unlearned again or, more likely, to be corrected.” (credit Wikiquotes)

I like this description better than “finicky old car” but can’t disagree with the comparison otherwise.

#6 Comment By Forester On January 13, 2014 @ 6:23 pm

Part of the problem is there is no longer any way to obtain a random sample of respondents. All the statistics in the world are useless against a sample of “friends and neighbors” or a convenience sample of whoever decides to respond to your online survey. Back in the day we used to be able to get a pretty good representation of a given population in a phone sample. Yes, there was a non-response bias, but it was relatively small. Now, with Caller ID and prohibitions against calling cell phones, phone surveys are impossible. Online surveys are better than they used to be but still far from representative for a lot of populations.

#7 Comment By Tzimiskes On January 13, 2014 @ 7:45 pm

One problem that needs to be mentioned more is that looking at individual studies usually has little value. Most everyone I’ve ever talked to in the actual university system talks about the preponderance of evidence. There is a lot of research being published and if findings are tending strongly in one direction it is pretty convincing. An individual study, or a handful, that go against the majority is far more likely to be wrong.

Not that there aren’t exceptions were an individual study turned a field on its head but this is very rare.

#8 Comment By Jonny On January 13, 2014 @ 8:15 pm

“In our day to day lives, that means instead of accepting scientific results on faith, or gleefully nitpicking the methodology of inconvenient results, we should look for opportunities to replicate or test the latest results.”

I think this largely describes the scientific process as it currently exists. I’ve always thought that one of the fundamental goals of science was admitting what we don’t know. After all, lab tests aren’t done in the real world. They can only establish *likely* causal relationships. In a best-case scenario, science tells us what is vastly more likely than anything else and permits us to reasonably agree upon truth. But it never gives us certainty.

It’s true that making decisions based on “best available data” can be discomfiting, but the only thing worse is anything else.

#9 Comment By MrsKrishan On January 14, 2014 @ 2:52 pm

Chuckle… relatedly just tripped up on this in my Twitter feed

On a more serious note, what does this mean about our propensity for favoring deliberative projects ie is not a healthy dose of scepticism needed in the exercise of our democratic acts, our jurisprudence acts, indeed re” For example, if you wanted to be able to publish a paper saying that something had a significant link to future income, for our (and increasingly the world’s) unemployed youth drowning in student debt right now, what (considering the hot mess our economy has landed itself in through trusting the monetary FIAT of Keynsian math-gurus at the Federal Reserve) what should our fiduciary acts look like? This?


#10 Comment By Charol Shakeshaft On January 15, 2014 @ 10:07 pm

While I appreciate the recent post from Ms Libresco (whose intellect I greatly admire) on the limitations of relying upon statistical significance, I would add that no serious social scientist relies upon statistical significance as a measure of anything other than the probability of the reliability of the finding. All statistical significance means is that, if you have a p of .05, and if you replicate the study 100 times, then – if sample conditions are met – you will get the same result 95 of those times. Statistical significance says nothing about the importance, the breadth, depth, accuracy, or meaning of the finding. To address those questions we turn to statistical tools that help us understand effect size or variance accounted for. We also require that the assumptions for which inferential statistics are appropriate are in place. I agree with Feynman that all we can hope for are approximations. However, I’m willing to stick with that as science unlearns, corrects, verifies, and moves forward in our understanding of the physical and social worlds. I do not agree with one of the posters that we are no longer able to draw random samples. I read random sample studies all the time. However, the conservative agenda has made it much more difficult to do good science, particularly in education, by erecting barriers for obtaining samples and for making comparisons between control and experimental groups. Nor do I understand how the gratuitous comment from elitecomminc – don’t hurt my feelings l’est I’ll accuse you of a hate crime – has anything to do with this discussion.