[ print ]

Psychologist; Assistant Professor of Marketing, Stern School of Business, NYU; Author, Drunk Tank Pink: And Other Unexpected Forces that Shape How We Think, Feel, and Behave.
Replication As a Safety Net

In 1984, New York became the first state to introduce mandatory seat belt laws. Most of the remaining states applauded the new legislation and followed suit in the 1980s and 1990s, but a small collection of researchers worried that seat belts might paradoxically license people to drive more carelessly. They believed that people drove carefully because they worried they might be seriously injured in an accident; if seat belts diminished the risk of serious injury, they would also diminish the incentive to drive carefully.

There's a danger for social scientists to rely too heavily on the concept of replication in the same way that potentially careless drivers rely too heavily on seatbelts. When we examine new hypotheses, we tolerate the possibility that approximately one in every twenty results is a fluke. If we run the experiment two or three times, and the result replicates, it's safer to assume that the original result was reliable. Students are taught that untruths will be revealed in time through replication-that flimsy results will wither under empirical scrutiny, so the enduring scientific record will reflect only those results that are robust and replicable. Unfortunately, this appealing theory crumbles in practice; just as some drivers rely too heavily on the protection of seatbelts, so psychological scientists rely too heavily on the protection of replication.

As the seatbelt illustration suggests, the problem begins when researchers behave carelessly because they rely too heavily on the theory of replication. Each experiment becomes less valuable and less definitive, so instead of striving to craft the cleanest, most informative experiment, the incentives weigh in favor of running many unpolished experiments instead. Journals are similarly more inclined to publish marginally questionable research on the basis that other researchers will test the reliability of the effect in future research.

In fact, only a limited sample of high-profile findings is replicated, because generally there's less scientific glory in overturning an old finding than in proposing a new one. With limited time and resources, researchers tend to focus on testing new ideas rather than on questioning old ones. The scientific record features thousands of preliminary findings, but relatively few thorough replications, rejoinders, and reconsiderations of those early results.

Without a graveyard of failed effects, it's very difficult to distinguish robust results from brittle flukes. The gravest consequence, then, is that our over-reliance on the theory of replication-the notion that researchers will unmask empirical untruths-is that we overestimate the reliability of the many effects that have yet to be re-examined. Replication is a critical component of the scientific process, but the illusion of replications as an antidote to flimsy effects deserves to be shattered.