[ print ]

Assistant Professor, Cognitive, Linguistic and Psychological Sciences, Brown University
Big Effects Have Big Explanations

Many scientists are seduced by a two-step path to success: First identify a big effect and then find the explanation for it. Although not often discussed, there is an implicit theory behind this approach. The theory is that big effects have big explanations. This is critical because scientists are interested in the explanations, not in the effects—Newton is famous not for showing that apples fall, but for explaining why. So, if the implicit theory is wrong, then a lot of people are barking up the wrong trees.

There is, of course, an alternative and very plausible source of big effects: Many small explanations interacting. As it happens, this alternative is worse than the wrong tree—it's a near-hopeless tree. The wrong tree would simply yield a disappointingly small explanation. But the hopeless tree has so many explanations tangled in knotted branches that extraordinary effort is required to obtain any fruit at all.

So, do big effects tend to have big explanations, or many explanations? There is probably no single, simple and uniformly correct answer to this question. (It's a hopeless tree!) But, we can use a simple model to help make an educated guess.

Suppose that the world is composed of three kinds of things. There are levers we can pull. Pulling these levers cause observable effects: Lights flash, bells ring, and apples fall. Finally, there is a hidden layer of causal forces—the explanations—that connect the levers to their effects.

In order to explore this toy world I simulated it on my laptop. First, I created one thousand levers. Each lever activated between one and five hidden mechanisms (200 levers activated just one mechanism each, another 200 activated two, etc.). In my simulation, each mechanism was simply a number drawn from a normal distribution with a mean of zero. Then, the hidden mechanisms activated by each lever were summed to produce an observable effect. So, 200 of the levers produced effects equal to a single number drawn from a normal distribution, another 200 levers produced effects equal to the sum of two such numbers, and so forth.

After this was done, I had a list of 1,000 effects of varying size. Some were large (very negative, or very positive), while others were small (close to zero). First I looked at the 50 smallest effects, curious to see how many of them resulted from a single, isolated mechanism: 11 out of 50. Then I checked how many of them were the result of five mechanisms, summed together: 6 out of 50. On the whole, the very smallest effects tended to have fewer explanations.

Next I looked at the 50 largest effects. These effects were much larger—about 100 times larger, on average. But they also tended to have many more explanations. Among those 50 largest effects, 25 of them had five explanations, but not even one of them had a single explanation. The first such single-explanation-effect was ranked 103 in size. (These examples help to make my point tangible, but its essence can be captured more succinctly: The standard deviation of the sum of two uncorrelated random variables is greater than the standard deviation of either individually).

So, if a scientist's exclusive goal were simplicity, then in my toy world she ought to avoid the very biggest effects and instead pursue the smallest ones. Yet, she might feel cheated because this method would only identify explanations of tremendously little influence. As a crude method of balancing simplicity (few explanations) against influence (big explanations), I computed a sort of "expected value" of experimentation for different effect sizes: The probability of finding a one-cause-effect, multiplied by the size of the effect in question. As you might guess, the highest expected values tend to fall towards the middle of the range of effect sizes. Balance, it seems, finds a soul mate in modesty.

Now, there are some caveats to my back of the envelope calculations. Most scientists are capable of working out causal mechanisms that have more than one dimension. (Some can even handle five!) Also, the actual causal mechanisms that scientists investigate are far more complicated than my model allows for. One explanation may be related to many effects, multiple explanations combine with each other nonlinearly, explanations may be correlated, and so forth.

Still, there is value in retiring the implicit theory that we should pursue the largest effects most doggedly. I suspect that every scientist has her own a favorite example of the perils of this theory. In my field, lakes of ink have been spilled attempting to find "the" explanation for why people consider it acceptable to redirect a speeding trolley away from five people and towards one, but not acceptable to hurl one person in front of a trolley in order to stop it from hitting five. This case is alluring because the effect is huge and its explanation is not all obvious. With the benefit of hindsight, however, there is considerable agreement that it does not have just one explanation. In fact, we have tended to learn more from studying much smaller effects with a key benefit: a sole cause.

It is natural to praise research that delivers large effects and the theories that purport to explain them. And this praise is often justified—not least because the world has large problems that demand ambitious scientific solutions. Yet science can advance only at the rate of its best explanations. Often, the most elegant ones are clothed around effects of modest proportions.