Smart people often manage to avoid the cognitive errors that bedevil less well-endowed minds. But there are some kinds of foolishness that seem only to afflict the very intelligent. Worrying about the dangers of unfriendly AI is a prime example. A preoccupation with the risks of superintelligent machines is the smart person’s Kool Aid.
This is not to say that superintelligent machines pose no danger to humanity. It is simply that there are many other more pressing and more probable risks facing us this century. People who worry about unfriendly AI tend to argue that the other risks are already the subject of much discussion, and that even if the probability of being wiped out by superintelligent machines is very low, it is surely wise to allocate some brainpower to preventing such an event, given the existential nature of the threat.
Not coincidentally, the problem with this argument was first identified by some of its most vocal proponents. It involves a fallacy that has been termed "Pascal’s mugging," by analogy with Pascal’s famous wager. A mugger approaches Pascal and proposes a deal: in exchange for the philosopher’s wallet, the mugger will give him back double the amount of money the following day. Pascal demurs. The mugger then offers progressively greater rewards, pointing out that for any low probability of being able to pay back a large amount of money (or pure utility) there exists a finite amount that makes it rational to take the bet—and a rational person must surely admit there is at least some small chance that such a deal is possible. Finally convinced, Pascal gives the mugger his wallet.
This thought experiment exposes a weakness in classical decision theory. If we simply calculate utilities in the classical manner, it seems there is no way round the problem; a rational Pascal must hand over his wallet. By analogy, even if there is there is only a small chance of unfriendly AI, or a small chance of preventing it, it can be rational to invest at least some resources in tackling this threat.
It is easy to make the sums come out right, especially if you invent billions of imaginary future people (perhaps existing only in software—a minor detail) who live for billions of years, and are capable of far greater levels of happiness than the pathetic flesh and blood humans alive today. When such vast amounts of utility are at stake, who could begrudge spending a few million dollars to safeguard it, even when the chances of success are tiny?
Why do some otherwise very smart people fall for this sleight of hand? I think it is because it panders to their narcissism. To regard oneself as one of a select few far-sighted thinkers who might turn out to be the saviors of mankind must be very rewarding. But the argument also has a very material benefit: it provides some of those who advance it with a lucrative income stream. For in the past few years they have managed to convince some very wealthy benefactors not only that the risk of unfriendly AI is real, but also that they are the people best placed to mitigate it. The result is a clutch of new organizations that divert philanthropy away from more deserving causes. It is worth noting, for example, that Give Well—a non-profit that evaluates the cost-effectiveness of organizations that rely on donations—refuses to endorse any of these self-proclaimed guardians of the galaxy.
But whenever an argument becomes fashionable, it is always worth asking the vital question—Cui bono? Who benefits, materially speaking, from the growing credence in this line of thinking? One need not be particularly skeptical to discern the economic interests at stake. In other words, beware not so much of machines that think, but of their self-appointed masters.