Can non-benevolent super-intelligence persist?
31 October 2010 (updated 7 March 2022)
Recently, Joshua Fox gave an insightful presentation on why “super-intelligence does not imply benevolence.” Thanks to Michael Anissimov for bringing this to my attention.
Joshua identifies two kinds of benevolence: instrumental and axiomatic. And he argues that neither is a necessary outcome for super-intelligence.
He observes that instrumental benevolence results from others’ capacity to monitor, punish, and reward. And we may be incapable of doing that for super-intelligence.
He further observes that neither basic goals nor complex goals necessarily result in benevolence. Simple goals could result in consumption of that which humans value. And complex goals could result in any of many universal atomic configurations that are incompatible with human welfare. Thereby, he illustrates risks in axiomatic intelligence.
His conclusion is that we should not rely on the spontaneous emergence of benevolence in super-intelligence. But we should, instead, work carefully to engineer benevolence into super-intelligence.
I completely agree with Joshua’s conclusion. The risks associated with super-intelligence are such that we should take great care in its engineering. And we should certainly not simply assume that its behavior would necessarily turn out benevolent. Neglect, here, is far too risky.
On the other hand, I don’t agree with or at least question some of Joshua’s arguments against the possibility of emergent benevolence. He used human treatment of non-human animals, particularly horses, as an example of the limitation of instrumental benevolence. However, the increasing influence of the animal rights movement suggests that he may not be considering the matter broadly enough.
Regarding axiomatic benevolence, he seems to overlook some inherent complexities associated with simple goals. For example, could a super-intelligence succeed at winning chess or wars, making money, or curing cancer, if its pursuit of these goals eradicates the context that makes possible their pursuit? If so, is such a self-defeating intelligence really a super-intelligence? That’s not at all clear.
Finally, Joshua suggests that a goal-seeking super-intelligence would not learn to adjust its goal toward benevolence. And he offers as reasoning only that a super-intelligence would not consist of paradoxically competing or contradicting goals that would result in such change over time, as is the case with many humans.
That’s a weak position. Natural intelligence, and even existing computer programs, already illustrate competing goals, which adjust their behavior depending on context. Again, would inflexible intelligence actually qualify as super-intelligence? That’s not clear.
The part of Joshua’s presentation that was most interesting to me was his discussion of how most complex goals probably are not compatible with benevolence. To underscore this idea, he calls to our attention the many possible organizations of the atoms in our universe. Among them, there’s a very narrow set within which humanity can survive.
Of course, we shouldn’t confuse general benevolence with necessary human survival. However, the argument is emotionally compelling given that we’re considering benevolence from a human perspective.
This leads to an important question at the heart of the Benevolence Argument of the New God Argument. Can human civilization survive long enough to become posthuman without increasing in benevolence? What are the probable consequences, if we do not increase in benevolence? Can we attain one of the dystopian posthuman scenarios against which Nick Bostrom warns us?
Joshua joins in warning us against such scenarios. And I empathize with the concern, even to the point of considering the concern well worth our serious attention.
However, I’m still divided on the question of whether it’s even possible for us to survive long enough to become posthuman if we don’t continue to increase in benevolence. I’m skeptical of the capacity for persistence of the dystopian scenarios. I’m skeptical of the capacity for persistence of a super-intelligence that is not flexible enough to react to the same evolutionary pressures that have directed humans toward increasing benevolence.
In the end, though, we want neither the dystopian nor the destructive. Accordingly, whether or not dystopian persistent is possible, we should put our practical working trust in the possibility of a benevolent posthuman future.