How Weak Is the Orthogonality Thesis?
26 July 2014 (updated 28 August 2020)
I finally got my digital hands on a Kindle copy of Nick Bostrom’s “Superintelligence: Paths, Dangers, Strategies.” And the first thing I checked was how his chapter on “Superintelligent Will” compares to his 2012 paper on “The Superintelligent Will: Motivation and Instrumental Rationality in Advanced Artificial Agents.”
This subject interests me a great deal for many reasons. One is the observation that our expectations regarding superintelligence will affect our own attempts to achieve it ourselves – to become superintelligent posthumanity. I contend that this is an implicit aim of every life-affirming theology or pantheon or posthuman projection that has existed since the dawn of history.
In his 2012 paper, Nick characterized constraints on intelligent purpose as “weak.” Here’s the text:
“Intelligent search for instrumentally optimal plans and policies can be performed in the service of any goal. Intelligence and motivation can in this sense be thought of as a pair of orthogonal axes on a graph whose points represent intelligent agents of different paired specifications. Each point in the graph represents a logically possible artificial agent, modulo some weak constraints—for instance, it might be impossible for a very unintelligent system to have very complex motivations, since complex motivations would place significant demands on memory. Furthermore, in order for an agent to “have” a set of motivations, this set may need to be functionally integrated with the agent’s decision-processes, which again would place demands on processing power and perhaps on intelligence. For minds that can modify themselves, there may also be dynamical constraints; for instance, an intelligent mind with an urgent desire to be stupid might not remain intelligent for very long.”
In response, I expressed some skepticism. It seems that anatomy, such as memory and processing capacity, actually imposes strong constraints on the possible complexity of an agent’s purpose because it precludes innumerably many purposes beyond the agent’s capacity to comprehend. It also seems that environment, as expressed in logic and physics, imposes strong dynamic constraints on the possible persistence of an agent’s purpose because it precludes innumerably many purposes that are, whether directly or indirectly, inherently self-undermining.
Maybe others shared similar feedback. Here’s the new text:
“Intelligent search for instrumentally optimal plans and policies can be performed in the service of any goal. Intelligence and motivation are in a sense orthogonal: we can think of them as two axes spanning a graph in which each point represents a logically possible artificial agent. Some qualifications could be added to this picture. For instance, it might be impossible for a very unintelligent system to have very complex motivations. In order for it to be correct to say that a certain agent ‘has’ a set of motivations, those motivations may need to be functionally integrated with the agent’s decision processes, something that places demands on memory, processing power, and perhaps intelligence. For minds that can modify themselves, there may also be dynamical constraints — an intelligent self-modifying mind with an urgent desire to be stupid might not remain intelligent for long.”
There are, in my estimation, at least two improvements in the new text. First, the constraints are no longer characterized as “weak.” Second, the anatomical constraint is described in a way that resolves my confusion about whether Nick might have been suggesting a third kind of constraint. However, Nick does still proceed as before to state:
“But these qualifications must not be allowed to obscure the basic point about the independence of intelligence and motivation, which we can express as follows: … Intelligence and final goals are orthogonal: more or less any level of intelligence could in principle be combined with more or less any final goal.”
This still seems problematic to me. Technically, indefinitely large swaths of final goal possibility space cannot be combined with indefinitely large swaths of intelligence possibility space. Practically, indefinitely large swaths of final goal possibility space cannot persist when combined with indefinitely large swaths of intelligence possibility space. Those qualifications, which Nick readily acknowledges, actually seem, strictly speaking, to defeat the basic point – not just obscure it.
That said, I think there may be a great deal of value in a slightly modified version (or interpretation) of the basic point. And perhaps this is what Nick actually intends us to understand anyway. There is much more room for variation in final goals among possible intelligences than we might intuitively imagine based on our interactions with other humans, or even with non-human animals and life on Earth. Our intuition could be blinded by anthropocentric biases in ways that may prove incredibly dangerous as we explore and build further into a superintelligent future.
If you like these thoughts, you might also like “The Semi-Orthogonality Thesis.”