The ambition of this essay is to assert, define, provide examples of, and above all else, promote debate around the below-mentioned scientific hypothesis:
The objective of maximizing fitness in a given context is sufficient to drive all of what we mean by intelligent behavior.
The above hypothesis stands in stark contrast to the driving philosophies of “good old fashioned AI,” which typically rely on instilling expert knowledge and precise problem formulations into rules that could be followed by machine agents. The fitness hypothesis, on the other hand, is a natural progression of what we have learned in the past few decades in the era of modern artificial intelligence. Experience teaches us time and time again that carefully formulated programs and problem specifications fail to match the performance of more general learning rules and simplified objectives (e.g. see Sutton’s “Bitter Lesson”), given sufficient computational resources.
Within the framework of reinforcement learning, learning agents at a sufficiently grand scale seeking only to maximize simple scalar rewards consistently outperform hand-coded expert programs as well as humans. Often enough to almost be a hallmark of the superiority of generalized learning at scale, deep reinforcement learning agents find solutions to problems that shock, bewilder, and offend their creators, even discovering strategies so creative that they would be rejected if they were developed in any other way. But accumulated rewards don’t lie. A strategy that yields significantly greater rewards, no matter how ugly or dangerous, is by definition a better solution, and therefore the product of superior problem-solving intelligence, as applied to the problem of maximizing reward.
Hiding just below the surface of the generality of reinforcement learning, however, is an even more powerful idea, and a natural progression in the quest for artificial general intelligence. This is the fitness hypothesis, a proven and promising route to general intelligence. It’s a simpler, and thus likely better, alternative to both modern and good old-fashioned approaches to AI and even has certain advantages over the reward hypothesis. Under the fitness hypothesis, we can do away with objective functions and reduce all problems to a single directive, very simple to describe, and which can be thought of as the zeroth law of intelligence: don’t stop existing.
We also suggest that agents and systems that survive through trial and error can eventually come to exhibit most, if not all, facets of intelligence, including social intelligence, cunning, creativity, language, and a sense of humor. Therefore, super-fit evolutionary agents and systems of agents could represent a powerful solution to artificial general intelligence.
We won’t need rewards where we’re going.
To fully understand the fitness hypothesis and its ramifications, we’ll need to clarify exactly what we mean by “intelligent behavior,” “fitness,” and, of course, “a given context.”
- Intelligence can be described as the ability of individuals and groups to take actions that best solve the problem of maximizing their survival.
- The objective of maximizing fitness in a given context is sufficient to drive all of what we mean by intelligent behavior.
- Fitness is defined as the measure of the ability of individuals and groups to survive in an environment.
It is important to realize that fitness itself, and thus the definition of intelligence, can change drastically across different environments. The pinnacle of intelligence used to be occupied by a variety of survival strategies employed by archaic dinosaurs. These strategies exemplified extraordinary fitness right up until the point where they didn’t, when the large and specialized body plans used by non-avian dinosaurs proved to be not so smart after all in the context of the massive meteorological disruption of the KT impact. A new standard for intelligence arose in the environmental context that followed, as small mammals became big mammals and big-brained mammals discovered how to use fire.
Now, in a world where evolutionary selection is determined by the ability to co-exist with those big-brained mammals (humans), yet another new type of intelligence has emerged. This new experimental version of intelligence being selected for is that of machine agents, under a highly variable selective pressure subject to the cultural whimsy of human research and engineering. While they may not always seem that smart, they’re sure to be intelligent so long as their behavior is favorable to selective pressure in their environment.