A new technique to make A.I. evolve
In a new paper published in the scientific journal Nature, A.I. researchers at Stanford University present a new approach to the ability of A.I. to evolve. The new technique uses a virtual environment and reinforcement learning to create virtual agents able to evolve both in their physical structure and learning capacities. These important findings will be significant for the future of A.I. and robotics.
In nature, body and brain evolve together. Across many generations, every animal species has gone through numberless cycles of mutation to maintain the functions it needs in its environment.
All these species descended from the first lifeform that appeared on Earth from the selection influences caused by the environment so that the descendants of those first living beings evolved in many different ways.
However, replicating evolution is extremely difficult, especially for an A.I. system because it would have to search a very large range of possible morphologies, an extremely expensive operation to be done computationally.
A.I. researchers use several shortcuts and predesigned features to overcome some of these challenges. For example, the Lamarckian shortcut is the notion of an organism to pass physical characteristics on to its offspring that the parent organism acquired through use or disuse during its lifetime; rather than Darwinian evolution, in which A.I. agents pass on their learned parameters to their descendants. Another approach instead, is to train different A.I. subsystems separately (vision, locomotion, language, etc.) and then apply them on together in a final A.I. or robotic system. Although these approaches speed up the process and reduce the costs of training and evolving A.I. agents, they also limit the flexibility and variety of results that can be achieved.
In their new work, researchers aim to make a step closer to the real evolutionary process but keeping the costs as low as possible. “Our goal is to elucidate some principles governing relations between environmental complexity, evolved morphology, and the learnability of intelligent control”, they write in the paper.
Their new framework is called Deep Evolutionary Reinforcement Learning (DERL) where the agent uses deep reinforcement learning to acquire the skills required to maximize its goals during its lifetime. DERL uses Darwinian evolution to search the morphological space for optimal solutions, meaning that when a new generation of A.I. agents are generated, they only inherit the physical and architectural traits of their parents (along with slight mutations) but not the learned parameters.
To test the framework they used MuJoCo, a virtual environment that provides highly accurate rigid-body physics simulation. Using a design space called UNIversal aniMAL (UNIMAL), the aim is to create morphologies that learn locomotion and object-manipulation tasks in a variety of terrains.
Each agent is composed of a genotype that defines its limbs and joints. The direct descendant of each agent inherits the parent’s genotype and goes through mutations that can create new limbs, remove existing limbs, or make small modifications to their properties such as the degrees of freedom or the size of limbs.
Each agent is trained with reinforcement learning to maximize rewards in various environments. Agents whose physical structures are better suited for traversing terrain learn faster to use their limbs for moving around.
To test the system’s results, agents are tested in 3 types of terrains: flat (FT), variable (VT), and variable terrains with modifiable objects (MVT). The flat terrain puts the least selection pressure on the agents’ morphology. The variable terrains, on the other hand, force the agents to develop a more versatile physical structure that can climb slopes and move around obstacles. The MVT variant has the added challenge of requiring the agents to manipulate objects to achieve their goals.
Other approaches to evolutionary A.I. tend to converge on one solution because new agents directly inherit the physique and learnings of their parents. But in DERL, only morphological data is passed on to descendants, so the system ends up creating a different set of successful morphologies, including bipeds, tripeds, and quadrupeds with and without arms.
At the same time, the system shows traits of the Baldwin effect, which suggests that agents that learn faster are more likely to reproduce and pass on their genes to the next generation.
In conclusion, the DERL framework confirms the hypothesis that more complex environments will give rise to more intelligent agents. Evolved agents were tested across 8 different tasks, including patrolling, escaping, manipulating objects, and exploration showing that in general, agents that have evolved in variable terrains learn faster and perform better than A.I. agents that have only experienced flat terrain.
Findings that seem to be in line with another hypothesis by DeepMind researchers that a complex environment, a suitable reward structure, and reinforcement learning can eventually lead to the emergence of all kinds of intelligent behaviors.
The DERL environment only depict a fraction of the complexities of the real world but in the future, researchers will expand the range of evaluation tasks to better evaluate how the agents can enhance their ability to learn human-relevant behaviors that will have important implications for the future of A.I. and robotics.
It’s not hard to imagine how A.I. could evolve and overtake human abilities, it’s just a matter of computing power. Will A.I. become so evoluted to unintelligible for us? Will we therefore be easily deceived?