TY - GEN
T1 - Evolving inborn knowledge for fast adaptation in dynamic POMDP problems
AU - Ben-Iwhiwhu, Eseoghene
AU - Ladosz, Pawel
AU - Dick, Jeffery
AU - Chen, Wen Hua
AU - Pilly, Praveen
AU - Soltoggio, Andrea
N1 - Publisher Copyright:
© 2020 ACM.
PY - 2020/6/25
Y1 - 2020/6/25
N2 - Rapid online adaptation to changing tasks is an important problem in machine learning and, recently, a focus of meta-reinforcement learning. However, reinforcement learning (RL) algorithms struggle in POMDP environments because the state of the system, essential in a RL framework, is not always visible. Additionally, hand-designed meta-RL architectures may not include suitable computational structures for specific learning problems. The evolution of online learning mechanisms, on the contrary, has the ability to incorporate learning strategies into an agent that can (i) evolve memory when required and (ii) optimize adaptation speed to specific online learning problems. In this paper, we exploit the highly adaptive nature of neuromodulated neural networks to evolve a controller that uses the latent space of an autoencoder in a POMDP. The analysis of the evolved networks reveals the ability of the proposed algorithm to acquire inborn knowledge in a variety of aspects such as the detection of cues that reveal implicit rewards, and the ability to evolve location neurons that help with navigation. The integration of inborn knowledge and online plasticity enabled fast adaptation and better performance in comparison to some non-evolutionary meta-reinforcement learning algorithms. The algorithm proved also to succeed in the 3D gaming environment Malmo Minecraft.
AB - Rapid online adaptation to changing tasks is an important problem in machine learning and, recently, a focus of meta-reinforcement learning. However, reinforcement learning (RL) algorithms struggle in POMDP environments because the state of the system, essential in a RL framework, is not always visible. Additionally, hand-designed meta-RL architectures may not include suitable computational structures for specific learning problems. The evolution of online learning mechanisms, on the contrary, has the ability to incorporate learning strategies into an agent that can (i) evolve memory when required and (ii) optimize adaptation speed to specific online learning problems. In this paper, we exploit the highly adaptive nature of neuromodulated neural networks to evolve a controller that uses the latent space of an autoencoder in a POMDP. The analysis of the evolved networks reveals the ability of the proposed algorithm to acquire inborn knowledge in a variety of aspects such as the detection of cues that reveal implicit rewards, and the ability to evolve location neurons that help with navigation. The integration of inborn knowledge and online plasticity enabled fast adaptation and better performance in comparison to some non-evolutionary meta-reinforcement learning algorithms. The algorithm proved also to succeed in the 3D gaming environment Malmo Minecraft.
KW - Adaptive agent
KW - Few-shots learning
KW - Hebbian learning
KW - Lifelong learning
KW - Neuroevolution
KW - Neuromodulation
KW - Self modifying network
UR - http://www.scopus.com/inward/record.url?scp=85091742879&partnerID=8YFLogxK
U2 - 10.1145/3377930.3390214
DO - 10.1145/3377930.3390214
M3 - Conference article published in proceeding or book
AN - SCOPUS:85091742879
T3 - GECCO 2020 - Proceedings of the 2020 Genetic and Evolutionary Computation Conference
SP - 280
EP - 288
BT - GECCO 2020 - Proceedings of the 2020 Genetic and Evolutionary Computation Conference
PB - Association for Computing Machinery
T2 - 2020 Genetic and Evolutionary Computation Conference, GECCO 2020
Y2 - 8 July 2020 through 12 July 2020
ER -