From Reward to Histone: Combining Temporal-Difference Learning and Epigenetic Inheritance for Swarm's Coevolving Decision Making

2020 Joint IEEE 10th International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob)(2020)

引用 0|浏览9
暂无评分
摘要
Applying intelligence to a group of simple robots known as swarm robots has become an exciting technology in assisting or replacing humans to fulfil complex, dangerous and harsh missions. However, building a strategy for a swarm to thrive in a dynamic environment is challenging because of control decentralisation and interactions between agents. The decision-making process in a robotic task commonly takes place in sequential stages. By understanding the subsequent action-reaction process, a strategy to make optimal decisions in a respective environment can be learnt. Hence, using the concept of epigenetic inheritance, novel evolutionary-learning mechanisms for a swarm will be discussed in this paper. Reinforcement evolutionary learning using epigenetic inheritance (RELEpi) is proposed in this article. This method utilizes reward, temporal difference and epigenetic inheritance to approximate optimal action and behaviour policies. The proposed method opens possibilities to combine reward-based learning and evolutionary methods as a stacked process where histone value is used rather than fitness function. The formulation consists of methylation and epigenetic mechanisms, inspired by the epigenome studies. The methylation process helps the accumulation of the reward to histone value of the gene. Epigenetic mechanisms give the ability to mate genetic information along with their histone value.
更多
查看译文
关键词
Epigenetic,Swarm,Coevolving,Learning,Multi-Agent,Decision-Making
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要