《计算机应用研究》|Application Research of Computers

基于GAED-MADDPG多智能体强化学习的协作策略研究

Research on collaborative strategy based on GAED-MADDPG multi-agent reinforcement learning

免费全文下载 (已被下载 次)  
获取PDF全文
作者 邹长杰,郑皎凌,张中雷
机构 成都信息工程大学 软件工程学院,成都 610225
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2020)12-027-3656-06
DOI 10.19734/j.issn.1001-3695.2019.09.0546
摘要 目前多智能体强化学习算法多采用集中学习,分散行动的框架。该框架存在算法收敛时间过长和可能无法收敛的问题。为了加快多智能体的集体学习时间,提出多智能体分组学习策略。通过使用循环神经网络预测出多智能体的分组矩阵,通过在分组内部共享智能体之间经验的机制,提高了多智能体的团队学习效率;同时,为了弥补分组带来的智能体无法共享信息的问题,提出了信息微量的概念在所有智能体之间传递部分全局信息;为了加强分组内部优秀经验的留存,提出了推迟组内优秀智能体死亡时间的生灭过程。最后,在迷宫实验中,训练时间比MADDPG减少12%;夺旗实验中,训练时间比MADDPG减少17%。
关键词 强化学习; 群体协作; 深度学习; 群体智慧
基金项目 国家自然科学基金资助项目(61772091,61802035)
本文URL http://www.arocmag.com/article/01-2020-12-027.html
英文标题 Research on collaborative strategy based on GAED-MADDPG multi-agent reinforcement learning
作者英文名 Zou Changjie, Zheng Jiaoling, Zhang Zhonglei
机构英文名 Software College,Chengdu University of Information Technology,Chengdu 610225,China
英文摘要 At present, multi-agent reinforcement learning algorithms mostly adopt frameworks that are centralized in learning and decentralized in action. These frameworks may take too long to converge or may not converge at all. In order to speed up the collective learning time of multi-agents, this paper proposed a novel multi-agent group learning strategy. It used recurrent neural network(RNN) to predict the grouping matrix of multi-agents to share the experience between them, resulting in improved learning efficiency within the multi-agents group. Meanwhile, this paper proposed the concept of information trace to remedy the problem that the agents could not share information brought by the grouping. In order to strengthen the retention of excellent experience within the group, this paper proposed the practice of delaying the death time of excellent agents in the group. Finally, the results show that, compared to MADDPG, the training time is reduced by 12% in the labyrinth experiment and by 17% in capture the flag experiment.
英文关键词 reinforcement learning; group collaboration; deep learning; group wisdom
参考文献 查看稿件参考文献
 
收稿日期 2019/9/30
修回日期 2019/11/23
页码 3656-3661
中图分类号 TP
文献标志码 A