《计算机应用研究》|Application Research of Computers

多agent分层强化学习在数据定位中的应用研究

Application research of multi-agent layered reinforcement learning in data location

免费全文下载 (已被下载 次)  
获取PDF全文
作者 洪壮壮,万仲保,张薇,黄兆华
机构 华东交通大学 软件学院,南昌 330013
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2020)12-023-3635-05
DOI 10.19734/j.issn.1001-3695.2019.09.0527
摘要 为了在领域文本中实现数据定位,将文本视为环境,针对文本环境中存在的动态性以及不确定性等问题,提出了基于多agent分层强化学习的数据定位方法。该方法利用分层结构的特点,将系统任务分解为多个子任务,个体agent分别对对应子任务学习,以此将策略更新限制在规模较小的局部空间;同时利用多agent系统中单agent与系统远期目标的同一性,引入策略协调机制,通过agent之间交换信息来发现趋势性信息,并利用shaping技术,将在线获取的动态知识对各个agent进行趋势性启发,加快agent的收敛速度。实验将该方法应用于司法领域的判决文书上,实验结果表明:该方法能够在大规模复杂未知的文本环境中对目标数据进行高效准确定位,平均准确率与<i>F</i>值能够达到96.6%和98.2%,且具有较好的收敛速度。因此可以看出,该方法能够很好地在领域文本中实现数据定位,具有较大的理论以及实际意义。
关键词 数据定位; 文本环境; 分层强化学习; 多agent系统; 策略协调; shaping技术
基金项目 国家重点研发计划项目(2018YFC0831106)
江西省自然科学基金资助项目(20122BAB201040)
本文URL http://www.arocmag.com/article/01-2020-12-023.html
英文标题 Application research of multi-agent layered reinforcement learning in data location
作者英文名 Hong Zhuangzhuang, Wan Zhongbao, Zhang Wei, Huang Zhaohua
机构英文名 Dept. of Software Engineering,East China Jiaotong University,Nanchang 330013,China
英文摘要 In order to achieve data location in the domain text, this paper regarded the text as the environment. Aiming at the dynamic and uncertainty of the text environment, this paper proposed a data location method based on multi-agent hierarchical reinforcement learning. The method utilized the characteristics of the hierarchical structure to decompose the system tasks into multiple subtasks, and the individual agents respectively learnt the corresponding subtasks, thereby limiting the strategy update to the smaller local space. And simultaneously utilizing the multi-agent system the identity of a single agent with the system's long-term goals, introduced a policy coordination mechanism, exchanged information between agents to discover trend information, and used the sharing technique to dynamically acquire online dynamic knowledge. The agent conducted trending inspiration and speeded up the convergence of the agent. It applied the method to the judgment documents in the judicial field, and the practical application results show that the proposed method can efficiently and accurately locate the target data in a large-scale complex and unknown text environment, and the average accuracy and <i>F</i> value can reach 96.6% and 98.2%, and has a good convergence speed. Therefore, this method can well realize data location in domain text, which has great theoretical and practical significance.
英文关键词 data location; text environment; hierarchical reinforcement learning; multi-agent system; policy coordination; shaping technology
参考文献 查看稿件参考文献
 
收稿日期 2019/9/2
修回日期 2019/10/25
页码 3635-3639
中图分类号 TP391
文献标志码 A