《计算机应用研究》|Application Research of Computers

基于上下文融合的文档级事件抽取方法

Document level event extraction method based on context fusion

免费全文下载 (已被下载 次)  
获取PDF全文
作者 葛君伟,乔蒙蒙,方义秋
机构 重庆邮电大学 计算机科学与技术学院,重庆 400065
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2022)01-008-0048-06
DOI 10.19734/j.issn.1001-3695.2021.06.0212
摘要 基于句子级别的抽取方法不足以解决中文事件元素分散问题。针对该问题,提出基于上下文融合的文档级事件抽取方法。首先将文档分割为多个段落,利用双向长短期记忆网络提取段落序列特征;其次采用自注意力机制捕获段落上下文的交互信息;然后与文档序列特征融合以更新语义表示;最后采用序列标注方式抽取事件元素并匹配事件类型。与其他事件抽取方法在相同的中文数据集上进行对比,实验结果表明,该方法能有效抽取文档中分散的事件元素,并提升模型的抽取性能。
关键词 事件抽取; 序列标注; 特征提取; 事件元素; 上下文融合
基金项目 国家自然科学基金面上项目(62072066)
本文URL http://www.arocmag.com/article/01-2022-01-008.html
英文标题 Document level event extraction method based on context fusion
作者英文名 Ge Junwei, Qiao Mengmeng, Fang Yiqiu
机构英文名 College of Computer Science & Technology,Chongqing University of Posts & Telecommunications,Chongqing 400065,China
英文摘要 The sentence level extraction method is insufficient to solve the problem of Chinese event element dispersion. To solve this problem, this paper proposed a document level event extraction method based on context fusion. Firstly, the paper divided the document into paragraphs, and used bidirectional long and short memory network to extract sequence features of paragraphs. Secondly, the method used self-attention mechanism to capture the interaction information of paragraph context. Then the method combined the document sequence features with the interaction information to update the semantic representation. Finally, the method used sequence annotation to extract event elements and match event types. Compared with other event extraction methods on the same Chinese data set, the experimental results show that this method can effectively extract scattered event elements from documents, and improve the extraction performance of the model.
英文关键词 event extraction; sequence labeling; feature extraction; event element; context fusion
参考文献 查看稿件参考文献
 
收稿日期 2021/6/6
修回日期 2021/7/26
页码 48-53
中图分类号 TP391
文献标志码 A