《计算机应用研究》|Application Research of Computers

基于字词混合的中文实体关系联合抽取方法

Joint extraction method of Chinese entity relationship based on mixture of characters and words

免费全文下载 (已被下载 次)  
获取PDF全文
作者 葛君伟,李帅领,方义秋
机构 重庆邮电大学 计算机科学与技术学院,重庆 400065
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2021)09-010-2619-05
DOI 10.19734/j.issn.1001-3695.2021.01.0006
摘要 针对中文关系抽取中分词时引起的边界切分出错而造成的歧义问题,以及出现实体对重叠不能提取出所涉及的多个关系问题,提出一种基于字词混合的联合抽取方法。首先,对于分词边界问题,嵌入层在词向量的基础上结合字向量,并且增加位置信息来保证字与字之间的正确顺序。其次,模型引入混合扩张卷积网络进行不同粒度、更远距离的特征提取。最后,采用分层标注方法,通过得到的主实体信息标记对应的关系和客实体,每个主实体可对应多个关系和客实体。与其他关系抽取方法在相同中文数据集上进行实验对比,实验结果表明,该方法的抽取效果最佳,并且也表现出更好的稳定性。
关键词 关系抽取; 分词; 字词混合; 边界切分; 混合扩张卷积
基金项目 国家自然科学基金面上项目(62072066)
本文URL http://www.arocmag.com/article/01-2021-09-010.html
英文标题 Joint extraction method of Chinese entity relationship based on mixture of characters and words
作者英文名 Ge Junwei, Li Shuailing, Fang Yiqiu
机构英文名 College of Computer Science & Technology,Chongqing University of Posts & Telecommunications,Chongqing 400065,China
英文摘要 Aiming at the problem of ambiguity caused by boundary segmentation errors caused by word segmentation in Chinese relation extraction, and the problem of multiple relations involved in the overlapping of entity pairs, this paper proposed a joint extraction method based on word mixing. Firstly, for the word segmentation boundary problem, the embedding layer combined word vectors on the basis of word vectors, and added position information to ensure the correct order between words. Secondly, the model introduced a hybrid expanded convolutional network for feature extraction with different granularity and longer distance. Finally, it used the hierarchical labeling method to mark the corresponding relationship and object entity through the obtained main entity information, and each main entity could correspond to multiple relations and object entities. Compared with other relation extraction methods on the same Chinese data set, the experimental results show that the extraction effect of this method is the best and also shows better stability.
英文关键词 relation extraction; word segmentation; word mixing; boundary segmentation; mixed expansion convolution
参考文献 查看稿件参考文献
 
收稿日期 2021/1/7
修回日期 2021/3/4
页码 2619-2623
中图分类号 TP391
文献标志码 A