《计算机应用研究》|Application Research of Computers

融合语言特征的抽象式中文摘要模型

Abstractive Chinese summarization model with linguistic features

免费全文下载 (已被下载 次)  
获取PDF全文
作者 胡德敏,王荣荣
机构 上海理工大学 光电信息与计算机工程学院,上海 200093
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2020)02-007-0351-04
DOI 10.19734/j.issn.1001-3695.2018.07.0531
摘要 为了解决传统抽象式摘要模型生成的中文摘要难以保存原文本语义信息的问题,提出了一种融合语言特征的抽象式中文摘要模型。模型中添加了拼接层,将词性、命名实体、词汇位置、TF-IDF等特征拼接到词向量上,使输入模型的词向量包含更多的维度的语义信息来确定关键实体。结合指针机制有选择地复制原文中的关键词到摘要中,从而提高生成的摘要的语义相关性。使用LCSTS新闻数据集进行实验,取得了高于基线模型的ROUGE得分。分析表明本模型能够生成语义相关度较高的中文摘要。
关键词 抽象式摘要模型; 语言特征; 关键实体; 词向量
基金项目 国家自然科学基金资助项目(61170227,61472256)
上海市教委科研创新重点资助项目(12zz17)
上海市一流学科建设项目(S1201YLXK)
本文URL http://www.arocmag.com/article/01-2020-02-007.html
英文标题 Abstractive Chinese summarization model with linguistic features
作者英文名 Hu Demin, Wang Rongrong
机构英文名 School of Optical-Electrical & Computer Engineering,University of Shanghai for Science & Technology,Shanghai 200093,China
英文摘要 In order to solve the problem that the Chinese summarization generated by traditional abstractive models can hardly preserve the semantic information of the original text, this paper proposed an abstractive Chinese summarization model with linguistic features. This model added a connection layer, and spliced features such as part of speech, named entity, word position, and TF-IDF into the word vector, so that the word vector of the input model contained more semantic information to determine the key entity. The pointer mechanism allowed model selectively copied the keywords in source text into the summarization to improve the semantic relevance between source text and summarization. This paper evaluated this model on LCSTS dataset, and obtained a higher ROUGE score than the baseline model. The analysis result shows that the model can generate Chinese summarization with higher semantic relevance.
英文关键词 abstractive summarization model; linguistic features; key entities; word vector
参考文献 查看稿件参考文献
 
收稿日期 2018/7/28
修回日期 2018/9/12
页码 351-354,369
中图分类号 TP391.1
文献标志码 A