《计算机应用研究》|Application Research of Computers

融合句子情感和主题相似性的中文新闻文本情感摘要

Chinese news text opinion summarization based on integrating sentences opinion and topic similarity

免费全文下载 (已被下载 次)  
获取PDF全文
作者 王玮,欧阳纯萍,阳小华,罗凌云,刘志明
机构 南华大学 计算机科学与技术学院,湖南 衡阳 421001
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2017)12-3543-04
DOI 10.3969/j.issn.1001-3695.2017.12.005
摘要 新闻文本情感摘要是指通过提炼、浓缩而产生表达文本全局情感意见的摘要,旨在帮助人们快速获取文本的情感倾向。现有的文本摘要方法仅考虑主题及句子特征等因素,无法获取带有情感意见的文本摘要。针对这一问题,提出了融合句子情感和主题相似性的中文新闻文本情感摘要。首先,对文本中的句子进行情感标注;然后,在LexRank算法中加入情感信息计算句子相似度;最后,根据新闻标题的特殊性计算句子与标题的相似性,再综合以上步骤的结果得到最终的情感摘要。实验结果表明,在ROUGE-1、ROUGE-2和ROUGE-W三个指标上,该方法比传统的LexRank算法均有提升,证明了同时考虑情感信息和主题信息能够更加有效地生成体现文本主要观点、情感的情感摘要。
关键词 情感摘要;句子情感;LexRank;句子特征;主题相似性
基金项目 国家自然科学基金资助项目(61402220,61502221)
本文URL http://www.arocmag.com/article/01-2017-12-005.html
英文标题 Chinese news text opinion summarization based on integrating sentences opinion and topic similarity
作者英文名 Wang Wei, Ouyang Chunping, Yang Xiaohua, Luo Lingyun, Liu Zhiming
机构英文名 SchoolofComputerScience&Technology,UniversityofSouthChina,HengyangHunan421001,China
英文摘要 News opinion summarization aims to produce opinions abstract via refining the text with emotional information, which helps people to know the theme content and tendency of opinions quickly. However, the existing methods only consider the theme and the characteristics of the sentence, which can not get a summary of the text with emotional comments. To address the above problem, this paper presented a method of integrating sentence emotion and topic similarity for Chinese news text opinion summarization. Firstly, it annotated the opinion information of sentences. Secondly, it added opinion information to the LexRank algorithm to compute sentence similarity. Finally, according to the special characteristics of the news title, it calculated the similarity between the sentence and the title. The results of the above three steps were taken into account to generate opinion summary. The results of experiment show that this method is more effective than the classic LexRank algorithm on ROUGE-1、ROUGE-2 and ROUGE-W. In addition, it also represents that considering both the emotion and theme can help generating opinion summary effectively.
英文关键词 opinion summarization; sentence emotion; LexRank; sentence features; thematic similarity
参考文献 查看稿件参考文献
  [1] 秦兵, 刘挺, 李生. 多文档自动文摘综述[J] . 中文信息学报, 2005, 19(6):13-20.
[2] 胡侠, 林哗, 王灿, 等. 自动文本摘要技术综述[J] . 情报杂志, 2010, 29(8):144-147.
[3] Luhn H P. The Automatic creation of literature abstracts[J] . IBM Journal of Research and Development, 1958, 2(2):159-165.
[4] Zhuang Li, Jing Feng, Zhu Xiaoyan. Movie review mining and summarization[C] //Proc of the 15th ACM International Conference on Information and Knowledge Management. New York:ACM Press, 2006:43-50.
[5] Bahrainian S A, Dengel A. Sentiment analysis and summarization of Twitter data[C] //Proc of the 16th International Conference on Computational Science and Engineering. 2013:227-234.
[6] Zhang Dongmei, Dong Hongan, Li Sheng’en, et al. Opinion summarization of customer reviews[C] //Proc of International Conference on Automatic Control and Artificial Intelligence. 2013:1476-1479.
[7] 程园, 吾守尔·斯拉木, 买买提依明·哈斯木. 基于综合的句子特征的文本自动摘要[J] . 计算机科学, 2015, 42(4):226-229.
[8] 荀静, 刘培玉, 杨玉珍, 等. 基于潜在狄利克雷分布模型的多文档情感摘要[J] . 计算机应用, 2014, 34(6):1636-1640.
[9] 王俊丽, 魏绍臣, 管敏. 基于图排序算法的自动文摘研究综述[J] . 计算机科学, 2015, 42(12):1-7, 39.
[10] 纪文倩, 李舟军, 巢文涵, 等. 一种基于LexRank算法的改进的自动文摘系统[J] . 计算机科学, 2010, 37(5):151-154.
[11] 熊娇, 王明文, 李茂西, 等. 基于词项_句子_文档三层图模型的多文档自动摘要[J] . 中文信息学报, 2014, 28(6):201-207.
[12] 林莉媛, 王中卿, 李寿山, 等. 基于PageRank的中文多文档文本情感摘要[J] . 中文信息学报, 2014, 28(2):85-90.
[13] Erkan G, Radev D R. LexRank:graph-based lexical centrality as salience in text summarization[J] . Journal of Artificial Intelligence Research, 2004, 22(1):457-479.
[14] Baxendale E. Machine-made index for technical literature an experiment[J] . IBM Journal of Research and Development, 1958, 12(4):354- 361.
收稿日期 2016/9/6
修回日期 2016/11/3
页码 3543-3546
中图分类号 TP391.1
文献标志码 A