《计算机应用研究》|Application Research of Computers

基于加权融合字词向量的中文在线评论情感分析

Chinese online comments sentiment analysis based on weighted char-word mixture word representation

免费全文下载 (已被下载 次)  
获取PDF全文
作者 张小艳,白瑜
机构 西安科技大学 计算机科学与技术学院,西安 710600
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2022)01-005-0031-06
DOI 10.19734/j.issn.1001-3695.2021.06.0253
摘要 随着社交网络平台的广泛使用,涌现出大量蕴涵丰富情感信息的在线评论文本,分析评论中表达的情感对企业、平台等具有重要意义。为了解决目前针对在线评论短文本情感分析中存在特征提取能力弱以及忽略短文本本身情感信息的问题,提出一种基于文本情感值加权融合字词向量表示的模型——SVW-BERT模型。首先,基于字、词级别向量融合表示文本向量,最大程度获取语义表征,同时考虑副词、否定词、感叹句及疑问句对文本情感的影响,通过权值计算得到文本的情感值,构建情感值加权融合字词向量的中文短文本情感分析模型。通过网络平台在线评论数据集对模型的可行性和优越性进行验证。实验结果表明,字词向量融合特征提取语义的能力更强,同时情感值加权句向量考虑了文本本身蕴涵的情感信息,达到了提升情感分类能力的效果。
关键词 在线评论; 情感分析; 字词向量; BERT; 情感值; 支持向量机
基金项目 国家自然科学青年基金资助项目(61902311)
本文URL http://www.arocmag.com/article/01-2022-01-005.html
英文标题 Chinese online comments sentiment analysis based on weighted char-word mixture word representation
作者英文名 Zhang Xiaoyan, Bai Yu
机构英文名 College of Computer Science & Technology,Xi'an University of Science & Technology,Xi'an 710600,China
英文摘要 The widespread use of social networking platforms has led to the emergence of emotionally rich online comment texts, analyzing the emotions expressed in comments is of great significance to companies, platforms, etc. In order to solve the current problem of weak feature extraction ability and ignoring the emotional information of short text in online comment short text sentiment analysis, this paper proposed a model based on text sentiment value weighted char-word mixture word representation-SVW-BERT. First, it based on the fusion of character and word level vectors represented text vectors for maximizing semantic representation. At the same time, considering the influence of adverbs, negative words, exclamation sentences and interrogative sentences on the sentiment of the text, it used the weight to calculate the sentiment value of the text, and constructed sentiment analysis model of Chinese short text based on text sentiment value weighted char-word mixture word representation. Through the network platform online reviews data set, it validated the feasibility and the advantages of the model. The experimental results show that the char-word mixture word representation is stronger in semantic extraction, and the sentiment value weighted sentence vector considers the sentiment information contained in the text itself, which achieves the effect of improving the ability of sentiment classification.
英文关键词 online comments; sentiment analysis; char-word representation; BERT; sentiment value; SVM
参考文献 查看稿件参考文献
 
收稿日期 2021/6/22
修回日期 2021/8/17
页码 31-36
中图分类号 TP391
文献标志码 A