《计算机应用研究》|Application Research of Computers

基于混合互信息算法的文本情感分析

Text sentiment analysis based on hybrid mutual information algorithm

免费全文下载 (已被下载 次)  
获取PDF全文
作者 王义,戴月明
机构 江南大学 物联网工程学院,江苏 无锡 214122
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2020)02-004-0337-05
DOI 10.19734/j.issn.1001-3695.2018.08.0537
摘要 针对互信息(MI)特征选择方法存在的正负相关性现象,以及未考虑特征项在不同类别内词频的问题,提出了一种混合互信息特征选择算法(hybrid mutual information,HMI)。引入逆文档频率系数和类间词频信息系数,使得整个文档中的词频信息以及每个类之间的词频信息得以有效利用;引入正负相关性系数,区分正相关性和负相关性并进行有效的利用。通过实验对比表明,混合互信息算法可以有效地提高特征选择的质量,进而提高文本情感分析的效果。
关键词 互信息; 特征选择; 正负相关性; 词频信息; 情感分析
基金项目 国家自然科学基金资助项目(61572237)
本文URL http://www.arocmag.com/article/01-2020-02-004.html
英文标题 Text sentiment analysis based on hybrid mutual information algorithm
作者英文名 Wang Yi, Dai Yueming
机构英文名 School of Internet of Things Engineering,Jiangnan University,Wuxi Jiangsu 214122,China
英文摘要 Aiming at the phenomenon of positive and negative correlation in the feature selection method of mutual information(MI) and the problem of the word frequency of the feature items in different categories hadn't been considered, this paper proposed a hybrid mutual information(HMI) feature selection algorithm. By introducing the inverse document frequency coefficient and the inter-class word frequency information coefficient, the algorithm could effectively utilize the word frequency information in the whole document and the word frequency information between each class. It introduced the positive and negative correlation coefficient to distinguish positive correlation and negative correlation and made effective use. The experimental results show that the hybrid mutual information algorithm can effectively improve the quality of feature selection and then improve the effect of text emotional analysis.
英文关键词 mutual information(MI); feature selection; positive and negative correlation; word frequency information; sentiment analysis
参考文献 查看稿件参考文献
 
收稿日期 2018/8/2
修回日期 2018/9/29
页码 337-341
中图分类号 TP391
文献标志码 A