《计算机应用研究》|Application Research of Computers

维吾尔文情感分类特征建设研究

Research on feature construction of Uyghur text sentiment classification

免费全文下载 (已被下载 次)  
获取PDF全文
作者 热西旦木·吐尔洪太,吾守尔·斯拉木
机构 1.新疆大学 信息科学与工程学院,乌鲁木齐 830046;2.伊犁师范学院 电子与信息工程学院,新疆 伊宁 835000
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2019)12-006-3548-05
DOI 10.19734/j.issn.1001-3695.2018.04.0378
摘要 由于目前缺乏维吾尔文情感分类特征表示方面的系统性研究,以传统<i>n</i>-gram特征为基础,按不同规模从维吾尔文情感标注语料库中提取了新特征及其组合特征,基于支持向量机(SVM)分类器对维吾尔文情感语料库进行了正负情感分类。实验结果表明,所提取的基本特征中unigram特征的分类效率最佳;unigram特征与词组特征的组合可以进一步提高分类效率,其最佳分类效果比unigram特征的分类效果提高了1.78%。首次在统一标注数据集上对不同特征的分类性能进行了综合评价,研究成果可以为今后的维吾尔文情感分类研究提供指导。
关键词 情感分类; 特征建设; 组合特征; 维吾尔文
基金项目 国家“973”计划资助项目(2014CB340506)
国家自然科学基金资助项目(61363063)
本文URL http://www.arocmag.com/article/01-2019-12-006.html
英文标题 Research on feature construction of Uyghur text sentiment classification
作者英文名 Raxida Turhuntay, Wushour Slamu
机构英文名 1.College of Information Science & Engineering,Xinjiang University,Urumqi 830046,China;2.College of Electronic & Information Engineering,Yili Normal University,Yili Xinjiang 835000,China
英文摘要 Due to the lack of systematic research on the feature expression of Uyghur text sentiment classification, this paper used the traditional <i>n</i>-gram features as the basis to extract new features and combined features from Uyghur sentiment corpora on different scales, and used support vector machine(SVM) classifier to classify the corpora as positive and negative. Results indicated that, in the Uyghur text sentiment classification, the unigram features in the basic features have the best classification efficiency. The combination of unigram features and phrase features can further improve the classification efficiency. The best performance of the combined features, the classification accuracy is 1.78% higher than that of unigram. This paper first made a comprehensive evaluation of the classification performance of different features on a unified data set. The research results can be applied as a reference for future Uyghur sentiment classification research.
英文关键词 sentiment classification; feature construction; combined features; Uyghur
参考文献 查看稿件参考文献
 
收稿日期 2018/4/26
修回日期 2018/6/26
页码 3548-3552
中图分类号 TP391
文献标志码 A