《计算机应用研究》|Application Research of Computers

CNN-ELM混合短文本分类模型

Hybrid CNN-ELM model for short text classification

免费全文下载 (已被下载 次)  
获取PDF全文
作者 韩众和,夏战国,杨婷
机构 1.中国矿业大学 计算机科学与技术学院,江苏 徐州 221000;2.中科院电子学研究所苏州研究院,江苏 苏州 215000
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2019)03-004-0663-05
DOI 10.19734/j.issn.1001-3695.2017.09.0930
摘要 针对目前自然语言处理研究中,使用卷积神经网络(CNN)进行短文本分类任务时可以结合不同神经网络结构与分类算法以提高分类性能的问题,提出了一种结合卷积神经网络与极速学习机的CNN-ELM混合短文本分类模型。使用词向量训练构成文本矩阵作为输入数据,然后使用卷积神经网络提取特征并使用Highway网络进行特征优化,最后使用误差最小化极速学习机(EM-ELM)作为分类器完成短文本分类任务。与其他模型相比,该混合模型能够提取更具代表性的特征并能快速准确地输出分类结果。在多种英文数据集上的实验结果表明,提出的CNN-ELM混合短文本分类模型比传统机器学习模型与深度学习模型更适合完成短文本分类任务。
关键词 文本分类;卷积神经网络;极速学习机
基金项目 国家自然科学基金资助项目(61572506)
本文URL http://www.arocmag.com/article/01-2019-03-004.html
英文标题 Hybrid CNN-ELM model for short text classification
作者英文名 Han Zhonghe, Xia Zhanguo, Yang Ting
机构英文名 1.CollegeofComputerScience&Technology,ChinaUniversityofMining&Technology,XuzhouJiangsu221000,China;2.InstituteofElectronics,ChineseAcademyofScience,SuzhouJiangsu215000,China
英文摘要 In current natural language processing research, people can combine different neural network structure and classification algorithm when using convolution neural network (CNN) to conduct text classification tasks so as to improve the classification performance. Thus, this paper proposed a hybrid CNN-ELM model for short text classification. Firstly, the model used word vectors to represent sentence as the input data. Secondly, it extracted features through CNN and completed features optimization with Highway network. Finally, it used error minimization extreme learning machine (EM-ELM) as a classifier to complete text classification task. Compared with other models, the proposed model could extract more representative features and output classification results more quickly and accurately. According to the experimental results in various English data sets, the proposed model is more suitable for short text classification tasks than traditional machine learning models and deep learning models.
英文关键词 text classification; convolutional neural network; extreme learning machine
参考文献 查看稿件参考文献
  [1] Cover T M, Thomas J A. Elements of information theory[M] . 2nd ed. New Jersey:Wiley, 2012.
[2] Cai Lijuan, Hofmann T. Text categorization by boosting automatically extracted concepts[C] //Proc of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval. New York:ACM Press, 2003:182-189.
[3] Hingmire S, Chougule S, Palshikar G K, et al. Document classification by topic labeling[C] //Proc of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York:ACM Press, 2013:877-880.
[4] Bengio Y, Ducharme R, Vincent P, et al. A neural probabilistic language model[J] . Journal of Machine Learning Research, 2003, 3(2):1137-1155.
[5] Mikolov T, Sutskever I, Chen Kai, et al. Distributed representations of words and phrases and their compositionality[C] //Proc of the 26th International Conference on Neural Information Processing Systems. [S. l. ] :Curran Associates Inc. , 2013:3111-3119.
[6] Kim Y. Convolutional neural networks for sentence classification[C] //Proc of Conference on Empirical Methods in Natural Language Processing. 2014:1746-1751.
[7] Mandelbaum A, Shalev A. Word embeddings and their use in sentence classification tasks[EB/OL] . [2017-11-30] . https://arxiv. org/abs/1610. 08229.
[8] 陈钊, 徐睿峰, 桂林, 等. 结合卷积神经网络和词语情感序列特征的中文情感分析[J] . 中文信息学报, 2015, 29(6):172-178. (Chen Zhao, Xu Ruifeng, Gui Lin, et al. Combining convolutional neural networks and word sentiment sequence features for Chinese text sentiment classification[J] . Journal of Chinese Information Processing, 2015, 29(6):172-178. )
[9] 刘龙飞, 杨亮, 张绍武, 等. 基于卷积神经网络的微博情感倾向性分析[J] . 中文信息学报, 2015, 29(6):159-165. (Liu Longfei, Yang Liang, Zhang Shaowu, et al. Convolutional neural networks for Chinese micro-blog emotional tendency identification[J] . Journal of Chinese Information Processing, 2015, 29(6):159-165. )
[10] 肜博辉, 付琨, 黄宇, 等. 基于多通道卷积神经网的实体关系抽取[J] . 计算机应用研究, 2017, 34(3):689-692. (Rong Bohui, Fu Kun, Huang Yu, et al. Relation extraction based on multi-channel convolutional neural network[J] . Application Research of Compu-ters, 2017, 34(3):689-692. )
[11] Ebert S, Vu N T, Schütze H. CIS-positive:combining convolutional neural networks and svms for sentiment analysis in Twitter[C] //Proc of the 9th International Workshop on Semantic Evaluation. 2015:527-532.
[12] Matsugu M, Mori K, Suzuki T. Face recognition using SVM combined with CNN for face detection[C] //Proc of International Confe-rence on Neural Information Processing. Berlin:Springer, 2004:356-361.
[13] Huang Guangbin, Zhou Hongming, Ding Xiaojian, et al. Extreme learning machine for regression and multiclass classification[J] . IEEE Trans on Systems, Man, and Cybernetics, Part B:Cyberne-tics, 2012, 42(2):513-529.
[14] Huang Guangbin, Zhu Qinyu, Siew C K. Extreme learning machine:a new learning scheme of feedforward neural networks[C] //Proc of IEEE International Joint Conference on Neural Networks. Piscataway, NJ:IEEE Press, 2004:985-990.
[15] Yu Jiasheng, Chen Jin, Xiang Z Q, et al. A hybrid convolutional neural networks with extreme learning machine for WCE image classification[C] //Proc of IEEE Conference on Robotics and Biomimetics. Piscataway, NJ:IEEE Press, 2015:1822-1827.
[16] Feng Guorui, Huang Guangbin, Lin Qingping, et al. Error minimized extreme learning machine with growth of hidden nodes and incremental learning[J] . IEEE Trans on Neural Networks, 2009, 20(8):1352-1357.
[17] Srivastava R K, Greff K, Schmidhuber J. Highway networks[EB/OL] . [2017-11-30] . https://arxiv. org/abs/1505. 00387.
[18] Srivastava R K, Greff K, Schmidhuber J. Training very deep networks[C] //Advances in Neural Information Processing Systems. 2015:2377-2385.
[19] Kim Y, Jernite Y, Sontag D, et al. Character-aware neural language models[C] //Proc of the 30th AAAI Conference on Artificial Intelligence. 2016:2741-2749.
[20] Hsu W N, Zhang Yu, Lee A, et al. Exploiting depth and highway connections in convolutional recurrent deep neural networks for speech recognition[C] //Proc of InterSPEECH. 2016:395-399.
[21] Collobert R, Weston J, Bottou L, et al. Natural language processing (almost) from scratch[J] . Journal of Machine Learning Research, 2011, 12:2493-2537.
[22] Pang Bo, Lee L. A sentimental education:sentiment analysis using subjectivity summarization based on minimum cuts[C] //Proc of the 42nd Annual Meeting on Association for Computational Linguistics. Stroudsburg, PA :Association for Computational Linguistics, 2004:271.
[23] Wallace B C, Choe D K, Kertz L, et al. Humans require context to infer ironic intent (so computers probably do, too)[C] //Proc of the 52nd Annual Meeting of the Association for Computational Linguistics. 2014:512-516.
[24] Danescu-Niculescu-Mizil C, Sudhof M, Jurafsky D, et al. A computational approach to politeness with application to social factors[EB/OL] . [2017-11-30] . https://arxiv. org/abs/1306. 6078.
收稿日期 2017/9/27
修回日期 2017/12/1
页码 663-667,672
中图分类号 TP391.1
文献标志码 A