《计算机应用研究》|Application Research of Computers

基于词义消歧的卷积神经网络文本分类模型

Convolutional neural network based on word sense disambiguation for text classification

免费全文下载 (已被下载 次)  
获取PDF全文
作者 薛涛,王雅玲,穆楠
机构 西安工程大学 计算机科学学院,西安 710048
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2018)10-2898-06
DOI 10.3969/j.issn.1001-3695.2018.10.004
摘要 传统文本分类使用word embedding作为文档表示,忽略词在当前上下文的含义,潜在地认为相同词在不同文本中含义相同。针对此问题提出一种词义消歧的卷积神经网络文本分类模型——WSDCNN(word sense disambiguation convolutional neural network)。使用双向长短时记忆网络(BLSTM)建模上下文,得到词义消歧后的文档特征图;利用卷积神经网络(CNN)进一步提取对文本分类最重要的特征。在四个数据集上进行对比实验,结果表明,所提出方法在两个数据集,特别是文档级数据集上优于先前最好的方法,在另外两个数据集上得到与此前最好方法相当的结果。
关键词 文本分类;卷积神经网络;长短时记忆网络;特征提取;自然语言处理
基金项目
本文URL http://www.arocmag.com/article/01-2018-10-004.html
英文标题 Convolutional neural network based on word sense disambiguation for text classification
作者英文名 Xue Tao, Wang Yaling, Mu Nan
机构英文名 CollegeofComputerScience,Xi'anPolytechnicUniversity,Xi'an710048,China
英文摘要 Traditional text categorization usually uses word embedding directly as a representation of the document, ignoring the meaning of the word in the current context, and potentially believes that the same word has the same meaning in different texts. This paper proposed a novel text classification model called WSDCNN(word sense disambiguation convolutional neural network) to solve the above problems. It used the bidirectional long short term memory network (BLSTM) for modeling context, and got the document feature map after finishing word sense disambiguation. Then it used convolutional neural network (CNN) to further extract the local characteristics of the document which was the most important for text categorization. Compared with the state-of-the-art models, the proposed model achieves excellent performance on 2 out of 4 data sets, especially to the document level data set and the other two show the same result as the previous best method.
英文关键词 text classification; convolutional neural network; long short term memory network; feature extraction; natural language processing(NLP)
参考文献 查看稿件参考文献
  [1] Li Juntao, Cao Yimin, Wang Yadi, et al. Online learning algorithms for double-weighted least squares twin bounded support vector machines[J] . Neural Processing Letters, 2017, 45(1):319-339.
[2] Kim Y. Convolutional neural networks for sentence classification[C] //Proc of Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA:Association for Computational Linguistics, 2014:1746-1751.
[3] Hochreiter S, Schmidhuber J. Long short-term memory[J] . Neural Computation, 1997, 9(8):1735-1780.
[4] Mandelbaum A, Shalev A. Word embeddings and their use in sentence classification tasks[J/OL] . (2016-10-26). https://arxiv. org/pdf/1610. 08229. pdf.
[5] Turian J, Ratinov L, Bengio Y. Word representations:a simple and general method for semi-supervised learning[C] //Proc of the 48th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA:Association for Computational Linguistics, 2010:384-394.
[6] Pang Bo, Lee L, Vaithyanathan S. Thumbs up? Sentiment classification using machine learning techniques[C] //Proc of ACL Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA:Association for Computational Linguistics, 2002:79-86.
[7] Mikolov T, Karafiát M, Burget L, et al. Recurrent neural network based language model[C] //Proc of the 11th Annual Conference of the International Speech Communication Association. 2010:1045-1048.
[8] Son L H, Allauzen A, Yvon F. Measuring the influence of long range dependencies with neural network language models[C] //Proc of the NAACL-HLT Workshop:Will We Ever Really Replace the N-gram Model?On the Future of Language Modeling for HLT. Stroudsburg, PA:Association for Computational Linguistics, 2012:1-10.
[9] Oualil Y, Singh M, Greenberg C, et al. Long-short range context neural networks for language modeling[C] //Proc of Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA:Association for Computational Linguistics, 2016:1473-1481.
[10] 陈龙, 管子玉, 何金红, 等. 情感分析研究进展[J] . 计算机研究与发展, 2017, 54(6):1150-1170.
[11] Graves A, Fernández S, Schmidhuber J. Bidirectional LSTM networks for improved phoneme classification and recognition[C] //Proc of International Conference on Artificial Neural Networks. Berlin:Springer-Verlag, 2005:753-753.
[12] Graves A, Schmidhuber J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures[J] . Neural Networks, 2005, 18(5/6):602-610.
[13] Zhou Peng, Shi Wei, Tian Jun, et al. Attention-based bidirectional long short-term memory networks for relation classification[C] //Proc of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA:Association for Computational Linguistics, 2016:207-212.
[14] Yang Zichao, Yang Diyi, Dyer C, et al. Hierarchical attention networks for document classification[C] //Proc of Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. 2016:1480-1489.
[15] Vu N T, Adel H, Gupta P, et al. Combining recurrent and convolutional neural networks for relation classification[C] //Proc of Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. 2016:534-539.
[16] 何炎祥, 孙松涛, 牛菲菲, 等. 用于微博情感分析的一种情感语义增强的深度学习模型[J] . 计算机学报, 2017, 40(4):773-790.
[17] 蔡慧苹, 王丽丹, 段书凯. 基于word embedding和CNN的情感分类模型[J] . 计算机应用研究, 2016, 33(10):2902-2905, 2909.
[18] 夏从零, 钱涛, 姬东鸿. 基于事件卷积特征的新闻文本分类[J] . 计算机应用研究, 2017, 34(4):991-994.
[19] Zhou Peng, Qi Zhenyu, Zheng Suncong, et al. Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling[C] //Proc of the 26th International Conference on Computational Liguistics. 2016:3485-3495.
[20] Lai Siwei, Xu Liheng, Liu Kang, et al. Recurrent convolutional neural networks for text classification[C] //Proc of National Conference of the American Association for Artificial Intelligence. Palo Alto, CA:AAAI Press, 2015:2267-2273.
[21] Zeiler M D. ADADELTA:an adaptive learning rate method[J/OL] . (2012-12-22). http://arxiv. org/abs/1212. 5701.
[22] Blunsom P, Grefenstette E, Kalchbrenner N. A convolutional neural network for modelling sentences[C] //Proc of the 52nd Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA:Association for Computational Linguistics, 2014:655-665.
[23] Tai K S, Socher R, Manning C D. Improved semantic representations from tree-structured long short-term memory networks[C] //Proc of the 53rd Annual Meeting of the Association for Computational Linguistics & the 7th International Joint Conference on Natural Languages Processing. Stroudsburg, PA:Association for Computational Linguistics, 2015:1556-1566.
收稿日期 2017/5/26
修回日期 2017/7/12
页码 2898-2903
中图分类号 TP391
文献标志码 A