《计算机应用研究》|Application Research of Computers

基于自学习近邻图策略的短文本匹配方法

Self-adaptive affinity graph learning for short text matching

免费全文下载 (已被下载 次)  
获取PDF全文
作者 付聪,李六武,杨振国,刘文印
机构 广东工业大学 计算机学院,广州 510006
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2020)06-019-1697-05
DOI 10.19734/j.issn.1001-3695.2018.12.0877
摘要 针对自然语言处理中的文本匹配问题,提出一种基于自学习文本近邻图框架的深度学习模型,以处理短文本匹配问题。文本近邻图可使用词嵌入将文本转换为向量形式,再通过构建文本相似度关系矩阵获得,可表达文本样本的近邻关系。现有方法通常构造静态的近邻图,这些方法一方面依赖先验知识,另一方面难以获得句子对的最优表示。因此,提出了利用孪生卷积神经网络学习更优的动态更新的近邻图。该模型在Quora数据集上的准确率和<i>F</i><sub>1</sub>值分别是84.15%和79.88%,在MSRP数据集上的准确率和<i>F</i><sub>1</sub>值分别是74.55%和81.63%。实验表明,提出模型能有效地提高文本识别和匹配的准确率。
关键词 文本匹配; 自学习近邻图; 词嵌入; 孪生卷积神经网络
基金项目 国家自然科学基金资助项目(61703109,91748107)
中国博士后科学基金资助项目(2018M643024)
广东省引进创新科研团队计划资助项目(2014ZT05G157)
本文URL http://www.arocmag.com/article/01-2020-06-019.html
英文标题 Self-adaptive affinity graph learning for short text matching
作者英文名 Fu Cong, Li Liuwu, Yang Zhenguo, Liu Wenyin
机构英文名 School of Computer Science,Guangdong University of Technology,Guangzhou 510006,China
英文摘要 For text matching problems in natural language processing, this paper proposed a deep learning model based on self-adaptive affinity graph learning framework for short text matching. The affinity graph can be converted into a vector form using word embedding, and then obtained by constructing a text similarity relationship matrix, which can express the neighbor relationship of the text sample. Current methods usually construct static affinity graphs, which rely on prior knowledge and hard to obtain the optimal representation of sentence pairs. Therefore, this paper proposed using the Siamese CNN to learn the affinity graph of better dynamic updates. The accuracy and <i>F</i><sub>1</sub> values of the model on the Quora dataset are 84.15% and 79.88%, and on the MSRP dataset are 74.55% and 81.63%. Experiments show that the proposed model can improve the accuracy of text recognition and matching effectively.
英文关键词 text matching; self-adaptive affinity graph learning; word embedding; Siamese CNN
参考文献 查看稿件参考文献
 
收稿日期 2018/12/8
修回日期 2019/2/12
页码 1697-1701
中图分类号 TP391.1
文献标志码 A