《计算机应用研究》|Application Research of Computers

基于HNC句类的社区问答系统问句检索模型构建

Construction of question retrieval model in community question answering system based on HNC sentence-category

免费全文下载 (已被下载 次)  
获取PDF全文
作者 王宇,王芳
机构 大连理工大学 经济管理学院,辽宁 大连 116024
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2020)06-033-1769-05
DOI 10.19734/j.issn.1001-3695.2018.11.0871
摘要 社区问答系统中充斥着大量的噪声,给用户检索信息造成麻烦,以往的问句检索模型大多集中在词语层面。针对以上问题构建句子层面的问句检索模型。新模型基于概念层次网络(hierarchincal network of concept,HNC)理论当中的句类知识,从句子的语用、语法和语义三个层面计算问句间相似度。通过问句分类算法确定查询问句和候选问句的问句类别,得到问句间的语用相似度,利用句类表达式的结构和语义块组成分别计算问句间的语法及语义相似度。在真实数据集上的实验表明,基于HNC句类的新模型提高了问句检索结果的准确性。
关键词 社区问答系统; 问句检索; HNC理论; 句类分析; 相似度计算
基金项目
本文URL http://www.arocmag.com/article/01-2020-06-033.html
英文标题 Construction of question retrieval model in community question answering system based on HNC sentence-category
作者英文名 Wang Yu, Wang Fang
机构英文名 School of Economics & Management,Dalian University of Technology,Dalian Liaoning 116024,China
英文摘要 Community question answering system causes trouble for users to retrieve information due to useless information. Most of the previous question retrieval models focus on the word level. In order to solve the above problems, this paper proposed a question retrieval model at the sentence level. Based on the sentence-category of HNC theory, the new model calcula-ted similarities between questions from the pragmatic, grammatical and semantic levels of the sentence. The model used the question classification algorithm to determine the categories of query question and candidate question, and thus obtained pragmatic similarity between questions. It used the sentence expression structure and the sentence semantic block to calculate grammatical and semantic similarities. Experiments on real data sets show that the new model based on HNC sentence-category improves the accuracy of question retrieval results.
英文关键词 community question answering system; question retrieval; HNC theory; sentence category analysis; similarity calculation
参考文献 查看稿件参考文献
 
收稿日期 2018/11/24
修回日期 2019/1/16
页码 1769-1773
中图分类号 TP391
文献标志码 A