英文标题 | Research on automatic text summarization combining topic feature |
作者英文名 | Luo Fang, Wang Jinghang, He Daosen, Pu Qiumei |
机构英文名 | 1.School of Computer Science & Technology,Wuhan University of Technology,Wuhan 430063,China;2.Dept. of Supply Chain & Information Management,Hang Seng University of Hong Kong,Hong Kong 999077,China;3.School of Information Engineering,Minzu University of China,Beijing 100081,China |
英文摘要 | Aiming at the traditional graph models for text summarization only focus on statistical features or shallow semantic features, and lack mining and utilization of deep topic semantic features, this paper proposed MDSR(multi-dimension summarization rank), an automatic text summarization method that combined topic feature. Specifically, this method adopted the LDA model to mine the semantic information of text topics and measured the impact of topic feature on a sentence by defining the importance of the topic. And it improved the construction mode of the probability transition matrix of graph model nodes by combining the topic feature with statistic features and inter-sentence similarity. Finally, it extracted and measured summarization according to the weight of sentence nodes. The results show that the ROUGE value evaluates by MDSR reaches the best when the weight ratio of topic feature, statistic feature and inter-sentence similarity is 3: 4: 3. The ROUGE-1, ROUGE-2, ROUGE-SU4 are 53.35%, 35.18% and 33.86%, which perform better than other comparisons. It shows that the text summarization method combining topic feature can effectively improve the accuracy of the summarization extraction. |
英文关键词 | TextRank; text summarization; semantic features; LDA; probability transition matrix |