《计算机应用研究》|Application Research of Computers

后验正则化综述

Survey on posterior regularization

免费全文下载 (已被下载 次)  
获取PDF全文
作者 韩亚楠,刘建伟,罗雄麟
机构 中国石油大学(北京)自动化系,北京 102249
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2021)10-001-2881-07
DOI 10.19734/j.issn.1001-3695.2020.12.0568
摘要 在模型训练过程中,训练语料通常包含许多针对特定问题的边信息,而模型往往无法直接利用这些边信息。后验正则化(posterior regularization,PR)由于其框架的灵活性和简单性,在分类任务、自然语言处理以及远程监督系统等领域获得广泛应用。首先系统性地对后验正则化问题进行描述;然后详细介绍了三类后验正则化的通用框架,并指出了各个后验正则化框架被提出的原因以及其模型的具体形式、所具有的优缺点、适宜解决的问题等;进而又指出了近年来几类后验正则化框架的典型应用,并指明了后验正则化框架未来可能的发展方向;最后对全文内容进行概括总结。
关键词 后验正则化; 边信息; 后验分布; 自然语言处理
基金项目 中国石油大学(北京)科研基金资助项目(2462020YXZZ023)
本文URL http://www.arocmag.com/article/01-2021-10-001.html
英文标题 Survey on posterior regularization
作者英文名 Han Yanan, Liu Jianwei, Luo Xionglin
机构英文名 Dept. of Automation,China University of Petroleum,Beijing 102249,China
英文摘要 During the model training, training corpora usually contains a lot of external problem-specific information that cannot be used by the model directly. Due to the flexibility and simplicity of its framework, Posterior regularization has been widely applied in the fields of natural language processing and sample selection of classification tasks. The aim of the posteriori regularization framework is to limit the capacity of the model posteriori on unmarked data, thus guiding the model to learn and achieve the desired performance. Firstly, this paper systematically described the problem of PR. Then it introduced the three general frameworks of PR, pointed the motivation of every PR and the specific form of every PR model, and the advantages of every PR model, the problems that every PR model could solve and the typical application scenarios that PR could be used. Lastly, this paper pointed out the research directions of the posterior regularization models in the future and summarized the content of the total paper.
英文关键词 posterior regularization; side information; posterior distribution; natural language processing
参考文献 查看稿件参考文献
 
收稿日期 2020/12/21
修回日期 2021/2/12
页码 2881-2887,2903
中图分类号 TP391
文献标志码 A