《计算机应用研究》|Application Research of Computers

基于多核系统的并行线性RankSVM算法

Efficient parallel algorithm for linear RankSVM on multi-core systems

免费全文下载 (已被下载 次)  
获取PDF全文
作者 聂慧,彭娇,金晶,李康顺
机构 1.广东科技学院 计算机系,广东 东莞 523000;2.中山大学 数据科学与计算机学院,广州 510006;3.华南农业大学 数学与信息学院/软件学院,广州 510006
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2017)01-0046-06
DOI 10.3969/j.issn.1001-3695.2017.01.009
摘要 现有的线性RankSVM已得到较有效的研究,但在训练大规模的线性RankSVM时,过长的训练时间依然难以让人接受。通过对当前最先进算法Tree-TRON的分析可知,利用信任区域的牛顿迭代(trust region Newton method,TRON)去训练线性RankSVM模型涉及大量的Hessian-vector内积(Hessian-vector product)计算,同时完成Hessian-vector内积计算又需计算大量的辅助变量和矩阵运算。为了有效地加速与Hessian-vector内积有关的计算,在多核系统下提出了一种高效的并行算法(命名为PRankSVM)用于提高大规模线性RankSVM的训练速度。PRankSVM的特征主要体现为两个方面:训练数据按不同的查询划分为不同的子问题;在多核系统下,利用多核加速辅助变量和相关矩阵的计算。通过实验分析可知,相较于现有的算法(如Tree-TRON),PRankSVM不仅可以有效地提高训练速度,而且可以有效地确保预测的准确率。
关键词 排序学习;线性RankSVM模型;并行计算;多核系统
基金项目 国家自然科学基金资助项目(61673157)
广东省自然科学基金资助项目(2014A030313454)
本文URL http://www.arocmag.com/article/01-2017-01-009.html
英文标题 Efficient parallel algorithm for linear RankSVM on multi-core systems
作者英文名 Nie Hui, Peng Jiao, Jin Jing, Li Kangshun
机构英文名 1.Dept.ofComputerScience,GuangdongUniversityofScience&Technology,DongguanGuangdong523000,China;2.SchoolofInformationScience&Technology,SunYatsenUniversity,Guangzhou510006,China;3.CollegeofMathematics&Informatics,CollegeofSoftwareEngineering,SouthChinaAgriculturalUniversity,Guangzhou510006,China
英文摘要 Many effective linear RankSVM algorithms have been studied extensively.However, if making use of any one of them to deal with the large-scale linear RankSVM, then it must be taken extremely lengthy training time.According to the analysis of the existing state-of-the-art algorithm Tree-TRON, if used trust region Newton method (TRON) to train the linear RankSVM, massive Hessian-vector products and the computation of the auxiliary variables could affect the training speed significantly.To efficiently accelerate these computations, this paper proposed an efficient parallel algorithm (named PRankSVM) on multi-core systems.All in all, two important issues should be well handled when designing PRankSVM on multi-core systems.First, it divided the training set into several subsets in terms of different queries.Second, it efficiently utilized the great computational power of the multi-core system to improve the Hessian-vector products and the computation of the auxiliary variables.The experimental results show that PRankSVM not only can obtain the excellent convergence speed, but also can ensure the accuracy in prediction, while comparing with the existing methods.
英文关键词 learning to rank; linear RankSVM; parallel computing; multi-core system
参考文献 查看稿件参考文献
  [1] Andrew T. Learning to rank[J] . Information Retrieval, 2005, 8(3):359-381.
[2] Cao Z, Qin T, Liu T Y, et al. Learning to rank:from pairwise 540 approach to listwise approach[C] //Proc of the 24th International Conference on Machine Learning. New York:ACM Press, 2007:129-136.
[3] Fen Xia, Liu Tieyan, Wang Jue, et al. Listwise approach to learning to rank-Theory and Algorithm[C] //Proc of the 25th International Conference on Machine Learning. New York:ACM Press, 2008:1-22.
[4] Burges C, Shaked T, Renshaw E, et al. Learning to rank using gradient descent[C] //Proc of the 22th International Conference on Machine Learning. New York:ACM Press, 2005:89-96.
[5] Zheng Z, Zha H, Zhang T, et al. A general boosting method and its application to learning ranking functions for Web search[C] //Neural Information Processing Systems. 2007:1697-1704.
[6] Jin J, Lin X. Efficient parallel algorithms for linear ranksvm on gpu[C] //Proc of Network and parallel computing. Berlin:Springer, 2014:181-194.
[7] Herbrich R, Graepel T, Obermayer K. Large margin rank boundaries for ordinal regression[C] //Proc of Neural Information Processing Systems. 2000:115-132.
[8] Cortes C, Vapnik V. Support-vector networks[J] . Machine Learning, 1995, 20(3):273-297.
[9] Chapelle S, Keerthi S. Efficient algorithms for ranking with SVMs[J] . Information Retrieval, 2010, 13(3):201-215.
[10] Airola T, Pahikkala T S. Training linear ranking SVMs in linearithmic time using red-black trees[J] . Pattern Recognition Letters, 2011, 32(9):1328-1336.
[11] Kuo T M, Lee C P, Lin C J. Large-scale kernel RankSVM[J] . MIT Press Journals, 2014, 26(14):781-817.
[12] Dembo R S, Steihaug T. Truncated-Newton algorithms for large-scale unconstrained optimization[J] . Mathematical Programming, 1983, 26(2):190-212.
[13] Lin C J, Mor′e J J. Newton’s method for large bound-constrained optimization problems[J] . SIAM Journal on Optimization, 1999, 9(4):1100-1127.
[14] Joachims T. Training linear svms in linear time[C] //Proc of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM Press, 2006:217-226.
[15] Lee C P, Lin C J. Large-scale linear rankSVM[J] . Neural Computation, 2014, 26(4):781-817.
[16] Cossock D, Zhang T. Statistical analysis of bayes optimal subset ranking[J] . IEEE Trans on Information Theory, 2008, 54(11):5140-5154.
[17] Yu Hwanjo, Kim Jinha, Kim Youngdae, et al. An efficient method for learning nonlinear ranking SVM functions[J] . Information Sciences, 2012, 209(22):37-48.
收稿日期 2015/10/7
修回日期 2015/12/14
页码 46-51,57
中图分类号 TP181;TP301.6
文献标志码 A