《计算机应用研究》|Application Research of Computers

面向数据流的多任务多核在线学习算法

Online learning algorithm based on multi-task and multi-kernel for stream data

免费全文下载 (已被下载 次)  
获取PDF全文
作者 裴乐,刘群
机构 重庆邮电大学 计算智能重庆市重点实验室,重庆 400065
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2019)03-005-0668-05
DOI 10.19734/j.issn.1001-3695.2017.09.0921
摘要 对于数据流的处理,现有的在线学习算法在准确性上仍有欠缺,故提出一种新的多任务多核在线学习模型用于提高数据流预测的准确性。在保持多任务多核学习的基础上,将其扩展到在线学习中,从而得到一个新的在线学习算法;同时为输入数据保持一定大小的数据窗口,用较小空间换取数据的完整性。实验部分对核函数的选取以及训练样本集的大小进行了较为详细的分析,通过对UCI数据和实际的机场客流量数据进行分析,很好地保障了流数据处理的准确性及实时性,有一定的实际应用价值。
关键词 多任务多核学习;在线学习;流数据;支持向量机
基金项目 国家重点研发计划资助项目(2016QY01W0200)
本文URL http://www.arocmag.com/article/01-2019-03-005.html
英文标题 Online learning algorithm based on multi-task and multi-kernel for stream data
作者英文名 Pei Le, Liu Qun
机构英文名 ChongqingKeyLaboratoryofComputationalIntelligence,ChongqingUniversityofPosts&Telecommunications,Chongqing400065,China
英文摘要 For the prediction of data stream, some online learning algorithms have some shortcomings in accuracy. Therefore, this paper proposed a new multi-task and multi-kernel online learning model to improve the accuracy of data stream prediction. Based on the multi-task multiple-kernel learning, it extended the model to online learning, so as to get a new online learning algorithm, while maintaining a certain size of the input data window for the integrity of the data with less space. In the experimental part, it analyzed the selection of kernel function and the size of training sample set in detail. Through the analysis of UCI data and actual airport passenger flow data, the proposed algorithm can ensure the accuracy and real-time of stream data processing, and has certain applicable value.
英文关键词 multi-task and multi-kernel learning; online learning; streaming data; SVM
参考文献 查看稿件参考文献
  [1] 李志杰, 李元香, 王峰, 等. 面向大数据分析的在线学习算法综述[J] . 计算机研究与发展, 2015, 52(8):1707-1721. (Li Zhijie, Li Yuanxiang, Wang Feng, et al. Online learning algorithm for big data analytics:a survey[J] . Journal of Computer Research and Development, 2015, 52(8):1707-1721. )
[2] 潘志松, 唐斯琪, 邱俊洋, 等. 在线学习算法综述[J] . 数据采集与处理, 2016, 31(6):1067-1082. (Pan Zhisong, Tang Siqi, Qiu Junyang, et al. Survey on online learning algorithms[J] . Journal of Data Acquisition and Processing, 2016, 31(6):1067-1082. )
[3] Wang Zhuang, Vucetic S. Online passive-aggressive algorithms on a〓〓〓〓budget[J] . Journal of Machine Learning Research, 2010, 9(9):908-915.
[4] Rosenblatt F. The perceptron:a probabilistic model for information storage and organization in the brain[J] . Psychological Review, 1958, 65(6):386.
[5] Crammer K, Dredze M, Pereira F. Confidence-weighted linear classification for text categorization[J] . Journal of Machine Learning Research, 2012, 13(1):1891-1926.
[6] Rakotomamonjya, Flamary R, Gasso G, et al. lp-lq penalty for sparse linear and sparse multiple kernel multitask learning[J] . IEEE Trans on Neural Networks, 2011, 22(8):1307-20.
[7] 李志杰, 李元香, 王峰, 等. 面向大数据流的多任务加速在线学习算法[J] . 计算机研究与发展, 2015, 52(11):2545-2554. (Li Zhijie, Li Yuanxiang, Wang Feng, et al. Accelerated multi-task online learning algorithm for big data stream[J] . Journal of Computer Research and Development, 2015, 52(11):2545-2554. )
[8] Yang Haiqin, Lyu M R, King I. Efficient online learning for multitask feature selection[J] . ACM Trans on Knowledge Discovery from Data, 2013, 7(2):1693-1696.
[9] Li Cong, Georgiopoulos M, Anagnostopoulos G C. Pareto-path multitask multiple kernel learning[J] . IEEE Trans on Neural Networks & Learning Systems, 2015, 26(1):51-61.
[10] 周志华, 王珏. 机器学习及其应用[M] . 北京:清华大学出版社, 2007:127-129.
[11] 邹恒明. 计算机的心智:操作系统之哲学原理[M] . 北京:机械工业出版社, 2012:100-102.
[12] 张钢, 谢晓珊, 黄英, 等. 面向大数据流的半监督在线多核学习算法[J] . 智能系统学报, 2014, 9(3):355-363. (Zhang Gang, Xie Xiaoxian, Huang Ying, et al. An online multi-kernel learning algorithm for big data[J] . CAAI Trans on Intelligent Systems, 2014, 9(3):355-363. )
[13] Jian Ling, Shen Shuqian, Li Jundong, et al. Budget online learning algorithm for least squares SVM[J] . IEEE Trans on Neural Networks & Learning Systems, 2016, 28(9):2076-2087.
[14] Li C, Georgiopoulos M, Anagnostopoulos G C. Conic multi-task classification[C] //Proc of Joint European Conference on Machine Lear-ning and Knowledge Discovery in Databases. Berlin:Springer, 2014:193-208.
[15] UCI数据集[EB/OL] . http://archive. ics. uci. edu/ml/DOI.
[16] 机场客流量时空分布预测[EB/OL] . https://tianchi. aliyun. com/competition/introduction. htm?spm=5176. 100066. 333. 4. 6YizCQ &race Id=231588DOI.
收稿日期 2017/9/17
修回日期 2017/10/31
页码 668-672
中图分类号 TP181;TP301.6
文献标志码 A