《计算机应用研究》|Application Research of Computers

稀疏条件下的重叠子空间聚类算法

Novel algorithm of overlapping subspace clustering under sparse condition

免费全文下载 (已被下载 次)  
获取PDF全文
作者 邱云飞,费博雯,刘大千,刘兴
机构 辽宁工程技术大学 a.软件学院;b.工商管理学院;c.电子与信息工程学院,辽宁 葫芦岛 125105
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2019)03-003-0657-06
DOI 10.19734/j.issn.1001-3695.2017.08.0904
摘要 现有子空间聚类算法不能很好地平衡子空间数据的稠密性和不同子空间数据稀疏性的关系,且无法处理数据的重叠问题。针对上述问题,提出一种稀疏条件下的重叠子空间聚类(OSCSC)算法。算法利用l1范数和Frobenius范数的混合范数表示方法建立子空间表示模型,并对l1范数正则项进行加权处理,提高不同子空间的稀疏性和同一子空间的稠密性;然后对划分好的子空间使用一种服从指数族分布的重叠概率模型进行二次校验,判断不同子空间数据的重叠情况,进一步提高聚类的准确率。在人造数据集和真实数据集上分别进行测试,实验结果表明,OSCSC算法能够获得良好的聚类结果。
关键词 重叠子空间聚类;混合范数;重叠概率模型;指数族分布
基金项目 国家自然科学基金青年科学基金资助项目(61401185)
本文URL http://www.arocmag.com/article/01-2019-03-003.html
英文标题 Novel algorithm of overlapping subspace clustering under sparse condition
作者英文名 Qiu Yunfei, Fei Bowen, Liu Daqian, Liu Xing
机构英文名 a.SchoolofSoftware,b.SchoolofBusiness&Management,c.SchoolofElectronicsInformationEngineering,LiaoningTechnicalUniversity,HuludaoLiaoning125105,China
英文摘要 The existing subspace clustering algorithms cannot balance the density of the data in the same subspace and the sparsity of the data between different subspaces and most algorithms cannot solve the overlap of data. To solve the above problems, this paper proposed a novel algorithm of overlapping subspace clustering algorithm under sparse condition (OSCSC) . The algorithm used the mixed norm representation method of l1 norm and Frobenius norm to establish the subspace representation model, and the weighted l1 norm regular term could improve the sparsity of different subspaces and the density of the same subspace. Then, the algorithm performed rechecks on the partitioned subspaces by using an overlapping probability model subject to exponential family distribution to determine whether exist overlapping in different subspaces, which could further improve the accuracy of clustering. The results of the experiment on both artificial datasets and real-world datasets show that the algorithm has better clustering performance by being compared to other contrast algorithms.
英文关键词 overlapping subspace clustering; mixed norm; overlapping probability model; exponential family distribution
参考文献 查看稿件参考文献
  [1] Elhamifar E, Vidal R. Sparsity in unions of subspaces for classification and clustering of high-dimensional data[C] // Proc of the 49th Annual Allerton Conference on Communication, Control, and Computing. Piscataway, NJ:IEEE Press, 2011:1085-1089.
[2] Peng Xi, Tang Huajin, Zhang Lei, et al. A unified framework for representation-based subspace clustering of out-of-sample and large-scale data[J] . IEEE Trans on Neural Networks and Learning Systems, 2016, 27(12):2499-2512.
[3] 陈爱国, 王士同. 基于多代表点的大规模数据模糊聚类算法[J] . 控制与决策, 2016, 31(12):2122-2130. (Chen Aiguo, Wang Shitong. Fuzzy clustering algorithm based on multiple medoids for large-scale data[J] . Control and Decision, 2016, 31(12):2122-2130. )
[4] Hu Han, Lin Zhouchen, Feng Jianjiang, et al. Smooth representation clustering[C] //Proc of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ:IEEE Press, 2014:3834-3841.
[5] Agrawal R, Geherke J, Gunopulos D, et al. Automatic subspace clustering of high dimensional data[J] . Data Mining and Know-ledge Discovering, 2005, 11(1):5-33.
[6] 王卫卫, 李小平, 冯象初, 等. 稀疏子空间聚类综述[J] . 自动化学报, 2015, 41(8):1373-1384. (Wang Weiwei, Li Xiaoping, Feng Xiangchu, et al. A survey on sparse subspace clustering[J] . Acta Automatica Sinica, 2015, 41(8):1373-1384. )
[7] Elhamifar E, Vidal R. Sparse subspace clustering:algorithm, theory, and applications[J] . IEEE Trans on Pattern Analysis and Machine Intelligence, 2013, 35(11):2765-2781.
[8] Lu Canyi, Min Hai, Zhao Zhongqiu, et al. Robust and efficient subspace segmentation via least squares regression[C] //Proc of European Conference on Computer Vision. Berlin:Springer, 2012:347-360.
[9] Liu Guangcan, Lin Zhouchen, Yu Yong. Robust subspace segmentation by low-rank representation[C] //Proc of the 27th International Conference on Machine Learning. 2010:663-670.
[10] Xu Jun, Xu Kui, Chen Ke, et al. Reweighted sparse subspace clustering[J] . Computer Vision and Image Understanding, 2015, 138(9):25-37.
[11] 张涛, 唐振民, 吕建勇. 一种基于低秩表示的子空间聚类改进算法[J] . 电子与信息学报, 2016, 38(11):2811-2818. (Zhang Tao, Tang Zhenmin, Lyu Jianyong. Improved algorithm based on low rank representation[J] . Journal of Electronics & Information Technology, 2016, 38(11):2811-2818. )
[12] Baadel S, Thabtah F, Lu J. Overlapping clustering:a review[C] //Proc of SAI Computing Conference. Piscataway, NJ:IEEE Press, 2016:233-237.
[13] Banerjee A, Krumpelman C, Ghosh J, et al. Model-based overlapping clustering[C] //Proc of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM Press, 2005:532-537.
[14] Fu Qiang, Banerjee A. Bayesian overlapping subspace clustering[C] //Proc of the 9th IEEE International Conference on Data Mining. Piscataway, NJ:IEEE Press, 2009:776-781.
[15] Candès E J, Wakin M B, Boyd S P. Enhancing sparsity by reweigh-ted 1 minimization[J] . Journal of Fourier Analysis and Applications, 2008, 14(5-6):877-905.
[16] Panagakis Y, Kotropoulos C. Elastic net subspace clustering applied to pop/rock music structure analysis[J] . Pattern Recognition Letters, 2014, 38(3):46-53.
[17] Shi Jianbo, Malik J. Normalized cuts and image segmentation[J] . IEEE Trans on Pattern Analysis and Machine Intelligence, 2000, 22(8):888-905.
[18] Fu Qiang, Banerjee A. Multiplication mixture models for overlapping clustering[C] //Proc of IEEE International Conference on Data Mining. Piscataway, NJ:IEEE Press, 2008:791-796.
[19] 刘展杰, 陈晓云. 局部子空间聚类[J] . 自动化学报, 2016, 42(8):1238-1247. (Liu Zhanjie, Chen Xiaoyun. Local subspace clustering[J] . Acta Automatica Sinica, 2016, 42(8):1238-1247. )
[20] Cai Deng, He Xiaofei, Wu Xiaoyun, et al. Non-negative matrix factorization on manifold[C] //Proc of the 8th IEEE International Conference on Data Mining. Piscataway, NJ:IEEE Press, 2008:63-72.
[21] 邓赵红, 张丹丹, 蒋亦樟. 基于划分自适应融合的多视角模糊聚类算法[J] . 控制与决策, 2016, 31(4):593-600. (Deng Zhaohong, Zhang Dandan, Jiang Yizhang. Multi-view fuzzy clustering algorithm based on partition adaptive-fusion[J] . Control and Decision, 2016, 31(4):593-600. )
[22] Aggarwal C C, Wolf J L, Yu P S, et al. Fast algorithms for projected clustering[C] //Proc of ACM SIGKDD International Conference on Management of Data. New York:ACM Press, 1999:61-72.
收稿日期 2017/8/30
修回日期 2017/10/11
页码 657-662
中图分类号 TP301.6
文献标志码 A