《计算机应用研究》|Application Research of Computers

异质网络社区发现研究进展

Survey of community detection in heterogeneous networks

免费全文下载 (已被下载 次)  
获取PDF全文
作者 阳雨,郭勇,李海龙,邓波
机构 北京系统工程研究所,北京 100101
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2018)10-2881-07
DOI 10.3969/j.issn.1001-3695.2018.10.001
摘要 异质网络将复杂系统中的信息抽象成不同类型的节点和链接关系,不同于同质网络,基于异质网络的社区发现能够挖掘出更加精确的社区结构。异质网络的社区发现通过对异质网络中的多维结构、多模信息、语义信息、链接关系等信息进行建模表示和提取分析,以发现其中相对紧密稳定的社区结构,对网络信息的获取与挖掘、信息推荐以及网络演化预测具有重要的研究价值。首先对社区发现和异质网络进行了简单阐述;随后结合实例介绍了异质网络社区发现的现有研究方法,包括基于主题模型、基于排序和聚类相结合、基于数据重构和基于降维的方法等,并针对各类方法指出了其特点和局限性;最后讨论了当前该领域在结构复杂性、建模复杂性、数据规模等方面面临的挑战。在将来,基于并行化、可扩展、动态增量的研究更能适应当前的变化环境。
关键词 异质网络;社区发现;网络结构
基金项目 国家自然科学基金资助项目(61402486)
本文URL http://www.arocmag.com/article/01-2018-10-001.html
英文标题 Survey of community detection in heterogeneous networks
作者英文名 Yang Yu, Guo Yong, Li Hailong, Deng Bo
机构英文名 BeijingInstituteofSystemEngineering,Beijing100101,China
英文摘要 Most real systems consist of a large number of interacting, multi-typed components, while most contemporary researches model them as homogeneous networks without distinguishing different types of objects and links in the networks. Compared with homogeneous networks, community detection based on heterogeneous networks could obtain more accurate community structures. By modeling and analyzing various information including multi-dimensional structure, multi-mode information and semantic meaning in heterogeneous networks, community detection is to detect relatively stable community and valuable for network information collection and mining, information recommendations and predicting the evolution of networks. Firstly, this paper introduced the community detection and heterogeneous networks. In community detection of heterogeneous networks, the current mainstream methods included topic model, ranking-based clustering, data reconstruction, dimensionality reduction and so on. This paper summarized the above types of methods and analyzed their performance with practical applications. It also discussed the development trend of the community detection in heterogeneous networks. In the future, researches in the parallel, scalable and incremental dynamic heterogeneous networks will get more attention.
英文关键词 heterogeneous networks; community detection; network structure
参考文献 查看稿件参考文献
  [1] We are social:2016年全球互联网、社交媒体、移动设备普及情况[EB/OL] . (2016-12-18). http://www. 199it. com/archives/437192. html.
[2] Sun Yizhou, Han Jiawei. Mining heterogeneous information networks:a structural analysis approach[J] . ACM SIGKDD Explorations Newsletter, 2013, 14(2):20-28.
[3] Tang Lei, Liu Huan. Community detection and mining in social media[M] . San Rafael, CA:Morgan & Claypool Publisher, 2010:1-137.
[4] Tang Lei, Wang Xufei, Liu Huan. Uncoverning groups via heterogeneous interaction analysis[C] //Proc of the 9th IEEE International Conference on Data Mining. Washington DC:IEEE Computer Society, 2009:503-512.
[5] Papadimitriou C H, Raghavan P, Tamaki H, et al. Latent semantic indexing:a probabilistic analysis[J] . Journal of Computer & System Sciences, 2000, 61(2):217-235.
[6] Hofmann T. Probabilistic latent semantic indexing[C] //Proc of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York:ACM Press, 1999:50-57.
[7] Blei D M, Ng A Y, Jordan M I. Latent Dirichlet allocation[J] . Journal of Machine Learning Research, 2003, 3(1):993-1022.
[8] Zhou Ding, Manavoglu E, Li Jia, et al. Probabilistic models for discovering e-communities[C] //Proc of the 15th International Conference on World Wide Web. New York:ACM Press, 2006:173-182.
[9] Cha Y, Cho J. Social-network analysis using topic models[C] //Proc of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York:ACM Press, 2012:565-574.
[10] Rosen-Zvi M, Griffiths T, Steyvers M, et al. The author-topic model for authors and documents[C] //Proc of the 20th Conference on Uncertainty in Artificial Intelligence. Arlington, Virginia:AUAI Press, 2004:487-494.
[11] Liu Yan, Niculescu-Mizil A, Gryc W. Topic-link LDA:joint models of topic and author community[C] //Proc of the 26th Annual International Conference on Machine Learning. New York:ACM Press, 2009:665-672.
[12] Mei Qiaozhu, Cai Deng, Zhang Duo, et al. Topic modeling with network regularization[C] //Proc of the 17th International Conference on World Wide Web. New York:ACM Press, 2008:101-110.
[13] Cai Deng, Mei Qiaozhu, Han Jiawei, et al. Modeling hidden topics on document manifold[C] //Proc of the 17th ACM Conference on Information and Knowledge Management. New York:ACM Press, 2008:911-920.
[14] Cai Deng, Wang Xuanhui, He Xiaofei. Probabilistic dyadic data analysis with local and global consistency[C] //Proc of the 26th Annual International Conference on Machine Learning. New York:ACM Press, 2009:105-112.
[15] Deng Hongbo, Han Jiawei, Zhao Bo, et al. Probabilistic topic models with biased propagation on heterogeneous information networks[C] //Proc of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM Press, 2011:1271-1279.
[16] Deng Hongbo, Zhao Bo, Han Jiawei. Collective topic modeling for heterogeneous networks[C] //Proc of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York:ACM Press, 2011:1109-1110.
[17] Wang Qian, Peng Zhaohui, Jiang Fei, et al. LSA-PTM:a propagation-based topic model using latent semantic analysis on heterogeneous information networks[C] //Proc of International Conference on Web-Age Information Management. Berlin:Springer, 2013:13-24.
[18] Wang Qian, Peng Zhaohui, Wang Senzhang, et al. cluTM:content and link integrated topic model on heterogeneous information networks[C] //Proc of International Conference on Web-Age Information Management. Cham:Springer, 2015:207-218.
[19] Wang Chengguan, Song Yangqiu, El-Kishky A, et al. Incorporating world knowledge to document clustering via heterogeneous information networks[C] //Proc of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM Press, 2015:1215-1224.
[20] Sun Yizhou, Han Jiawei, Zhao Peixiang, et al. RankClus:integrating clustering with ranking for heterogeneous information network analysis[C] //Proc of the 12th International Conference on Extending Database Technology:Advances in Database Technology. New York:ACM Press, 2009:565-576.
[21] Sun Yizhou, Yu Yintao, Han Jiawei. Ranking-based clustering of heterogeneous information networks with star network schema[C] //Proc of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM Press, 2009:797-806.
[22] Wang Ran, Shi Chuan, Yu P S, et al. Integrating clustering and ranking on hybrid heterogeneous information network[C] //Proc of Pacific-Asia Conference on Knowledge Discovery and Data Mining. Berlin:Springer, 2013:583-594.
[23] Shi Chuan, Wang Ran, Li Yitong, et al. Ranking-based clustering on general heterogeneous information networks by network projection[C] //Proc of the 23rd ACM International Conference on Information and Knowledge Management. New York:ACM Press, 2014:699-708.
[24] Chen Junxia, Dai Wei, Sun Yizhou, et al. Clustering and ranking in heterogeneous information networks via gamma-Poisson model[C] //Proc of SIAM International Conference on Data Mining. [S. l. ] :SIAM Press, 2015:424-432.
[25] Wang Chi, Liu Jialu, Desai N, et al. Constructing topical hierarchies in heterogeneous information networks[J] . Knowledge and Information Systems, 2015, 44(3):529-558.
[26] Qiu Changhe, Chen Wei, Wang Tengjiao, et al. Overlapping community detection in directed heterogeneous social network[C] //Proc of International Conference on Web-Age Information Management. Cham:Springer, 2015:490-493.
[27] Liu Weichu, Murata T, Liu Xin. Community detection on heterogeneous networks[C] //Proc of the 27th Annual Conference of Japanese Society for Artificial Intelligence. 2013.
[28] Liu Xin, Liu Weichu, Murata T, et al. A framework for community detection in heterogeneous multi-relational networks[J] . Advances in Complex Systems, 2014, 17(6):1450018.
[29] Murata T. Detecting communities from bipartite networks based on bipartite modularities[C] //Proc of International Conference on Computational Science and Engineering. Picataway, NJ:IEEE Press, 2009:50-57.
[30] Liu Xin, Murata T. Detecting communities in K-partite K-uniform (hyper) networks[J] . Journal of Computer Science and Technology, 2011, 26(5):778-791.
[31] Jolliffe I. Principal component analysis[M] . 2nd ed. New York:Springer-Verlag, 2002.
[32] Mika S, Ratsch G, Weston J, et al. Fisher discriminant analysis with kernels[C] //Proc of IEEE Signal Processing Society Workshop on Neural Networks for Signal Processing. Washington DC:IEEE Computer Society, 1999:41-48.
[33] Lee D D, Seung H S. Learning the parts of objects by non-negative matrix factorization[J] . Nature, 1999, 401(6755):788-791.
[34] Wang Wenjun, Jiao Pengfei, He Dongxiao, et al. Autonomous overlapping community detection in temporal networks:a dynamic Bayesian nonnegative matrix factorization approach[J] . Knowledge-Based Systems, 2016, 110(10):121-134.
[35] Psorakis I, Roberts S, Ebden M, et al. Overlapping community detection using Bayesian non-negative matrix factorization[J] . Physical Review E, 2011, 83(6):066114.
[36] Yang J, Leskovec J. Overlapping community detection at scale:a nonnegative matrix factorization approach[C] //Proc of the 6th ACM International Conference on Web Search and Data Mining. New York:ACM Press, 2013:587-596.
[37] Chen Xu, Zhou Mingyuan, Carin L. The contextual focused topic model[C] //Proc of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM Press, 2012:96-104.
[38] Guimerà R, Sales-Pardo M, Amaral L A N. Module identification in bipartite and directed networks[J] . Physical Review E, 2007, 76(3):036102.
[39] Aggarwal C C, Xie Yan, Yu P S. Towards community detection in locally heterogeneous networks[C] //Proc of SIAM International Conference on Data Mining. [S. l. ] :SIAM Press, 2011:391-402.
[40] Sun Yizhou, Aggarwal C C, Han Jiawei. Relation strength-aware clustering of heterogeneous information networks with incomplete attributes[J] . Proceedings of the VLDB Endowment, 2012, 5(5):394-405.
[41] Qi Guojun, Aggarwal C C, Huang T S. On clustering heterogeneous social media objects with outlier links[C] //Proc of the 5th ACM International Conference on Web Search and Data Mining. New York:ACM Press, 2012:553-562.
[42] Boden B, Ester M, Seidl T. Density-based subspace clustering in heterogeneous networks[C] //Proc of Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Berlin:Springer, 2014:149-164.
[43] Sun Yizhou, Norick B, Han Jiawei, et al. PathSelClus:integrating meta-path selection with user-guided object clustering in heterogeneous information networks[J] . ACM Trans on Knowledge Discovery from Data, 2013, 7(3):article No 11.
[44] Luo Chen, Pang Wei, Wang Zhe. Semi-supervised clustering on he-terogeneous information networks[C] //Proc of Pacific-Asia Conference on Knowledge Discovery and Data Mining. Cham:Springer, 2014:548-559.
[45] Alqadah F, Bhatnagar R. A game theoretic framework for heterogenous information network clustering[C] //Proc of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM Press, 2011:795-804.
[46] Sun Yizhou, Han Jiawei, Yan Xifeng, et al. PathSim:meta path-based top-k similarity search in heterogeneous information networks[J] . Proceedings of the VLDB Endowment, 2011, 4(11):992-1003.
[47] Shi Chuan, Zhou Chong, Kong Xiangnan, et al. HeteRecom:a semantic-based recommendation system in heterogeneous networks[C] //Proc of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM Press, 2012:1552-1555.
[48] Zhang Jiawei, Philip S Y. Multiple anonymized social networks alignment[C] //Proc of IEEE International Conference on Data Mining. Washington DC:IEEE Computer Society, 2015:599-608.
[49] Wang Chi, Han Jiawei, Jia Yuntao, et al. Mining advisor-advisee relationships from research publication networks[C] //Proc of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM Press, 2010:203-212.
[50] Liben-Nowell D, Kleinberg J. The link-prediction problem for social networks[J] . Journal of the American Society for Information Science and Technology, 2007, 58(7):1019-1031.
[51] Wang Guan, Xie Sihong, Liu Bing, et al. Identify online store review spammers via social review graph[J] . ACM Trans on Intelligent Systems and Technology, 2012, 3(4):1-21.
[52] Yin Xiaoyin, Han Jiawei, Philip S Y. Truth discovery with multiple conflicting information providers on the Web[J] . IEEE Trans on Knowledge and Data Engineering, 2008, 20(6):796-808.
[53] Zhao Bo, Rubinstein B I P, Gemmell J, et al. A Bayesian approach to discovering truth from conflicting sources for data integration[J] . Proceedings of the VLDB Endowment, 2012, 5(6):550-561.
[54] Shi Chuan, Kong Xiangnan, Huang Yue, et al. HeteSim:a general framework for relevance measure in heterogeneous networks[J] . IEEE Trans on Knowledge and Data Engineering, 2014, 26(10):2479-2492.
[55] Meng Xiaofeng, Shi Chuan, Li Yitong, et al. Relevance measure in large-scale heterogeneous networks[C] //Proc of Asia-Pacific Web Conference. Cham:Springer, 2014:636-643.
[56] Cohen J. Graph twiddling in a MapReduce world[J] . Computing in Science & Engineering, 2009, 11(4):29-41.
[57] Kang U, Tsourakakis C E, Faloutsos C. PEGASUS:a Peta-scale graph mining system implementation and observations[C] //Proc of the 9th IEEE International Conference on Data Mining. Picataway, NJ:IEEE Press, 2009:229-238.
[58] Buzun N, Korshunov A, Avanesov V, et al. EgoLP:fast and distributed community detection in billion-node social networks[C] //Proc of IEEE International Conference on Data Mining. Washington DC:IEEE Computer Society, 2014:533-540.
[59] Gonzalez J E, Xin R S, Dave A, et al. GraphX:graph processing in a distributed dataflow framework[C] //Proc of the 11th USENIX Conference on Operating Systems Design and Implementation. Berkeley, CA:USENIX Association, 2014:599-613.
[60] Sun Yizhou, Tang Jie, Han Jiawei, et al. Community evolution detection in dynamic heterogeneous information networks[C] //Proc of the 8th Workshop on Mining and Learning with Graphs. New York:ACM Press, 2010:137-146.
收稿日期 2017/7/19
修回日期 2017/9/14
页码 2881-2887
中图分类号 TP393
文献标志码 A