《计算机应用研究》|Application Research of Computers

基于随机游走的大规模图中节点对采样算法

Random walk based node pair sampling algorithm in large graph

免费全文下载 (已被下载 次)  
获取PDF全文
作者 吴春琼,叶东毅
机构 1.福州大学阳光学院,福州 350015;2.福州大学 数学与计算机科学学院,福州 350116
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2015)04-1052-04
DOI 10.3969/j.issn.1001-3695.2015.04.022
摘要 社会网络中的节点对采样可用于大规模社会网络的好友预测和用户兴趣识别。当整个网络的拓扑结构不完全或者随机选择用户的代价很高时,传统的均匀顶点采样方法的性能迅速下降。为此,提出了一种基于随机游走的大规模图中节点对采样算法。首先对社会网络的节点对采样进行了系统分析,对不同跳数下的节点对进行了定义;然后将社会网络转换成等价的网络图。新图中的顶点是原图中的边,新图中边的两个顶点是原图中含有相同顶点的两条边。最后,在新图上应用随机游走模型对节点对进行采样。实验结果表明,提出的方法统计误差小、执行效率高,性能明显优于均匀节点采样的相关算法。
关键词 图;随机游走;均匀顶点采样;社会网络
基金项目 福建省自然科学基金资助项目(2010J01329)
本文URL http://www.arocmag.com/article/01-2015-04-022.html
英文标题 Random walk based node pair sampling algorithm in large graph
作者英文名 WU Chun-qiong, YE Dong-yi
机构英文名 1. Sunshine College of Fuzhou University, Fuzhou 350015, China; 2. Institute of Mathematics & Computer Science, Fuzhou University, Fuzhou 350116, China
英文摘要 Node pair sampling in social networks is useful for friend recommendation and interest targeting. While the topology of the whole network is incomplete, or the cost of random generation of a user is very large, the performances of traditional uniform vertex sampling methods decrease quickly. So, this paper proposed a random walk based node pair sampling algorithm in large graph. Firstly, it analyzed systematically the problem of node pair sampling in social networks, and gave definitions of node pair in different hops. Secondly, it transformed the social network into an equivalent graph, whose nodes were edges in original graph, and whose edges contained two node having the same vertex. Finally, it applied random walk on the new graph and proposed a neighbor random walk sampling algorithm. The experiments show that, the proposed algorithm has less error, better performance, and is obviously better than uniform vertex sampling related methods.
英文关键词 graph; random walk; uniform vertex sampling; social network
参考文献 查看稿件参考文献
  [1] 马帅, 曹洋, 沃天宇, 等. 社会网络与图匹配查询[J] . 中国计算机学会通讯, 2012, 8(4):20-24.
[2] SINGLA P, RICHARDSON M. Yes, there is a correlation:from social networks to personal behavior on the Web[C] //Proc of the 17th International Conference on World Wide Web. New York:ACM Press, 2008:655-664.
[3] 余学军. 六度分割理论成就 SNS[J] . 信息网络, 2009(11):37-37.
[4] XUESONG L, STPHANE B. Sampling connected induced subgraphs uniformly at random[C] //Proc of the 24th International Conference on Scientific and Statistical Database Management. 2013:781-792.
[5] 蔡君, 余顺争. 基于随机聚类采样算法的复杂网络社团探测[J] . 计算机应用研究, 2013, 30(12):3560-3563.
[6] LESKOVEC J, HORVITZ E. Planetary-scale views on a large instant-messaging network[C] //Proc of WWW. 2008:915-924.
[7] ZHOU Rong, HANSEN E A. Breadth-first heuristic search[J] . Artificial Intelligence, 2006, 170(4):385-408.
[8] GJOKA M, KURANT M, BUTTS C T, et al. Walking in facebook:a case study of unbiased sampling of OSNs[C] //Proc of IEEE INFOCOM. 2010:2498-2506.
[9] 夏放怀, 沈振康, 唐朝京, 等. 一种用于实时体绘制系统的自适应采样算法[J] . 电子学报, 2002, 30(3):367-371.
[10] MOHAISEN A, YUN A, KIM Y. Measuring the mixing time of social graphs[C] //Proc of ACM SIGCOMM Internet Measurement Confe-rence. 2010:390-403.
[11] HECKATHORN D D. Respondent-driven sampling Ⅱ:deriving valid population estimates from chain-referral samples of hidden populations[J] . Social Problems, 2002, 49(1):11-34.
[12] SALGANIK J, HECKATHORN D D. Sampling and estimation in hidden populations using respondent-driven sampling[J] . Sociological Methodology, 2004, 34(1):193-239.
[13] WANG Ping-hui, ZHAO Jun-zhou, LIU J C S, et al. Sampling node pairs over graphs[EB/OL] . (2012). http://www. cse. cuhk. edu. hk/%7ephwang/samplingpairreport. pdf.
[14] LESKOVEC J, HUTTENLOCHERD, KLEINBERG J. Predicting positive and negative links in online social networks[C] //Proc of WWW. 2010:641-650.
[15] RIPEANU M, FOSTER I T, IAMNITCHI A. Mapping the Gnutella network:properties of large-scale peer-to-peer systems and implications for system design[J] . IEEE Internet Computing Journal, 2002, 6(1):50-57.
收稿日期 2014/3/20
修回日期 2014/5/15
页码 1052-1055
中图分类号 TP301.5
文献标志码 A