《计算机应用研究》|Application Research of Computers

基于CPU与GPU协作的马尔可夫聚类的并行优化实现

Optimal parallel implementation of Markov clustering based on coordination of CPU and GPU

免费全文下载 (已被下载 次)  
获取PDF全文
作者 陆璐,何芦微
机构 华南理工大学 计算机科学与工程学院,广州 510006
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2018)08-2367-04
DOI 10.3969/j.issn.1001-3695.2018.08.031
摘要 马尔可夫聚类算法(MCL)为网络聚类问题提供了一个有效的方法,尤其是在社区问题和生物信息学方面。然而在MCL中矩阵的expansion是非常耗时的,因为两个大规模矩阵相乘的时间复杂度是n3,每个元素值的计算是独立的,所以expansion和inflation可以并行执行于多核GPU上。一个基本的马尔可夫聚类的并行实现需要使用全邻接矩阵来提高性能,该邻接矩阵通常是稀疏甚至是极大稀疏的。因此,为了优化马尔可夫聚类的并行实现,采用CSR×CSC格式去存储矩阵,大大减少了空间的浪费,并在一定程度上提升了expansion的性能。实验结果表明,在处理大规模网络问题上,Sparse-MCL比CPU-MCL和P-MCL更有效。
关键词 MCL;GPU;稀疏矩阵;OpenCL;并行优化
基金项目 国家自然科学基金资助项目(61370103)
广东省科技计划资助项目(2016B010107004)
广州市产学研重大项目(201604010022)
本文URL http://www.arocmag.com/article/01-2018-08-031.html
英文标题 Optimal parallel implementation of Markov clustering based on coordination of CPU and GPU
作者英文名 Lu Lu, He Luwei
机构英文名 SchoolofComputerScience&Engineering,SouthChinaUniversityofTechnology,Guangzhou510006,China
英文摘要 Markov clustering algorithm provides an effective method for network clustering problem, especially including community problem and bioinformatics.However, the expansion operation is the most time-consuming procedure, since the multiplication of two large-scale phalanxes can cause the time complexity of O(n3).Considering that each element value calculation is independent from others, expansion and inflation can be parallel-executed on the multi-core GPU.First, this paper proposed a basic parallel implementation of Markov clustering which needs the whole adjacent matrix as a traditional method to improve the performance.In addition, the adjacent matrix is usually sparse and sometimes ultra-sparse.Hence, it developed an optimal parallel implementation working with the CSR×CSC multiplication, which significantly decreased the space needed to store the matrix and promotes the performance of the expansion to the extent.The experimental results show that Sparse-MCL is more effective than CPU-MCL and P-MCL when dealing with the big scale network.
英文关键词 MCL; GPU; sparse matrix; OpenCL; parallel optimization
参考文献 查看稿件参考文献
  [1] Van Dongen S. The introduction of MCL:a cluster algorithm for graphs[EB/OL] . (2016-05-06)[2016-09-25] . http://www. micans. org/mcl/.
[2] Schaeffer S E. Graph clustering by flow simulation[D] . Netherlands:University of Utrecht, 2010.
[3] Brohée S, Helden J V. Evaluation of clustering algorithms for protein-protein interaction networks[J] . BMC Bioinformatics, 2006, 7(1):488.
[4] Chen Guocai, Zhao Jieyi, Cohen T, et al. Using ontology fingerprints to disambiguate gene name entities in the biomedical literature[J] . Database:the Journal of BioLogical Databases and Curation, 2015, 2015:bav034.
[5] Bustamam A, Burrage K, Hamilton N A. Fast parallel Markov clustering in bioinformatics using massively parallel graphics parallel and distributed methods in verification[C] //Proc of the 9th International Workshop on Parallel and Distributed Methods in Verification, and 2nd International Workshop on High Performance Computational Systems Biology. Washington DC:IEEE Computer Society, 2010:116-125.
[6] Cuzzocrea A, Mumolo E, Timeus N, et al. GPU-aware genetic estimation of hidden Markov models for workload classification problems[C] //Proc of the 40th Annual Computer Software and Applications Conference. Washington DC:IEEE Computer Society, 2016:674-682.
[7] Li Chuan, Wand M. Combining Markov random fields and convolutional neural networks for image synthesis[C] //Proc of IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, NJ:IEEE Press, 2016:2479-2486.
[8] Zhai Yanlong, Guo Ying, Chen Qiurui, et al. Design and optimization of a big data computing framework based on CPU/GPU cluster[C] //Proc of the 10th International Conference on High Performance Computing and Communications & IEEE International Conference on Embedded and Ubiquitous Computing. Piscataway, NJ:IEEE Press, 2013:1039-1046.
[9] Yung L S, Yang Can, Wan Xiang, et al. GBOOST:a GPU-based tool for detecting gene-gene interactions in genome-wide case control studies[J] . Bioinformatics, 2011, 27(9):1309-1310.
[10] Vouzis P D, Sahinidis N V. GPU-BLAST:using graphics processors to accelerate protein sequence alignment[J] . Bioinformatics, 2011, 27(2):182-188.
[11] Kohchoff J K, Sosnick M H, Hsu W T, et al. CAMPAIGN:an open-source library of GPU-accelerated data clustering algorithms[J] . Bioinformatics, 2011, 27(16):2322-2323.
[12] Bell N, Garland M. Efficient sparse matrix-vector multiplication on CUDA, NVIDIA Technical Report, 2008-004[R] . City of Santa Clara:NVIDIA, 2008.
[13] Gaster B R, Howes L, Kaeli D R, et al. Heterogeneous computing with OpenCL[M] . Beijing:Tsinghua University Press, 2012.
[14] Yao Ping, An Hong, Xu Mu, et al. CuHMMer:a load-balanced CPU-GPU cooperative bioinformatics application[C] //Proc of International Conference on High Performance Computing and Simulation. Piscataway, NJ:IEEE Press, 2010:24-30.
[15] Lancichinetti A, Fortunato S. Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities[J] . Physical Review E, Statistical, Nonlinear & Soft Matter Physics, 2009, 80(1 Pt 2):016118.
[16] Newman M. Network data of email, netscience, power and hep[EB/OL] . (2013-07-19)[2016-09-27] . http://www-personal. umich. edu/~mejn/netdata/.
[17] ubelj L. Network data of facebook and slovenia scientists[EB/OL] . (2016-05-23)[2016-09- 27] . http://lovro. lpt. fri. uni-lj. si/.
收稿日期 2017/4/9
修回日期 2017/4/30
页码 2367-2370,2378
中图分类号 TP301.6
文献标志码 A