《计算机应用研究》|Application Research of Computers

基于差异性采样的流数据聚类算法

Stream data clustering algorithm based on differential sampling

免费全文下载 (已被下载 次)  
获取PDF全文
作者 邱云飞,孙梦冉
机构 辽宁工程技术大学 软件学院,辽宁 葫芦岛 125105
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2019)06-010-1646-06
DOI 10.19734/j.issn.1001-3695.2017.12.0808
摘要 针对传统聚类算法对流数据进行聚类时面临时间复杂度高、存储空间需求大以及准确度较低的问题,提出一种基于差异性采样的流数据聚类算法。首先利用差异性采样法对流数据进行采样并用样本点构造核矩阵,然后利用核模糊C均值聚类算法对核矩阵中的点进行聚类得到一个带有标记的样本核矩阵,最后利用带有标记的样本核矩阵对流数据中的点进行划分。同时利用衰退聚类机制,实时更新样本核矩阵。实验结果表明,相比于传统聚类算法,该算法实现了更低的时间复杂度,同时实时聚类,得到较为理想的聚类结果。
关键词 差异性采样; 衰退聚类机制; 核模糊C均值; 流数据; 时间复杂度
基金项目 国家自然科学基金资助项目(61404069)
辽宁省教育厅科学研究项目(LJYL048)
本文URL http://www.arocmag.com/article/01-2019-06-010.html
英文标题 Stream data clustering algorithm based on differential sampling
作者英文名 Qiu Yunfei, Sun Mengran
机构英文名 College of Software,Liaoning Technical University,Huludao Liaoning 125105,China
英文摘要 Concerning the problems of high time complexity, large storage space requirements and low accuracy when traditional clustering algorithm cluster stream data, this paper proposed a kind of stream data clustering algorithm based on differential sampling. First, it used the differential sampling method sampled stream data, and used sample points to construct kernel matrix. Then it used kernel fuzzy C-means clustering algorithm clustered the data points in the kernel matrix, obtained a marked sample kernel matrix. Finally, it used the marked kernel matrix divided the stream data. Meanwhile, this paper adop-ted the fading cluster mechanism to update kernel matrix in real time. Experimental results show that compared with the traditional clustering algorithm, the proposed algorithm achieves lower time complexity, real-time clustering at the same time, gets the ideal clustering result.
英文关键词 differential sampling; fading cluster mechanism; kernel fuzzy C-means; stream data; time complexity
参考文献 查看稿件参考文献
 
收稿日期 2017/12/18
修回日期 2018/1/29
页码 1646-1651
中图分类号 TP391.9
文献标志码 A