《计算机应用研究》|Application Research of Computers

基于EK-medoids聚类和邻域距离的特征选择方法

Feature selection method based on EK-medoids cluster and neighborhood distance

免费全文下载 (已被下载 次)  
获取PDF全文
作者 孙印杰,张新乐,孙林
机构 河南师范大学 a.计算机与信息工程学院;b.河南省高校计算智能与数据挖掘工程技术研究中心,河南 新乡 453007
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2019)08-008-2279-05
DOI 10.19734/j.issn.1001-3695.2018.02.0093
摘要 针对传统聚类算法中只注重数据间的距离关系而忽视数据全局性分布结构的问题,提出一种基于EK-medoids聚类和邻域距离的特征选择方法。首先,用稀疏重构的方法计算数据样本之间的有效距离,构建基于有效距离的相似性矩阵;然后,将相似性矩阵应用到K-medoids聚类算法中,获取新的聚类中心,进而提出EK-medoids聚类算法,可有效对原始数据集进行聚类;最后,根据划分结果所构成簇的邻域距离给出确定数据集中的属性重要度定义,应用启发式搜索方法设计一种EK-medoids聚类和邻域距离的特征选择算法,降低了聚类算法的时间复杂度。实验结果表明,该算法不仅有效地提高了聚类结果的精度,而且也可选择出分类精度较高的特征子集。
关键词 特征选择; 有效距离; K-medoids聚类; 邻域距离
基金项目 国家自然科学基金资助项目(61772176,U1604156,11702087)
中国博士后科学基金资助项目(2016M602247)
河南省科技创新人才项目(184100510003)
河南省科技攻关项目(182102210362,162102210261,182102210078)
河南省高校青年骨干教师培养计划资助项目(2017GGJS041)
河南省自然科学基金资助项目(182300410130,182300410368)
河南省高等学校重点科研计划资助项目(14A520069)
新乡市科技攻关计划资助项目(CXGG17002)
河南师范大学博士科研启动费支持课题(qd15132,qd15129,qd15131)
河南师范大学青年科学基金资助项目(2015QK23,2015QK24)
本文URL http://www.arocmag.com/article/01-2019-08-008.html
英文标题 Feature selection method based on EK-medoids cluster and neighborhood distance
作者英文名 Sun Yinjie, Zhang Xinle, Sun Lin
机构英文名 a.College of Computer & Information Engineering,b.Engineering Technology Research Center for Computing Intelligence & Data Mining of Henan Province,Henan Normal University,Xinxiang Henan 453007,China
英文摘要 Since the traditional clustering algorithms only pay attention to the distance relationship among data, and ignore the problem of global distribution data structure, this paper proposed a feature selection method based on EK-medoids cluster and neighborhood distance. First of all, it calculated the effective distances between data samples by using the sparse reconstruction method, and constructed an effective distance-based similarity matrix. Then it matrixed the similarity introduced in the K-medoids clustering algorithm, and obtained these new cluster centers. This paper developed an EK-medoids clustering algorithm which could effectively cluster these original data sets. Finally, according to the classification results of clusters, it defined an attribute importance based on the neighborhood distance, and designed an EK-medoids cluster and neighborhood distance-based feature selection algorithm on the basis of heuristic searching method, which could further reduce the time complexity of cluster algorithms. The experimental results show that the proposed algorithm not only effectively can improve the accuracy of the clustering results, but also selects the feature subset with high classification accuracy.
英文关键词 feature selection; effective distance; K-medoids cluster; neighborhood distance
参考文献 查看稿件参考文献
 
收稿日期 2018/2/27
修回日期 2018/4/10
页码 2279-2283
中图分类号 TP301.6
文献标志码 A