《计算机应用研究》|Application Research of Computers

基于特征矢量化的肺结节特征选择算法

Feature selection based on feature vectorization on computer tomography scan of pulmonary nodules

免费全文下载 (已被下载 次)  
获取PDF全文
作者 贺兴怡,龚敬,王丽嘉,聂生东
机构 上海理工大学 医学影像工程研究所,上海 200093
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2018)08-2544-05
DOI 10.3969/j.issn.1001-3695.2018.08.076
摘要 针对肺结节良/恶性分类模型中特征选择过程无法避免特征多样性不受破坏的问题,提出一种将肺结节特征矢量化处理的特征选择方法。假设每个肺结节特征都是由数据、类型构成的一个矢量,按照特征类型添加特征到相应的特征子集,并分别利用Relief算法评价特征、特征子集的分类重要性。通过动态阈值的方式筛选得到优化后的特征子集。在150个肺结节样本的分类实验中,采用提出的算法所取得的敏感性为94.7%、特异性为93.7%、虚警率为5.2%、受试者工作特性曲线下面积为97.3%。分析表明,提出的算法几乎不破坏肺结节特征的多样性,能够显著提高肺结节良/恶性分类的准确性。
关键词 特征选择;肺结节;Relief算法;计算机断层扫描
基金项目 国家自然科学基金资助项目(60972122)
上海市自然科学基金资助项目(14ZR1427900)
本文URL http://www.arocmag.com/article/01-2018-08-076.html
英文标题 Feature selection based on feature vectorization on computer tomography scan of pulmonary nodules
作者英文名 He Xingyi, Gong Jing, Wang Lijia, Nie Shengdong
机构英文名 InstituteofMedicalImagingEngineering,UniversityofShanghaiforScience&Technology,Shanghai200093,China
英文摘要 To solve the problem of feature selection techniques by which the diversity of features was damaged in the process of distinguishing malignant pulmonary nodules from benign pulmonary nodules, this paper developed a new feature selection based on feature vectorization (FSBFV).Firstly, it assumed that each feature was a vector composing of its data and type.Secondly, it divided the feature space into several feature subsets according to their types, and then it applied Relief to evaluate the quality of features and feature subsets, as well as applied dynamic threshold to select those features with high quality and feature subsets with high quality.Finally, it combined selected feature subsets into the one.It carried out the classification experiment of 150 nodules.The sensibility, specificity, false alarm rate, and the area under the receiver operating characteristic curve based on the proposed method were 94.7%, 93.7%, 5.2%, and 97.3%, respectively.Further analysis indicates that the proposed method makes sure the diversity of features in the optimized set, and is helpful to improve the classification accuracy of benign and malignant pulmonary nodules.
英文关键词 feature selection; pulmonary nodules; Relief algorithm; computer tomography scan
参考文献 查看稿件参考文献
  [1] Siegel R L, Miller K D, Jemal A. Cancer statistics[J] . CA:A Cancer Journal for Clinicians, 2016, 67(1):7-30.
[2] 吴海丰, 刘韫宁, 孙涛, 等. 基于 Curvelet 变换的肺结节 CT 图像良恶性分类研究[J] . 北京生物医学工程, 2011, 30(5):471-473, 511.
[3] Aggarwal P, Vig R, Sardana H K. Semantic and content-based medical image retrieval for lung cancer diagnosis with the inclusion of expert knowledge and proven pathology[C] //Proc of the 2nd International Conference on Image Information Processing. Piscataway, NJ:IEEE Press, 2013:346-351.
[4] 裴博. 基于混合成像的孤立性肺结节良恶性预测模型的研究[D] . 太原:太原理工大学, 2015.
[5] Fraioli F, Serra G, Passariello R. CAD (computed-aided detection) and CADx (computer aided diagnosis) systems in identifying and characterising lung nodules on chest CT:overview of research, developments and new prospects[J] . La Radiologia Medica, 2010, 115(3):385-402.
[6] Cheng Jiezhi, Ni Dong, Chou Yihong, et al. Computer-aided diagnosis with deep learning architecture:applications to breast besions in US images and pulmonary nodules in CT scans[J] . Scientific Reports, 2016, 6:Article No. 24454.
[7] Chen Hui, Zhang Jing, Xu Yan, et al. Performance comparison of artificial neural network and logistic regression model for differentiating lung nodules on CT scans[J] . Expert Systems with Applications, 2012, 39(13):11503-11509.
[8] 常莎, 王瑞平. 基于CT三维图像的肺结节良恶性鉴别研究[J] . 北京生物医学工程, 2012, 32(1):12-16.
[9] 王晋, 张小龙, 赵涓涓. 孤立性肺结节诊断模型的特征选择算法[J] . 中国科技论文, 2014, 9(10):1201-1205.
[10] Samala R, Moreno W, You Y, et al. A novel approach to nodule feature optimization on thin section thoracic CT[J] . Academic Radiology, 2009, 16(4):418-427.
[11] Jaffar M A, Eisa E A. Classification of lung nodules using hybrid features from CT scan images[C] // Selvaraj H, Zydek D, Chmaj G. Progress in Systems Engineering. Berlin:Springer International Publishing, 2015:645-651.
[12] Lee M C, Boroczky L, Sungur-Stasik K, et al. A two-step approach for feature selection and classifier ensemble construction in computer-aided diagnosis[C] //Proc of the 21st IEEE International Symposium on Computer-Based Medical Systems. Piscataway, NJ:IEEE Press, 2008:548-553.
[13] 齐连君, 周剑, 张金山. 肺癌诊断的8种影像征象分析[J] . 武警医学, 2005, 16(8):622-624. [14] Tripathi S, Zhen Xuqiu. Differentiation of benign and malignant solitary pulmonary nodule:literature review[J] . Advances in Lung Cancer, 2015, 4(2):17-24.
[15] 李秋萍. 基于医学图像的肺结节特征提取与辅助检测[D] . 济南:山东财经大学, 2015.
[16] 张厚海. 面向肺癌CAD的医学图像检索算法研究与系统实现[D] . 沈阳:东北大学, 2015.
[17] 杨玉海. 孤立性肺结节良恶性综合性影像诊断的ROC分析及其临床应用价值研究[D] . 济南:山东大学, 2011.
[18] Jirapatnakul A C, Reeves A P, Apanasovich T V, et al. Characterization of pulmonary nodules:effects of size and feature type on reported performance[C] //Proc of SPIE:the International Society for Optical Engineering, Medical Imaging. 2008:69151E-69151E-9.
[19] Han Fangfang, Wang Huafeng, Zhang Guopeng, et al. Texture feature analysis for computer-aided diagnosis on pulmonary nodules[J] . Journal of Digital Imaging, 2015, 28(1):99-115.
[20] Kira K, Rendell L A. The feature selection problem:traditional methods and a new algorithm[C] //Proc of the 10th International Conference on Artificial Intelligence. San Jose:AAAI Press, 1992:129-134.
[21] 周志华. 机器学习[M] . 北京:清华大学出版社, 2016:247-261.
[22] Opulencia P, Channin D S, Raicu D S, et al. Mapping LIDC, RadLex, and lung nodule image features[J] . Journal of Digital Imaging, 2011, 24(2):256-270.
[23] 顾晓晖, 马晓宇, 陈卉. LIDC中肺结节注释信息的提取及数据库的建立[J] . 数理医药学杂志, 2009, 22(2):203-206.
[24] Dhara A K, Mukhopadhyay S, Dutta A, et al. Classification of pulmonary nodules in lung CT images using shape and texture features[C] //Medical Imaging:Computer-Aided Diagnosis. 2016:97852Y.
[25] 王菲. 面向肺部CAD的特征提取、选择及分类方法研究[D] . 沈阳:东北大学, 2009.
[26] Way T W, Sahiner B, Chan H P, et al. Computer-aided diagnosis of pulmonary nodules on CT scans:improvement of classification performance with nodule surface features[J] . Medical Physics, 2009, 36(7):3086-3098.
[27] 何珂程. CT图像的肺结节特征提取的方法研究[D] . 武汉:华中科技大学, 2011.
[28] 杨宏薇. 肺结节特征提取和特征选择的研究及系统实现[D] . 重庆:重庆大学, 2010.
[29] Lee Rodgers J, Nicewander W A. Thirteen ways to look at the correlation coefficient[J] . American Statistician, 1988, 42(1):59-66.
[30] Burman P. A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods[J] . Biometrika, 1989, 76(3):503-514.
[31] 方匡南, 吴见彬, 朱建平, 等. 随机森林方法研究综述[J] . 统计与信息论坛, 2011, 26(3):32-38.
[32] Lalkhen A G, McCluskey A. Clinical tests:sensitivity and specificity[J] . Continuing Education in Anaesthesia, Critical Care & Pain, 2008, 8(6):221-223.
[33] Zou K H, O’Malley A J, Mauri L. Receiver-operating characteristic analysis for evaluating diagnostic tests and predictive models[J] . Circulation, 2007, 115(5):654-657.
收稿日期 2017/3/21
修回日期 2017/4/27
页码 2544-2548
中图分类号 TP391.41;TP301.6
文献标志码 A