《计算机应用研究》|Application Research of Computers


Survey of malicious PDF documents detection

免费全文下载 (已被下载 次)  
作者 林杨东,杜学绘,孙奕
机构 信息工程大学 河南省信息安全重点实验室,郑州 450004
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2018)08-2251-05
DOI 10.3969/j.issn.1001-3695.2018.08.003
摘要 针对PDF的漏洞及相应攻击手段日新月异,传统的恶意PDF文档检测技术难以应对各种新型威胁。目前针对恶意PDF文档检测的研究已取得一定成果,为了更深入地解决该技术存在的不足,采用文献分析方法,首先讨论了必要性、简述了其相关概念和检测基本框架;其次针对其分析技术的不同将现有方案进行分类,从适用范围、检测效果、检测效率等多个方面进行对比分析。最后归纳了该领域当前的热点和发展前景。
关键词 PDF;文档检测;静态分析;动态分析
基金项目 国家“863”计划资助项目(2015AA016006)
本文URL http://www.arocmag.com/article/01-2018-08-003.html
英文标题 Survey of malicious PDF documents detection
作者英文名 Lin Yangdong, Du Xuehui, Sun Yi
机构英文名 HenanProvincialKeyLaboratoryofInformationSecurity,InformationEngineeringUniversity,Zhengzhou450004,China
英文摘要 The vulnerability of PDF and targeted attacks using malicious PDF, it made a great threat to the network office environment of the government, enterprises, and important organizations, so malicious PDF document detection technology has gradually become the hot spot in the study of network security in recent years. Although the malicious PDF document detection technology has made some achievements, this paper was to find deficiencies of existing schemes. Firstly, it discussed the necessity and briefly introduced its related concepts and basic framework of detection. Secondly, according to the differences of its analysis technology, it divided the existing schemes into several categories and concluded the schemes from the aspects of application scope, detection effect and detection efficiency. Finally, it pointed out the existing problems and development prospects so as to provide reference for further research.
英文关键词 PDF; document detection; static analysis; dynamic analysis
参考文献 查看稿件参考文献
  [1] PDF格式[EB/OL] . https://baike. baidu. com/item/pdf格式/8426775?fr=aladdin.
[2] 周可政, 施勇, 薛质. 基于恶意PDF文档的APT检测[J] . 信息安全与通信保密, 2016(1):131-136.
[3] Roychoudhury A, Liu Yang. A systems approach to cyber security[C] // Proc of the 2nd Singapore Cyber-Security R&D Conference. Amsterdam:IOS Press, 2017.
[4] Rechange. 真相只有一个:入侵索尼影视的居然是俄罗斯黑客?[EB/OL] . (2015-02-07). http://www. freebuf. com/news/58552. html.
[5] Clouds. 揭秘:俄罗斯APT漏洞利用工具包[EB/OL] . (2016-08-12). http://www. freebuf. com/articles/network/111490. html.
[6] Clouds. 盘点2016年针对苹果Mac系统恶意软件[EB/OL] . (2017-01-12). http://www. freebuf. com/articles/system/124728. html.
[7] 武雪峰. 恶意PDF文档的分析[D] . 济南:山东大学, 2012.
[8] Blonce A, Filiol E. Portable document format (PDF) security analysis and malware threats[J] . Images Paediatr Cardiol, 2008, 10(2):1-3.
[9] Itabashi K. Portable document format malware[EB/OL] . (2011-01-13)[2017-06] . https://www. symantec. com/.
[10] Ulucenk C, Varadharajan V, Balakrishnan V, et al. Techniques for analysing PDF malware[C] //Proc of Asia-Pacific Software Enginee-ring Conference. Washington DC:IEEE Computer Society, 2011:41-48.
[11] PDF reference:version 1. 7[R/OL] . (2010-09-29). https://www. loc. gov/preservation/digitall formats/fdd/fdd000277. shtml.
[12] Stevens D. Malicious PDF documents explained[J] . IEEE Security & Privacy, 2011, 9(1):80-82.
[13] 陈亮, 陈性元, 孙奕, 等. 基于结构路径的恶意PDF文档检测[J] . 计算机科学, 2015, 42(2):90-94.
[14] 胡江, 周安民. 针对JavaScript攻击的恶意PDF文档检测技术研究[J] . 现代计算机, 2016(1):36-40.
[15] 丁晓煌. 恶意PDF文档的静态检测技术研究[D] . 西安:西安电子科技大学, 2014.
[16] Gandotra E, Bansal D, Sofat S. Malware analysis and classification:a survey[J] . Journal of Information Security, 2016, 5(2):56-64.
[17] Baccas P. Finding rules for heuristic detection of malicious PDFS:with analysis of embedded exploit code[C] //Proc of Virus Bulletin Conference. 2010.
[18] Shabtai A, Moskovitch R, Elovici Y, et al. Detection of malicious code by applying machine learning classifiers on static features:a state-of-the-art survey[J] . Information Security Technical Report, 2009, 14(1):16-29.
[19] rndic' N, Laskov P. Hidost:a static machine-learning-based detector of malicious files[J] . EURASIP Journal on Information Security, 2016, 2016(1):22-41.
[20] Policicchio S. Bulk analysis of malicious PDF documents[D] . Pittsburgh:University of Pittsburgh, 2015.
[21] 孙本阳. PDF文档的安全性检测技术研究[D] . 上海:上海交通大学, 2015.
[22] Li W J, Stolfo S, Stavrou A, et al. A study of malcode-bearing documents[C] //Proc of the 4th International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Berlin:Springer-Verlag, 2007:231-250.
[23] Shafiq M Z, Khayam S A, Farooq M. Embedded malware detection using Markov n-grams[C] //Proc of International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Berlin:Springer, 2008:88-107.
[24] Laskov P, rndic' N. Static detection of malicious JavaScript-bearing PDF documents[C] //Proc of the 27th Computer Security Applications Conference. New York:ACM Press, 2011:373-382.
[25] Vatamanu C, Gavrilut D, Benchea R. A practical approach on clustering malicious PDF documents[J] . Journal in Computer Virology, 2012, 8(4):151-163.
[26] Maiorca D, Giacinto G, Corona I. A pattern recognition system for malicious PDF files detection[C] //Proc of International Conference on Machine Learning and Data Mining in Pattern Recognition. Berlin:Springer-Verlag, 2012:510-524.
[27] Smutz C, Stavrou A. Malicious PDF detection using metadata and structural features[C] //Proc of the 28th Annual Computer Security Applications Conference. New York:ACM Press, 2012:239-248.
[28] rndic′ N, Laskov P. Detection of malicious PDF files based on hierarchical document structure[C] //Proc of the 20th Annual Network & Distributed System Security Symposium. 2013.
[29] Maiorca D, Corona I, Giacinto G. Looking at the bag is not enough to find the bomb:an evasion of structural methods for malicious pdf files detection[C] //Proc of the 8th ACM SIGSAC Symposium on Information, Computer and Communications Security. New York:ACM Press, 2013:119-130.
[30] Maiorca D, Ariu D, Corona I, et al. A structural and content-based approach for a precise and robust detection of malicious PDF files[C] //Proc of International Conference on Information Systems Secu-rity and Privacy. Piscataway, NJ:IEEE Press, 2015:27-36.
[31] Pareek H, Eswari P R L, Babu S C. Entropy and n-gram analysis of malicious PDF documents[J] . International Journal of Engineering Research & Technology, 2013, 2(2):1-4.
[32] Nissim N, Cohen A, Moskovitch R, et al. ALPD:active learning framework for enhancing the detection of malicious PDF files[C] //Proc of IEEE Joint Intelligence and Security Informatics Conference. Piscataway, NJ:IEEE Press, 2014:91-98.
[33] Nissim N, Cohen A, Moskovitch R, et al. Keeping pace with the crea-tion of new malicious PDF files using an active-learning based detection framework[J] . Security Informatics, 2016, 5(1):1-20.
[34] Toth T, Kruegel C. Accurate buffer overflow detection via abstract pay load execution[C] //Proc of the 5th International Conference on Recent Advances in Intrusion Detection. Berlin:Springer, 2002:274-291.
[35] Akritidis P, Markatos E P, Polychronakis M, et al. STRIDE:polymorphic detection through instruction sequence analysis[C] //Proc of IFIP International Conference on Information Security and Privacy in the Age of Ubiquitous Computing. Boston:Springer, 2005:375-392.
[36] Polychronakis M, Anagnostakis K G, Markatos E P. Comprehensive shellcode detection using runtime heuristics[C] //Proc of the 26th Computer Security Applications Conference. New York:ACM Press, 2010:287-296.
[37] Willems C, Holz T, Freiling F. Toward automated dynamic malware analysis using CWSandbox[J] . IEEE Security & Privacy, 2007, 5(2):32-39.
[38] Engelberth M, Willems C, Holz T. MalOffice:detecting malicious documents with combined static and dynamic analysis[C] //Proc of Virus Bulletin International Conference. 2009.
[39] Snow K Z, Krishnan S, Provos N, et al. Shellos:enabling fast detection and forensic analysis of code injection attacks[C] //Proc of the 20th USENIX Conference on Security. Berkeley:USENIX Association, 2011:9.
[40] Cova M, Kruegel C, Vigna G. Detection and analysis of drive-by-download attacks and malicious JavaScript code[C] //Proc of International Conference on World Wide Web. New York:ACM Press, 2010:281-290.
[41] Van Overveldt T, Kruegel C, Vigna G. FlashDetect:actionScript 3 malware detection[C] //Proc of the 15th International Workshop on Research in Attacks, Intrusion and Defenses. Berlin:Springer, 2012:274-293.
[42] Maass M, Scherlis W L, Aldrich J. In-Nimbo sandboxing[C] //Proc of Symposium and Bootcamp on the Science of Security. New York:ACM Press, 2014:Article No 1.
[43] Rieck K, Krueger T, Dewald A. Cujo:efficient detection and prevention of drive-by-download attacks[C] //Proc of the 26th Computer Security Applications Conference. New York:ACM Press, 2010:31-39.
[44] Dewald A, Holz T, Freiling F C. ADSandbox:sandboxing JavaScript to fight malicious websites[C] //Proc of ACM Symposium on Applied Computing. New York:ACM Press, 2010:1859-1864.
[45] Tzermias Z, Sykiotakis G, Polychronakis M, et al. Combining static and dynamic analysis for the detection of malicious documents[C] //Proc of the 4th European Workshop on System Security. New York:ACM Press, 2011:ArticleN0 4.
[46] Curtsinger C, Livshits B, Zorn B, et al. ZOZZLE:fast and precise in-browser JavaScript malware detection[C] //Proc of the 20th USENIX Conference on Security. Berkeley:USENIX Association, 2011:3.
[47] Schmitt F, Gassen J, Gerhards-Padilla E. PDF Scrutinizer:detecting JavaScript-based attacks in PDF documents[C] //Proc of the 10th International Conference on Privacy, Security and Trust. Washington DC:IEEE Computer Society, 2012:104-111.
[48] Lu Xun, Zhuge Jianwei, Wang Ruoyu, et al. De-obfuscation and detection of malicious PDF files with high accuracy[C] //Proc of the 46th Hawaii International Conference on System Sciences. Washington DC:IEEE Computer Society, 2013:4890-4899.
[49] Liu Daiping, Wang Haining, Stavrou A. Detecting malicious JavaScript in PDF through document instrumentation[C] //Proc of IEEE/IFIP International Conference on Dependable Systems and Networks. Washington DC:IEEE Computer Society, 2014:100-111.
[50] Corona I, Maiorca D, Ariu D, et al. Lux0R:detection of malicious PDF-embedded JavaScript code through discriminant analysis of API references[C] //Proc of Workshop on Artificial Intelligent and Securi-ty. New York:ACM Press, 2014:47-57.
[51] 白鹏, 胡影, 戴方芳. 基于shellcode检测的恶意文档检测[C] //第19届全国青年通信学术年会论文集. 2015:141-146.
[52] Nissim N, Cohen A, Glezer C, et al. Detection of malicious PDF files and directions for enhancements:a state-of-the art survey[J] . Computers & Security, 2015, 48(2):246-266.
[53] 黄海新, 张路, 邓丽. 基于数据挖掘的恶意代码检测综述[J] . 计算机科学, 2016, 43(7):13-18, 56.
[54] Li Yuancheng, Ma Rong, Jiao Runhai. A hybrid malicious code detection method based on deep learning[J] . International Journal of Software Engineering & Its Applications, 2015, 9(5):205-216.
[55] Wang Yao, Cai Wandong, Wei Pengcheng. A deep learning approach for detecting malicious JavaScript code[J] . Security & Communication Networks, 2016, 51(8):28656-28667.
收稿日期 2017/6/23
修回日期 2017/8/11
页码 2251-2255
中图分类号 TP309.2
文献标志码 A