《计算机应用研究》|Application Research of Computers

图形处理器流水线数据压缩技术研究综述

Survey of data compression techniques for GPU pipeline

免费全文下载 (已被下载 次)  
获取PDF全文
作者 韩立敏,田泽,张骏,郑新建,任向隆
机构 中航工业西安航空计算技术研究所,西安 710068
统计 摘要被查看 次,已被下载
文章编号 1001-3695(2018)03-0648-06
DOI 10.3969/j.issn.1001-3695.2018.03.002
摘要 提高功耗效率是高端GPU的关键设计目标之一。在3D图形渲染流水线的多个阶段,使用数据压缩技术能够显著减少GPU片外存储器的访问量,从而达到提高图形绘制性能和降低功耗的效果。为了对图形处理器流水线数据压缩技术的应用现状进行总结和分析,立足于GPU图形渲染流水线和存储系统的结构特征,归纳了各种缓冲区对象、纹理数据专用压缩算法的关键特性;分析了图形流水线数据压缩技术的研究现状、不足与挑战;并基于应用需求指明GPU流水线数据压缩技术进一步的研究内容。
关键词 图形处理器;数据压缩;3D渲染流水线;功耗效率
基金项目
本文URL http://www.arocmag.com/article/01-2018-03-002.html
英文标题 Survey of data compression techniques for GPU pipeline
作者英文名 Han Limin, Tian Ze, Zhang Jun, Zheng Xinjian, Ren Xianglong
机构英文名 Xi'anAeronauticsComputingTechniqueResearchInstituteofChinaAeronauticalIndustry,Xi'an710068,China
英文摘要 One of the most important design aspects for high-end GPU is improving its power efficiency. Data compression can be used in various stages of the GPU 3D rendering pipeline to reduce off-chip memory traffic, yield high performance and low power consumption. In order to summarize the research progress and application status of data compression techniques for GPU pipeline, this paper outlined the key features of data compression specialized for buffer objects and texture based on the structure characteristics of GPU 3D rendering pipeline and memory system. Moreover, it discussed research status, inefficiencies and challenges of data compression in GPU pipeline. In the end, it indicated the further research content for GPU data compression according to application demands.
英文关键词 graphics processing unit(GPU); data compression; 3D graphics rendering pipeline; power efficiency
参考文献 查看稿件参考文献
  [1] Morein S. ATI Radeon hyperz technology[EB/OL] . (2000-01-01). http://www. graphicshardware. org/previous/www_2000/presentations/ATIHot3D. pdf.
[2] Akenine-Moller T, Strom J. Graphics processing units for handhelds[J] . Proceedings of the IEEE, 2008, 96(5) :779-789.
[3] Arnau J M, Parcerisa J M, Xekalakis P. Boosting mobile GPU performance with a decoupled access/execute fragment processor[C] //Proc of International Symposium on Computer Architecture. 2012:84-93.
[4] Chu S L, Hsiao C C, Hsieh C C. An energy-efficient unified register file for mobile GPUs[C] //Proc of the 9th IFIP International Confe-rence on Embedded and Ubiquitous Computing. [S. l. ] :IEEE Computer Society, 2011:166-173.
[5] Strom J, Akenine-Moller T. Ipackman:high quality, low-complexity texture compression for mobile phones[C] //Proc of SIGGRAPH/ EUROGRAPHICS Conference on Graphics Hardware. New York:ACM Press, 2005:63-70.
[6] Arnau J M, Parcerisa J M, Xekalakis P. Parallel frame rendering:trading responsiveness for energy on a mobile GPU[C] //Proc of the 22nd International Conference on Parallel Architectures Compilation Techniques. [S. l. ] :IEEE Computer Society, 2013:7-17.
[7] Gennady P, Evgeny B. A case for toggle-aware compression for GPU system[C] // Proc of International Conference on High Performance Computer Architecture. [S. l. ] :IEEE Computer Society, 2016:188-200.
[8] NVIDIA. Tegra 4 family GPU architecture[EB/OL] . (2013-02-01). http://www. NVIDIA. Com/Docs/IO//116757/Tegra_4_GPU_Whitepaper_Finalv2. pdf.
[9] NVIDIA. Tegra X1 whitepaper[EB/OL] . (2015-02-01). http://international. download. nvidia. com/pdf/tegra/Tegra-X1-whitepaper-v1. 0. pdf.
[10] Lee J, Choe S, Lee S. Mesh geometry compression for mobile graphics[C] //Proc of IEEE Consumer Communication & Networking Conference. [S. l. ] :IEEE Press, 2010:301-305.
[11] Alliez P, Gotsman C. Recent advances in compression of 3D meshes[C] //Advances in Multiresolution for Geometric Modelling. Berlin:Springer, 2005:3-26.
[12] Purnomo B, Bilodeau J, Cohen J D, et al. Hardware-compatible vertex compression using quantization and simplification[C] //Proc of ACM SIGGRAPH/EUROGRAPHICS Symposium on Graphics Hardware. Los Angeles:DBLP, 2005:53-61.
[13] Chhugani J, Kumar S. Geometry engine optimization:cache friendly compressed representation of geometry[C] //Proc of Symposium Interactive 3D Graphics. New York:ACM Press, 2007:9-16.
[14] Meng Shaoliang, Wang Aili, Li Shengming. Compression of 3D triangle meshes based on predictive vector quantization[C] //Proc of the 3rd International Symposium on Systems and Control in Aeronautics and Astronautics. [S. l. ] :IEEE Press, 2010:1403-1406.
[15] Christian D, Jens S, Rudiger W. Efficient geometry compression for GPU-based decoding in real-time terrain rendering[J] . Computer Graphics Forum, 2009, 28(1):67-83.
[16] Quirin N M. Real-time geometry decompression on graphics hardware[D] . Erlangen:University of Erlangen Press, 2012.
[17] Wittenbrink C M, Kilgariff E, Prabhu A. Fermi GF100 GPU architecture[J] . IEEE Micro, 2011, 31(2):50-59.
[18] Iourcha K, Nayak K S, Hong Zhou. System and method for fixed-rate block-based image compression with inferred pixels values[EB/OL] . (2004-08-10). http://www. freepatentsonline. com/6775417. html.
[19] Delp J, Mitchell R. Image compression using block truncation coding[J] . IEEE Trans on Communications, 1979, 27(9):1335-1342.
[20] Jiang Yifei, Huan Dandan. Improved texture compression for S3TC[C] //Proc of Picture Coding Symposium. 2010:386-389.
[21] Strm J, Pettersson M. ETC2:texture compression using invalid combinations[C] //Proc of SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware. New York:ACM Press, 2007:49-54.
[22] Fenney S. Texture compression using low-frequency signal modulation[C] //Proc of ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware. 2003:84-91.
[23] PowerVR:getting great graphics performance with the PowerVR insi-der SDK[EB/OL] . (2012 -01-01). http://imgtec. eetrend. com/sites/imgtec. eetrend. com/files/article/201402/1457-2109-powervrneibusdkgaishu. pdf.
[24] Van Waveren J M P, Castano I. Real-time normal map DXT compression[EB/OL] . (2008-02-07). http://www. nvidia. com/object/real-time-normal-map-dxt-compression. html.
[25] Wilson D. A new compression scheme:3Dc[EB/OL] . (2004-05-04). http://www. anandtech. com/show/1314/6.
[26] AMD RS880 databook device specification for the RS880[EB/OL] . (2013-01-01). http://support. amd. com/techdocs/46112. pdf.
[27] Guthe S, Goesele M. GPU-based lossless volume data compression[C] //Proc of Conference:the True Vision-Capture, Transmission and Display of 3D Video. 2016:1-4.
[28] Nystad J, Lassen A, Pomianowski A, et al. Adaptive scalable texture compression[C] //Proc of International Conference on Computer Graphics and Interactive Techniques. [S. l. ] :The Eurographics Association, 2012:105-114.
[29] Hasselgren J. Efficient depth buffer compression[C] //Proc of International Conference on Computer Graphics and Interactive Technique. [S. l. ] :The Eurographics Association, 2006:103-110.
[30] Deroo J, Morein S, Favela B, et al. Method and apparatus for compressing parameter values for pixels in a display frame:USA, 6476811[P] . 2002-11-05.
[31] Van Dyke J M, Margeson J E. Method and apparatus for managing and accessing depth data in a computer graphics system:USA, 6961057[P] . 2005-11-01.
[32] Kim H S, Lee J, Kim H, et al. A lossless color image compression architecture using a parallel Golomb-Rice hardware CODEC[J] . IEEE Trans on Circuits and Systems for Video Technology, 2011, 21(11):1581-1587.
[33] Andersson M, Munkberg J, Akenine-Mller T. Stochastic depth buf-fer compression using generalized plane encoding[J] . Computer Graphics Forum, 2013, 32(2):103-112.
[34] The OpenGL graphics system:a specification version 2. 0[EB/OL] . (2004-10-22). https://khronos. org/registry/OpenGL/specs/gl/glspec20. pdf.
[35] AMD M74/M72 databook technical reference manual[EB/OL] . (2013-01-01). http://dev. xdevs. com/attachments/download/276/42013_m74_ds_nda_2. 02. pdf.
[36] Amsinck C, Schneider B O, Bolz J A. Stencil buffer data compression:USA, 9390464[P] . 2016-07-12.
[37] Hasselgren J. Performance improvements for the rasterization pipeline[D] . Lund:Lund University Press, 2009.
[38] Rasmusso J, Akenine-Moller T, Hosselgren J, et al. Frame buffer compress and decompress method for graphics rendering:USA, 8031973[P] . 2011-10-04.
[39] Ian B. The ARM MaliTM-T880 mobile GPU[C] //Proc of IEEE Hot Chips 27 Symposium. 2016:1-27.
[40] AMD780E databook technical reference manual[EB/OL] . (2009 -01-01). http://support. amd. com/TechDocs/45732. pdf.
[41] Jacob M, Wennersten P, Rasmusson J, et al. Floating-point buffer compression in a unified codec architecture[C] //Proc of ACM SIGGRAPH/EUROGRAPHICS Symposium on Graphics Hardware. [S. l. ] :Eurographics Association, 2008:75-84.
[42] Chien S Y, Lok K H, Lu Y C. Low-decoding-latency buffer compression for graphics processing units[J] . IEEE Trans on Multimedia, 2012, 14(2):250-263.
[43] Pool J, Lastra A, Singh M. Lossless compression of variable-precision floating-point buffers on GPUs[C] //Proc of ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games. New York:ACM Press, 2012:47-54.
[44] Geforce GTX980 whitepaper[EB/OL] . (2014-01-01). http://international. download. nvidia. com/geforce-com/international/pdfs/GeForce_GTX_980_Whitepaper_FINAL. PDF.
收稿日期 2017/2/14
修回日期 2017/3/27
页码 648-653
中图分类号 TP303
文献标志码 A