design of memory efficient vlsi architecture for real time multimedia
Transcription
design of memory efficient vlsi architecture for real time multimedia
S.Allwin Devaraj et al., International Journal of Computer & IT [ISSN No.(Print):2320-8074] DESIGN OF MEMORY EFFICIENT VLSI ARCHITECTURE FOR REAL TIME MULTIMEDIA APPLICATION 1 S.Allwin Devaraj, babu.allwin@gmail.com Department of ECE, Assistant Professor 2 R.Helen Vedanayagi Anita, rhelenanita@gmail.com Department. of ECE, UG Student Francis Xavier College of Engineering,Tirunelveli ABSTRACT:- On-chip memory hierarchy for a video, contains the data memory and the context memory organizations for better optimization. Compressing the memory space is important aspect in VLSI, in order to reduce the power consumption, power dissipation and area.HCC and LDO techniques are used to compress the onchip memory space and thereby reduces the reconfiguration time and the data reference time. HCC, the contexts to be constructed in hierarchical fashion in order to eliminate repetitive portions of the contexts. In LDO, it increases the reuse ratio of memory space automatically and also for an several time references. The hierarchical storage & re usage for saving required memory space , without affecting the performance .In future by increasing the size of macro blocks from 8X8 into 64x64 in H.265 CODEC to achieve better compression rate and to increase throughput rate using pipelining technique. Keywords: HCC, LDO, CODEC, On-chip memory, Macro blocks I. INTRODUCTION Multimedia technology, improved the quality of human lives from large device to portable devices, from home to outdoors ,and from specific people to everybody. Digital video technology is considered to be the most important part of multimedia providing applications such as high definition TV, 3Dgraphics, digital cinema, camera, and so on. The multimedia system used to store and/or transmit video data becomes an essential concern. The key role of a video system to reduce the video data without losing any of its quality through an video CODEC(encoder and decoder),or known as video coding. © 2015, IJCIT All Rights Reserved Video consumes 66% of the total Internet data flow, and that number continues to increase rapidly. Users want to watch high quality videos but, for an online video providers, costs of purchasing the network bandwidth and the storage devices grow increasingly every year.[9] Compressing video reduces its file size, it is important, because smaller files upload faster and thus it save bandwidth and storage costs, and load quicker when it played back . Video is composed of a series of still frames normally 24-60 frames per second and only part of the image changes from frame to frame. Instead of storing two nearly-identical frames in a video file, only the parts of the image to be changed are recorded. For eg. If you have a friend waving to the camera, and your friend's arm is the only thing moving in the shot, and thus the image information of your friend's arm is the only thing recorded. The method a computer used for determining the amount of change between frames is called the codec. For every couple of frames, the codec will pick a key frame that to be serves as a reference for all the frames after it. H.261and H.263 are now widely used for real-time video communications in a network.H.264 is now a widely adopted standard and its an international standard for video compression. H.264 provides significant improvements in coding efficiency, latency, complexity and robustness. H.264 format is more broadly available in network cameras, video management and video encoders software, system designers and integrators will need to make sure that the products and vendors they choose and support this new open standard. H.264 is a standard for video compression, and now it is currently one of the most commonly used format for recording, compression and distribution of high definition video. In this re usage P a g e | 30 S.Allwin Devaraj et al., International Journal of Computer & IT [ISSN No.(Print):2320-8074] scheme and hierarchical storage are used to the save of memory space required , without affecting the performance.[1] In LDO, the on-chip data are classified into two types, based on the lifetime of data. The short-lifetime of data are stored in the FIFO to increase the reuse ratio of memory space automatically and can be used to pass the value from one stage to next. And RAM for storing the long life time data . Inorder to perform inter & intra prediction, the previous frames and blocks need to be stored for reference. A H.264 video encoder which carries out prediction, DCT transform and encoding processes to produce a compressed H.264 bit stream, H.264 video decoder carries out the complementary processes of decoding, inverse transform and reconstruction to produce a decoded video sequence. The HCC and LDO techniques used in H.264 CODEC uses the logic element of 6869 , speed of 153.12MHZ and the dissipate power of 514.47 mw. And in proposed HEVC is said to double the data compression ratio compared to H.264 at the same level of video quality. And it can alternatively be used to provide improved video quality at the same bit rate. It can support 8K UHD and resolution up to 8192x4320. I. SYSTEM DESIGN Operation The video is composed of series of still frames.DCT transform is performed to first frame of video after transformation it stored as reference in RAM and then the next frame is get processed for DCT and then the difference between the frames is get predicted by motion estimation i.e the reference between the different types of frame are realized by a process called motion estimation. Motion estimation used to estimates the residual value obtained from DCT and motion compensation predicts which prediction is performed to frames. The use of deblocking filter is to improve the visual quality and prediction performance by smoothening sharp edges between the macro block. IDCT is performed to recover the original frame. In this also HCC and LDO technique used in H.265 CODEC thereby achieving the reduction in area and dissipation of power and increases speed when compared to H.264. © 2015, IJCIT All Rights Reserved INPUT VIDEO VIDEO TO FRAME CONVERTER FRAME 1 FRAME 2 APPLY DCT APPLY DCT REFERENCE FRAME MOTION ESTIMATION COMPARE PERFORM INTER OR INTRA PREDICTION IDCT DEBLOCKING FILTER RECOVERED FRAME RAM(ON CHIP MEMORY ) Fig 1: Flow diagram of proposed II. DETAILED SYSTEM DESIGN Context memory : The context memories store the contexts, and get accessed during the configuration. The context memory organization is crucial in the reconfigurable system, because affects the size of contexts and reflects the configuration.[1] The context size determines both the reconfiguration overhead and the silicon area. The smaller the contexts, and it smaller the memory space required for the contexts. Smaller the size of contexts assist in reducing the transfer delay of contexts from off-chip memory to on-chip memory. Overhead in the communication heavily depends on the context memory organization. P a g e | 31 S.Allwin Devaraj et al., International Journal of Computer & IT [ISSN No.(Print):2320-8074] the current macro block or block and the result of the residual is compressed and transmitted to the decoder, together with the information required for the decoder to repeat the prediction process. The decoder creates an identical prediction and adds this to the decoded residual or block. CONTEX T MEMOR Y on chip DCT CURRENT TRANSFOR M FRAM E REFERENCE FRAM E INTRA & INTER PREDICTIO N DEBLOCKING FIF O FILTE R off chip MOTION COMPENSATION on chip MOTION ESTIMATIO N Inter prediction : It aims to remove temporal redundancies in a video sequence. Inter prediction macro blocks must reside in P-slices and require an history of previously encode frames and it to be kept in memory. The availability of multiple reference frames for motion compensation is a new feature in H.264/AVC standard. [9]For inter prediction a 16x16 macro block to be partitioned into any 4x4 multiple. Hierarchial configuration context: In nonhierarchical, the context get stored repeatedly and it directly included in the task context. The main advantage of HCC is to compress the memory space required for the contexts. The context are get constructed in hierarchical fashion in order to completely eliminate the repetitive portions of the context and it can be accessed and located conveniently. Fig 2: Block diagram III. RESULTS& DISCUSSION On-chip memory : On-chip memory is the simplest type of memory to use in an FPGA-based embedded system. And it provides the highest throughput, lowest latency memory to be possible in an FPGAbased embedded system. Advantage of on-chip memory requires no additional board space or circuitboard wiring because it is implemented directly on the FPGA. On-chip memory can also saves development time and cost. Motion estimation : The references between the different types of frames are get realized by a process called motion compensation or motion estimation .The correlation in terms of motion between two frames is get represented by a motion vector. The frame correlation ,and the pixel arithmetic difference is strongly depends on how the good estimation algorithm is implemented. Good estimation results in better quality of the coded video sequence and higher compression ratio . Intra prediction : When a block or macro blocks is coded in intra mode, a prediction block is formed based on previously encoded and reconstructed blocks in the same frame. This prediction is subtracted from © 2015, IJCIT All Rights Reserved MODELSIM OUTPUT: Fig 3 shows the simulation output waveform after processing the frames by performing DCT and prediction process. AREA UTILIZATION REPORT: The flow summary depicts the successful compilation and execution. The P a g e | 32 S.Allwin Devaraj et al., International Journal of Computer & IT [ISSN No.(Print):2320-8074] report gives register usage and memory storage of a system chip design. Fig 4 the area can be calculated by knowing the total logic elements and register, memory bits and total pins. PERFORMANCE REPORT: Fig 6 Power play analysis tool allow estimating static and dynamic power consumption throughout the design cycle using Quartus software . IV. COMPARISON RESULTS TYPES USED AREA SPEED POWER Existing CODEC 6869 153.12MHZ 514.4 mW Proposed CODEC 6037 147.73MHZ 477.8 mW Table shows the trade off analyzes of Video codec with LDO and HCC methods with QUARTUS II hardware synthesis using STRATIX III family. V. CONCLUSION AND FUTURE WORK Fig5 shows the speed of the codec using Quartus II software and the speed value obtained is 147.73 MHZ. POWER ANALYZER: © 2015, IJCIT All Rights Reserved The proposed HCC and LDO techniques are used to compress the memory space and reduce the reconfiguration time using H.265 CODEC.The DCT block , FIFO and RAM for data references are designed thereby reducing On-Chip data memorysize without affecting the performance of system and thereby achieving the area,power and performance than by using H.264.Future enhancements includes, modify DCT blocks, prediction blocks and to carried out pipelining for better throughput and performance P a g e | 33 S.Allwin Devaraj et al., International Journal of Computer & IT [ISSN No.(Print):2320-8074] metric analyzes implementations. VI. in H.265 and its FPGA REFERENCES [1] Yansheng Wang, Leibo Liu, Shouyi Yin , “On-Chip Memory Hierarchy in OneCoarseGrained Reconfigurable Architecture to Compress Memory Space and to ReduceReconfiguration Time and Data-Reference Time”,IEEE transactions on very large scaleintegrationsystems, vol. 22, no. 5, may 2014 983. [2] T. Geng, L. Liu, S. Yin, M. Zhu, and S. Wei, “Parallelization of computing-intensive tasks of the H.264 high profile decoding algorithm on a reconfigurable multimedia system,” IEICE Trans. Inf. Syst.,vol. E93-D, no. 12, pp. 3223–3231, Jan. 2010. [3] B. Mei, B. De Sutter, T. V. Aa, M. Wouters, and S. Dupont“Implementation of a coarse-grained reconfigurable media processor for AVC decoder,” J. Signal Process. Syst., vol. 51, no. 3, pp. 225–243, 2008. © 2015, IJCIT All Rights Reserved [4] J. Shield, P. Sutton, and P. Machanick,“Dynamic cache switching in reconfigurable embedded systems,”, in Proc. Int. Conf. Field Program Logic Appl., 2007, pp. 111–116. [5] M. K. A. Ganesan, S. Singh, F. May, and J. Becker, “H.264 decoder at HD resolution on a coarse grain dynamically reconfigurable architecture,”in Proc. Int. Conf. Field Program. Logic Appl., Aug. 2007,pp. 467–471. [6] M. Suzuki, Y. Hasegawa, V. M. Tuan, S. Abe, and H. Amano ,“A cost-effective context memory structure for dynamically reconfigurable processors,” in Proc. 20th Int. Parallel Distrib. Process. Symp., Apr. 2006, p. 188. [7] White Paper of Reconfiguration on XPP-III Processor, PACT Inc., Lisle,IL, USA, 2006. [8] B. Mei, F-J. Veredas, and B. Masschelein, “Mapping an H.264/AVCdecoder onto the ADRES reconfigurable architecture” in Proc. Int. Conf.Field Program. Logic Appl., Aug. 2005, pp. 622–625). [9] Suneetha Kosaraju , ”Novel VLSI Architecture for Quantization and Variable Length Coding for H264/AVC Video Compression Standard”, theses,2005. P a g e | 34