JPEG and MPEG Why compress? Lossy and lossless compression
Transcription
JPEG and MPEG Why compress? Lossy and lossless compression
JPEG and MPEG Why compress? ● Limited storage space ● Transferring images over network Esa Nuutinen Kimmo Pajunen Leo Rela Pekka Repo Juha Suikki Lossy and lossless compression ● Lossless: – ● What data will be lost? ● all data will be kept Lossy copression: Human vision – not good at seeing small changes in color – good at seeing changes in brightness – some data will be lost in compression ● Details – better compression ratio ● Color information ● Edges ● Heavy compression causes disortion JPEG ● Joint Photographic Experts group ● Defines lossy and lossless compression ● For photographs or artwork – 2.7% of original not good for line drawings or cartoons 1% of original RGB to YUV ● ● ● RGB images are compressed in JPEG by transforming the images first into YUV and after that three color components are compressed separately. The chrominance components are often sub-sampled so that a 2x2 block of the original pixels forms a new pixel in sub-sampled image. Human eye is weak in separating color differences in the same luminance level. RGB to YUV ● Lossless JPEG • Lossless JPEG image is processed pixel by pixel in row-major order. • Value of the current pixel is predicted on the basis of neighboring pixels that have been coded. • When predicting pixel P(i,j), W=P(i,j-1), NW=P(i-1,j-1), N=P(i1,j) ja NE=P(i-1,j+1) • Prediction functions are visible on the following table: RGB => YUV Y = 0,299*R+0,587*G+0,114*B U = B-Y V = R-Y Y is luminance value U and V are chromatic values Mode: Predictor: Mode: Predictor: 0 Null 4 N+W-NW 1 W 5 W+(N-NW)/2 2 N 6 N+(W-NW)/2 3 NW 7 N+W/2 Lossless JPEG Prediction Error Lossless JPEG Example Huffman coding of the prediction errors. Category: Codeword: Difference: Codeword: 0 00 0 - 1 010 -1, 1 0, 1 2 011 -3, -2, 2, 3 00, 01, 10, 11 3 100 -7,...-4, 4...7 000,...011, 100,... 111 4 101 -15,...-8, 8...15 0000,...0111, 1000,... 1111 5 110 -31,...-16, 16...31 : 6 1110 -63,...-32, 32...63 : 7 11110 -127,...-64, 64...127 : 8 111110 -255,...-128, 128...255 : • In the following example, prediction mode 1 has been used in the pixel sequence ( 10, 12, 7, 8, 8, 12 ). Pixel: 10 12 10 7 8 8 12 Prediction error: +10 +2 -2 -3 +1 0 +4 4 2 2 2 1 0 3 101 011 011 011 010 00 100 1010 10 10 00 1 Category: Bit sequence: 100 JPEG - Huffman Coding ● ● ● The Idea in Huffman Coding is that some characters etc. are more common than others. Common characters are coded using less bits than rare characters. This way data will go in smaller space. Basic lossy JPEG encoder ● JPEG - Huffman Coding Encoder Block Diagram Header Compressed data Image Data 8x8 Blocks DCT Quantizer JPEG - DCT • DCT: ”Discrete Cosine Transform” • Each 8x8 block will be transformed in to frequency domain. Huffman Encoder DCT Quantization Tables Huffman Tables F (u, v ) = CuCv 7 7 ⎡ ( 2i + 1) ⋅ uπ ⎤ ⎡ ( 2 j + 1) ⋅ vπ ⎤ ⋅ cos ⎢ cos ⎢ ∑∑ ⎥ ⎥⎦ ⋅ f (i, j ) 4 i =0 j =0 16 16 ⎣ ⎦ ⎣ ⎧ 1 ⎪ Cu, Cv = ⎨ 2 ⎪⎩ 1 u, v = 0 otherwise Basic lossy JPEG decoder DCT Header Reconstructed Image Data in 8x8 blocks Compressed data Huffman Decoder Dequantizer Huffman Tables •If there is only one shade of gray in 8x8 picture block, the only weighted value will be in the top left corner. IDCT Quantization tables 7 7 f ( x, y ) = ∑∑ i =0 j =0 IDCT •8x8 matrix. Lower frequencies are in the top left on higher on bottom right. ⎧ 1 ⎪ Cu, Cv = ⎨ 2 ⎪⎩ 1 CuCv ⎡ ( 2 y + 1) ⋅ vπ ⎤ ⎡ ( 2 x + 1) ⋅ uπ ⎤ cos ⎢ ⎥⎦ ⋅ F (u, v ) ⎥⎦ ⋅ cos ⎢⎣ 16 16 4 ⎣ u, v = 0 otherwise JPEG Quantization ⎛ F [u, v ] ⎞ F ' [u, v ] = round ⎜⎜ ⎟⎟ ⎝ q[u, v ] ⎠ • The idea is to reduce the number of bits per sample. • Example: 45 = 101101 ( 6 bits ) q[u,v]=4 -> 45/4 = 11 = 1011 (4 bits ) • This is the main reason for data loss in lossy JPEG. • In JPEG the quantization factor is not uniform within the block. JPEG – Quantization Table ● ● ● ● The idea in these tables is that more bits are allocated for the low frequency components than to high frequency components. Quantization tables are stored in JPEG file header. The Luminance Quantization table is used for grey scale images and Y color component (in YUV color space). The Chrominance Quantization table is used for U and V color components. The bit rate can be adjusted by scaling the basic quantization tables either up ( for low bit rate ) or down (for higher bit rates.) JPEG – Quantization Tables JPEG – Quantization Tables The Luminance Quantization Table The Chrominance Quantization Table 16 11 10 16 24 40 51 61 17 18 24 47 99 99 99 99 12 12 14 19 26 58 60 55 18 21 26 66 99 99 99 99 14 13 16 24 40 57 69 56 24 26 56 99 99 99 99 99 14 17 22 29 51 87 80 62 47 66 99 99 99 99 99 99 18 22 37 56 68 109 103 77 99 99 99 99 99 99 99 99 24 35 55 64 81 104 113 92 99 99 99 99 99 99 99 99 49 64 78 87 103 121 120 101 99 99 99 99 99 99 99 99 72 92 95 98 112 100 103 99 99 99 99 99 99 99 99 99 JPEG – Zig-zag scan ● The Idea is to group low frequency coefficients on top of the vector. Maps 8x8 to a 1x64 vector. JPEG ● ● After zig-zag long sequences of zero value coefficients are coded by their number (the length of the run). Huffman coding is then applied to the non-zero coefficients. JPEG Example JPEG Example 15 DCT JPEG Example 235,6 -1 round( = ) 16 11 15 0 JPEG - Example 16*15=240 •There are a lot of zeros in the bottom right corner of matrix. So it is quite easy to compress the image data even more. This can be done by using methods described earlier. 0 JPEG Example References ● IDCT ● •If the original and the decompressed block are compared, it can be noticed that there are some small differences. ● ● Some high frequency components are discarded. What happens (in practice)? http://www.it.lut.fi/opetus/9899/1970/seminars/05/node11.html http://www.it.lut.fi/opetus/9798/1588/references/Franti.ps http://rnvs.informatik.tuchemnitz.de/~jan/MPEG/HTML/mpeg_tech.html Esa Kerttula, Luentomoniste, kevät 2003 Telematiikka (1630) Osa I. 27.4.2003 Exaggerated compression: How 8x8 blocks are being approximated. 8x8 block Original picture with all 64 coefficients (left) 15 coefficients, the produced error is small (middle) 6 coefficients, the error remarkable (right) So, less than ¼ of the 64 values are needed to achieve a good approximation of the original image [1] Same picture before and after compression. In the left picture 8x8 blocks are visible. (Exaggerated = liioiteltu) Compressed picture and trails from the DCT The eye example Original uncompressed bmp file, 640x480 Jpeg Jpeg ~ 20 kB ~ 9 kB ~1MB The difference (error) between pictures can only hardly be seen DCT 8x8 basis vectors What kind of pictures can be compressed efficiently? JPEG: 2 used colors, line drawing, JPEG: 24 bit colors, photo, hard edges, single color textures 6 435 bytes 14 568 bytes smooth edges and textures Present JPEG extensions ● ● ● Progressive mode: Lossless JPEG (LJPEG) -1995 – Supports real-time transmission of images -Does not use DCT – Coefficients are sent in multiple scans of the original image -Codes the difference between each pixel and predicted value for the pixel. – Low quality preview can be sent first and then comes rest ”incremental” images. -Eight different predictor functions are used Variable quantization – Allows quantization table to be altered for different parts of the image. – Some parts of the image can be compressed with higher quality. -Huffman coding -Exact losslessness is not guaranteed (depends on encoder and decoder implementations) Hierarchical mode: – Same image with multiple resolutions – Higher resolution images are represented as differences from the next smaller resolution image. [4] [2, 4] JPEG 2000 -Wavelet compression -About 20% better compression than present JPEG References 1 The Scientist and Engineer's Guide to Digital Signal Processing. California Technical Publishing, 1997. ISBN 0-9660176-3-3. 2 Seppo Virtanen: An introduction to JPEG. Course material – Multimedia algorithms spring 2000. Laboratory of electronics and information technology, University of Turku 3 Jpeg website, http://www.jpeg.org/ [25th Feb 2004] 4 The JPEG tutorial, -Uses progressive mode -Not very supported on Feb 2004 [3] http://dynamo.ecn.purdue.edu/~ace/jpeg-tut/jpegtut1.html [25th Feb 2004] MPEG ● Motion Picture Expert Group ● Basic ideas: ● – sub sampling – removing redundancy inside frames – removing redundancy between frames I-frames ● All information needed to construct the frame is present ● Compression similar to JPEG ● Can be used to construct other frames MPEG video feed consist of three types of frames – I-frames – P-frames – B-frames P-frames ● ● ● P-frame is predicted from previous I- or P-frame – The frame is divided and processed in macroblocks (16x16 pixels) – Macro blocks in the current frame are compared to content of the previous frame – ● P-frames Goal is to find similarities ● If similar block is found: – Difference in position is saved in a movement vector (mv) ● The block from the previous frame is subtracted from the block in the current frame to get difference block mv and the difference block are sent then to the receiver If similar blocks not found, whole block is sent (as in I-frame) If blocks are exactly the same, nothing is sent Motion Prediction B-frames frame 1 (I-frame) ● frame 2 (P-frame) mv ● ● Processed like P-frames, but information from both previous and next I- or P-frame can be used (Bi-Directional prediction) Offers usually best compression ratio B-frames are never used as source of information difference block GOP MPEG-1 and MPEG-2 ● Video is processed in Group Of Pictures ● Begins with I-frame ( length usually 10-15 frames ) – first generation ● Usual sequence is IBBPBBPBB… – VHS quality, 1.5 Mbit/s – made digital video possible ● ● MPEG-1 MPEG-2 – generic coding of video and audio – greater quality, up to 4 Mbit/s – compression ratio between 50:1 and 25:1 MPEG-3 and MPEG-4 ● ● MPEG-3 ● ● ● MPEG-7 – HDTV (High Definition TV) – DDT (Description Definition Language) – obsolete – enables searchable content in video clips MPEG-4 – encoding/decoding of audio-visual objects – body animation, games, high quality virtual environments – speech and video synthesis, fractal geometry, artificial intelligence MPEG References ● MPEG-7 and MPEG-21 http://www.mpeg.org http://www.disctronics.co.uk/technology/video/vi deo_mpeg.htm Sikora, T.; MPEG digital video-coding standards Signal Processing Magazine, IEEE , Volume: 14 , Issue: 5 , Sept. 1997. Pages:82 - 100 ● MPEG-21 – framework for content creation/transfer – based on digital items – multiple standards may be used within the framework