Direct Video Broadcast (DVB) Systems ( ) y
Transcription
Direct Video Broadcast (DVB) Systems ( ) y
Direct Video Broadcast ((DVB)) Systems y Slide: Courtesy, Hung Nguyen Processing of The Streams in The Set-Top Box (STB) Slide: Courtesy, Hung Nguyen Multimedia Communications Standards and Applications Slide: Courtesy, Hung Nguyen Video Coding g Standards • ITU H.261 for Video Teleconference (VTC) ( ) • ITU H.263 for VTC over POTS • ITU H.262 for VTC over ATM/broadband and digital TV networks • ISO MPEG-1 for movies on CDROM (VCD) – 1.2 Mbps p for video coding g and 256 Kbps p for audio coding g • ISO MPEG-2 for broadcast quality video on DVD – 2-15 Mbps allocated for audio and video coding • Low-bit L bit rate t ttelephony l h over POTS – 10 Kbps for video and 5.3 Kbps for audio • Internet and mobile communication: MPEG-4 MPEG 4 – Very Low Bit Rate (VLBR) code to be compatible with H.263 • Multimedia content description interface MPEG-7 – Description schemes and description definition language for integrated multimedia search engine History y • • • • H.261: – First video coding standard standard, targeted for video conferencing over ISDN. Uses block-based hybrid coding framework with integerpixel MC H 263: H.263: – Improved quality at lower bit rate, to enable video conferencing/telephony below 54 kbps (modems, desktop conferencing) f i ) – Half-pixel MC and other improvement MPEG-1 video – Video on CD and video on the Internet (good quality at 1.5 mbps) – Half Half-pixel pixel MC and bidirectional MC MPEG-2 video – SDTV/HDTV/DVD (4-15 mbps) – Extended E tended from MPEG MPEG-1, 1 considering interlaced video ideo Video compression principles Video: moving pictures and the terms “frame” and “ i t ” “picture”. one approach to compressing a video source is to apply the JPEG algorithm to each frame i d independently. d tl This Thi approach h is i known k as moving i JPEG or MJPEG. between 10:1 and 20:1, neither of which is large enough h on its it own to t produce d th compression the i ratios ti needed. • Redundancy is often present between a set of frames • Example: – movement of a person’s lips or eyes in video telephony application – a person or vehicle moving across the screen in a movie. (only a small portion of each frame is involved with any motion that taking place.) • Hence, sending only information relating to those segments of each frame that have movement associated with them. ((considerable id bl additional dditi l savings i iin b bandwidth d idth can b be made d b by exploiting l iti th the temporal differences that exist between many of the frames). JJust a selection l i iis sent iin iindividually-compressed di id ll d fform and, d for the remaining frames, only the differences between the actual frame contents and the predicted frame contents are sent. Sub-sampling p g of Chrominance Information • Transforming (R,G,B)->(Y,Cb,Cr) provides two advantages: • 1)The human visual system (HVS) is more sensitive to Y component than the Cb or Cr components. • 2) Cb and d Cr C are ffar lless correlated l t d with Y than R with G, R with Blue and Blue with G, thus reducing TV transmission bandwidths. • Cb and Cr both require far less bandwidth and can be sampled more coarsely (Shannon). • By doing so we can reduce data without ith t affecting ff ti g visual i l quality lit ffrom a personal view. Color Space Conversion • In general , each pixel in a picture consists of three components : R (Red), G (Green), B (Blue). (R,G,B) must be , , ) in MPEG-1 converted to ((Y,Cb,Cr) before processing • We can view the color value of each pixel from RGB color space , or YCbC YCbCr color l space • Because (Y,Cb,Cr) is less correlated than (R,G,B), coding using (Y,Cb,Cr) (Y Cb Cr) components is more efficient. • (Y,U,V) can also be used to denote (Y,Cb,Cr), however it most appropriately represents the analog TV equivalent Macroblock structure The basic coding unit is a 8 by 8 matrix block. A macroblock is consists of six block: 4 block of luminance (Y) , one block bl k off Cb chrominance, h i and d one block of Cr chrominance Macro Blocks & Color Sub Sub-sampling sampling Schemes A macroblock consists of 4 8x8 pixel blocks Slide: Courtesy, Hung Nguyen Picture Frames - Overview Three frame types: • I-Picture (Intra-frame picture), • P-Picture (Inter-frame predicted picture) • B-Picture B Picture (Bi-directional (Bi directional predictedpredicted interpolated pictures) I-frames • Are encoded without reference to any other frames. • Each frame is treated as a separate (digiti (digitized) ed) picture and the Y, Cb and Cr matrices are encoded independently using the JPEG algorithm (DCT, quantization, entropy encoding) except that the quantization threshold values that are used are the same for all DCT coefficients. • Hence the level of compression obtained with Iframes is relatively small. P-frames • The encoding of a P-frame is relative to the contents of either a preceding I-frame I frame or a preceding P-frame. • P-frames P frames are encoded using a combination of motion estimation and motion compensation Bf B-frames • Their contents are predicted using search regions i i both in b th pastt and d future f t f frames. • allowing for occasional moving objects, this also provides better motion estimation. estimation Group off pictures G i t or GOP: GOP The number of frames/pictures p between successive I-frames It is given the symbol N and typic values for N are from 3 through to 12. 12 Example p Frame Sequences q I and P Frames Only I,P and B Frames The number of frames between a Pframe and the immediately preceding Ior P-frame Pf is i called ll d the th prediction di ti span. It is given the symbol M (1 & 3) •A typical sequence of frames involving just I- and P-frames is shown in Figure 4.11(a) and a sequence involving all three frame types is shown in part (b) of the figure. • P-frames their contents are encoded by considering the contents of the current (uncoded) frame relative to the contents of the immediately preceding (uncoded) frame. • B B-frames, f h however th three ((uncoded) d d) fframe contents t t are involved: the immediately preceding I- or P-frame, the current frame being g encoded,, and the immediately y succeeding I- or P-frame. • This results in an increase in the encoding (and decoding) delay which is equal to the time to wait for the next II or P-frame P frame in the sequence. sequence Decoding P frame • With P-frames, the received information is g information first decoded and the resulting is then used, together with the decoded contents of the preceding II or P-frame P frame, to derive the decoded frame contents. D Decoding di B fframes • IIn the th case off B-frames, Bf the th received i d iinformation f ti is first decoded and the resulting information is then used used, together with both the immediately preceding I- or P-frame contents and the immediately succeeding P- or I-frame contents, t derive to d i the th d decoded d d fframe contents. t t • H Hence iin order d tto minimize i i i th the titime required i d tto decode each B-frame, the order of encoding (and transmission) of the (encoded) frames is changed so that both the preceding and succeeding I- or P-frames are available when th B the B-frame f is i received. i d Frame Types • there are two basic types of compressed frame: – those that are encoded independently – those that are predicted. p • Intracoded Frames -> I-Frames – Level of compression p is relatively y small 10:1 to 20:1 – Present at regular intervals to limit extent of errors – Number of frames between I-frames is known as the Group of pictures ((GOP)) p – 10:1 to 20:1 compression ratio Intercoded Frames (interpolation frames) – Predicted Frames-> Frames > P P-Frames Frames • Significant compression level achieved here • Errors are propagated • 20:1 to 30:1 compression ratio – Bidirectional Frames -> B-Frames • Highest levels of compression achieved • B-frames Bf are nott used d for f prediction, di ti thus th errors are nott propagated • 30:1 to 50:1 compression ratio • Motion Compensation (MC) And Motion Estimation (ME) • Motion Estimation is to predict a block of pixels' value in next picture using a block in current picture. The location difference between these blocks is called Motion Vector. And the difference between two blocks is called p prediction error. • In MPEG-1, encoder must calculate the motion vector and prediction error. When decoder obtain these information , it can use this information and current picture to reconstruct the next picture. • We usually call this process as Motion Compensation. In general, motion compensation is the inverse process of motion Estimation Slide: Courtesy, Hung Nguyen Motion Compensation • • • Try to match each block in the actual picture to content in the previous picture. Matching is made by shifting each of the 8 x 8 blocks of the two successive pictures pixel by pixel each direction -> > Motion vector Subtract the two blocks -> Difference block Transmit the motion vector and the difference block Motion estimation Estimation any movement between successive frames. (The accuracy of the prediction operation?) Motion compensation additional information must also be sent to indicate any small differences between the predicted and actual positions of the moving segments involved involved. Motion Estimation (ME) Slide: Courtesy, Hung Nguyen Motion Compensation (MC) Slide: Courtesy, Hung Nguyen P-Frame Encoding: Macroblock Structure P-Frame Encoding: Encoding Procedure DCT (discrete cosine transform) • DCT is used to convert data from the spatial domain to data in frequency domain. The higher frequency coefficients can be more coarsely quantized without a perceived loss of image quality due to the fact that the HVS is less sensitive to the higher frequencies and they contain less energy. • The DCT coefficient at location (0,0) is called DC coefficient and the other values we call them AC coefficients. In general, we use large quantization step in quantizing the higher AC coefficients. Hi h precision Higher i i iis required i d ffor the h DC term iin order d to avoid id blocking in the reconstructed image. • IIn MPEG MPEG-1, 1 we use 8*8 DCT. DCT By B using i this thi ttransform f we can convertt a 8 by 8 pixel block to another 8 by 8 block. In general most of the energy(value) is concentrated to the top-left corner. • After quantizing the transformed matrix, most data in this matrix may be zero, then using zig-zag order scan and run length coding can achieve a high compression ratio. ratio Transform Coding (TC) • • • • Pack P k the h signal i l energy iinto as ffew transform f coefficients ffi i as possible ibl The DCT yields nearly optimal energy concentration A2 2-dimensional dimensional DCT with block size of 8x8 pixels is commonly used in today’s image coder Transform is followed by quantization and entropy coding Slide: Courtesy, Hung Nguyen 2D DCT and IDCT u, v, x, y = 0, 1,2, ….,7 Slide: Courtesy, Hung Nguyen DCT Scan Modes • • The zigzag scan used in MPEG-1 is suitable for progressive images where frequency components have equal importance in each horizontal and vertical direction. (Frame pictures only) In MPEG-2, an alternate scan is introduced because interlaced images tend to have higher frequency components in the vertical direction. Thus, the scanning order d weighs i h more on th the hi higher h vertical ti l ffrequencies i th than th the same h horizontal i t l frequencies. Selection between these two zigzag scan orders can be made on a picture basis. (Frame and field pictures allowed) Slide: Courtesy, Hung Nguyen Quantization • In MPEG-1, a matrix called the quantizer ( Q[i,j] ) defines the quantization step. If ( X[i,j] ) is the DCT matrix t i with ith the th same size i as Q[i,j], Q[i j] X[i X[i,j] j] iis di divided id d b by Q[i,j]*QSF to obtain the quantized value matrix Xq[i,j] . QSF is the Quantization Scale Factor – Quantization Equation : • Xq[i,j] = Round( X[i,j]/(Q[i,j] *QSF)) • Inverse Quantization (dequantize) is to reconstruct original value. – Inverse I Quantization Q ti ti E Equation ti : • X'[i,j]=QSF*Xq[i,j]*Q[i,j] • The difference between actual value and reconstructed value from quantized value is called the quantization error. In general if we carefully design Q[i j] visual quality will not be affected. Q[i,j], affected Quantization (cont (cont’d) d) Slide: Courtesy, Hung Nguyen Intra frame Encoding Process Intra-frame • • • • • • • Decomposing image to three components in RGB space Converting RGB to YCbCr Dividing image into several macroblocks (each macroblock has 6 blocks , 4 for Y, 1 for Cb, 1 for Cr) DCT transformation for each block After DCT transform , Quantizing each coefficient Then use zig-zag scan to gather AC value Use DPCM to encode the DC value, then use VLC to encode it Use RLE to encode the AC value, then use VLC to encode it Coding of P Pictures • • As in I pictures, the encoder needs to store the decoded P pictures since this may be used as the starting point for motion compensation. Therefore, the encoder will reconstruct the image f from the th quantized ti d coefficients. ffi i t In coding P pictures, the encoder has more decisions to make than in the case of I pictures – – – – Selection of Macroblock Type: There are 8 types of macroblock in P pictures. Motion Compensation Decision: The encoder d h has an option ti on whether h th to transmit motion vectors or not for predictive-coded macroblocks. Intra/Non-intra Coding Decision Coded/Not Coded Decision: After quantization, if all the coefficients in a block is zero then the block is not coded. Quantizer/No Quantizer Decision: Quantizer scale can be altered which will affect the picture quality. Slide: Courtesy, Hung Nguyen The Inter-frame Encoding Flow Chart Slide: Courtesy, Hung Nguyen MPEG (Moving Picture Expert Group) • Established in January 1988 • Operated in the framework of the Joint ISO/IEC T h i lC Technical Committee itt • ISO: International Organization for Standardization • IEC: IEC International I t ti l Electro-technical El t t h i l C Commission i i • First meeting was in May 1988 with 25 experts participated • Grown to 350 experts from 200 companies in some 20 countries • As a rule, MPEG meets in March, July and November & could be more often as needed Slide: Courtesy, Hung Nguyen RGB Image RGB Image 50 100 150 200 250 300 350 400 450 500 100 200 300 400 500 600 700 800 Compressed Image (QSF=24) Compressed Image 50 100 150 200 250 300 350 400 450 500 100 200 300 400 500 600 700 800 Luminance Plane (Y) Luminance Plane 50 100 150 200 250 300 350 400 450 500 100 200 300 400 500 600 700 800 Blue Chrominance Plane (Cb) Blue Chrominance Plane 50 100 150 200 250 300 350 400 450 500 100 200 300 400 500 600 700 800 Red Chrominance Plane (Cr) Red Chrominance Plane 50 100 150 200 250 300 350 400 450 500 100 200 300 400 500 600 700 800 Red Red RGB Plane 50 100 150 200 250 300 350 400 450 500 100 200 300 400 500 600 700 800 Green Green RGB Plane 50 100 150 200 250 300 350 400 450 500 100 200 300 400 500 600 700 800 Blue Blue RGB Plane 50 100 150 200 250 300 350 400 450 500 100 200 300 400 500 600 700 800