Einleitung - Institut für Informatik
Transcription
Einleitung - Institut für Informatik
MPEG (Motion Picture Expert Group) Eine Einführung in die MPEG Video- und AudioKompression für Internet, CD-ROM und DVD Universität Osnabrück Rechenzentrum Dipl.-Math. Frank Elsner 12.12.1999 Version 1.0 http://www.rz.uni-osnabrueck.de/Multimedia 1 Inhaltsverzeichnis Einleitung.......................................................................................................................................................... 3 Ein Steilkurs ..................................................................................................................................................... 4 Einführende Beispiele ..................................................................................................................................... 9 MPEG Video Compression............................................................................................................................ 15 MPEG-1 Layer 3 Audio Compression........................................................................................................... 25 Profiles ............................................................................................................................................................ 27 MPEG Encoder ............................................................................................................................................... 32 MPEG Player................................................................................................................................................... 40 Probleme, Tips und Tricks ............................................................................................................................ 42 DVD Grundlagen und Authoring................................................................................................................... 48 Weiterführende Dokumentation ................................................................................................................... 55 Anhang ............................................................................................................................................................ 57 2 Einleitung MPEG ist ein Modewort, das selbst Eingang in „Nicht-Computer-Fachliteratur“ gefunden hat. Ausschlaggebend ist die Hysterie, die MP3 (MPEG-1 Layer 3) im Internet ausgelöst hat. Weitere wichtige Anwendungen von MPEG (genauer MPEG-2) sind DVD-Video und Digitales Fernsehen. Für beide Anwendungen stellt MPEG-2 das zugrundeliegende Kompressionsverfahren dar. In diesem Skript erhalten Sie eine Einführung in die Video- und Audio-Kompression mit Hilfe des MPEG Verfahrens. Im Kapitel „Ein Steilkurs“ wird auf die Historie von MPEG und die grundlegenden Ideen eingegangen. Im Kapitel „Einführende Beispiele“ wird anhand einfacher Beispiele demonstriert, wie Videos im Format AVI nach MPEG transformiert und bearbeitet werden können. Im Kapitel „MPEG Compression“ wird im Detail behandelt, wie MPEG Video-Kompressionsraten von bis zu 1:100 erzielen kann. Im Kapitel „MPEG-1 Layer 3“ wird auf das neue MP3 Audio Format eingegangen. Im Kapitel „Profiles“ werden die Unterschiede zwischen MPEG-1 und MPEG-2 behandelt sowie diverse Profile vorgestellt. In den Kapitel „Encoder“ und „Player“ werden Encoder zum Erzeugen von MPEG Dateien, zum Analysieren, Schneiden und Zusammenfügen sowie einige Player vorgestellt. Im Kapitel „Probleme, Tips und Tricks“ werden mögliche Probleme und Lösungen aufgezeigt, die im Produktionsprozeß auftreten können. Im Kapitel „DVD Grundlagen und Authoring“ wird das Thema DVD angeschnitten. Abschließend liefern die Kapitel „Weiterführende Dokumentation“ und „Anhang“ Links zu Herstellern und Referenzdokumenten sowie Auszüge aus einigen der genannten Dokumente. Eine Anmerkung in eigener Sache: In Anbetracht der Tatsache, daß (fast) alle Dokumente zu MPEG in englischer Sprache und in ausgezeichneter Qualität vorliegen, habe ich mich entschlossen, die Texte in der Originalfassung zu übernehmen (und nicht durch eine Übersetzung zu verschlechtern J)., Ich bin aber gern bereit, Teile des Skriptes oder auch das ganze Skript zu übersetzen, wenn trifftige Gründe (?) hierfür sprechen. Als Ergänzung zu diesem Skript ist vom RZ eine CD-ROM erhältlich mit dem Titel: „MPEG und DVD – Software, Dokumentation und Clips“. 3 Ein Steilkurs In diesem Kapitel wird auf die Entwicklung und Normierung von MPEG eingegangen. Video Parameter Video is simply an electronic sequence of still images displayed or projected (quickly) in succession to one another. As a result, the human mind is fooled into believing that people or objects in the presented sequence move. In terms of computers, there are three important characteristics of video: 1. How fast each picture is displayed (frame rate)? 2. How many elements create each picture in both the horizontal and vertical dimensions (frame size)? This is normally given in terms of pixels (or pels). 3. How many different colors the picture/pixel is made from (color depth)? Frame rate Frame rate is the number of frames that are displayed to a viewer each second. For example, in motion picture film in the United States it is common to display 24 frames each second. In color television for the US home (called NTSC) 29.97 frames a second are displayed. [German PAL is based on a frame rate of 24 frames each second. – FE]. Even though computers are not normally thought of in terms of frame rates, most computers “refresh” the screen by repainting every element of the screen as often as 72 times a second. Frame size or number of picture elements Frame size or number of picture elements is the next component of video. This is measured horizontally and vertically in pixels. “Pixels” are picture elements -- the small dots which make up the displayed picture. Some common dimensions, or resolutions numbers, in the computer world include: 640 horizontal pixels x 480 vertical pixels, 1024 horizontal x 768 vertical, and 800 horizontal x 600 vertical pixels. Number of colors The number of colors which make up each picture or frame is a third component of video. As is the case with a painter’s palette, a color can be described in terms of several “primary” colors. For instance, when playing with paints as a child, mixing equal parts of red, yellow and blue created black. By mixing these primary colors in different combinations, it is possible to produce any other color. Color mixing works a bit differently with light than with paints, but we can still make any color from three primaries. In the video world, however, we substitute green for yellow in our “primary” color palette. Color Spaces In mixing colors of light, we vary the amount of red, green and blue light that makes up the color of a pixel. To make video practical, it is necessary to limit the number of dffering shades of red, green, or bue that can be generated. This puts an upper limit on the total number of colors that video can recreate. Here is an example of a common digital color scheme. Each primary color (red, green, or blue) may have 256 different levels or shades. Since a color may be composed of the three primaries, this means we can generate 16.8 million different colors, or 256 levels of red times 256 levels of green times 256 levels of blue (16.8 million roughly equals 256x256x256). The color for a pixel is normally written as follows: pixel_color = (red_level, blue_level, green_level). The previous example just described 24 bit video, without calling it that. The term 24 bit comes from the fact that 256 shades of the primaries may be represented as an 8 bit value. Since it takes three primaries torepresent a single value it takes 8+8+8 or 24 bits to represent color for a single pixel: 8bits red 8bits green 8bits blue RRRRRRRR GGGGGGGG BBBBBBBB 4 = 24 bits As a final note about color and video, it is possible to choose different primaries or entirely different colorspaces/colorsystems. The way we described colors above is not the only way to identify colors. Different “colorspaces,” or methods of describing colors have different uses. For example the common colorspace for printing is CMYK, or cyan, magenta, yellow, and black. Another colorspace is YCrCb, or luminance (shade intensity) and chrominance-red and chrominance-blue (chrominance components define the hue and value of the color). This last colorspace is commonly used in video, primarily because it more closely resembles the colorspace of human eyes, where rods detect luminance components and cones detect the chrominance components of color. The international standard CCIR-610-1 specifies eight-bit digital coding for component video. For Rec. 601-1 coding in eight bits per component, Y_8b = 16 + 219 * Y Cb_8b = 128 + 112 * (0.5/0.886) * (Bgamma - Y) Cr_8b = 128 + 112 * (0.5/0.701) * (Rgamma - Y) CCIR-610-1 Rec. calls for two-to-one horizontal subsampling of Cb and Cr, to achieve 2/3 the data rate of RGB with virtually no perceptible penalty. This is denoted 4:2:2. JPEG and MPEG normally subsample Cb and Cr two-to-one horizontally and also two-to-one vertically, to get 1/2 the data rate of RGB. This is denoted 4:2:0. To get good results using subsampling you should not just drop and replicate pixels, but implement proper decimation and interpolation filters. For the purposes of this discussion, let's assume our video source is a typical professional digital video format called ITU-T 601 (formerly known as CCIR 601). In this format, we see the video is represented in the following fashion: 1. 2. 3. frame rate of 30 frames a second picture size of one frame 720x480 (NTSC) color and colorspace: YCrCb 4:2:2 Luminance (Y) is sampled at full resolution; each chrominance component (Cr and Cb) is sampled at full resolution one half as often. On average then, it takes 16 bits to represent each pel. Using these values, it is easy to calculate the total disk space required to hold one second of uncompressed video in this format: 720 horiz. Pixels X 480 vert. Pixels = 345600 pixel per frame 345600 pixel per frame X 30 frames per second =10368000 pixels per second 10368000 pixels per second X 2 bytes per pixel =20,736,000 total bytes per second This means that a 20 Gigabyte hard drive could hold about 1000 seconds of uncompressed video. Clearly this is not practical for most applications. 5 MPEG ISO Standard Die Motion Picture Expert Group (MPEG) wurde Ende der 80er Jahre zur Festlegung eines digitalen Standards für Bewegtbilddarstellung ins Leben gerufen. Bis zur Verabschiedung der Norm MPEG-1 standen bereits verschiedene Verfahren zur Verfügung. Zu den bis dato wichtigsten Vertretern gehörten Motion-JPEG (M-JPEG) und die Recommendation H.261 der CCITT. MPEG is an acronym for Moving Pictures Experts Group which commonly refers to the international standard for digital video and audio compression. The official name of the MPEG-1 standard is: “Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to about 1.5 Megabits per second.” It is sometimes referred to by its ISO/IEC project number, 11172 parts 1 through 5. However, this video standard is usually just called “MPEG”. MPEG-1 and MPEG-2 are motion video compression standards created by the Moving Picture Experts Group. This group is a joint committee of the International Standardization Organization (ISO) and the International Electrotechnical Commission. The MPEG-1 Standard, completed as a draft in 1992, defines a bit stream of compressed audio and video data with a data rate of 1.5 Mbits/sec as being suitable for CD-ROMs and VideoCD applications. It is possible to generate MPEG-1 streams with other data rates. The MPEG-1 Standard is formally described in ISO/IEC 11172. The MPEG-2 Standard was designed later for digital transmission of broadcast quality video with data rates from 2 to 10 Mbits/sec. It was written to be more “generic”, that is to address a broader range of applications, and is the compression standard for DVD and various digital television systems. The MPEG-2 Standard is described in ISO/IEC 13818 documents. MPEG Compression Overview (This is a short introduction. Please refer to the following chapters for detailed information!) The basic idea behind MPEG video compression is to remove spatial redundancy within a video frame and remove temporal redundancy between video frames. As in JPEG, the standard for still image compression, DCT-based (Discrete Cosine Transform) compression is used to reduce spatial redundancy. Motion compensation is used to exploit temporal redundancy. The images in a video stream usually do not change much within small time intervals. The idea of motion-compensation is to encode a video frame based on other video frames temporally close to it. Intra-frame compression (compression within a picture) 1. Discrete Cosinus Transformation (DCT) 2. Quantization 3. Huffmann / Arithmetic Encoding Die folgenden Abbildungen skizzieren die Abläufe: 6 Inter-frame compression (temporal redundancy) Inter-frame compression (compression relating to nearby pictures) 1. Motion Compensation An MPEG stream can have three types of frames: • • • I-frames P-frames B-frames Intra (I) frames are coded without any references to any other frames. Predicted (P) frames reference previously encoded P or I frames, and encode only the changes. Predicted frames provide significantly better compression than Intra coded (I) frames. Bi-directional interpolated (B) frames contain references to both previous P or I frames and the next P or I frame. Bi-directional frames provide the best compression. 7 The primary difference among these frames is how motion vectors are used in them. Intraframes (I frames) do not use any form of motion vectors.Predictive frames (P frames) make use of predictive type motion vectors. Bi-directional frames (B frames) make use of both predictive and interpolative motion vectors. The three frame types and their sequence (for instance, I BB P BB P BB P BB), represent a Group of Pictures (GOP). Take for example, an AVI file of 30 frames per second. Each frame is a standalone picture, and does not reference any other frame. If we use the letter “I” to represent a frame, a single second could then be represented like this by 30 I-frames: (I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I) An MPEG encoder can gather most of the information for the group of similar pictures from the first frame in a scene and encode that as an I frame. The encoder could continue to encode each of these frames as an I frame, but it is much more efficient to only record data about the pixels that have changed. The encoder might then look ahead to the fourth frame, note only the things that changed in #1 and #4, and record those few changes as a P frame. Finally, the encoder can analyze the I and the P frame, and from that create very small B frames for frames #2 and #3. Because video generally has 24-30 fps, GOP structures can get quite complex. In addition, the encoded order of frames is not always the same as the display order, and many GOPs will be required to accurately portray just a few seconds of video. This encoder analysis is a process known as motion estimation. Using the superior compression of GOP structures, therefore, the same second in MPEG might be represented like this: ([I BB P BB P BB P BB P BB] [I BB P BB P BB P BB P BB]) It consists of two GOPs. Both of the examples contain 30 frames, but the MPEG file will be significantly smaller due to the use of P and B frames. 8 Einführende Beispiele In diesem Kapitel werden einige typische Anwendungsbeispiele behandelt: 1. 2. 3. 4. Umwandeln einer AVI Datei nach MPEG-1 (Encode and Multiplex) Trennen von Video- und Audio-Stream (De-Multiplex) Zusammenfügen von 2 MPEG Dateien (Join) Bearbeiten von MPEG Dateien (Snapshot, ...) Beispiel 1 – Ligos Quick Start Tutorial To get started immediately with the LSX-MPEG Encoder, we’ve provided this Quick Start Tutorial. Detailed information is provided throughout this Help file for additional features such as Variable Bitrate control, skipping frames for lower bitrate encoding, and Batch Processing. The Balloon is a YUV compressed AVI, 3 seconds, 352x240, 2960 KB/Sec video, 44.1 kHz 16 bit stereo audio, balloon_yuv.avi when it’s unzipped. It will be used for the tutorial below. Encoding with the Profile Manager When using the LSX-MPEG Encoder, the parameters used to encode an MPEG file are kept in a Profile. You can use the default Profile, use a predefined Profile selected from the Profile Manager, or you can create your own custom Profile. In this Tutorial we’ll show you how to quickly encode a file using the Profile Manager, and then take you through the steps of creating a custom Profile. Start LSX-MPEG Encoder by choosing the program from the Start menu or clicking on the icon for the application in the program group. 9 The Encoder will automatically load a set of default parameters that are displayed in the Main Window, such as Frame Rate, Video Stream data rate, Frame Size, etc.. This is the information that will control the encoding process and determine the characteristics of the output file we’ll generate. Together, this information is known as a Profile. The top portion of the interface will remain blank until we actually start encoding some video. Balloon_yuv.avi is a 3 second, 8.67 MB AVI file, and we want to encode it to MPEG to reduce the file size, but retain the quality. The easiest way to do this when first starting out with the LSX-MPEG Encoder is to use the Profile Manager. The Profile Manager includes predefined recommended Profiles, and allows you to create and store Profiles for later use. Open it from either the pulldown File menu item, or the Open Profile Manager button on the toolbar. When the Profile Manager is opened, you’ll see a two-part interface. The left side is a list box that displays predefined and custom MPEG-1 and MPEG-2 Profiles with descriptive names. When a Profile is selected, the right side displays the characteristics that make up that Profile (Format, Frame Rate, Data Rate, etc.). Either you can select a Profile based on the recommendation, or based on the goal you are trying to achieve. For instance, if you want something that is a good choice for a SIF resolution (352x240) AVI source file, you should get good results using the Profile “MPEG-1 (Recommended for SIF, 352x240 NTSC)”. If, however, you have a specific goal of taking a large AVI file and making it small enough for quick download over the Internet, you’d probably want to select “MPEG-1 (low bitrate, 10 fps simulated)” due to its optimized low data and frame rates. The choice is yours, and you can always choose one that’s close and customize it later. We have a goal for Balloon_yuv.avi (reduce file size, keep the quality), so it is best to choose the recommended Profile. Select “MPEG-1 (recommended for SIF, 352x240 NTSC)” , and click Load Profile. Now we need to select our Input File to convert and encode to MPEG. Near the Input File box, click on the Browse button. A standard Windows file dialog is displayed, prompting you to choose an AVI file for encoding. Select the file “Balloon_yuv.avi” from the “Media” sub-directory, and click the Open button. An Information popup appears with an analysis of the Balloon_yuv.avi file. Click OK to continue. In addition to the Input File box being filled in, the Output File is now automatically named "Balloon_yuv.mpg", saved in the same folder. This can be changed, but we’ll leave it like this for now. 10 Let’s Preview the file… click on the Preview button to the left of the Input File box. A window appears and plays back the clip. Close the movie window. That’s all there is to it! As you can see, the parameters in the Main Window have been adjusted to match the Input File and the Profile we chose. If we don’t have any other changes to make, we can encode the file. Click on the Start MPEG Encoding button on the toolbar . The application will display and begin encoding the video portion of the stream. A new section of the interface, MPEG Video Encoding in Progress, will appear. A meter will display a frame by frame accounting as each is encoded, and show information on image quality, current frame quality and elapsed encoding time. The application will then show a meter for the encoding progress of audio, followed by a meter for the multiplexing of audio and video. When finished, the Multiplexing Completed box is displayed, presenting a summary of information regarding the process. Close the dialog. 11 Let’s Play the finished file… click on the Play button to the left of the Output File box. A media player should open to play the clip. Click play. As you can see, it is the same quality as the AVI file we input. A quick check on the file size (using Windows Explorer) shows that the resulting file is only about 500 kilobytes, 6% the size of the original! Close the movie window. Beispiel 2 – Darim DVMPEG Multiplexer/Demultiplexer The MPEG balloon.mpg will be separated into video (*.mpv) and audio track (*.mp2). Show video and audio stream parameters: 12 Beispiel 3 – Camel MPEGJoin Join the files balloon_yuv.mpg and dolphin_yuv.mpg into one single file joined.mpg. 13 Beispiel 4 - Womble MPEG-VCR 3.02 The Womble MPEG-VCR is application software for editing compressed digital movies that are compliant with the MPEG international standards. This current release is for all 32-bit Windows platforms.It has been tested for Windows 95, Windows 98, Windows NT workstation and Windows NT server. The main features include 1. 2. 3. 4. Frame-accurate editing for cut, copy, paste, and record. insert simple transitions with video special effect. still image overlay for text and logo insertion. an audio editor component for separate MPEG audio editing. The following is a list of the supported MPEG formats 1. MPEG Video: all MPEG-1 and MPEG-2 video formats, including VBR (variable rate). 2. MPEG Audio: MPEG layer-I, layer-II, and layer-III (layer-III is input only). 3. MPEG Systems: all MPEG-1 and MPEG-2 formats, including Program and Transport. Other video data that the editor will read as input 1. 2. 3. 4. Windows AVI DIB RGB video sequence. Windows bitmap still images. Windows bitmap still images. JPEG still images. Die folgende Abbildung zeigt einen Schnappschuß der Benutzeroberfläche: 14 MPEG Video Compression In diesem Kapitel wird detailliert auf die einzelnen Schritte bei der Video-Kompression eingegangen. MPEG Goals (from C-Cube) This chapter presents an overview of the Moving Picture Experts Group (MPEG) standard that is implemented by the CL480. The standard is officially known as ISO/IEC Standard, Coded Representation of Picture, Audio and Multimedia/hypermedia Information, ISO 11172. It is more commonly referred to as the MPEG-1 standard. MPEG addresses the compression, decompression and synchronization of video and audio signals. The MPEG video algorithm can compress video signals to an average of about 1/2 to 1 bit per coded pixel. At a compressed data rate of 1.2 Mbits per second, a coded resolution of 352 x 240 at 30 Hz is often used, and the resulting video quality is comparable to VHS. Image quality can be significantly improved by using a more highly-compressed data rate (for example, 2 Mbits per second) without changing the coded resolution. MPEG System Stream Structure In its most general form, an MPEG system stream is made up of two layers: • • The system layer contains timing and other information needed to demultiplex the audio and video streams and to synchronize audio and video during playback. The compression layer includes the audio and video streams. The system decoder extracts the timing information from the MPEG system stream and sends it to the other system components. The system decoder also demultiplexes the video and audio streams from the system stream; then sends each to the appropriate decoder. The video decoder decompresses the video stream as specified in Part 2 of the MPEG standard. The audio decoder decompresses the audio stream as specified in Part 3 of the MPEG standard. Figure 2-1 shows a generalized decoding system for the audio and video streams. 15 Figure 2-1: General MPEG Decoding System Video Stream Data Hierarchy The MPEG standard defines a hierarchy of data structures in the video stream as shown schematically in Figure 2-2: 1. 2. 3. 4. 5. 6. Video Sequence Group of Pictures (GOP) Picture Slice Macroblock Block Figure 2-2 MPEG Data Hierarchy Video Sequence Begins with a sequence header (may contain additional sequence headers), includes one or more groups of pictures, and ends with an end-of-sequence code. Group of Pictures (GOP) A header and a series of one or more pictures intended to allow random access into the sequence. Picture The primary coding unit of a video sequence. A picture consists of three rectangular matrices representing luminance (Y) and two chrominance (Cb and Cr) values. The Y matrix has an even number of rows and columns. The Cb and Cr matrices are one-half the size of the Y matrix in each direction (horizontal and vertical). Figure 2-3 shows the relative x-y locations of the luminance and chrominance components. Note that for every four luminance values, there are two associated chrominance values: one Cb value and one Cr value. (The location of the Cb and Cr values is the same, so only one circle is shown in the figure.) 16 Slice One or more ``contiguous'' macroblocks. The order of the macroblocks within a slice is from left-to-right and top-to-bottom. Slices are important in the handling of errors. If the bitstream contains an error, the decoder can skip to the start of the next slice. Having more slices in the bitstream allows better error concealment, but uses bits that could otherwise be used to improve picture quality. Macroblock A 16-pixel by 16-line section of luminance components and the corresponding 8-pixel by 8-line section of the two chrominance components. See Figure 2-3 for the spatial location of luminance and chrominance components. A macroblock contains four Y blocks, one Cb block and one Cr block as shown in Figure 2-4. The numbers correspond to the ordering of the blocks in the data stream, with block 1 first. Figure 2-4 Macroblock Composition Block A block is an 8-pixel by 8-line set of values of a luminance or a chrominance component. Note that a luminance block corresponds to one-fourth as large a portion of the displayed image as does a chrominance block. YcbCr Coding The MPEG-1 algorithm operates on images represented in YUV color space (Y Cr Cb). If an image is stored in RGB format, it must first be converted to YUV format. In YUV format, images are also represented in 24 bits per pixel (8 bits for the luminance information (Y) and 8 bits each for the two chrominance information (U and V)). The YUV format is subsampled. All luminance information is retained. However, chrominance information is subsampled 2:1 in both the horizontal and vertical directions. Thus, there are 2 bits each per pixel of U and V information. This subsampling does not drastically affect quality because the eye is more sensitive to luminance than to chrominance information. Subsampling is a lossy step. The 24 bits RGB information is reduced to 12 bits YUV information, which automatically gives 2:1 compression. Technically speaking, MPEG-1 is 4:2:0 YCrCb. Im Folgenden wird dargestellt, wie ein digitalisiertes Videobild für MPEG-1 vorbehandelt werden muß, um zu handhabbaren Datenmengen zu kommen (1,5 MBit/s), welche Bildtypen verwendet werden und wie Codierung und Decodierung unter Beibehaltung akzeptabler Bildqualität erfolgen. Die üblichen Auflösungen von digitalisiertem Video sind 720x480 Pixel bei 60 Hz (NTSC) oder 720x576 Pixel bei 50 Hz (PAL), wobei das Material in 4:2:2 Form vorliegt. Diese Auflösung wird in horizontaler und vertikaler Richtung um die Hälfte verringert. Die horizontale Auflösung wird dabei im Allgemeinen nicht einfach durch Weglassen von Luminanz- oder Chrominanzwerten verringert. Gebräuchlich ist eine gewichtete Mittelung eines Pixels mit seinen später nicht mehr verwendeten Nachbarpixeln. In vertikaler Richtung kann eine ähnliche Filterung erfolgen, oder es wird einfach jede zweite Zeile weggelassen, also nur ein Halbbild verwendet. Die Chrominanz-Auflösung wird in vertikaler Richtung ein weiteres Mal durch Filterung halbiert, so daß ein 4:2:0 Abtastungsmuster entsteht (Abbildung ). Vier Abtastpunkte der Lunimanz entsprechen hierbei einem Abtastpunkt für die Chrominanz. Die einzelnen Abtastwerte werden, wie bei JPEG, zu 8x8-Matrizen zusammengefaßt, vier der 8x8-Blöcke bilden einen Makroblock. Die Farbinformation wird mit je einer 8x8- Matrix für und dargestellt, so daß pro Makroblock insgesamt 6 Matrizen der Ausmaße 8x8 verwendet werden. Der Vorgang der Auflösungsreduzierung wird in sogenannten Dezimierungsfiltern entweder im MPEG-Prozessorbaustein, wie dem weiter unten beschriebenen VRP von C-Cube, oder in Videodigitalisierung-Bausteinen vorgenommen. Bei der Darstellung komprimierten Videomaterials muß umgekehrt die anfängliche Auflösung wiederhergestellt werden. Dazu werden zwischen den Luminanz- bzw. Chrominanzwerten Nullwerte eingefügt, und anschließend wird eine gewichtete Mittelung durchgeführt. Die Gewichte sind die Filterkoeffizienten eines sogenannten Interpolationsfilters. Der Effekt ist, daß zum Beispiel aus der Folge 10,11,12 eine Folge 10, 10.5, 11, 11.5, 12 erzeugt wird. Einfachere Verfahren zur Interpolation arbeiten mit der Wiederholung von Werten. 17 Audio Stream Data Hierarchy The MPEG standard defines a hierarchy of data structures that accept, decode and produce digital audio output. The MPEG audio stream, like the MPEG video stream, consists of a series of packets. Each audio packet contains an audio packet header and one or more audio frames as shown in Figure 2-5. Figure 2-5 Audio Stream Structure Each audio packet header contains the following information: • • Packet start code - Identifies the packet as being an audio packet Packet length - Indicates the number of bytes in the audio packet. An audio frame contains the following information: • • • Audio frame header - Contains synchronization, ID, bit rate, and sampling frequency information Error-checking code - Contains error-checking information Audio data - Contains information used t o reconstruct the sampled audio data. 18 • Ancillary data - Contains user-defined data. Step 1: Intra-picture (Transform) Coding The MPEG transform coding algorithm includes these steps: • • • Discrete cosine transform (DCT) Quantization Run-length encoding Both image blocks and prediction-error blocks have high spatial redundancy. To reduce this redundancy, the MPEG algorithm transforms 8 x 8 blocks of pixels or 8 x 8 blocks of error terms from the spatial domain to the frequency domain with the Discrete Cosine Transform (DCT). Next, the algorithm quantizes the frequency coefficients. Quantization is the process of approximating each frequency coefficient as one of a limited number of allowed values. The encoder chooses a quantization matrix that determines how each frequency coefficient in the 8 x 8 block is quantized. Human perception of quantization error is lower for high spatial frequencies, so high frequencies are typically quantized more coarsely (i.e., with fewer allowed values) than low frequencies. The combination of DCT and quantization results in many of the frequency coefficients being zero, especially the coefficients for high spatial frequencies. To take maximum advantage of this, the coefficients are organized in a zigzag order to produce long runs of zeros (see Figure 2-10). The coefficients are then converted to a series of run-amplitude pairs, each pair indicating a number of zero coefficients and the amplitude of a non-zero coefficient. These run-amplitude pairs are then coded with a variable-length code, which uses shorter codes for commonly occurring pairs and longer codes for less common pairs. Huffman Coding For a given character distribution, by assigning short codes to frequently occurring characters and longer codes to infrequently occurring characters, Huffman's minimum redundancy encoding minimizes the average number of bytes required to represent the characters in a text. Static Huffman encoding uses a fixed set of codes, based on a representative sample of data, for processing texts. Although encoding is achieved in a single pass, the data on which the compression is based may bear little resemblance to the actual text being compressed. Dynamic Huffman encoding, on the other hand, reads each text twice; once to determine the frequency distribution of the characters in the text and once to encode the data. The codes used for compression are computed on the basis of the statistics gathered during the first pass with compressed texts being prefixed by a copy of the Huffman encoding table for use with the decoding process. Some blocks of pixels need to be coded more accurately than others. For example, blocks with smooth intensity gradients need accurate coding to avoid visible block boundaries. To deal with this inequality between blocks, the MPEG algorithm allows the amount of quantization to be modified for each macroblock of pixels. This mechanism can also be used to provide smooth adaptation to a particular bit rate. Figure 2-10 Transform Coding Operations The encoding scheme used is similar to JPEG compression. Each 8x8 block is encoded independently with one exception explained below. The block is first transformed from the spatial domain into a frequency domain using the DCT (Discrete Co- 19 sine Transform), which separates the signal into independent frequency bands. Most frequency information is in the upper left corner of the resulting 8x8 block. After this, the data is quantized. Quantization can be thought of as ignoring lower-order bits (though this process is slightly more complicated). Quantization is the only lossy part of the whole compression process other than subsampling. The resulting data is then run-length encoded in a zig-zag ordering to optimize compression. This zig-zag ordering produces longer runs of 0's by taking advantage of the fact that there should be little high-frequency information (more 0's as one zig-zags from the upper left corner towards the lower right corner of the 8x8 block). The afore-mentioned exception to independence is that the coefficient in the upper left corner of the block, called the DC coefficient, is encoded relative to the DC coefficient of the previous block (DCPM coding). Since MPEG is targeted for a set of specific applications, there is only one color space (4:2:0 YCbCr), one sample precision (8 bits), and one scanning mode (sequential). Luminance and chrominance share quantization tables. The range of sampling dimensions are more limited as well. MPEG adds adaptive quantization at the macroblock (16 x 16 pixel area) layer. This permits both smoother bit rate control and more perceptually uniform quantization throughout the picture and image sequence. Adaptive quantization is part of the JPEG-2 charter. MPEG variable length coding tables are non-downloadable, and are therefore optimized for a limited range of compression ratios appropriate for the target applications. The local spatial decorrelation methods in MPEG and JPEG are very similar. Picture data is block transform coded with the two-dimensional orthonormal 8x8 DCT. The resulting 63 AC transform coefficients are mapped in a zig-zag pattern to statistically increase the runs of zeros. Coefficients of the vector are then uniformly scalar quantized, run-length coded, and finally the run-length symbols are variable length coded using a canonical (JPEG) or modified Huffman (MPEG) scheme. Global frame redundancy is reduced by 1-D DPCM, of the block DC coefficients, followed by quantization and variable length entropy coding. MCP DCT ZZ Q Frame -> 8x8 spatial block -> 8x8 frequency block -> Zig-zag scan -> RLC VLC quantization -> run-length coding -> variable length coding. Step 2: Inter-Picture Coding Much of the information in a picture within a video sequence is similar to information in a previous or subsequent picture. The MPEG standard takes advantage of this temporal redundancy by representing some pictures in terms of their differences from other (reference) pictures, or what is known as inter-picture coding. This section describes the types of coded pictures and explains the techniques used in this process. The MPEG standard specifically defines three types of pictures: intra, predicted, and bidirectional. Intra Pictures Intra pictures, or I-pictures, are coded using only information present in the picture itself. I-pictures provide potential random access points into the compressed video data. I-pictures use only transform coding and provide moderate compression. I-pictures typically use about two bits per coded pixel. Predicted Pictures Predicted pictures, or P-pictures, are coded with respect to the nearest previous I- or P-picture. This technique is called forward prediction and is illustrated in Figure 2-6. Like I-pictures, P-pictures serve as a prediction reference for B-pictures and future P-pictures. However, P-pictures use motion compensation to provide more compression than is possible with I-pictures. Unlike I-pictures, P-pictures can propagate coding errors because P-pictures are predicted from previous reference (I- or P-) pictures. Figure 2-6 Forward Prediction Bidirectional Pictures Bidirectional pictures, or B-pictures, are pictures that use both a past and future picture as a reference. This technique is called bidirectional prediction and is illustrated in Figure 2-7. B-pictures provide the most compression and do not propagate errors because they are never used as a reference. Bidirectional prediction also decreases the effect of noise by averaging two pictures. 20 Figure 2-7 Bidirectional Prediction Video Stream Composition The MPEG algorithm allows the encoder to choose the frequency and location of I-pictures. This choice is based on the application's need for random accessibility and the location of scene cuts in the video sequence. In applications where random access is important, I-pictures are typically used two times a second. The encoder also chooses the number of B-pictures between any pair of reference (I- or P-) pictures. This choice is based on factors such as the amount of memory in the encoder and the characteristics of the material being coded. For example, a large class of scenes have two bidirectional pictures separating successive reference pictures. A typical arrangement of I-, P-, and B-pictures is shown in Figure 2-8 in the order in which they are displayed. Figure 2-8 Typical Display Order of Picture Types The MPEG encoder reorders pictures in the video stream to present the pictures to the decoder in the most efficient sequence. In particular, the reference pictures needed to reconstruct B-pictures are sent before the associated B-pictures. Figure 2-9 demonstrates this ordering for the first section of the example shown above. Figure 2-9 Video Stream versus Display Ordering 21 Motion Compensation Motion compensation is a technique for enhancing the compression of P- and B-pictures by eliminating temporal redundancy. Motion compensation typically improves compression by about a factor of three compared to intra-picture coding. Motion compensation algorithms work at the macroblock level. Bewegungskompensation bedeutet, daß redundante Bildinformationen, welche sich durch Koordinatenverschiebungen innerhalb einer Bildsequenz ergeben, nur durch einen Vektor mit Referenzierung auf einen Urblock codiert werden. Bei der Berechnung der Motion Compensation wird sich dabei jedoch ein Bilddetail nicht immer identisch über eine Folge mehrerer<Bilder fortsetzen. Ein Pixelblock wird sich im Fall von Realvideo aufgrund des Grundrauschens immer mehr oder weniger vom<vorhergehenden unterscheiden. Bei einer Person, die sich durch das Bild bewegt, ändert sich zum Beispiel der Sitz oder die<Schattierung der Kleidung. Falls die Bildunterschiede signifikant sind, muß neben dem Motion-Vektor auch noch ein Fehlerbild codiert werden. Die Entscheidung, wohin sich ein Bildinhalt bewegt, kann nur aufgrund objektiver Kriterien erfolgen. Ein Video-Encoder wird daher in der Umgebung des früheren Ausgangsblocks nach einem Pixelblock suchen, der eine größtmögliche Ähnlichkeit besitzt (Abbildung ). Ein denkbares Entscheidungskriterium ist zum Beispiel der mittlere quadratische Abstand der Werte der beiden 16xl6-Pixelblöcke. Gemeint ist damit, daß die Quadrate der Differenzen aller Luminanzwerte und Chrominanzwerte des Originalblocks und des Kandidatenblocks innerhalb des Suchbereiches errechnet und aufsummiert werden. Auf diese Art und Weise erhält man ein Maß für die Ähnlichkeit zweier Blöcke. Hat sich ein Block zum nächsten fortgepflanzt ohne sich zu verändern, ist die Differenz gleich Null. Eine sehr rechenaufwendige Methode wäre, für alle denkbaren Verschiebungen innerhalb des Suchbereichs die Summe der quadrierten Differenzen zu bilden. Im Encoder wird dann der Bewegungsvektor des Bildes mit dem kleinsten quadratischen Abstand zum Original als der beste ausgewählt. Die Suche nach dem besten Motion-Vektor kann mit einer Auflösung von einem Pixel oder einem halben Pixel erfolgen. Die für die Codierung verwendeten Vektoren besitzen dabei eine Auflösung von bis zu einem halben Pixel. Für die Suche nach dem Motion-Vektor kann linear zwischen benachbarten Pixeln interpoliert werden. Da der Rechenaufwand sehr erheblich ist, werden unterschiedliche Suchstrategien angewandt. So kann beispielsweise zunächst das Gitter der 48x48 ganzzahligen Verschiebungen abgesucht werden, um danach die 8 benachbarten Positionen mit einem Abstand von einem halben Pixel zu untersuchen. Eine weitere Methode benutzt für die Suche zunächst ein grobes Raster mit einem Abstand von mehreren Pixeln um es dann um die beste Position nach und nach zu verfeinern. Diese Methode kommt mit noch weniger Schritten aus. Allerdings wird die Wahrscheinlichkeit geringer, den optimalen Motion-Vektor zu finden. JPEG kann in der Regel mit einer 20- bis 25fachen Datenverdichtung Bilder guter Qualität komprimieren und dekomprimieren. MPEG erreicht durch das Motion-CompensationVerfahren den dreifachen Wert. Berücksichtigt man, daß bei MPEG-1 Videobilder vor der eigentlichen Kompression auf CIF heruntergerechnet werden, so ergeben sich Datenverdichtungen um den Faktor 240. (CIF (Common Intermediate Format) entspricht einer Auflösung von 352*288 Pixel (352*240 Pixel bei NTSC) - ermöglicht ganzzahlige Aufteilung in 16x16 Blöcke.) Dies bedeutet - bildlich gesehen, daß zehn Pixel mit je acht Bit für die Rot-, Grün- und Blau-Werte durch nur ein Bit dargestellt werden. When a macroblock is compressed by motion compensation, the compressed file contains this information: • • The spatial vector between the reference macroblock(s) and the macroblock being coded (motion vectors) The content differences between the reference macroblock(s) and the macroblock being coded (error terms) Not all information in a picture can be predicted from a previous picture. Consider a scene in which a door opens: The visual details of the room behind the door cannot be predicted from a previous frame in which the door was closed. When a case such as this arises--i.e., a macroblock in a P-picture cannot be efficiently represented by motion compensation--it is coded in the same way as a macroblock in an I-picture using transform coding techniques. The difference between B- and P-picture motion compensation is that macroblocks in a P-picture use the previous reference (I- or P-picture) only, while macroblocks in a B-picture are coded using any combination of a previous or future reference picture. Four codings are therefore possible for each macroblock in a B-picture: • • • • Intra coding: no motion compensation Forward prediction: the previous reference picture is used as a reference Backward prediction: the next picture is used as a reference Bidirectional prediction: two reference pictures are used, the previous reference picture and the next reference picture Backward prediction can be used to predict uncovered areas that do not appear in previous pictures. 22 Das MPEG-Verfahren nutzt die Tatsache, daß in Folgen bewegter Bilder zwischen aufeinanderfolgenden Bildern große Ähnlichkeit besteht. Mit der Ausnahme krasser Szenenwechsel werden sich Bilddetails kontinuierlich von einem Bild zum nächsten fortsetzen, wie zum Beispiel ein sich von links nach rechts bewegendes Fahrzeug oder eine weiße Wolke, die vor dem Hintergrund eines blauen Himmels vorbeizieht. Ein zentraler Bestandteil von MPEG ist nun die sogenannte Motion Compensation: Die Bewegung des Fahrzeugs wird einfach durch einen Vektor beschrieben, zum Beispiel durch die Angabe, daß das Fahrzeug sich von einem Bild zum nächsten um 12 Pixel nach rechts und 10 Pixel nach oben bewegt hat. Die Erkennung eines zusammengehörigen Objekts wäre in der Praxis allerdings viel zu aufwendig. Stattdessen werden sogenannte Makroblöcke mit einer Pixelgröße von 16x16 untersucht. Diese Makroblöcke entsprechen 4 Blöcken, wie sie bei JPEG codiert werden. Im nächsten Schritt wird die Differenz aus dem realen Makroblock in Filmbild 1 und dem verschobenen Makroblock aus Filmbild 2 gebildet. Dieses Fehlerbild muß neben dem Verschiebungsvektor zur Beobachtung der Fehlerfortpflanzung codiert und gespeichert werden. Der geringste Speicheraufwand entsteht natürlich, wenn der Unterschied zwischen den verschobenen Makroblöcken und den tatsächlich dargestellten Blöcken so klein ist, daß auf die Codierung der Differenz ganz verzichtet werden kann. MPEG steuert die Darstellung von komprimiertem Video durch die Festlegung einer Syntax. Die Regeln zur Erfassung der Bewegungskompensation lassen hingegen viele Freiheiten zu, so daß die Qualität des MPEG-Endprodukts auch maßgeblich von der Güte des verwendeten Codierungs-Algorithmus abhängt. Synchronization The MPEG standard provides a timing mechanism that ensures synchronization of audio and video. The standard includes two parameters: the system clock reference (SCR) and the presentation timestamp (PTS). The MPEG-specified ``system clock'' runs at 90 kHz. System clock reference and presentation timestamp values are coded in MPEG bitstreams using 33 bits, which can represent any clock cycle in a 24-hour period. An SCR is a snapshot of the encoder system clock which is placed into the system layer of the bitstream, as shown in Figure 2-11. During decoding, these values are used to update the system clock counter in the CL480. 23 Figure 2-11 SCR Flow in MPEG System Presentation timestamps are samples of the encoder system clock that are associated with video or audio presentation units. A presentation unit is a decoded video picture or a decoded audio time sequence. The PTS represents the time at which the video picture is to be displayed or the starting playback time for the audio time sequence. The decoder either skips or repeats picture displays to ensure that the PTS is within one picture's worth of 90 kHz clock tics of the SCR when a picture is displayed. If the PTS is earlier (has a smaller value) than the current SCR, the decoder discards the picture. If the PTS is later (has a larger value) than the current SCR, the decoder repeats the display of the picture. 24 MPEG-1 Layer 3 Audio Compression In diesem Kapitel erhalten Sie eine kurze Einführung in MP3 und Hinweise auf Encoder und Player. Overview The ISO/MPEG Audio Coding Standard describes the compression of audio signals using high performance perceptual coding schemes. It specifies a family of three audio coding schemes, simply called Layer 1, Layer 2 and Layer 3. Compression gain (sound quality per bit) and encoder complexity increase from Layer 1 to Layer 3. All Layers use the same basic structure. The coding scheme can be described as perceptual noise shaping or perceptual subband/transform coding. The encoder analyses the spectral components of the audio signal by calculating a filterbank or transform and applies a psychoacoustic model to estimate the just noticeable noise-level. In its quantization and coding stage, the encoder tries to allocate the available number of data bits in a way to meet both the bitrate and masking requirements. The decoder is much less complex. Its task is to synthesize an audio signal out of the encoded spectral components. Compression rates: You can achieve a compression rate of 1:4 1:6..8 1:10..12 with Layer 1 (or 192 kbps per audio channel), with Layer 2 (or 128..96 kbps per audio channel), and with Layer 3 (or 64..56 kbps per audio channel), and the reconstructed audio signal will maintain a CD-like sound quality. There is a lot of confusion surrounding the terms audio compression, audio encoding, and audio decoding. This section will give you an overview what audio coding (another one of these terms...) is all about. The purpose of audio compression Up to the advent of audio compression, high-quality digital audio data took a lot of hard disk space to store. Let us go through a short example. You want to, say, sample your favorite 1-minute song and store it on your harddisk. Because you want CD quality, you sample at 44.1 kHz, stereo, with 16 bits per sample. 44100 Hz means that you have 44100 values per second coming in from your sound card (or input .le). Multiply that by two because you have two channels. Multiply by another factor of two because you have two bytes per value (that's what 16 bit means). The song will take up 44 100 sample/sec * 2 channels * 2 bytes/sample * 60 sec/min= 10 Mbyte/min Means 10 MB of storage space on your harddisk per minute. If you wanted to download that over the internet, given an average 28.8 modem, it would take you (at least) 10 000 000 bytes 8 bits/byte * 28.800 bits/sec * 60 sec/min = 45 min ¾ h just to download one minute of music! Digital audio coding, which - in this context - is synonymously called digital audio compression as well, is the art of minimizing storage space (or channel bandwidth) requirements for audio data. Modern perceptual audio 25 coding techniques (like MPEG Layer-3) exploit the properties of the human ear (the perception of sound) to achieve a size reduction by a factor of 12 with little or no perceptible loss of quality. Therefore, such schemes are the key technology for high quality low bit-rate applications, like soundtracks for CD-ROM games, solid-state sound memories, Internet audio, digital audio broadcasting systems, and the like. The two parts of audio compression Audio compression really consists of two parts. The .rst part, called encoding, transforms the digital audio data that resides, say, inaWAVE .le, into a highly compressed form called bitstream. To play the bitstream on your soundcard, you need the second part, called decoding. Decoding takes the bitstream and re-expands it to a WAVE .le. The program that e.ects the .rst part is called an audio encoder. MP3Enc is such an encoder; there are others, see http://www.fhg.iis.de/audio/. The program that does the second part is called an audio decoder. One well-known MPEG Layer-3 decoder is WinPlay3, another l3dec. Both can be found on http://www.fhg.iis.de/audio/. Compression ratios, bitrate and quality It has not been explicitly mentioned up to now: What youend up with after encoding and decoding is not the same sound .le anymore: All superflous information has been squeezed out, so to say. It is not the same .le, but it will sound the same { more or less, depending on how much compression had been performed on it. Generally speaking, the lower the compression ratio achieved, the better the sound quality will be in the end { and vice versa. Table 1.1 gives you an overview about quality achievable. Because compression ratio is a somewhat unwieldy measure, experts use the term bitrate when speaking of the strength of compression. Bitrate denotes the average number of bits that one second of audio data will takeup in your compressed bitstream. Usually the units used will be kbps, which is kbits/s , or 1024 bits/s. To calculate the number of bytes per second of audio data, simply divide the number of bits per second by eight. Fraunhofer Homepage: http://www.iis.fhg.de/amm/ MP3 FAQ: http://www.iis.fhg.de/amm/techinf/layer3/layer3faq/index.html 26 Profiles In diesem Kapitel werden einige Profile vorgestellt, die beim Erstellen von MPEG Dateien verwendet werden. What is common and what are the difference between MPEG-1 and MPEG-2? MPEG-1 is suitable for low and medium data rate applications producing image quality comparable to VHS tape. Such applications include computer multimedia CD-ROM titles, computer games, video training materials, video databases, and networked video applications. It is generally used for video of resolutions up to 352x288 and data rates up to 2 Mbits/sec. MPEG-2 is designed for higher quality video applications like DVD, video on demand, and digital broadcasting. It is generally used for resolutions and data rates greater than those listed above for MPEG-1. MPEG-1 and MPEG-2 are international standards. The MPEG files are operating system independent, unlike formats such as AVI and QuickTime. MPEG-1 playback is now standard on most systems sold today. With the proliferation of DVD players on PCs, the same is happening with MPEG-2. In most cases, MPEG provides better quality video in a smaller file size than AVI or QuickTime codecs. In the past few years, it has become very easy to capture analog input video as an AVI file with an inexpensive video capturing card on a PC, optionally edit the AVI video file, and then convert the AVI file to MPEG. Profiles and Layers Although it is possible to create MPEG-1 files with frame sizes of up to 4095x4095 and data rates greater than 1812 kbits/sec, not all decoders can play such MPEG-1 streams. There is a minimum required set of parameters of MPEG-1 streams that many low-end hardware and software MPEG-1 players usually support. The minimum set of parameters is specified below as "constrained parameters". A constrained parameter bitstream (CBR) is defined in the MPEG-1 Standard as following: 1. 2. 3. 4. 5. 6. 7. Horizontal frame size less than or equal 768 pixels. Vertical frame size less than or equal 576 pixels. Picture area less than or equal 396 macroblocks (101376 pixels). Frame rate less than or equal 30 frames/sec. Motion vectors less than or equal 64. Bitrate less than or equal 1.86 Mbits/sec VBV buffer size less than or equal to 40 Kbytes/sec. (40 Kbytes/sec for constrained files and 224 Kbytes/sec for non-constrained files.) Constrained Parameters do not apply to MPEG-2 streams, as they have broader applications than CD-ROM or VideoCD. MPEG-2 was designed to be a very generic standard in that it is to be used for a variety of applications, everything from DVD and computer video to digital satellite and HDTV systems. MPEG-2 does, however, have defined Profiles and Levels of compatibility. Profiles specify syntax (i.e. algorithms), and Levels specify coding parameters (sample rates, frame dimensions, coded bitrates, etc.). Defined together, Profiles and Levels specify interchange standards for specific applications of MPEG-2. As shown in the table below, there are 5 Profiles (Simple, Main, SNR, Spatial and High), each with a maximum of 4 possible Levels (Low, Main, High-1440, and High). Not all combinations have been defined in the MPEG-2 specification. MPEG-2 Main Profile at Main Level (MP@ML) can be considered similar to MPEG-1's constrained parameters, and supports up to 720 pixels x 480 lines x 30 frames/sec, at a total sampling rate up to 10.4 Msamples/second (i.e., consistent with the CCIR-601 video format standard). If compatibility with a specific MPEG-2 application or decoder is important in your work, be sure to specify the correct parameters, and the correct Profile/Level combination. The following table shows the MPEG-2 Profile (horizontal) / Level (vertical) cross-reference structure, along with the values that defined the upper limits of each combination. For a more in depth explanation of this systems, check the MPEG-2 specification or a reference on the subject. 27 Level ß/ Profile Þ LOW SIMPLE undefined MAIN MP@LL 352 pels/line 288 lines/frame 30 frames/sec 3.04 Msamples/s 4 Mbits/s SNR SPATIAL 352 pels/line undefined 288 lines/frame 30 frames/sec 3.04 Msamples/s 4 Mbits/s both layers 3 Mbits/s base layer 720 pels/line 576 lines/frame 30 frames/sec 10.4 Msamples/s 15 Mbits/s MP@ML HIGH 1440 undefined 1440 pels/line 1152 lines/frame 60 frames/sec 47 Msamples/s 60 Mbits/s undefined HIGH undefined 1920 pels/line 1152 lines/frame 60 frames/sec 62.7 Msamples/s 80 Mbits/s undefined MAIN 720 pels/line 576 lines/frame 30 frames/sec 10.4 Msamples/s 15 Mbits/s HIGH undefined 288 lines/frame undefined 30 frames/sec 3.04 Msamples/s 4 Mbits/s both layers 3 Mbits/s base layer 720 pels/line 576 lines/frame 30 frames/sec 11.06 Msamples/s or 14.75 samples/s 20 Mbits/s 3 layes 15 Mbits/s base + middle 4 Mbits/s base layer 1440 pels/line 1440 pels/line 1152 lines/frame 1152 lines/frame 60 frames/sec 60 frames/sec 47 Msamples/s 47 Msamples/s or 60 Mbits/s 3 layes 62.7 Msamples/s 40 Mbits/sbase + 80 Mbits/s 3 layers middle 60 Mbits/s 15 Mbits/s base lay- base + middle er 20 Mbits/s base layer undefined 1920 pels/line 1152 lines/frame 60 frames/sec 62.7 Msamples/s or 83.5 Msamples/s 100 Mbits/s 3 layes 80 Mbits/s base + middle 25 Mbits/s base layer In den folgenden Abschnitten finden Sie Informationen darüber, welche dieser Kombinationen von ausgewählten Encodern unterstützt werden. Predefined Profiles (Ligos) Achtung: Der Begriff Profile umfaßt im folgenden Profile/Layer (siehe zuvor). This first set of four are MPEG-1 Profiles very suitable for Internet use. Using a combination of low data rates and our special “Skip B frames” mode, these profiles produce very compact files suitable for e-mail or web pages. 1. MPEG-1 (Low bitrate, 10 fps simulated) MPEG-1, 30 fps (with special “Skip B frames” mode enabled to simulate 10 fps), 190 kbits/sec video data rate, 96 kbits/sec data rate MPEG-1 Layer II audio 2. MPEG-1 (Low bitrate, 15 fps simulated) MPEG-1, 29.97 fps (with special “Skip B frames” mode enabled to simulate 15 fps), 498 kbits/sec video data rate, 96 kbits/sec data rate MPEG-1 Layer II audio 3. MPEG-1 (Low bitrate, 15 fps simulated, variable bitrate) MPEG-1, 23.976 fps (with special “Skip B frames” mode enabled to simulate 15 fps), variable bitrate mode with a peak maximum of 600 kbits/sec video data rate, average bitrate of 350 kbits/sec, and 96 kbits/sec data rate MPEG-1 Layer II audio 4. MPEG-1 (Low bitrate, 5 fps simulated, variable bitrate) MPEG-1, 30 fps (with special “Skip B frames” mode enabled to simulate 5 fps), variable bitrate mode with a peak maximum of 125 kbits/sec video data rate, average maximum bitrate of 75 kbits/sec, average minimum bitrate of 20 kbits/sec, and 32 kbits/sec data rate MPEG-1 Layer II audio This next set of MPEG-1 Profiles are much more general and popular Profiles based on “good” data and frame rates for different input videos (based on frame size and frame rate). 28 1. MPEG-1 (Recommended for QSIF, 176x120) MPEG-1, 24 fps, 600 kbits/sec video data rate, 96 kbits/sec data rate MPEG-1 Layer II audio 2. MPEG-1 (Recommended for SIF, 352x240 NTSC) MPEG-1, 29.97 fps, 1198 kbits/sec video data rate, 128 kbits/sec data rate MPEG-1 Layer II audio 3. MPEG-1 (Recommended for SIF, 352x288 PAL) MPEG-1, 25 fps, 1198 kbits/sec video data rate, 128 kbits/sec data rate MPEG-1 Layer II audio 4. MPEG-1 NTSC (General) MPEG-1, 30 fps, 1098 kbits/sec video data rate, 112 kbits/sec data rate MPEG-1 Layer II audio 5. MPEG-1 PAL (General) MPEG-1, 25 fps, 1098 kbits/sec video data rate, 112 kbits/sec data rate MPEG-1 Layer II audio This next set of files give you “one-click” access to the settings for VideoCD, both NTSC and PAL standard. 1. MPEG-1 VideoCD NTSC MPEG-1 VideoCD, 29.97 fps, 1123 kbits/sec video data rate, 224 kbits/sec data rate MPEG-1 Layer II audio 2. MPEG-1 VideoCD PAL MPEG-1 VideoCD, 25 fps, 1123 kbits/sec video data rate, 224 kbits/sec data rate MPEG-1 Layer II audio The remaining seven Profiles are best used for video larger than 352x288, and for higher data rates than listed above. They default to MPEG-2, and cover resolutions from Half-Horizontal Resolution (HHR, or Half D1) up to fullscreen. Also included is an MPEG-2 Profile using the new Variable Bitrate mode that produces excellent fullscreen quality in a compact file size. 1. MPEG-2 (Recommended for HHR, 352x480 NTSC) MPEG-2, 30 fps, 2048 kbits/sec video data rate, 192 kbits/sec data rate MPEG-1 Layer II audio 2. MPEG-2 (Recommended for HHR, 352x576 PAL) MPEG-2, 25 fps, 2048 kbits/sec video data rate, 192 kbits/sec data rate MPEG-1 Layer II audio 3. MPEG-2 (Recommended for NTSC Full Screen) MPEG-2, 29.97 fps, 4096 kbits/sec video data rate, 224 kbits/sec data rate MPEG-1 Layer II audio 4. MPEG-2 (Recommended for PAL Full Screen) MPEG-2, 25 fps, 4096 kbits/sec video data rate, 224 kbits/sec data rate MPEG-1 Layer II audio 5. MPEG-2 (Sample VBR profile for Full Screen) MPEG-2, 29.97 fps, Variable Bitrate with maximum peak 6000 kbits/sec video data rate, maximum average of 3000 kbits/sec, minimum average of 1500 kbits/sec, 224 kbits/sec data rate MPEG-1 Layer II audio 6. MPEG-2 (Main Profile @ Low Level) MPEG-2, 30 fps, 3906 kbits/sec video data rate, 112 kbits/sec data rate MPEG-1 Layer II audio 7. MPEG-2 (Main Profile @ Main Level) MPEG-2, 30 fps, 14648 kbits/sec video data rate, 112 kbits/sec data rate MPEG-1 Layer II audio Predefined Profiles (Xing) The XingMPEG Encoder comes with a number of pre-defined Stream Profiles. Audio Only Layer 3 Folder contains Stream Profiles for making audio only MPEG files for a wide range of target networks. 1. 128K Stream Profile creates CD quality layer 3 audio only MPEG-1 layer 3 files for broadband target networks. Great for music! 29 2. 112K Stream Profile creates CD quality layer 3 audio only MPEG-1 layer 3 files for broadband target networks. Great for music! 3. 64K Stream Profile creates a radio quality audio only MPEG-2 layer 3 files for narrowband target networks. 4. 28.8 Modem Stereo Stream Profile creates a telephone quality stereo audio only MPEG-2 layer 3 files for narrowband target networks. 5. 28.8 Modem Mono Stream Profile creates a telephone quality mono audio only MPEG-2 layer 3 files for narrowband target networks. MPEG 1 Folder contains Stream Profiles for making various kinds of MPEG files, from full size down to smaller Internet based sizes. 1. Match Source Stream Profile creates an MPEG file that best matches the source file (near CD quality audio). Use the Match Source Stream Profile when you are unsure which Stream Profile to use or to avoid unnecessary re-encoding. When you provide an MPEG system source file (.mpg), MPEG video source file (.mpv) or an MPEG audio source file (.mpa), the XingMPEG Encoder tries to use that source file without re-encoding it. However, if your Stream Profile does not match the source file's properties exactly (Data Rate, Resolution, Frame Rate, etc.), the Encoder assumes you want the file re-encoded with the new properties. Use the Match Source Stream Profile to prevent unnecessary re-encoding. If you want to use an MPEG-2 or LBR Algorithm, you need to create a custom Match Source Stream Profile. Use the MPEG 1 Match Source Stream Profile and change the Algorithm field to MPEG-2 or LBR. 1. NTSC Stream Profile creates a full-screen MPEG file following the US (NTSC) standard for color television broadcast signals (near CD quality audio). 2. PAL Stream Profile creates a full-screen MPEG file following the European standard (PAL) for color television broadcast signals (near CD quality audio). 3. FILM Stream Profile creates a full-screen MPEG file following the 35mm motion picture film standard (near CD quality audio). 4. 600K Stream Profile creates a full-screen MPEG file with high radio quality audio suitable for broadband target networks. 5. 384K Stream Profile creates a full-screen MPEG file with radio quality audio suitable for broadband target networks. 6. 128K Stream Profile creates a quarter-screen MPEG file with radio quality audio suitable for narrowband target networks. VideoCD Folder contains Stream Profiles for making VideoCD MPEG files for the Whitebook Standard. 1. NTSC Stream Profile creates a VideoCD MPEG file following the US (NTSC) standard for color television broadcast signals (CD quality audio). 2. PAL Stream Profile creates a VideoCD MPEG file following the European (PAL) standard for color television broadcast signals (CD quality audio). 3. FILM Stream Profile creates a VideoCD MPEG file following the 35mm motion picture film standard (CD quality audio). Audio/Video Layer 2 folder contains Stream Profiles for making audio and video MPEG files for a wide range of target networks. 1. 1.5Mb Stream Profile creates a full-screen TV quality MPEG-1 file with near CD quality layer 2 audio for broadband target networks. 2. 600K Stream Profile creates a full-screen TV quality MPEG-1 video file with high radio quality layer 2 audio for broadband target networks. 3. 384K Stream Profile creates a full-screen TV quality MPEG-1 video file with radio quality layer 2 audio for broadband and narrowband target networks. 4. 128K ISDN to 28.8K Modem Stream Profile creates a quarter-screen MPEG-1 video file with radio quality layer 2 audio for 128K ISDN down to 28.8K Modem target networks. 5. 128K ISDN to 14.4K Modem Stream Profile creates a quarter-screen MPEG-1 video file with radio quality layer 2 audio for 128K ISDN down to 14.4K Modem target networks. Audio/Video Layer 3 folder contains Stream Profiles for making audio and video MPEG files for a wide range of target networks. 30 1. 1.5Mb Stream Profile creates a full-screen TV quality MPEG-1 video file with CD quality layer 3 audio for broadband target networks. 2. 600K Stream Profile creates a full-screen TV quality MPEG-1 video file with near CD quality layer 3 audio for broadband target networks. 3. 384K Stream Profile creates a full-screen TV quality MPEG-1 video file with radio quality layer 3 audio for broadband and narrowband target networks. 4. 128K ISDN to 28.8K Modem Stream Profile creates a quarter-screen MPEG-1 video file with radio quality layer 3 audio for 128K ISDN down to 28.8K Modem target networks. Audio Only Layer 2 folder contains Stream Profiles for making audio only MPEG files for a wide range of target networks. 1. 384K Stream Profile creates a CD quality layer 2 audio only MPEG-1 file for broadband and narrowband target networks. 2. 128K ISDN Stream Profile creates a near CD quality layer 2 audio only MPEG-1 file for broadband and narrowband target networks. 3. 64K ISDN Stream Profile creates a high radio quality layer 2 audio only MPEG-2 file for narrowband target networks. 4. 14.4 Modem Stream Profile creates a radio quality LBR audio only file for narrowband target networks. 5. 28.8 Modem Stream Profile creates a radio quality layer 2 audio only MPEG-2 file for narrowband target networks. 6. 9600 Modem Stream Profile creates a radio quality LBR audio only file for narrowband target networks. Audio Only Layer 3 folder contains Stream Profiles for making audio only MPEG files for a wide range of target networks. 1. 192K Stream Profile creates the highest quality layer 3 audio only MPEG-1 file for broadband target networks. While Xing supports custom Stream Profiles up to 384K audio only layer 3 files, the quality gained above 192K is negligible. 2. 128K ISDN Stream Profile creates a CD quality layer 3 audio only MPEG-1 file for broadband target networks. Great for music! 3. 64K ISDN Stream Profile creates a high radio quality layer 3 audio only MPEG-2 file for narrowband target networks. 4. 28.8 Modem Stereo Stream Profile creates a radio quality layer 3 audio only MPEG-2 file for narrowband target networks. 5. 28.8 Modem Mono Stream Profile creates a radio quality layer 3 audio only MPEG-2 file for narrowband target networks. Profiles (Heuris) There are several base templates in MPEG Power Professional. These are built in templates that optimize MPEG encoding parameters for various types of applications. The base templates include: 1. CD 1X Optimized for playback from single speed CD-ROM 2. CD 2X Optimized for playback from double speed CD-ROM 3. CD-I Optimized for playback from a CD-I disc. This template provides MPEG files that are properly “pinked” for insertion into CD-I applications. 4. Internet Low bit-rate encoding for transmission over the Internet. 5. Video CD Optimized for Video CD – a format based on Phillips’ White Book Standard. These files will work in Video CD 1.1 or 2.0 applications. 31 MPEG Encoder Im folgenden finden Sie Beschreibungen einiger kommerzieller und freier MPEG Encoder. Die Beschreibungen stammen weitestgehend aus der mitgelieferten Online Dokumentation. Überblick Folgende MPEG Video- und Audio Encoder werden im folgenden beschrieben: 1. 2. 3. 4. 5. 6. Darim DVMPEG Encoder (http://www.darvision.com) Ligos LSX MPEG Encoder (http://www.ligos.com) Heuris MPEG Professional Encoder 2.0 (http://www.heuris.com) Xing MPEG Encoder (http://www.xingtech.com/) Panasonic MPEG-1 Encoder Plugin (http://www.pwi.co.jp/products/mpeg/) AVI2MPG und BBMPEG (http://members.home.net/beyeler/bbmpeg.html) [Freeware!] Weitere MPEG Video- und Audio Encoder sind u.a. verfügbar von: 1. Digami MEGAPEG MPEG Encoder (http://www.digami.com) 2. PixelTools MPEG Encoder (http://www.pixeltools.com/) 3. Berkeley MPEG_Encode (nur Folgen von Einzelbildern, *.ppm!) [Freeware] Weitere MPEG Audio (MP3-) Encoder sind u.a. verfügbar von: 1. 2. 3. 4. 5. 6. Fraunhofer MP3 Producer (max. 36 Kbit/sec, hm) Fraunhofer MP3 Encoder Kommandozeile BladeEncoder inkl. Tools (Feurio, ...) Gogo MP3 Encoder Real Jukebox Real Producer G2 7.0 Beta Video Encoding Parameter To encode an MPEG file, many other parameters of the MPEG video stream must be specified. Among these Standard and Advanced parameters are (see definitions above): 1) 2) 3) 4) 5) 6) the frame rate of the video the resolution (frame size) of the video video and audio data rate, which is an average amount of data transferred in an MPEG stream per unit of time (usually kilobits or megabits per second) the amount of P frames that are to be stored between every pair of I frames amount of B frames that are to be encoded between every pair of P frames maximum vertical and horizontal motion vector values for P and B frames, which are necessary to limit the area covered by the motion estimation process 32 Darim DVMPEG Encoder 5.0 The Darim Vision MPEG compression software for Windows® 95, Windows® 98 and Windows NT™ (DVMPEG) is a versatile software-only tool that allows to create highly compressed MPEG video, audio and combined video/audio streams from existing movies or animation. Because of DVMPEG’s compatibility with Video For Windows™ industry standard for video and audio compression, the DVMPEG plug-in drivers can be used together with virtually any video editing or animation creation software. Examples include Adobe Premiere, Ulead Media Studio, Kinetix 3D Studio MAX, Asymetrix DVP, DPS VideoAction and many more. In general, any application that can output AVI files compressed using Microsoft Video for Windows™ interface will be able to produce MPEG files directly, thus saving a lot of storage space and time. We believe that you will be impressed with DVMPEG’s unparalleled features, quality, performance and ease of use. The new DVMPEG plug-in drivers and applications for Windows 95/98/NT are native 32-bit software, specially designed for these platforms. This allows DVMPEG to take full advantage of modern CPUs capabilities (such as MMX™ extensions) and operating systems architecture. The following are the most important features of DVMPEG plug-in drivers: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Easy to use add-on to your favorite video editing or computer graphics animation software Creates MPEG1 and MPEG2 files ‘on the fly’ eliminating the need in any intermediate files;MPEG files do not have any limitations on their size other than amount of available disk space Any source files supported by host video editing or CG animation software can be compressed; this includes AVI and QuickTime movies, and image sequences in various formats High performance 32-bit video and audio encoding engine, optimized for Pentium™ or better CPUs with MMX™ extensions Flexible video resolution settings from 32x32 pixels (for MPEG1) to 768x576 (for MPEG2) Provides control over many advanced MPEG encoding parameters (video aspect ratio, GOP structure, relative size of I, P and B frames, etc.) new! Several parameters presets are supplied for all kinds of input data and MPEG output Can output interlaced or progressive MPEG2 video depending on source video and target playback platform new! Batch MPEG compression Compatible with batch processing commands of many video editing programs new! User may see preview of the resulting MPEG clip during encoding new! Produces the following types of MPEG format: 1. MPEG1 layer II elementary audio (ISO/IEC 11172-3), 2. MPEG1 system stream (ISO/IEC 11172-1), 3. VideoCD (White book) compatible 1 4. MPEG2 elementary video Main Profile @ Main Level (ISO/IEC 13818-2), 5. MPEG2 program stream (ISO/IEC 13818-1) 6. DVD compatible video track (Constant Bitrate only)2 33 DVMPEG 5.0 also features the new AVI2MPEG front-end application that could be extremely helpful for novice users and anyone who needs to convert existing video and audio files in AVI, TGA, BMP, JPEG and WAV formats into MPEG1 or MPEG2. The AVI2MPEG program helps user to choose optimal compression parameters depending on the type of the source video. See section 4.2 for more information on AVI2MPEG. Finally, DVMPEG can be easily integrated into custom MPEG encoding applications. Naturally, it can be used via standard Video for Windows™ API for generic video and audio compression. Alternatively, a set of custom COM objects and interfaces exported by DVMPEG software can be used. Please contact Darim Vision for the description of these interfaces and to obtain preliminary SDK Ligos LSX MPEG Encoder 3.0 LSX-MPEG Encoder is an application for transcoding AVI files into MPEG files. Specifically, the LSX-MPEG Encoder can create multiplexed MPEG-1 and MPEG-2 video and audio streams. The LSX-MPEG Encoder is optimized to achieve very fast encoding for most standard frame sizes and frame rates of video required in multimedia applications. Advantages: LSX-MPEG Encoder utilizes Intel MMXTM technology for maximum MPEG encoding speeds. The LSXMPEG Encoder automatically detects if it is running on MMX compatible processors and will utilize those instructions for faster encoding, if available. LSX-MPEG Encoder utilizes our revolutionary motion estimation algorithm for fastest MPEG encoding available in software. Due to our LightSpeed algorithm, the LSX-MPEG Encoder works several times faster than other software encoding solutions, while providing virtually the best possible compression and quality. Our super fast algorithm of motion estimation is available for licensing for software and hardware implementations of image compression and recognition. The LSX-MPEG Encoder is flexible. Our simple and intuitive interface is easy for new users, but still allows video professionals to control a number of MPEG encoding parameters. A special mode for creating very low data rate MPEG video files makes the program great for creating low-bandwidth MPEG files for the Internet. New features such as Variable Bitrate control allow the user to produce better MPEG files than ever before. If you are new to MPEG encoding and the LSX-MPEG Encoder, you can get started immediately with our newest Quick Start Tutorial. Using an AVI clip available from our website at http://www.ligos.com/products/sample_clips.shtm , you’ll be able to quickly familiarize yourself with the powerful but simple interface of the LSX-MPEG Encoder, and immediately see the advantages MPEG has over AVI codecs for producing smaller, better looking video. 34 LSX-MPEG Encoder Features 1) 2) 3) 4) 5) 6) 7) 8) 9) 10) 11) Single pass Variable Bitrate control with three modes of control and operation Support for 48 kHz audio input and output Custom controls for adding MPEG Sequence Headers Ability to specify “Closed GOPs” for increased compatibility with MPEG editing applications Full implementation of Video Buffer Verifier (VBV), including protection from overflow and underflow errors New "Edges" filter allows users to cover up edge garbage and noise that often accompanies source files captured from VHS 18 new and revised encoding Profiles that are more flexible and easier to use. Encode anything from exceptionally low variable bitrate MPEG video for the Internet, to MPEG for VideoCD, to full-blown MP@ML MPEG-2, all with just a few clicks Improved deinterlace function that creates higher quality progressive video from originally interlaced source LSX-MPEG Encoder now includes the LSX-MPEG Player, a software filter that provides quality playback of MPEG-2 video directly through Microsoft's Windows Media Player Improved, more efficient interface New Quick Start Tutorial Supported MPEG Output Formats: 1) 2) 3) 4) 5) MPEG-2 program streams (ISO/IEC 13818-1) MPEG-2 video elementary streams (ISO/IEC 13818-2) MPEG-1 system streams (ISO/IEC 11172-1), including White Book VideoCD MPEG-1 video elementary streams (ISO/IEC 11172-2) MPEG-1 audio layer II elementary streams (ISO/IEC 11172-3) Specifications for MPEG-1 Encoding Support: 1) Supports user-defined frame sizes per the "constrained parameter bit stream" section of the MPEG-1 specification: a) b) c) d) e) f) 2) 3) 4) 5) Horizontal frame sizes less than or equal 768 pixels Vertical frame sizes less than or equal 576 pixels Picture area less than or equal 396 macroblocks/picture Supports user-defined frame rates less than or equal 30 frames/sec Supports user-defined data rates less than or equal 1812 Kbits/sec Motion vectors less than or equal 64 Supports creation of audio only streams. (via *.mp2 temporary files) Supports encoding of MPEG-1 files at either constant bitrate or variable bitrate Supports creation of multiplexed MPEG-1 streams with audio stream (Layer 2 only) compression. The bit rate range follows ISO/IEC 11172-3 standard. It also supports mono, stereo, dual channel, and joint stereo mode Supports creation of “White Book” compliant MPEG-1 streams for VideoCD Specifications for MPEG-2 Encoding Support: 1) Support for the following MPEG-2 Profiles and Levels a) b) c) d) 2) 3) 4) Main Profile and Low Level Main Profile and Main Level High Profile and High-1440 Level High Profile and High Level Supports user-defined aspect ratios of 1:1, 4:3, 16:9, 2.21:1 Supports encoding of MPEG-2 files at either constant bitrate or variable bitrate Supports creation of multiplexed MPEG-2 streams with audio streams (MPEG-1 Layer II) compression. The bit rate range follows ISO/IEC 11172-3 standard. It also supports mono, stereo, dual channel, and joint stereo mode Specifications for MPEG-2 Decoding (LSX-MPEG Player): 35 - - Works directly with the Windows Media Player as a Direct Show filter, giving you a fully powered media player (DirectShow and ActiveMovie compatible) Support for real time decoding of MPEG-2 Program Streams (up to Main Profile@Main Level and 10 Mbps) Support for full screen playback of up to Full D-1 resolution files, NTSC (720x480, 30fps) or PAL (720x576, 25 fps) on a Pentium II 450 MHz system with a DirectX compatible video card supporting YUV overlay mode, and Half D-1 on a 300 MHz Pentium II Optimized for most efficient use of processor Support for re-routing MPEG-1 System Stream video decode through our filter Heuris MPEG Power Professional MPEG Power Professional is the most widely used professional MPEG encoder. It provides the features and image quality demanded by professional video editors, without the expensive hardware. It is also the toprated professional MPEG encoder, having won New Media magazine's prestigious Hyper Award in 1997 and 1998. MPEG Power Professional is available for Windows 95/98/NT, Compaq Alpha, and Power Macintosh systems. This demo version of MPEG Power Professional is limited in the amount of video you can convert to MPEG1 or MPEG2. It will expire on December 1, 1999. The ECL (Event Control List) stores information about all of the actions to be taken by the encoder. The information found in the ECL includes: filters (what type used, and when they are turned on and off), automatic scene detection, I-frame injection points, search parameters, and telecine information. The ECL generated by the analysis only feature is based on the best guess MPEG Power Professional can make based on the information it has. The Analysis Only Task Type allows you to manually review the suggested encoding events and accept or reject any of them prior to encoding. This type of analysis produces a framework for reviewing suggested encoding events. This is possible, because MPEG Power Professional takes the information it gathers during an Analysis Only pass over your video source and pumps it out into an editable file format called the Encoding Control List or .ECL file. Any encoding events which will be used when it actually encodes your video source can be viewed with the corresponding time index or frame number. Whenever you run an Analysis Only pass of your video source, MPEG Power Professional generates a corresponding .ECL file. Once the Analysis is complete, you can view the contents of the .ECL file, accept or reject any of the encoding events it suggests and add or substitute your own encoding controls. The ECL Editor by Timecode list shows you encoding events sequentially as they’re scheduled to occur. You can scroll through all of the scheduled encoding events, adding and deleting events throughout the timeline. You can also view the encoding events by frame. Highlight a timecode based encoding event from the list and then click on the “By Frame” button. The ECL Editor by Frame dialog box appears, displaying the same encoding event by frame, which you just viewed based on timecode. If you loaded your source video file, you also see a thumbnail bitmap of the currently selected frame where the encoding event will occur. If you did not load your video source file, the frame bitmap and the slider bar are disabled. 36 Xing MPEG Encoder 2.2 XingMPEG Encoder is a high performance software program that converts (encodes) new or existing audio and/or video files into MPEG files. For example you can: 1. 2. 3. 4. 5. 6. 7. Convert an existing .avi file into an MPEG video or audio file. Convert an existing .wav file into an MPEG audio file. Create fully compliant MPEG-1 system streams (video and audio streams combined). Create MPEG-2 audio streams - including MPEG-2 layer 3 audio. Create StreamWorks System (video and audio) streams for delivery using Xing's StreamWorks products. Create audio and video files for VideoCDs, KaraokeCDs, and CD-i Movies. Create VideoCD files that support Single Speed CD-ROM, Whitebook, and other popular MPEG formats; even still file formats. 8. Create MPEG files for quick downloading over the Internet or an intranet. 9. Create MPEG files from Apple's QuickTime .mov files. 10. Re-encode MPEG files: .mpa/.mp3, (audio), .mpv (video), and .mpg (system) files. Some of new features introduced in 2.20: 1. MPEG-2 layer 3 audio Create MPEG-2 layer 3 files for the best audio quality at low bit rates (less than 112kbps). 2. MPEG-1 layer 3 audio Create MPEG-1 layer 3 files for the best audio quality data at moderate to high bit rates (112-320kbps). 3. Support for Apple's QuickTime .mov files Create MPEG files from Apple's QuickTime .mov files. 4. Full system MPEG re-encoding Re-encode MPEG files: .mpa/.mp3, (audio), .mpv (video), and .mpg (system) files. 5. Batch processing Encode the same file types in an entire directory using the Make Batch button. A wildcard is added preceding the file's extension, and the similar file types are encoded as a batch job. Panasonic MPEG-1 Encoder Plugin 2.0 This software is the Plug-In software for Adobe Premiere5.x which covers full specification of MPEG1 encoding capability featured our original encoding engine developed by Matsushita Electric Industrial Co.,Ltd.. [Features] 1. Encode high resolution video data (up to 1024X1024 but must be multiples of 16) 2. 3 choices of Quantizer Matrix for image types Natural Image, CG/Cartoon, MPEG1 Standard) 3. Variety of filters to enhance its image quality Noise Reduction(improve an input image quality) 37 Smoothing Video Filter(improve an MPEG image quality) 4. Wide data rate for encoding from 600k to 15Mbps 5. Forced Intra Frame function can change any frames to an Intra frame. 6. Using the same user interface of Premier5.x for parameters settings This Plug-In is completely installed into "Export Movie Settings" panel in Premiere 5.x 7. Gamma correction for MPEG Data for PC and TV color characteristics 8. Create a low frame rate MPEG1 movie not complying with MPEG1 standard but it can run on most of MPEG1 player. 9. Create an MPEG1 System Stream for the VideoCD V2.0 Specification. [Specification] Version of MPEG Video Output Stream Data Hardware Configuration Operating System Frame Resolution (WxH) MPEG Video Stream Data Rate Video Frame Rate Audio Stream Data Rate Audio Mode Other Functions MPEG1 (ISO/IEC 11172-2 compliance) System/Video/Audio(Layer2) Pentium/Pentium-II based IBM-PC or Compatibles Minimum System Memory 64MB Windows95, Windows98 and WindowsNT4.0 or higher 64x64 to 1024x1024 [pixel] (multiples of 16) 600K to 15M [bit/sec] 10/15/23.976/24/25/29.97/30 [frames/sec] 64/96/128/192/224/384 [Kbit/sec] 32/44.1/48 [KHz] Noise Reduction Filter for Input Data Smoothing Filter for Output MPEG Data VBV Buffer Size Selectable GOP Sequence Selectable Forced Intra Frame function 3 choices of Quantizer Matrix Gamma correction for MPEG Data(for PC and TV) Motion compensation(half pel/full pel) [How to install...] 1. Confirm you installed Premiere5.x correctly before installing this software. 2. Run "dmpegpie.exe". 3. The installer starts automatically, then folow the dialogs. 4. After starting Premiere5.x, check the list box of "File Type"/"General Settings" in "Export Movie Settings" panel. If "Panasonic MPEG/Trial" was found among them, its installation finished correctly. [Limitations of this trial version] 1. A movie size for encoding is up to 30 seconds. 2. Time expiry function is implemented.(1 month) 3. "Panasonic MPEG1 Encoder" will be printed on an MPEG encoded movie. 4. Encoding failure may occur in case of encoding a high resolution video with a low data rate setting. <Alternative to prevent from this problem> Increase video data rate 320X240/30fps: data rate more than 800kbps 640X480/30fps: data rate more than 2000kbps Das Plugin kann nur in Zusammenhang mit einem Video Editing System wie z.B. Adobe Premiere genutzt werden. 38 bbMPEG 1.1 and AVI2MPG2 1.8 bbMPEG and AVI2MPG2 are Windows programs that convert AVI files to MPEG-2 or MPEG-1 (including VideoCD) files. They are freeware. The file bbMPEG.DLL is also a compiler/export plug-in for ADOBE Premiere 5.0 or higher (it will not work with version 4.2). The file AVI2MPG2.EXE is a front-end for bbMPEG.DLL so it can be used without ADOBE Premiere. This software was written with the goal in mind of creating MPEG-2 program streams from AVI files captured by a MotionJPEG video capture board that could be played on the Creative Labs PC-DVD Encore Dxr2 hardware (which has since died, may have to upgrade to a Dxr3 or a Hollywood+ card). All testing was done on this software with AVI files that had the following specs: 1. MPEG-2 - Video: 640x480 @ 29.97 (or 30) fps, Audio: 16-bit 44.1kHz stereo. 2. MPEG-1 (and VideoCD) - Video: 352x240 @ 29.97 fps, Audio 16-bit 44.1kHz stereo. 3. If you do encode other types of AVI files and run into problems, let me know and I will try to fix or help you fix the problem. The software generates MPEG-2 (ISO/IEC 13818-2) or MPEG-1 (ISO/IEC 11172-2) video streams, MPEG-1 (ISO/IEC 11172-3, layer 1 and 2 only) audio streams and MPEG-2 (ISO/IEC 13818-1) or MPEG-1 (ISO/IEC 11172-1) program streams (including VideoCD compliant streams) or almost any combination of the above. You can just do multiplexing if you want to, you don't have to encode video or audio. It can also multiplex AC3 audio streams into an MPEG-2 program stream. The video encoding was derived from MSSG (MPEG Software Simulation Group) MPEG-2 video codec, version 1.2. The audio encoding was derived from the MPEG/Audio Software Simulation Group's audio codec, version 4.0. The multiplexing was derived from Christoph Moar's MPLEX, version 1.1. Visit www.mpeg.org for links to all of the above software. bbMPEG requires either Win95, Win98 or WinNT. It also requires a Pentium processor. 39 MPEG Player In diesem Kapitel werden kurze Hinweise auf verfügbare Player gegeben. Überblick Zum Abspielen von MPEG-1 Video- und Audio -Dateien sind folgende Programme zu empfehlen: 1. Microsoft Media Player 6.4 2. Ligos LSX MPEG Player 3. Xing MPEG Player Zum Abspielen von MPEG-2 Video- und Audio -Dateien sind folgende Programme (möglichst mit Hardware MPEG-2 Decoder oder zumindestens einer „hilfreichen“ Grafik-Karte, siehe folgenden Abschnitt) zu empfehlen: 4. Creative DVD Player 5. WinDVD 6. PowerDVD Hilfreiche Grafik-Karten sind solche, die beim Dekodieren die Schritte Inverse DCT (iDCT) und/oder Motion Compensation unabhängig von der CPU ausführen. Minimale Voraussetzung ist Overlay Fähigkeit – leider auch nicht bei allen Karten, insbesondere auf Notebooks!, vorhanden. Zum Abspielen von MPEG-1 Audio-Dateien (MP3) sind folgende Programme zu empfehlen: 1. 2. 3. 4. WinAmp Fraunhofer MP3 Player Real Player G2 7.0 Sonique What is the best video card to play DVDs ? First of all, don't be fooled by adverts you may have read ! Voodoo3, TNT2, as well as G400 cards are NOT DVD accelerated at all !!! All these amazing cards are just overlay compatible. They just support the min specifications needed to play DVDs ! To have a smooth playback, you need a fast CPU and a video card that has fast colorspace conversion (YUV to RGB). There's NO need to have a DVD accelerated card if your CPU is fast enough (PentiumII 400+MHz). If you have a mid-range CPU (K6-2 300MHz to PentiumII 350MHz) you may need some hardware assistance (Motion Compensation or iDCT), if not a specific MPEG-2 card. We won't list all true DVD accelerated video cards, but keep in mind that there are several Motion Compensation or iDCT specifications (ATi MC, S3 MC, ATi iDCT etc.). And all softwares DVD decoders don't know them all (PowerDVD v1.50+ knows S3 MC, ATi MC but cannot use Rage128 iDCT. Cinemaster knows ATi's MC and iDCT but cannot use S3 MC). Once again, the video card you have must match the DVD software specifications, if you want to use some hardware acceleration your own video card supports. Finaly, to answer the question, it's better to invert its terms ! What video cards shouldn't you have to play DVDs ? Older cards as Matrox Mystique/Mystique 220/Millenium/MilleniumII, as well as S3 Trio and ALL OLDER cards are NOT overlay/DVD compatible. (if you use a Mpeg-2 card, these old cards will be OK ! Overlay is done by the Mpeg-2 card !) All other new video cards are overlay/DVD compatible and can be used to play DVD. You still want a brand and a model ? Well, if you need a cheap DVD system, go for the well known ATi Rage Pro card ! It's definitively a bad choice if you're a hard core 3D gamer, but Rage Pro has got a really super 40 fast colorspace conversion, and it's Motion Compensation is known by any recent and good software DVD decoders ! The Rage128 is far from the best 3D cards, but it's got a powerful iDCT that may help mid-range CPUs to render DVD playback the right way. Whatever brand you have got, be sure to always install the latest drivers available ! Microsoft Media Player 6.4 Microsoft Windows Media Player is a universal media player you can use to receive audio, video, and mixedmedia files in most popular formats. Use Windows Media Player to listen to or view live news updates or broadcasts of your favorite sports team, to review a music video on a Web site, to "attend" a concert or seminar, or to preview clips from a new movie. Media formats supported by Windows Media Player The following types of media files can be played by Microsoft Windows Media Player. When you open a stored file that has one of the extensions listed below, either by double-clicking a file icon or a link in a Web page, Windows Media Player starts. Microsoft Windows Media formats File name extensions: .avi, .asf, .asx, .rmi, .wav, .wma, .wax Moving Pictures Experts Group (MPEG) File name extensions: .mpg, .mpeg, .m1v, .mp2, .mp3, .mpa, .mpe Musical Instrument Digital Interface (MIDI) File name extensions: .mid, .rmi Apple QuickTime®, Macintosh® AIFF Resource File name extensions: .qt, .aif, .aifc, .aiff, .mov UNIX formats File name extensions: .au, .snd 41 Probleme, Tips und Tricks In diesem Kapitel finden Sie Hinweise, wie Sie optimal Videos aufnehmen, digitalisieren und nach MPEG wandeln können. Trixter's Desktop MPEG-1 Authoring FAQ This FAQ can always be found at: http://www.oldskool.org/mpeg/. The HTML version has embedded hypertext anchors to all of the software packages mentioned in this document. This FAQ attempts to answer some of the more common questions about authoring MPEG files (including Video CDs) that crop up on rec.video.desktop. While the questions and answers listed here are Windows/Premiere-centric, there are many concepts presented that apply to all OS platforms and editing packages. Disclaimer: I am not a video editing professional; I don't do this for a living. But I have worked with digital video on the desktop for almost a decade and MPEG-1 for half a decade, and have come to several conclusions about creating MPEGs that make sense. Maybe you agree with me; maybe not. Write me at trixter@oldskool.org and let me know if you find a glaring error in my conclusions (or if I'm leaving something major out). Disclaimer #2: To make this document easier to understand, I assume that you're using NTSC. To wit: • • • Captured video is at 30 frames (60 fields) per second. A full capture has 480 lines of resolution (720x480, for example). A "half" capture has 240 lines of resolution (352x240, for example). If your country's broadcast standard isn't NTSC, you'll have to substitute your country's numbers for what's listed in this document. For example, PAL is 576 full lines of res, 288 "half" lines of res, and a framerate of 25 ("fieldrate" of 50). What core ideas should I know about before I begin reading this FAQ? Core Idea #1: The quality of most MPEG encoders is directly tied to the quality of the input you give them. Remember the old adage, "Garbage in, Garbage out?" It's most evident when encoding MPEGs. If you give an encoder a noisy signal with lots of weak broadcasting artifacts, the encoder will try to include all of that in the output, which makes for a noisy bitstream. If your source is extremely clean (or live, like the live output of a video camera), your end result will be clean. Some encoders are much better than others, but the primary factor affecting the output is the quality of your input. Core Idea #2: Frames vs. Fields. Video is 30 frames a second, right? Wrong. Video has a framerate of 30, but each frame consists of two interlaced fields. A field is a completely new picture. Here's another way to understand it: Each NTSC "image" is made up of 240 lines. A 480-line capture, therefore, has two "images" in it--the odd scanlines (1, 3, 5, etc.) make up the first image, and the even scanlines (2, 4, 6, etc.) make up the second image. The second image is displayed 1/60th of a second after the first image, then you move onto the next frame. If you still have trouble understanding this, try playing a video with high motion in it in your VCR and then hit "pause". Notice how the freezed-frame tends to "flicker" or "jitter" quickly between two different images? That's because only one frame is being displayed, and is quickly alternating between the two fields 60 times a second. Core Idea #3: Software MPEG encoding takes a really, really long time unless you have a 500MHz (or faster) machine. Hardware encoders are either real-time (they encode the video as fast as it comes in) or faster than real-time (they encode off of .AVI files at about 3:1 or faster--a minute of video gets encoded in 20 seconds). The above information seems useless right now, but you may find it useful later. What process should I follow to create the best possible MPEGs? It depends on what your needs are, but what most people in in rec.video.desktop want to do is create Video CDs (MPEG-1, about 170Kbytes/second, up to 70 minutes of video+audio on a CDROM) that are as close to the original video source as possible. Here's a generic overview of what to do: 1. Capture full-frame video (480 lines). 2. De-interlace the video frames. This properly combines the two captured fields into a single frame. 3. Smoothly resize the de-interlaced frames down to your output size, typically 352x240. (The "smooth resize" process is sometimes called "resampling".) 42 4. Encode the resized frames with a software encoder. This will get you the best possible output quality. For a specific process using Premiere 4.2, here's what I do to create my Video CDs: 1. Capture at 720x480 and bring the clip (or clips) into Premiere 4.2 and arrange them on the timeline in the construction window. 2. Right-click each clip in the timeline, select "Field Options" from the menu that pops up, and then select "Always deinterlace" from the available options. 3. Once that's done, right-click each clip again, select "Filters" from the menu that pops up, and then choose the filters you want to apply for general processing. (I usually apply the Crop filter to get rid of a few noisy lines outside the frame that accidentally get captured by my capture device.) When you're done, apply the Resize filter. Make sure it's listed last in the filter list. 4. Go to the Make menu, choose Output Options, and make sure that your output size is the size of your final MPEG output. (For Video CD, I type in 352x240.) This setting, combined with the Resize filter, ensures that your video will be resized properly before it gets to the encoder. Don't trust an encoder to resize your input properly--most won't resize at all, or do it poorly. 5. At this point, I make a choice: If I am working with a short clip I will do a "Make Movie" to a completely new .AVI file and then encode it with Ligos' LSX-MPEG encoder. If I have a particularly large project, I will use Xing's MPEG encoder utilizing their plug-in for Premiere. (It shows up under the Make menu as "XingMPEG Movie".) There are some probably some time-saving shortcuts you could apply to the above, like making a virtual clip and applying all of the operations to that one clip, but I wanted to keep it simple for people who want to duplicate the process with other editing packages. Does a hardware MPEG encoder produce better output than a software MPEG encoder? It depends on the price, but the general answer is no. Consumer hardware encoders only encode the first field of a video frame and completely ignore the second field, so you lose motion quality. And because they have to encode in real time, there usually isn't enough processing time left over to do noise filtering, so the output can be noisier than a software encoder if your input is noisy. Of course, software encoding takes forever and a day, so there is still a valid reason to buy hardware encoders. If you have very clean source material, the output of a hardware encoder matches (and sometimes exceeds, in special cases) the output of a software encoder. Darim sells a product called the M-Filter, which greatly pre-processes video and assists MPEG compression with any encoder. However, like the other professional-grade MPEG products they manufacture, it has a price that is beyond most consumers' budgets. What's the best hardware MPEG encoder in a consumer price range? General consensus points to the Broadway being the best, with all others trailing slightly in terms of output quality. It's a bit pricy at $800, but it can deal with marginal source material much better than the others, and can also output back to TV (the Dazzle DVC can also output to TV). I am unsure if it captures and/or takes into consideration both video fields, however. If I had to rank them, I'd rank the Adaptec VideOh (which is a repackaged, OEM'd Futuretel Video Sphynx) 2nd after the Broadway, the Videonics Python ranking third, and the Dazzle DVC after that. But they're all acceptable if you have clean source material. I own a Python myself, and use it to encode live feeds from my video camera and images generated from PCs with results comparible to software encoders. What's the best software MPEG encoder in a consumer price range? General consensus points to Ligos' LSX-MPEG encoder. It's one of the fastest of the bunch, has a ton of options, and even has support for MPEG-2 if you want to experiment with DVD bitstream creation. Xing's encoder is just as fast, but doesn't handle low bitrate or high-motion clips quite as well as Ligos' encoder does. On the other hand, Xing's encoder comes with a free Premiere plug-in, which is an enormous time saver. You can do the same with Darim's DVMPEG because it installs as a VFW CODEC, but I believe it costs the most out of all three encoders listed. For short projects, I render to an .AVI file and use LSX-MPEG. For long projects, I export via Xing's Premiere plug-in. YMMV. I strongly suggest you download the trial version of all three encoders listed and test them out for yourself. Can I encode MPEGs for free? I can't spend money for an encoder. There are several encoders for multiple platforms at www.mpeg.org, but the easiest one to use for Windows that can also output Video CD bitstreams is AVI2MPG1 located at http://www.mnsi.net/~jschlic1/. (Be sure to grab the GUI front-end.) 43 Why should I capture at 480 lines when the MPEG output is only 240 lines? Wouldn't a 352x240 capture be more efficient? See Core Idea #2 listed at the beginning of this document. If your source material was captured at 240 lines (352x240, for example), you're missing half of the images. Capturing at 480 lines and then deinterlacing ensures that you are encoding as much of the original video signal as possible. I've deinterlaced and resized my 480-line video, but when I look at a single frame, it looks "blurry". My 240-line video looks fine. What gives? A single deinterlaced frame will indeed look more blurry than the same image captured at 240 lines. But if you play the two captures side by side, you'll notice that the 240-line capture doesn't look as "smooth" during playback than the 480-line capture that was deinterlaced and resized down to 240 lines. I have an "all-in-one" video card with embedded capture, but it can only capture at 240 lines. Will my MPEG output suffer? It won't be as good as a 480-line capture, but there's nothing preventing you from doing it. :-) 240-line captures aren't bad--they're just not as good as 480-line captures that have been deinterlaced and resized properly. My MPEG output has horizontal "lines" all over the place whenever there is heavy motion in the content. What gives? It sounds like you captured at 480 lines, but either forgot to resize cleanly or you're letting the encoder resize for you. Review the process listed above in the question "What process should I follow to create the absolute best possible MPEGs?". How can I avoid the Windows 2gig .AVI file size limitation when encoding MPEGs? Two ways: You can either generate many MPEG files from different clips and later join the MPEGs together, or you can generate the entire thing from your editing program. Joining clips together is the cheap method; you can find several programs to do this at www.mpeg.org, but one popular program that does this under Windows is Camel's MPEG Joiner. Note: If you are creating a Video CD, you might not have to join video clips together at all. Most VCD authoring programs allow you to create a "simple video sequence" that plays the MPEGs one right after the other. There are a couple of ways to do a long, unbroken sequence. The method I use is to put together my entire project in Premiere, then use Xing's Premiere plug-in to export the entire timeline to a single MPEG file. You can also use Darim's DVMPEG to output an entire timeline to a single MPEG. How can I avoid the Windows 2gig .AVI file size limitation when outputting to tape? If you have a "prosumer" package, such as the Miro DC30+, you probably already have a special version of Premiere that can either work with files larger than 2gig, or can play multiple files from the timeline seamlessly after rendering transitions. In the Miro product, this appears as a plug-in called "Miro InstantVideo". For those of us without the budget for such a product, there is an excellent shareware program that, in addition to being a powerful real-time NLE program, can string multiple pre-rendered clips together on a timeline and play them in sequence without dropping a single frame. This product is called DDClip, and is well worth the registration money. I've used it to string together multiple Iomega BUZ-captured clips with the same resolution and audio parameters, and it played them one right after the other without any dropped frames. I was able to output 10 2gig clips to tape (about 24 minutes of video) using DDClip without having to touch the VCR. How can I avoid the Windows 2gig .AVI file size limitation when capturing? AVI_IO was written by Markus Zingg expressly for this purpose: It is a better VidCap32 than VidCap32. You can capture to multiple files--even on multiple drives--and it won't drop a single frame. I have used it myself and can verify its effectiveness; I routinely use it to put together 30 minute and 60-minute VideoCDs. Is it possible to create MPEGs with a low framerate, like 15- or 10-fps? My low-bitrate MPEG has many artifacts. You can simulate a low framerate by encoding blank B-frames; this leaves more bits for the encoding of Iand P-frames. Ligos' encoder can be configured to do this; check the help file for exact configuration options. The Xing encoder can do this as well, but it does so automatically under low-bitrate conditions and it's exact behavior cannot be specifically controlled. (I've found the results to be perfectly fine--I just like to tweak options ;-) Is it possible to specify key frames manually? My MPEG has many artifacts because of swiftlychanging scenes in the source material. Unless you have professional hardware, no. Consumer encoding hardware and software usually don't allow you to arbitrarily specify where I (key) frames go. (If you're willing to pay for professional hardware, then they 44 will do this automatically. Jason Livingston had this to contribute: "The professional MPEG encoders (Heuris, Philips/Sun, the high end C-Cube chips) will automatically insert I frames when there is a significant change in the scene (called auto-scene change detection), and will even choose whether a I, B, or P frame would be more appropriate based on the current frame content. It wouldn't be unusual to see a professionally encoded MPEG stream look like IBBPBIBBBBBBBPP...") The best way to avoid the "blocky scene-change" effect you correctly described earlier is to either use a better/different encoder (Xing and Ligos are best, IMHO), or to encode at a higher bitrate. If you're already using VideoCD bitrates, then try a different encoder. Another thing to try is to apply a low-pass filter, median filter, or a very soft Gaussian blur (no more than a 1pixel radius) to the entire video as the last filter in any filter sequences you have. (Apply this as a filter if you're doing the Make Movie-Xing MPEG Export function, since it will *not* be applied if you specify it in the Make Movie special/advanced options.) This removes random noise and softens the entire image, which aids compression. This may not eliminate the "blocky sudden scene-change" effect, but it may help reduce it to the point that only trained eyes can see it. WHAT YOU SHOULD KNOW BEFORE YOU SHOOT YOUR VIDEO (Heuris, Big Squeeze) DO opt for a component video format if available. AVOID converting your video to or from a composite format at any time during production or post-production.... BECAUSE you will suffer an irreversible quality loss and potentially introduce artifacts that will stick with you all the way to the finished product. DO use high quality, first generation video. AVOID using second or third generation video..... BECAUSE the higher the quality you start with, the higher the quality of the end result. High quality video is often less “noisy”. Since MPEG cannot distinguish between moving video and “noise”, it will attempt to encode the noise, taking bits and quality away from your moving video. DO use nice big fonts. AVOID MPEG encoding text over moving video..... BECAUSE text is high frequency video data. The moving video in the background will cause your foreground text to fade in and out. In addition, the text uses lots of bits that could be allocated to making your video look better. DO use animation with medium amounts of detail and lines which are several pixels thick. AVOID using computer-rendered animation with extremely fine lines (less than 3 pixels) or extremely fine detail... BECAUSE extremely thin lines and fine details tend to “disappear” due to MPEG’s lower resolutions. DO use fast moving video with tightly focused close-ups. AVOID using fast moving video where background and foreground are both highly detailed and in-focus.... BECAUSE when background and foreground are both in focus, they vie with each other for bit allocation-both will require a lot of bits. This can lead to “blockiness” or “pixelization.” DO use talking heads in video; preferably a tightly focused close-up on the face. AVOID using talking heads that are too small..... BECAUSE when characters on the screen talk, the viewer’s focus is drawn to the mouth. If the mouth is too small, it will not be clear and will distracting. DO use computer or hand-drawn animation. AVOID using computer or hand-drawn animation with very sharp diagonal or vertical lines.... BECAUSE this can lead to “aliasing” which makes smooth lines look like stairsteps. DO use scene changes and relatively quick cuts. AVOID using extremely fast scene changes that comprise less than 2 frames or blinking or flashing screens... 45 BECAUSE very rapidly blinking screens and rapid scene changes are difficult for MPEG encoders to handle. Encoders need to work over multiple frames in order to achieve optimal compression. DO use video with contrast. AVOID using high contrasts in luminance, i.e. flames, explosions, fireworks, etc. BECAUSE high contrasts lead to blockiness. DO use video with lots of colors. AVOID using monochrome scenes..... BECAUSE while the resolution levels MPEG can handle are lower than computers, the number of colors MPEG is capable of is very high. So, if you can “say it” with color rather than “cross-hatching,” by all means do so. These guidelines are not meant to suggest that there are “hard and fast” rules for MPEG encoding. HOW TO JUDGE MPEG QUALITY (Heuris, Big Squeeze) Image quality is subjective at best. What looks good to one person may not look good to another. This is especially frustrating when you are trying to decide which MPEG encoding house to go with. However, you can educate yourself as to what to look for in MPEG encoded material. First of all, compare apples to apples. MPEG has a difficult time handling lots of fast motion with detailed backgrounds, areas of highly contrasting light intensity, (explosions, fireworks, lightning, etc.), and (believe it or not) simple 2-D animation sequences. Try to compare demos which display some of these difficult scenes. Just about anybody can make flowers blowing gently on the breeze or a duck gliding over the water look good. Next - get close. All MPEG encoding looks the same from 20 feet away.Optimal viewing distance for MPEG on a standard size computer monitor is 5 feet away, at about eye level. Finally, turn down the sound. The sound can have a strong effect on your perception of the video quality. If you’re really trying to level the playing field, turn off the sound. MPEG encoding has a host of potential quality problems all its own. Special things to look out for include: Blockiness: When your picture breaks up into little squares. Especially noticeable in fast moving highly detailed sequences, and sequences with high contrasts in light intensity like explosions and fireworks. Aliasing: When lines that are supposed to be straight (especially diagonal ones) look like little “stairsteps”. Not necessarily indicative of bad encoding, but aliasing may be reduced by good encoding or extra image processing. Fuzz and snow: Images that look as though your monitor is dirty or you lost a contact lens. Little gray or white flecks that intrude randomly throughout the picture. Worms: Crawling dots and squirming lines. Probably the result of low quality video or bad digitizing. Halos: Small area of distortion surrounding the outline of moving objects. Balancing Quality and Performance (Ligos) This section provides some recommendations on how to encode MPEG files achieving optimum balance between picture quality and performance requirements. Maximum motion vectors P frame maximum horizontal motion vector 16, 24, 32 46 P frame maximum vertical motion vector 16, 24 B frame maximum horizontal motion vector B frame maximum vertical motion vector 8, 16, 24 8, 16 Frames In most cases the default sequence of 3 P frames between I frames and 2 B frames between P frames gives fast encoding. If you want faster encoding then reduce the amount of B frames, because encoding B frames takes more time than encoding P frames. Set 1 B frame between P frames. 47 DVD Grundlagen und Authoring In diesem Kapitel erhalten Sie eine Einführung in das Thema DVD. DVD In Short DVD, which stands for Digital Video Disc or Digital Versatile Disc, is the next generation of optical disc storage technology. It's essentially a bigger, faster CD that can hold video as well as audio and computer data. DVD aims to encompass home entertainment, computers, and business information with a single digital format, eventually replacing audio CD, videotape, laserdisc, CD-ROM, and perhaps even video game cartridges. DVD has widespread support from all major electronics companies, all major computer hardware companies, and about half of the major movie and music studios, which is unprecedented and says much for its chances of success. It's important to understand the difference between DVD-Video and DVD-ROM. DVD-Video (often simply called DVD) holds video programs and is played in a DVD player hooked up to a TV. DVD-ROM holds computer data and is read by a DVD-ROM drive hooked up to a computer. The difference is similar to that between Audio CD and CD-ROM. DVD-ROM also includes future variations that are recordable one time (DVD-R) or many times (DVD-RAM). Most people expect DVD-ROM to be initially much more successful than DVD-Video. Most new computers with DVD-ROM drives can also play DVD-Videos. DVD disc's can hold up to 5,2 GB of data, the first disc's could hold 2,6 GB so the technology is still not finished... These are the advantages of DVD compared to tapes and/or normal CD's: Superior picture quality - digital video technology offers more than twice the resolution of a VHS video picture and eliminates static and snow. DVD Video offers pictures that are twice as sharp and clear as VHS. DVD video has up to 500 lines of horizontal resolution, compared to only 240 lines of horizontal resolution for VHS. Superior sound quality - DVDs provide CD quality digital audio. Movie DVDs released in the United States and Canada use Dolby Digital™ (AC-3) multi-channel Surround Sound to bring true theater audio experience to the home. Dolby Digital Surround Sound provides five completely separate channels plus a bass channel (i.e. 5.1): Left, Center, Right, Left-Rear and Right-Rear, plus a subwoofer channel for special bass effects. As a true digital system, Dolby Digital (AC-3) audio encoding offers CD quality sound, with outstanding dynamic range, low distortion, wide frequency response and wow & flutter beneath the threshold of measurement. High viewer enjoyment - most DVDs support additional camera angles, wide screen formats, and "behind the scenes commentary" that is not available on VHS movies. Multiple Aspect Ratios - Most DVD titles feature both the traditional full-screen television format and also the widescreen or letterbox format, which presents movies in the same aspect ratio as shown in theaters. Informational Features - Many DVD titles include additional information on the DVD, such as biographies of the performers in the movie or music video, notes on the production of the movie, and behind-the-scenes commentary from the director or actors. Scene Access - Because DVDs are not tape-based, you can instantly access any specific scene in the movie. You no longer have to rewind or fast-forward through an entire movie to find your favorite scene. Camera Angles - Some DVD titles were filmed with multiple camera angles - easy-to-use menus allow you to choose these alternative angles. More compact - DVDs are easier to store than VHS tapes. High durability - a DVD can be played repeatedly without wear and tear and without any degradation to image quality. Backward compatibility with audio CDs - you can play music CDs on your DVD player. 48 THE DVD FAQ http://perso.libertysurf.fr/dvdutils/start_dvdfaq.htm What is DVD-ROM ? What are the main features of the DVD ? How does DVD-ROM differ from DVD-Video ? Why is an MPEG-2 card required to use a DVD-ROM drive ? Can DVD-ROM drives play/read standard CD-ROM discs ? What about compatibility with CD-R discs ? Can DVD-Video discs be played on a DVD-ROM drive ? Why should I purchase a DVD-ROM drive instead of a CD-ROM drive ? What is the capacity of DVD-ROM discs ? What kind of DVD-ROM titles are available now ? Can I get the Dolby® AC-3 digital surround sound from a DVD-ROM drive ? How much can I expect to pay for a complete DVD-ROM kit ? What sort of system do I need to run a DVD-ROM drive ? Will DVD-ROM drives be compatible with the upcoming DVD-R write once recordable discs ? What about compatibility with rewritable DVD ? What advantage does the proposed DVD+RW format have over DVD-RAM ? Will DVD+RW discs require a caddy-like cartridge ? Are larger capacities possible for DVD+RW ? Are larger capacities possible for DVD+RW and is dual layer recording possible ? What is DVD-ROM ? DVD-ROM stands for Digital Versatile Disc Read Only Memory. Like CD-ROM discs, DVD-ROM discs are . intended for computer use, and are molded with the information pressed right into the disc. However, . unlike CD-ROM with its 650 MB capacity, DVD-ROM discs can hold up to 4.7 GB of information. Even higher capacities are possible with additional information layers and double sided DVD-ROM discs. What are the main features of the DVD ? Over 2 hours of very high-quality (better than laser disc) video on a single disc. Over 8 hours on a double-sided dual layer disc. Support for wide screen movies. Some DVD movies allow you to select wide screen or standard screen. Up to 8 tracks of digital audio for multiple language support. Up to 32 subtitle/karaoke tracks. Up to 9 different viewing angles (DVD disc must be encoded with the different angles). Automatic "seamless" branching of video for multiple story lines or different ratings of one movie. Menus and interactive features. Title, Chapter, and track search. Durability. Compact Size. Language choices. Parental lock. Random accessibility. Dolby Digital AC-3 audio. ... Making a DVD-Video disc You need three things to create a DVD: 1. digital content creation system 2. professionalMPEG encoder 3. DVD Authoring system The first thing that you need to create your DVD is a digital content creation system. These are generally nonlinear video editors from companies like Avid and Media 100. These systems are used to create the clips or 49 movies that will be put on the DVD. The second thing that you need is a professional MPEG encoder. This is a tool that will create the MPEG files that will be put on the DVD. Generally you want to put Variable Bit Rate MPEG-2 on your DVD. Many companies offer MPEG-2 realtime encoders : Zapex, Minerva, Optibase, Optivision, etc. The final thing that you need is a DVD authoring system from a company like Daikin or Spruce. A DVD authoring system is used to ensure that the DVD that is created complies to the DVD-VIDEO specifications. A DVD-VIDEO system does two basic things. First, it ensures that the disk image that you create conforms to the DVD-VIDEO specification. The DVD authoring system understands the DVD-VIDEO specification and ensures that what you produce will be playable on any DVD-VIDEO player. It does this by creating the correct files and folders. Second, a DVD authoring system allows you to visually hook up the elements that make up your DVD-VIDEO. Good DVD authoring systems work much like non-linear video editing systems from Avid or Media 100. The advantage of this is that they are easy to use and are familiar to existing users of non-linear video editors. Anforderungen an MPEG-Dateien für DVD Video: NTSC PAL Compression mat MPEG-1 MPEG-2 for- Picture resolution Pictures in Group Aspect ratio of Pictures (GOP) and 720 x 480,704 x Fewer than 36 fields 4:3 or 16:9 480,352 x 480,352 x 240 MPEG-1 MPEG-2 and 720 x 576, 704 x Fewer than 30 fields 576, 352 x 576, 352 x 288 4:3 or 16:9 Bit rate (maximum) 9.8 Mbps* 9.8 Mbps* Tabelle 1: Requirements THE CHALLENGE OF DVD AUTHORING Panos Nasiopoulos, Rabab K. Ward and Masato Otsuka Abstract: The evolution of DVD promises telecomputers will find their way into our living rooms, linked to a large flat screen which will be able to display HDTV, Standard Definition TV, video games, interactive movies from Digital Versatile Discs (DVD), Internet, video telephony and computer graphics. For the first time, Hollywood Studios and consumer electronic companies have formed an alliance to support the DVD technology which is critical for both sides. In this paper we address the complex process of DVD authoring. INTRODUCTION One of the most significant technological achievements in the consumer and entertainment industries is the development of the Digital Versatile Disc (DVD). DVD is the first union of emerging technologies, bringing together computer consumer electronics and entertainment. As a result, we are witnessing the generation of an entirely new infrastructure that is reshaping the world of entertainment. DVD is a lot more than just a storage medium. It is a new multi-purpose technology that will affect both the entertainment and computer worlds. For consumers, this is the first digital medium that offers studio video and audio quality combined with unprecedented interactivity at a very low cost. For the PC multimedia side, DVD is the first video distribution medium designed for very high data rates. Its interactivity and distribution format have the potential to revolutionize the entertainment software industry. This paper describes the complex process of DVD authoring. DVD: AN OVERVIEW Storage As a storage media, DVD can hold from 4.7 GB of digital data on one-side single-layer format to 17 GB on a double-side dual-layer format [1,2]. This increase in capacity (up to 25 times that of CDís) is achieved by introducing a shorter wavelength laser beam, dual focusing mechanism that allows the use of two layers per side, smaller pit size and tighter spirals. Furthermore, DVD discs offer ten times the speed of the CD rate, opening the way to numerous new real-time applications. While storage capacity is very important, it is DVDís other capabilities that make this technology so attractive. Hollywood As an entertainment product, DVD satisfies the goals established by the Hollywood Digital Video Disc Advisory Committee, delivering extraordinary picture and sound quality. DVD takes advantage of a two-pass variable bit rate MPEG-2 video encoding process to offer a superb picture quality comparable to D-1, the studio production standard. To make it more exciting, this is the first medium to introduce a number of viewing for50 mats such as the 4:3 TV screen format, the 16:9 HDTV screen format and the 20:9 letterbox format [1,2]. Combine this picture experience with Dolbyís AC-3 5.1 channel surround sound (or MPEG-2 7.1 for Europe) and you have reproduced video and audio quality that rivals that of a theater. In addition, this technology allows the use of 8 different languages, 32 subtitles, different camera angles and video-clip paths including interviews with producers and actors. The viewer can choose the camera angles and language, switching seamlessly from one to another, scan forwards and backwards and play slow motion. Parents are given the option to lock out versions of the movie which range from the directorís cut, to Rrated, to PG-13. Hybrid DVD-Internet But it is the PC world where DVD will have its biggest impact. The first read-only DVD drives are expected to offer over 7 times the storage capacity of the current CD, and will also be able to play DVD movie titles and existing music CDs. DVD brings the added capability of supporting the implementation of interactive adjuncts to traditional PC content. Embedded navigation such as web browsing can be added to video, enhancing the userís experience with hybrid content. A DVD-based PC application that combines DVDís interactive performance and rich video and audio capabilities with the Internet would offer a wealth of opportunities. For example, it is possible to produce a DVDbased department store catalogue that offers an interactive showcase of all the departmentsí merchandise, complete with audio and video. Playing the disc automatically connects the user with the store via the Internet, allowing the consumer to get current prices, order merchandise, communicate with a personal shopper or pay bills. Such a service is not viable today because of the Internetís low bandwidth. A similar hybrid application allows DVD-based courses and encyclopedias (which devour huge amounts of space with text, pictures, video sound and animations) to remain up-to-date by cross-referencing a constantly updated web site. DVD AUTHORING Authoring is generally defined as the process of preparing content, encoding video and audio, and creating the final DVD image. In the case of DVD, authoring is a complex process since it involves the laying out of multiple audio tracks and a video track, generation of sub-titles, menu pages, parental lock-out features, interactive functions such as program search, time search, seamless play, and pause, and finally editing of video and audio. Since authoring is always performed along with encoding and disc formatting, it is, in many cases, referred to as the entire DVD pre-mastering process. Preparation of Materials The first step in authoring is the collection of materials. These materials include video, audio, still images, and sub-pictures. DVDís video source format is the CCIR-601 studio format compressed to MPEG-2 format. The frame rate is 29.97 f/s for NTSC sources (North America) and 25 f/s for PAL/SECAM sources (Europe). The maximum allowable bit rate is 9.8 Mbps. Audio includes the surround track and up to 8 different language tracks for each title. All language tracks must be compared for level, mix, and equalization so that seamless switching between languages can be achieved. Still images are used to provide break points in the title, so that search functions and other interactive functions can be implemented. The preparation of still images includes identification of the breakpoints in the video and definition of the time duration for each image. Subpictures are bitmaps that are overlaid on top of the video. They include menus, sub-titles, graphics, and simple animation. Once created, their start and stop time must be defined in order to be synchronized with the associated video and audio elements. Up to a maximum of 32 sub-picture bit-streams are allowed in a title. Techniques and Parameters Good understanding of how various elements will be used in constructing the title is the key to intelligent parameter determination which includes tradeoffs between picture quality, length of program, number and quality of audio channels, number of subtitles, and level of interactivity. The following is a list of some of the basic parameters needed to be determined for a DVD title [3]: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. the number of audio channels the number of language versions the number of sub-picture elements the number of breakpoints in the video the number and the levels of rated versions of the title the number of still images used at each breakpoint the type of parental lock outs the type of directors cuts the audio encoding techniques the format used for still images 51 A single-layer single-sided DVD disc can store 2 hours and 13 minutes of video compressed at a nominal average bit rate of 3.5 Mbps combined with 3 languages encoded using AC-3 5.1 channels and 4 additional languages encoded as sub-titles [4]. The maximum program rate (i.e., video + audio + sub-pictures) is specified to be 10.08 Mbps. Given the disc capacity, the overall quality depends on determining the different tradeoffs between several parameters. For example, Table 1 shows the average storage requirements for a DVD title with the following parameters: 1. 2. 3. 4. 5. 6. 7. Audio tracks encoded using Dolby AC-3 5.1 4 unique languages supported 4 sub-picture streams supported "G" rated version has a total run length of 100 minutes "PG" rated version has an additional 4 minute run length 2 previews; each has a run length of 3 minutes 4 trailers; each has a run length of 2.5 minutes Note that 4% of the total disc capacity is always reserved for backup of the program control data and for additional information that is added after editing. The total run length is 120 minutes, resulting in the average bit rate of 3.43 Mbps. Video Encoding DVD takes advantage of the MPEG-2 compression technology to achieve picture quality comparable to that of D-1, the CCIR-601 TV studio production standard. MPEG-2 is a flexible and scaleable compression scheme which can produce bit rates that range from 1 to 40 Mbps. As implemented for DVD, MPEG-2 encoding is a two-pass process. During the first pass, the encoder scans the video source, detects scene changes and determines the optimal bit rates for each frame. During the second pass, higher bit rates are assigned to complex frames and sequences with more activity and lower bit rates to "simple" frames. The two-pass process guarantees the best possible picture quality for the given video clip and disc storage capacity. For video material originated from film, inverse telecine may be used to improve the compression performance. The reason is that film uses 24 f/s, a rate that is converted to the 30 f/s required by the NTSC standard. This conversion process is known as telecine and involves duplication of frames at regular intervals. Inverse telecine removes the duplicated frames, thus allowing more bandwidth to be allocated to the video. Audio Encoding Movies released in North America and Japan can carry Dolbyís AC-3 stereo or 5.1 audio which offers 5 surround channels plus a low frequency (sub-woofer) channel. For movies released in Europe, AC-3 is replaced by MPEG-2 stereo or 7.1 surround sound. In addition, as an option to AC-3 and MPEG-2 audio, DVD enables producers to choose uncompressed 16-bit linear PCM stereo sound with Dolby Pro Logic encoding. Table 2 shows audio encoding options as well as the specified sampling frequency rates, bit and transfer rates and number of channels supported by each option. Sub-Picture Encoding Sub-pictures are run-length compressed bit-maps using 2 bits/pixel and 4 colors out of a 16 color palette. The sub-picture size is 62KB per GOP/cell with 32 KB allocated for control data. Applications may vary from simple text (sub-titles) to menus to still images used for presentation effects. Pixels are categorized as foreground, background, emphasis-1 and emphasis-2. The still picture format must be a standard image format such as TIFF, GIF, or BMP. MPEG is used to encode still images which are then incorporated into the video stream. Putting it Together After preparing the different "segments" of a DVD title, a multiplexing process should link everything together and define the program flow of the DVD title. This final step should specify how each of the media elements will be presented to the user and how the user can interact with the program. Program flow specifications are translated to navigation commands that are, in turn, incorporated into program cells and program chains. A cell consists of a navigation command and all the video and audio data associated with a GOP. The navigation command (button) defines the playback behavior of the corresponding cell and it consists of one or at most a combination of three of the following instructions [4]: GoTo Link Jump Compare SetSystem Set à à à à à à branch between commands transfer between the same domain transfer between each domain recognition of parameter value player system setting calculate GPRM values 52 A sequence of cells and cell commands (navigation commands) form a program (PG). A program usually corresponds to one scene. Programs and video objects (nominally a GOP) form a program chain (PGC). A program chain is separated into the control information (PGCI) and the video object (VOB). PGCI acts as an address table pointing to cells, thus defining the playback order of Programs. The Part of Title (PTT) helps to construct multiple versions of the same title. A DVD title can have only one or multiple program chains. Interactive functions such as PTT searches, directorís cuts, and parental lock-outs can be achieved by creating the title as a multi-PGC_title, with different directorís cuts and different rated versions on different program chains. Simulation and Verification After all the media elements and control information are multiplexed into one stream, simulation testing is to be performed. The stream must guarantee that audio, video, and sub-pictures are synchronized; otherwise, the content must be re-edited or re-encoded. Besides synchronization, interactive functions may also be simulated and verified. References [1] DVD Format, TOSHIBA, DVD Forum April 1996. [2] DVD Presentation Data Specifications, VICTOR Company of Japan Ltd., DVD Forum, April 1996. [3] C. Fogg, DVD Technical Notes, July 1996. [4] Interactive Functions, HITACHI Ltd., DVD Forum, April 1996. Table 1. Storage Requirements for Each Media Element in Average-Bit-Rate Calculation Example Media Element Total Length 120 minutes Average Bit Rate Total Storage Requirements 0.384 Mbps per language 4 Sub-picture streams Reserved 120 minutes 0.01 Mbps per language 4*120*60*0.384Mbps/8 = 1382 MB 4*120*60*0.01Mbps/8 = 36 MB Video 120 minutes 4% of 4.7 Gbytes SUBTOTAL 3094/(120*60)*8= 3.43 Mbps 188 MB 1606 MB 3094 MB 4 Language Tracks Sampling Frequency Number of Bits Transfer Rate Number of Channels Table 2. Audio Data Specifications Linear PCM Dolby AC-3 48K, 96K 48K 16/20/24 bits compressed max. 6.144 Mpbs max. 448 kbps max. 8 max. 5.1 MPEG Audio 48K compressed max. 640 kbps max. 7.1 Erzeugen von VOB Files (StreamWeaver 5.4) Authoring DVD-Video titles [using CDMotion] is a multi step process. In the first step the motion video and audio files are captured to digital format as a function of MPEG encoding. This is done using an encoder suitable for the task, that is an encoder that meets at least the minimum requirements for DVD-Video content. This first step may also include the capture or rendering of raster image bit map files that are to be used in the DVD-Video title. In the second step of DVD-Video authoring, the stream content files are created using the Track StreamWeaver tool, the subject of this help file section. 53 When motion video and audio files are captured by the MPEG encoder, they are stored on the development station as "elementary" stream files. In this format the files are not suitable for use in DVD-Video. They must be combined into the multiplex file format which in DVD-Video is referred to as the VOB file. VOB files are created using StreamWeaver. StreamWeaver accepts as input a number of different types of files. These are then combined by StreamWeaver into the DVD-Video VOB file format. StreamWeaver makes certain assumptions regarding the content of a file based on the files name extension. It is very important that the file name extension correctly identify the type of content within the file. The extension types supported by StreamWeaver and the assumed file content for each type are: • • • • • • • • *.M2V *.MPV *.AC3 *.MPA *.WAV *.BMP *.SUP *.PLT MPEG2 Video MPEG1 Video Dolby AC3 Audio MPEG1 or MPEG2 Audio PCM Audio Windows BMP Raster Image DVD-Video Sub Picture DVD-Video Sub Picture Palette Failing to comply with these file naming conventions will result in errors occurring during the multiplexing process. 54 Weiterführende Dokumentation In diesem Kapitel finden Sie Hinweise auf verwendete und weiterführende Dokumentation, geordnet nach den Gebieten MPEG Referenz, MPEG in der Praxis, MPEG Clips, Videotechnik, DVD. MPEG Referenzdokumentation 1. C-Cube: Compression Technology : An MPEG Overview http://www.c-cube.com/technology/mpeg.html 2. MPEG Group FAQ -- http://www.crs4.it/~luigi/MPEG/ ; MPEG FAQs and standards 3. MPEG.org – http://www.mpeg.org/ ; MPEG Pointers and Resources 4. Berkeley Multimedia Research Center MPEG-1 FAQ - http://bmrc.berkeley.edu/frame/research/mpeg/faq/mpeg1.html MPEG-2 FAQ - http://bmrc.berkeley.edu/frame/research/mpeg/faq/mpeg2.html 5. Haskell, B.; Puri, A.; Netravali, A.; Digital Video: An Introduction to MPEG-2; Chapman & Hall, 1997. 6. Orzessek, M.; Sommer, P.; ATM & MPEG-2: Integrating Digital Video Into Broadband Networks; HewlettPackard Professional Books, 1998. 7. Symes, P.; Video Compression; McGraw-Hill, 1998. 8. Fraunhofer Institut: MPEG 1 Layer 3 (Bestandteil des Fraunhofer MP3 Encoders) MPEG Encoder, Player, Tips und Tricks 1. Ligos: Guide to MPEG Encoding http://www.ligos.com/support/guide2MPEG.pdf 2. Trixter's Desktop MPEG Authoring FAQ – http://www.oldskool.org/mpeg/mpegfaq.html 3. Markus Zingg’s AVI_IO – http://www.nct.ch/multimedia/avi_io/ excellent shareware application that works around the AVI 2 GB file size problem 4. Camel’s MPEGJoin – http://extra.newsguy.com/~theprof/Readme.html useful utility for joining MPEG streams together MPEG Clips 1. Darim MPEG Clips ftp://ftp.darvision.com/pub/mpegs/ 2. Ligos MPEG Clips http://www.ligos.com/products/sample_clips.shtm Videotechnik allgemein 1. John McGowan’s AVI Overview – http://www.rahul.net/jfm/avi.html ; Probably the best resource anywhere on details regarding the AVI format 2. Interactive Technology Primer – http://tlc.nlm.nih.gov/resources/publications/primer/primer.html; the most complete guide to the “big picture” of multimedia 3. Color FAQ http://www.inforamp.net/~poynton/notes/colour_and_gamma/ColorFAQ.html; explains all color spaces and gives transformation matrices from RGB to YCrCb 4. AV Video Multimedia Producer- http://www.avvideo.com/ 5. Camcorder & Computer Video – Miller Magazines, 4880 Market St., Ventura, CA 93003 6. Computer Videomaker - http://www.videomaker.com/ 7. DV (Digital Video) - http://www.dv.com/ 8. Videography - http://www.vidy. 9. Multimedia-Datenformate http://i31www.ira.uka.de/~semin94/Seminar.html http://i31www.ira.uka.de/~semin94/02_JPEG/ (JPEG) http://i31www.ira.uka.de/~semin94/06_MPEG/main_html.html (MPEG) 55 DVD 1. DVD Authoring Tool Scenarist http://www.scenarist.com/products/index/snt_fam.html, http://www.mtc2000.com/main.html 2. DVD FAQ at Videodiscovery – http://www.videodiscovery.com/vdyweb/dvd/dvdfaq.html the basics of DVD 3. A Day at the DVD Forum – http://reality.sgi.com/nemec/dvd.html technical notes on the requirements of MPEG-1 and MPEG-2 for DVD applications, highly recommended 4. Disctronics and Freehand DVD Video website - http://www.dvd-video.co.uk/ technical documents on DVD Video specifications and requirments 5. The Challenge of DVD Authoring http://www.scenarist.com/white_papers/wp_challenge.html 6. DVD Authoring Tool Sonic DVD http://www.dvdit.com/ 7. DVD Authoring Tool Minerva http://www.minervasys.com/dvd_solutions/dvd_default.html 8. Publishing in the Age of DVD http://www.dvdcreator.com/pdf/dvd_primer.pdf; hervorragende Einführung in alle Aspekte der DVD-Video Produktion 9. DVD FAQ http://perso.libertysurf.fr/dvdutils/start_dvdfaq.htm 10. DVD Utils Frequently Asked Annoying ;-) Questions http://perso.libertysurf.fr/dvdutils/start_f2aq.htm; behandelt DVD Player, Laufwerke aus Sicht „engagierter“ Benutzer (region-free, macrovision, ...) 11. Digital Video Disc: The Coming Revolution in Consumer Electronics http://www.c-cube.com/technology/dvd.html; hervorragende Einführung in alle Aspekte der DVD-Video 12. DVD Utils http://www.dvdutils.com/; die Seite für den technisch interessierten, „experimentierfreudigen“ DVD-Enthusiasten 13. DVD Tools http://perso.libertysurf.fr/dvdrip/ 56 Anhang In diesem Anhang finden Sie Auszüge aus ergänzender Dokumentation (siehe zuvor). An Interactive Technology Primer (Auszug: Compression) This document is accessible from The Learning Center home page at http://tlc.nlm.nih.gov under "Resoures" "Publications. Role of Compression in Digital Multimedia Compression defined Since digital multimedia files take up so much space and take so much time to transfer and present, they are often compressed. Compression involves reducing the size of a file for storage and transmission and reconstituting (decompressing) the information for presentation. There are different compression-decompression algorithms (CODECs). Information can be stored in compressed or uncompressed form on both digital optical or magnetic media. Special hardware may be required to decompress and display compressed information in some cases; in other cases only software may be needed. In the latter, display rate will depend more on the speed of the computer's microprocessor and the speed at which information can be transferred from a compact disc or hard disk or sent over a network. Compression can be done for still images, motion video, and audio. 7.2 Still compression One compression method, run length encoding, identifies adjacent pixels or lines on the screen having the same luminance (brightness) and chrominance (color) and records this value along with how many times it should be repeated. For example, instead of using 10 bytes to denote 5 red pixels followed by 5 blue ones by encoding the information as RRRRRBBBBB, only 4 bytes are needed if the information is encoded as 5R5B. Another compression method, differential pulse code modulation, records only the differences between adjacent pixels. A third method, discrete cosine transformation, samples screen pixels at different intervals and uses the luminance and chrominance values from these pixels to estimate the values of the intervening ones. Light values outside the range that the eye can detect are discarded and more luminance information is sampled than chrominance, since the eye is more sensitive to brightness. A fourth method, fractal compression, does not represent pixels, but uses mathematical formulas called fractals. It is based on the assumption that image objects are made up of smaller objects that are just like them. For example, the entire sky is made up of patches of sky that look like it. Fractal compression finds these image relationships, generates formulas representing them, and discards pixel data. The result is high compression that allows scaling images to any size without distortion. Since pictures can be scaled to be larger than originally, the technique can be used for image enhancement. Scanners are used to digitally capture slides and photos. 7.3 Motion compression Full motion video involves recording images at a rate of 30 per second. Lots of images must be recorded, but the information within each image is mostly redundant with prior and subsequent ones. Usually, the only thing different in each image are those objects that have moved or changed position from one frame to the next. This compression involves sampling full frames at specified intervals, compressing them, and then only compressing those parts of the intervening frames that have changed from one frame to the next. Video capture boards often are needed to digitally capture motion episodes and they may be needed to display or enhance the display of the recorded files. There are some software only CODECs for presenting compressed motion video files, including Apple's Quicktime (for Macintosh and Windows), Microsoft's Video for Windows, and MPEG (a compression standard from the Motion Picture Experts Group of the ISO for a variety of platforms). Indeo is compression software from Intel for capturing motion and creating compressed files in Video for Windows audio video interleave (.avi) or Quicktime (.mov) formats. It is a derivative of the older Digital Video Interactive (DVI) technology. There are two MPEG video compression formats -- MPEG or MPEG-1 and MPEG-2 that create .mpg files. MPEG-2 is more recent and is used in digital video disc and digitally broadcast video. It is a scalable compression standard offering several levels of audio quality and a variety of frame sizes and transfer rates. Low level provides 352 x 240 pixel displays at 30 frames per second (fps) at a maximum bit rate of 4mbs. Main level provides 720 x 480 pixel displays at 30 fps at maximum bit rate of 15mbs. High 1440 provides 1440 x 1152 pixel displays at 30 fps at a maximum bit rate of 60mbs. High 1920 provides 1920 x 1080 pixel displays at 30 fps at a maximum bit rate of 80mbs. 57 7.4 Audio compression There are several types of audio compression. Digital audio compression is a function of the sampling rate, usually in kilohertz (kHz), at which sound is originally captured, the number of bits used to store the captured sound, and how the bits are allocted. Higher sampling rates and the use of more bits results in higher quality. Monoral sound is produced when all bits are allocated to a single channel and stereo sound is produced when they are divided among two channels. Sometimes the quality of a compression level may be equated with "telephone" quality or "CD" quality. MPEG 3 (MP3) or MPEG level 3 is a standard to compress audio in a way that approaches the quality of CD audio. 7.5 Types of compression Compression can be either symmetrical or asymmetrical, lossless or lossy. In symmetrical compression, compression and decompression take the same time. In asymmetrical compression, compression is longer than decompression. Lossless compression means there is no deterioration of the image; lossy means there is. Low compression ratios of, say, 2 to 1 can be lossless, but the higher ones needed for digital multimedia, usually 50 or 60 to one, are lossy. This means that any one of the following may change: 1) the size of the image, 2) the resolution of the image, 3) the amount of color in the image, and 4) if motion is used, the rate at which individual pictures or frames can be displayed. When motion compression is done, there can be intraframe and interframe compression. The former is compression within a given frame or picture and the latter is compression between frames or pictures (e.g., of the changes between one picture and the next). Scalable motion compression sacrifices the quality of the image and sound data to maintain a specified rate of motion or frame rate (e.g., 30 frames per second or 15 frames per second). Scalable timing motion compression is when audio quality is preserved and frames are dropped to insure an uninterrupted soundtrack. AVI Overview (Auszug MPEG) by John F. McGowan, Ph.D. (c) 1996-1999, John F. McGowan http://www.rahul.net/jfm/ How to convert AVI to MPEG? AVI to MPEG Conversion at a Glance Company/Author(s) Product Price URL --------------------------------------------------------------------Corel PhotoPaint $500? http://www.corel.com/ Ulead MPEG Converter $249 http://www.ulead.com/ Xing Technologies XingMPEG Encoder $89 http://www.xingtech.com/ XingMPEG Encoder 2 (May 6, 1997 release) CeQuadrat PixelShrink $199 http://www.cequadrat.com/ Vitec MPEG Maker $125 http://vitechts.com/ MainConcept MainActor shareware http://www.mainconcept.de/ avi2mpg1 Unknown freeware http://www.mnsi.net/~jschlic1/ Stefan Eckhart and others CONVMPG3 freeware kit: http://www.powerweb.de/mpeg/msdos.html Ligos Technology LSX-MPEG Encoder $179.95 http://www.ligos.com/ ------------------Further information, reviews, and live links follow: 58 The following posting from the comp.graphics.animation USENET newsgroup provides a good answer to this question. I have retained the header to insure proper credit to the author. Note: LW refers to the Lightwave 3D animation software package. Hi, I use LW to do animation, and basically I am not happy with any of the compression engines aviable for avi. Those codes suck. So what I want to do is make an UNCOMPRESSED AVI and then translate it to MPEG. anyone know of any good converters to MPEG or hoe about plug-in for LW to be able to do MPEG files from the start. Thank You Hi, you're right, every single AVI compression codec is lame. 5 years of the AVI format existance and zero progress so far. If you're talking about freeware or budget-priced MPEG codecs, it's a tough task, to find the damn thing. I'm busy in this area quite for a while already, and here are my findings: 1. XING's MPEG encoder is a classical name on the scene. Had compatibility problems before, not anymore, I believe. Can cost you $150 or more, not sure. Scan for 'XING' on the Net, you'll definitely find some tracks (www.xing.com doesn't show up). 2. Stefan Eckart's CMPEG (DOS) encoder is FREE and GOOD, and stays so for a couple of years already. Can have troubles converting some particular streams, but generally not worse than many commercial programs. (You need to make a TGA sequence first out of your AVI, though). Again, scan for CMPEG, or use my bookmarks found on the site Im introducing below. 3. To my surprise, Corel Photopaint 6 has got very decent built-in MPEG compression option. Open an AVI, Save As an MPEG, and see what happens (get some coffie, as it'll take a while ;) I checked it out on a stream where CMPEG gave up and the Corel's conversion did make a wonder. (If you like to see the result, download my 'Liquid Beatles' morph clip, 1 Mb: http://www.proteon.nl/synth_art/movies/cross.mpg). 4. Ulead's MPEG converter (www.ulead.com) seems to be the major player (priced below $250) on the Windows arena. I've heard good references about their MPEG's quality, but I feel that their biggest advantage is good integration with Windows and AVI format. If I'm not mistaken, a very slow codec. 5. Don't mess with DARIM Vision's codec (Korea). I've tried their demo, it produces low-quality crap. Though fast and cheap (you bet :-). See my MPEG clips, fractals, morphs, and in general lots of advanced graphics at http://www.proteon.nl/synth_art/ Hope this helps, Valery http://www.proteon.nl/synth_art/movies.html --------------------------------------------------------------In addition to the above, there is MPEG Maker from VITEC-HTS (formerly Vitec Multimedia). Vitec is: 59 Vitec 4366 Independence Court, Suite C Sarasota, FL 34234 Voice: (941) 351-9344 FAX: (941) 351-9423 http://vitechts.com CeQuadrat makes a software-only AVI to MPEG converter called PixelShrink. CeQuadrat is: CeQuadrat 1804 Embarcadero Road, Suite 101 Palo Alto, CA 94303 Voice: (415) 843-3780 FAX: (415) 843-3799 http://www.cequadrat.com/ And the freeware kit CONVMPG3, a collection of MS-DOS utilities that can be used to convert AVI to MPEG-1 or MPEG-1 to AVI. CONVMPG3 includes Stefan Eckhardt's CMPEG MPEG-1 encoder mentioned above but also includes utilities to generate the sequence of Targa files required by CMPEG. The URL for CONVMPG3 is: http://www.powerweb.de/mpeg/msdos.html avi2mpg1 is a freeware command line application for Windows 95/NT that can convert AVI to MPEG-1, supports audio, video, and interleaved audio/video. http://www.mnsi.net/~jschlic1/ MainConcept's MainActor product now (March 1997) includes add-on modules to output MPEG-1 and MPEG-2. With these add-on modules, MainActor can convert AVI to MPEG-1 or MPEG-2. Marcus Moenig at MainConcept provided an evaluation copy of the MPEG-1/2 modules. In tests, these modules could convert AVI files to MPEG-1 that could be played using the ActiveMovie software MPEG player shipping with Microsoft's Windows 95 OSR2. MainConcept is: MainConcept http://www.mainconcept.de The URL for Ulead is: Ulead MPEG Converter http://www.ulead.com/ On May 6, 1997, Xing announce a new product, the Xing MPEG Encoder 2 which accelerates MPEG encoding using Intel MMX instructions on PC's. The original Xing MPEG Encoder did not use MMX instructions. The Xing MPEG Encoder 2 can convert AVI and WAV files to MPEG-1. The URL for Xing is: Xing Technology Corporation http://www.xingtech.com/ Ligos Technology markets an LSX-MPEG Encoder to convert AVI to MPEG-1 and MPEG-2 Ligos Technology 1475 Folsom St. Suite 200 60 San Francisco, CA 94103 +1-415-437-6137 +1-415-437-6139 FAX info@ligos.com http://www.ligos.com/ For further information on the MPEG digital audio and video format see Tristan Savatier's comprehensive MPEG site: http://www.mpeg.org/ and The MPEG Home Page: http://drogo.cselt.it/mpeg/ 61