Image Display Algorithms for High and Low Dynamic
Transcription
Image Display Algorithms for High and Low Dynamic
Image Display Algorithms for High and Low Dynamic Range Display Devices Erik Reinhard1,2,3 Timo Kunkel1 Yoann Marion2 Kadi Bouatouch2 Jonathan Brouillat2 Rémi Cozot2 Abstract With interest in high dynamic range imaging mounting, techniques for displaying such images on conventional display devices are gaining in importance. Conversely, high dynamic range display hardware is creating the need for display algorithms that prepare images for such displays. In this paper, the current state-of-the-art in dynamic range reduction and expansion is reviewed, and in particular we assess the theoretical and practical need to structure tone reproduction as a combination of a forward and a reverse pass. 1 Introduction Real-world environments typically contain a range of illumination much larger than can be represented by conventional 8-bit images. For instance, sunlight at noon may be as much as 100 million times brighter than starlight [Spillmann and Werner 1990; Ferwerda 2001]. The human visual system is able to detect 4 or 5 log units of illumination simultaneously, and can adapt to a range of around 10 orders of magnitude over time [Ferwerda 2001]. On the other hand, conventional 8-bit images with values between 0 and 255, have a useful dynamic range of around 2 orders of magnitude. Such images are represented typically by one byte per pixel for each of the red, green and blue channels. The limited dynamic range afforded by 8-bit images is well-matched to the display capabilities of CRTs. Their range, while being larger than 2 orders of magnitude, lies partially in the dark end where human vision has trouble discerning very small differences under normal viewing circumstances. Hence, CRTs have a useful dynamic range of 2 log units of magnitude. Currently, very few display devices have a dynamic range that significantly exceeds this range. The notable exception are LCD displays with an LED back-panel where each of the LEDs is separately addressable [Seetzen et al. 2003; Seetzen et al. 2004]. With the pioneering start-up BrightSide being taken over by Dolby, and both Philips and Samsung demonstrating their own displays with spatially varying back-lighting, hardware developments are undeniably moving towards higher dynamic ranges. It is therefore reasonable to anticipate that the variety in display capabilities will increase. Some displays will have a much higher dynamic range than others, whereas differences in mean luminance will also increase due to a greater variety in back-lighting technology. As a result, the burden on general purpose display algorithms will change. High dynamic range (HDR) image acquisition has already created the need for tone reproduction operators, which reduce the dynamic range of images prior to display [Reinhard et al. 2005; Reinhard 2007]. The advent of HDR display devices will add to this the need to sometimes expand the dynamic range of images. In particular the enormous number of conventional eight-bit 1 University of 2 Institut Bristol, Bristol, e-mail: reinhard@cs.bris.ac.uk de Recherche en Informatique et Systèmes Aléatoires (IRISA), Rennes 3 University of Central Florida, USA Figure 1: With conventional photography, some parts of the scene may be under- or over-exposed (left). Capture of this scene with 9 exposures, and assemblage of these into one high dynamic range image followed by tone reproduction, affords the result shown on the right. Figure 2: Linear scaling of high dynamic range images to fit a given display device may cause significant detail to be lost (left). For comparison, the right image is tone-mapped, allowing details in both bright and dark regions to be visible. images may have to be expanded in range prior to display on such devices. Algorithms for dynamic range expansion are commonly called inverse tone reproduction operators. In this paper we survey the state-of-the-art in tone reproduction as well as inverse tone reproduction. In addition, we will summarize desirable features of tone reproduction and inverse tone reproduction algorithms. 2 Dynamic Range Reduction Capturing the full dynamic range of a scene implies that in many instances the resulting high dynamic range (HDR) image cannot be directly displayed, as its range is likely to exceed the 2 orders of magnitude range afforded by conventional display devices. Figure 1 (left) shows a scene with a dynamic range far exceeding the capabilities a conventional display. By capturing the full dynamic range of this scene, followed by tone mapping the image, an acceptable rendition of this scene may be obtained (Figure 1, right). A simple compressive function would be to normalize an image (see Figure 2 (left)). This constitutes a linear scaling which is sufficient only if the dynamic range of the image is slightly higher than the dynamic range of the display. For images with a significantly higher dynamic range, small intensity differences will be quantized to the same display value such that visible details are lost. For comparison, the right image in Figure 2 is tone-mapped non-linearly showing detail in both the light and dark regions. In general linear scaling will not be appropriate for tone reproduction. The key issue in tone reproduction is then to compress an image while at the same time preserving one or more attributes of the image. Different tone reproduction algorithms focus on different attributes such as contrast, visible detail, brightness, or appearance. Ideally, displaying a tone-mapped image on a low dynamic range display device would recreate the same visual response in the observer as the original scene. Given the limitations of display devices, this is in general not achievable, although we may approximate this goal as closely as possible. 3 Spatial operators In the following sections we discuss tone reproduction operators which apply compression directly on pixels. Often global and local operators are distinguished. Tone reproduction operators in the former class change each pixel’s luminance values according to a compressive function which is the same for each pixel [Miller et al. 1984; Tumblin and Rushmeier 1993; Ward 1994; Ferwerda et al. 1996; Drago et al. 2003; Ward et al. 1997; Schlick 1994]. The term global stems from the fact that many such functions need to be anchored to some values that are computed by analyzing the full image. In practice most operators use the geometric average L̄v to steer the compression: ! 1 X log(δ + Lv (x, y) (1) L̄v = exp N x,y The small constant δ is introduced to prevent the average to become zero in the presence of black pixels. The luminance of each pixel is indicated with Lv , which can be computed from RGB values if the color space is known. If the color space in which the image is specified is unknown, then the second best alternative would be to assume that the image uses sRGB primaries and white point, so that the luminance of a pixel is given by: Lv (x, y) = 0.2125 R(x, y) + 0.7154 G(x, y) + 0.0721 B(x, y) (2) The geometric average L̄v is normally mapped to a predefined display value. The main challenge faced in the design of a global operator lies in the choice of compressive function. Many functions are possible, which are for instance based on the image’s histogram (Section 4.2) [Ward et al. 1997] or on data gathered from psychophysics (Section 4.3). On the other hand, local operators compress each pixel according to a specific compression function which is modulated by information derived from a selection of neighboring pixels, rather than the full image [Chiu et al. 1993; Jobson et al. 1995; Rahman et al. 1996; Rahman et al. 1997; Pattanaik et al. 1998; Fairchild and Johnson 2002; Ashikhmin 2002; Reinhard et al. 2002; Pattanaik and Yee 2002; Oppenheim et al. 1968; Durand and Dorsey 2002; Choudhury and Tumblin 2003]. The rationale is that the brightness of a pixel in a light neighborhood is different than the brightness of a pixel in a dark neighborhood. Design challenges for local operators involve choosing the compressive function, the size of the local neighborhood for each pixel, and the manner in which local pixel values are used. In general, local operators are able to achieve better compression than global operators (Figure 3), albeit at a higher computational cost. Both global and local operators are often inspired by the human visual system. Most operators employ one of two distinct compressive functions, which is orthogonal to the distinction between Figure 3: A local tone reproduction operator (left) and a global tone reproduction operator (right) [Reinhard et al. 2002]. The local operator shows more detail, as for instance seen in the insets. local and global operators. Display values Ld (x, y) are most commonly derived from image luminances Lv (x, y) by the following two functional forms: Lv (x, y) f (x, y) Ln v (x, y) Ld (x, y) = n Lv (x, y) + gn (x, y) Ld (x, y) = (3a) (3b) In these equations, f (x, y) and g(x, y) may either be constant, or a function which varies per pixel. In the former case, we have a global operator, whereas a spatially varying function results in a local operator. The exponent n is a constant which is either fixed, or set differently per image. Equation 3a divides each pixel’s luminance by a value derived from either the full image or a local neighborhood. As an example, the substitution f (x, y) = Lmax /255 in (3a) yields a linear scaling such that values may be directly quantized into a byte, and can therefore be displayed. A different approach would be to substitute f (x, y) = Lblur (x, y), i.e. divide each pixel by a weighted local average, perhaps obtained by applying a Gaussian filter to the image [Chiu et al. 1993]. While this local operator yields a displayable image, it highlights a classical problem whereby areas near bright spots are reproduced too dark. This is often seen as halos, as demonstrated in Figure 4. The cause of halos stems from the fact that Gaussian filters blur across sharp contrast edges in the same way that they blur small and low contrast details. If there is a high contrast gradient in the neighborhood of the pixel under consideration, this causes the Gaussian blurred pixel to be significantly different from the pixel itself. By using a very large filter kernel in a division-based approach such large contrasts are averaged out, and the occurrence of halos can be minimized. However, very large filter kernels tend to compute a local average that is not substantially different from the global average. In the limit that the size of the filter kernel tends to infinity, the local average becomes identical to the global average and therefore limits the compressive power of the operator to be no better than a global operator. Thus, the size of the filter kernel in divisionbased operators presents a trade-off between the ability to reduce the dynamic range, and the visibility of artifacts. Ld 300 250 200 150 100 50 0 0 1 2 3 4 5 6 Log (Lw) Figure 5: Tone-mapping function created by reshaping the cumulative histogram of the image shown in Figure 6. Figure 4: Halos are artifacts commonly associated with local tone reproduction operators. Chiu’s operator is used here without smoothing iterations to demonstrate the effect of division (left). 4 Global Tone Reproduction Operators While linear scaling by itself would be sufficient to bring the dynamic range within the display’s limits, this does not typically lead to a visually pleasant rendition. In addition to compressing the range of values, it is therefore necessary to preserve one or more image attributes. In the design of a tone reproduction operator, one is free to choose which image attribute should be preserved. Some of the more common examples are discussed in the following. 4.1 Brightness Matching Tumblin and Rushmeier have argued that brightness, a subjective visual sensation, should be preserved [Tumblin and Rushmeier 1993]. Given the luminance L of a pixel, its brightness B can be approximated using: «γ „ Lv (4) B = 0.3698 LA where LA is the adapting luminance, γ models the visual system’s non-linearity and the numeric constant is due to fitting the function to measured data. To preserve brightness, the brightness reproduced on the display Bd should be matched to the scene brightnesses Bv . This leads to an expression relating scene Lv and display luminances Ld : „ «γ(Lv )/γ(Ld ) Lv Ld = Ld,A (5) Lv,A where Ld,A and Lv,A are the adapting luminances for the display and the scene respectively. To account for the fact that it may be undesirable to always map mid-range scene values to midrange display values, this expression is multiplied with a correction term [Tumblin and Rushmeier 1993]. The γ() function is essentially the logarithm of its parameter: ( 2.655 if L > 100 cd/m2 γ(L) = −5 1.855 + 0.4 log10 (L + 2.3 · 10 ) otherwise (6) The key observation to make is that matching brightnesses between scene and display leads to a power function, with the exponent determined by the adapting luminances of the scene and the display. This approach therefore constitutes a form of gamma correction. It is known to work well for medium dynamic range scene, but may cause burn-out if used to compress high dynamic range images. In the following, we will argue that allowing burned-out areas can be beneficial for the overall appearance of the image. However, it is important to retain control over the number of pixels that become over-exposed. This issue is discussed further in Section 7. Furthermore, this operator is one of the few which essentially comprise a forward and a backward pass. The forward pass consists of computing brightness values derived from image luminances. The backwards pass then computes luminance values from these brightness values which can subsequently be displayed. This is theoretically correct, as any tone reproduction operator should formally take luminance values as input, and produce luminance values as output. If the reverse step is omitted, this operator would produce brightness values as output. If the human visual system is presented with such values, it will interpret these brightness values as luminance values, and therefore process values in the brain that have been perceived twice. To avoid this, it is theoretically correct to apply a forward pass to achieve range compression, followed by a reverse pass to reconstitute luminance values, and adjust the values for the chosen display. It should not matter whether this display is a conventional monitor or a high dynamic range display. Furthermore, this operator exhibits the desirable property that tone-mapping an image that was tone-mapped previously, results in an unaltered image [DiCarlo and Wandell 2000]. These issues are discussed further in Section 8. 4.2 Histogram Adjustment A simple but effective approach to tone reproduction is to derive a mapping from input luminances to display luminances using the histogram of the input image [Ward et al. 1997; Duan et al. 2005]. Histogram equalization would simply adjust the luminances so that the probability that each display value occurs in the output image is equal. Such a mapping is created by computing the image’s histogram, and then integrating this histogram to produce a cumulative histogram. This function can be used directly to map input luminances to display values. However, dependent on the shape of the histogram, it is possible that some contrasts in the image are exaggerated, rather than attenuated. This produces unnatural results, which may be overcome by restricting the cumulative histogram to never attain a slope that is too large. The threshold is determined at each luminance level by a model of human contrast sensitivity. The method is then called histogram adjustment, rather than histogram equalization [Ward et al. 1997]. An example of a display mapping generated by this method Ld 1.0 Ld = Lw / (Lw + 1) Ld = 0.25 log (Lw) + 0.5 Ld = Lw / (Lw + 10) Ld = Lw / (Lw + 0.1) 0.5 0.0 -2 -1 0 1 2 log (Lw) Figure 7: Over the middle range values, sigmoidal compression is approximately logarithmic. The choice of semi-saturation constant determines how input values are mapped to display values. Ld 1.0 Ld = Lnw / (Lnw + gn) n = 0.5 n = 1.0 n = 2.0 Figure 6: Image tone-mapped using histogram adjustment. 0.5 is shown in Figure 5. This mapping is derived from the image shown in Figure 6. While this method can be extended to include simulations of veiling luminance, illumination dependent color sensitivity, visual acuity, and although knowledge of human contrast sensitivity is incorporated into the basic operator [Ward et al. 1997], we may view this operator as a refined form of histogram equalization. The importance of this observation is that histogram equalization is essentially rooted in engineering principles, rather than a simulation of the human visual system. It is therefore natural to regard this operator as “unit-less” — it transforms luminance values to different luminance values, as opposed to Tumblin and Rushmeier’s operator which transforms luminance values to brightness values if applied in forward mode only. As a consequence there is no theoretical need to apply this model in reverse. This argument can be extended to other tone reproduction operators. In general, when an algorithm simulates aspects of the human visual system, this by itself creates the theoretical need to apply both the forward and inverse versions of the algorithm to ensure that the output is measured in radiometric units. This requirement does not necessarily exist for approaches based on engineering principles. 4.3 Sigmoidal Compression Equation 3b has an S-shaped curve on a log-linear plot, and is called a sigmoid for that reason. This functional form fits data obtained from measuring the electrical response of photo-receptors to flashes of light in various species [Naka and Rushton 1966]. It has also provided a good fit to other electro-physiological and psychophysical measurements of human visual function [Kleinschmidt and Dowling 1975; Hood et al. 1979; Hood and Finkelstein 1979]. Sigmoids have several desirable properties. For very small luminance values the mapping is approximately linear, so that contrast is preserved in dark areas of the image. The function has an asymptote at one, which means that the output mapping is always bounded between 0 and 1. A further advantage of this function is that for intermediate values, the function affords an approximately logarithmic compression. This can be seen for instance in Figure 7, where the middle section of the curve is approximately linear on a log-linear plot. To illustrate, both Ld = Lw /(Lw + 1) and Ld = 0.25 log(Lw ) + 0.5 are plotted in this figure 4 , showing that 4 The constants 0.25 and 0.5 were determined by equating the values as g=1 0.0 -2 -1 0 1 2 log (Lw) Figure 8: The exponent n determines the steepness of the curve. Steeper slopes map a smaller range of input values to the display range. these functions are very similar over a range centered around 1. In Equation 3b, the function g(x, y) may be computed as a global constant, or as a spatially varying function. Following common practice in electro-physiology, we call g(x, y) the semi-saturation constant. Its value determines which values in the input image are optimally visible after tone mapping, as shown in Figure 7. The effect of choosing different semi-saturation constants is also shown in this figure. The name semi-saturation constant derives from the fact that when the input Lv reaches the same value as g(), the output becomes 0.5. In its simplest form, g(x, y) is set to L̄v /k, so that the geometric average is mapped to user parameter k (which corresponds to the key of the scene) [Reinhard et al. 2002]. In this case, a good initial value for k is 0.18, which conforms to the photographic equivalent of middle gray (although some would argue that 0.13 rather than 0.18 is neutral gray). For particularly light or dark scenes this value may be raised or lowered. Alternatively, its value may be estimated from the image itself [Reinhard 2003]. A variation of this global operator computes the semi-saturation constant by linearly interpolating between the geometric average and each pixel’s luminance [Reinhard and Devlin 2005]. g(x, y) = a Lv (x, y) + (1 − a) L̄v (7) The interpolation is governed by user parameter a ∈ [0, 1] which has the effect of varying the amount of contrast in the displayable image. The exponent n in Equation 3b determines how pronounced the S-shape of the sigmoid is. Steeper curves map a smaller useful range of scene values to the display range, whereas shallower curves map a larger range of input values to the display range. well as the derivatives of both functions for x = 1. Studies in electro-physiology report values between n = 0.2 and n = 0.9 [Hood et al. 1979]. 5 Local Tone Reproduction Operators The several different variants of sigmoidal compression shown above are all global in nature. This has the advantage that they are fast to compute, and they are very suitable for medium to high dynamic range images. Their simplicity makes these operators suitable for implementation on graphics hardware as well. For very high dynamic range images, however, it may be necessary to resort to a local operator since this may give somewhat better compression. 5.1 Local Sigmoidal Operators A straightforward method to extend sigmoidal compression replaces the global semi-saturation constant by a spatially varying function, which once more can be computed in several different ways. Thus, g(x, y) then becomes a function of a spatially localized average. Perhaps the simplest way to accomplish this is to once more use a Gaussian blurred image. Each pixel in a blurred image represents a locally averaged value which may be viewed as a suitable choice for the semi-saturation constant5. As with division-based operators discussed in the previous section, we have to consider haloing artifacts. If sigmoids are used with a spatially varying semi-saturation constant, the Gaussian filter kernel is typically chosen to be very small to minimize artifacts. In practice filter kernels of only a few pixels wide are sufficient to suppress significant artifacts while at the same time producing more local contrast in the tone-mapped images. Such small filter kernels can be conveniently computed in the spatial domain without losing too much performance. There are, however, several different approaches to compute a local average, which are discussed in the following section. Figure 9: Scale selection mechanism. The left image shows the tone-mapped result. The image on the right encodes the selected scale for each pixel as a gray value. The darker the pixel, the smaller the scale. A total of eight different scales were used to compute this image. It is now possible to find the largest neighborhood around a pixel that does not contain sharp edges by examining differences of Gaussians at different kernel sizes i [Reinhard et al. 2002]: ˛ ˛ ˛ LDoG (x, y) ˛˛ i ˛ i = 1...8 (9) ˛ Ri σ (x, y) + α ˛ > t Here, the DoG filter is divided by one of the Gaussians to normalize the result, and thus enable comparison against the constant threshold t which determines if the neighborhood at scale i is considered to have significant detail. The constant α is added to avoid division by zero. For the image shown in Figure 9 (left), the scale selected for each pixel is shown in Figure 9 (right). Such a scale selection mechanism is employed by the photographic tone reproduction operator [Reinhard et al. 2002] as well as in Ashikhmin’s operator [Ashikhmin 2002]. Once for each pixel the appropriate neighborhood is known, the Gaussian blurred average Lblur for this neighborhood may be used to steer the semi-saturation constant, such as for instance employed by the photographic tone reproduction operator: Ld = 5.2 Local Neighborhoods In local operators, halo artifacts occur when the local average is computed over a region that contains sharp contrasts with respect to the pixel under consideration. It is therefore important that the local average is computed over pixel values that are not significantly different from the pixel that is being filtered. This suggests a strategy whereby an image is filtered such that no blurring over such edges occurs. A simple, but computationally expensive way is to compute a stack of Gaussian blurred images with different kernel sizes, i.e. an image pyramid. For each pixel, we may choose the largest Gaussian which does not overlap with a significant gradient. The scale at which this happens can be computed as follows. In a relatively uniform neighborhood, the value of a Gaussian blurred pixel should be the same regardless of the filter kernel size. Thus, in this case the difference between a pixel filtered with two different Gaussians should be around zero. This difference will only change significantly if the wider filter kernel overlaps with a neighborhood containing a sharp contrast step, whereas the smaller filter kernel does not. A difference of Gaussians (DoG) signal LDoG (x, y) at scale i can be computed as follows: i LDoG (x, y) = Ri σ (x, y) − R2 i σ (x, y) i 5 Although (8) g(x, y) is now no longer a constant, we continue to refer to it as the semi-saturation constant. Lw 1 + Lblur (10) It is instructive to compare the result of this operator with its global equivalent, which is defined as: Ld = Lw 1 + Lw (11) Images tone-mapped with both forms are shown in Figure 10. The CIE94 color difference metric shown in this figure shows that the main differences occur near (but not precisely at) high-frequency high-contrast edges, predominantly seen in the clouds. These are the regions where more detail is produced by the local operator. An alternative approach includes the use of edge preserving smoothing operators, which are designed specifically for removing small details while keeping sharp contrasts in tact. Such filters have the advantage that sharp discontinuities in the filtered result coincide with the same sharp discontinuities in the input image, and may therefore help to prevent halos [DiCarlo and Wandell 2000]. Several such filters, such as the bilateral filter, trilateral filter, Susan filter, the LCIS algorithm and the mean shift algorithm are suitable [Durand and Dorsey 2002; Choudhury and Tumblin 2003; Pattanaik and Yee 2002; Tumblin and Turk 1999; Comaniciu and Meer 2002], although some of them are expensive to compute. Edge preserving smoothing operators are discussed Section 5.4. 5.3 Sub-band Systems Image pyramids can be used directly for the purpose of tone reproduction, provided the filter bank is designed carefully [Li et al. Figure 10: This image was tone-mapped with both global and local versions of the photographic tone reproduction operator (top left and right). Below, the CIE94 color difference is shown. 2005]. Here, a signal is decomposed into a set of signals that can be summed to reconstruct the original signal. Such algorithms are known as sub-band systems, or alternatively as wavelet techniques, multi-scale techniques or image pyramids. An image consisting of luminance signals Lv (x, y) is split into a stack of band-pass signals LDoG (x, y) using (8) whereby the n i scales i increase by factors of two. The original signal Lv can then be reconstructed by simply summing the sub-bands: Lv (x, y) = n X LDoG (x, y) i (12) i=1 Assuming that Lv is a high dynamic range image, a tone-mapped image can be created by first applying a non-linearity to the bandpass signals. In its simplest form, the non-linearity applied to each LDoG (x, y) is a sigmoid [DiCarlo and Wandell 2000]. However, i as argued by Li et al, summing the filtered sub-bands then leads to distortions in the reconstructed signal [Li et al. 2005]. To limit such distortions, either the filter bank may be modified, or the nonlinearity may be redesigned. Although sigmoids are smooth functions, their application to a (sub-band) signal with arbitrarily sharp discontinuities will yield signals with potentially high frequencies. The high-frequency content of sub-band signals will cause distortions in the reconstruction of the tone-mapped signal Lv [Li et al. 2005]. The effective gain G(x, y) applied to each sub-band as a result of applying a sigmoid can be expressed as a pixel-wise multiplier: ′ LDoG (x, y) = LDoG (x, y) G(x, y) i i (13) To avoid distortions in the reconstructed image, the effective gain should have frequencies no higher than the frequencies present in the sub-band signal. This can be achieved by blurring the effective gain map G(x, y) before applying it to the sub-band signal. This approach leads to a significant reduction in artifacts, and is an important tool in the prevention of halos. The filter bank itself may also be adjusted to limit distortions in the reconstructed signal. In particular, to remove undesired frequencies in each of the sub-bands caused by applying a non-linear function, a second bank of filters may be applied before summing the sub-bands to yield the reconstructed signal. If the first filter bank which splits the signal into sub-bands is called the analysis filter bank, then the second bank is called the synthesis filter bank. The non-linearity described above can then be applied in-between Figure 11: Tone reproduction using a sub-band architecture, computed here using a Haar filter [Li et al. 2005]. the two filter banks. Each of the synthesis filters should be tuned to the same frequencies as the corresponding analysis filters. An efficient implementation of this approach, which produces excellent artifact-free results, is described by Li et al [Li et al. 2005]; an example image is shown in Figure 11. The image benefits from clamping the bottom 2% and the top 1% of the pixels, which is discussed further in Section 6.2. Although this method produces excellent artifact-free images, it has the tendency to over-saturate the image. This effect was ameliorated in Figure 11 by desaturating the image using the technique described in Section 6.1 (with a value of 0.7, which is the default value used for the sub-band approach). However, even after desaturating the image, its color fidelity remained a little too saturated. Further research would be required to determine the exact cause of this effect, which is shared with gradient domain compression (Section 5.5). Finally, it should be noted that the scene depicted in Figure 11 is a particularly challenging image for tone reproduction. The effects described here would be less pronounced for many high dynamic range photographs. 5.4 Edge-Preserving Smoothing Operators An edge-preserving smoothing operator attempts to remove details from the image without removing high-contrast edges. An example is the bilateral filter [Tomasi and Manduchi 1998; Paris and Durand 2006; Weiss 2006], which is a spatial Gaussian filter multiplied with a second Gaussian operating in the intensity domain. With Lv (x, y) the luminance at pixel (x, y), the bilateral filter LB (x, y) is defined as: w(x, y, u, v) Lv (u, v) (14) w(x, y, u, v) XX w(x, y, u, v) = Rσ1 (x − u, y − v) Rσ2 (Lv (x, y), Lv (u, v)) LB (x, y) = u v Here, σ1 is the kernel size used for the Gaussian operating in the spatial domain, and σ2 is kernel size of the intensity domain Gaussian filter. The bilateral filter can be used to separate an image into ’base’ and ’detail’ layers [Durand and Dorsey 2002]. Applying the bilateral filter to an image results in a blurred image in which sharp edges remain present (Figure 12 left). Such an image is normally called a ’base’ layer. This layer has a dynamic range similar to the Figure 12: Bilateral filtering removes small details, but preserves sharp gradients (left). The associated detail layer is shown on the right. Figure 14: The image on the left is tone-mapped using gradient domain compression. The magnitude of the gradients k∇Lk is mapped to a grey scale in the right image (white is a gradient of 0; black is the maximum gradient in the image). represented with image gradients (in log space): ∇L = (L(x + 1, y) − L(x, y), L(x, y + 1) − L(y)) Figure 13: An image tone-mapped using bilateral filtering. The base and detail layers shown in Figure 12 are recombined after compressing the base layer. input image. The high dynamic range image may be divided pixelwise by the base layer, obtaining a ’detail’ layer LD (x, y) which contains all the high frequency detail, but typically does not have a very high dynamic range (Figure 12 right): LD (x, y) = Lv (x, y)/LB (x, y) (16) Here, ∇L is a vector-valued gradient field. By attenuating large gradients more than small gradients, a tone reproduction operator may be constructed [Fattal et al. 2002]. Afterwards, an image can be reconstructed by integrating the gradient field to form a tonemapped image. Such integration must be approximated by numerical techniques, which is achieved by solving a Poisson equation using the Full Multi-grid Method [Press et al. 1992]. The resulting image then needs to be linearly scaled to fit the range of the target display device. An example of gradient domain compression is shown in Figure 14. (15) By compressing the base layer before recombining into a compressed image, a displayable low dynamic range image may be created (Figure 13). Compression of the base layer may be achieved by linear scaling. Tone reproduction on the basis of bilateral filtering is executed in the logarithmic domain. Edge-preserving smoothing operators may be used to compute a local adaptation level for each pixel, to be applied in a spatially varying or local tone reproduction operator. A local operator based on sigmoidal compression can for instance be created by substituting Lblur (x, y) = LB (x, y) in (10). Alternatively, the semi-saturation constant g(x, y) in (3b) may be seen as a local adaptation constant, and can therefore be locally approximated with LB (x, y) [Ledda et al. 2004], or with any of the other filters mentioned above. As shown in Figure 7, the choice of semi-saturation constant shifts the curve horizontally such that its middle portion lies over a desirable range of values. In a local operator, this shift is determined by the values of a local neighborhood of pixels, and is thus different for each pixel. This leads to a potentially better compression mechanism than a constant value could afford. 5.5 Gradient-Domain Operators Local adaptation provides a measure of how different a pixel is from its immediate neighborhood. If a pixel is very different from its neighborhood, it typically needs to be attenuated more. Such a difference may also be expressed in terms of contrast, which could be 5.6 Lightness Perception The theory of lightness perception, provides a model for the perception of surface reflectances6. To cast this into a computational model, the image needs to be automatically decomposed into frameworks, i.e. regions of common illumination. As an example, the window in the right-most image of Figure 17 would constitute a separate framework from the remainder of the interior. The influence of each framework on the total lightness needs to be estimated, and the anchors within each framework must be computed [Krawczyk et al. 2004; Krawczyk et al. 2005; Krawczyk et al. 2006]. It is desirable to assign a probability to each pixel of belonging to a particular framework. This leaves the possibility of a pixel having non-zero participation in multiple frameworks, which is somewhat different from standard segmentation algorithms that assign a pixel to at most one segment. To compute frameworks and probabilities for each pixel, a standard K-means clustering algorithm may be applied. For each framework, the highest luminance rule may now be applied to find an anchor. This means that within a framework, the pixel with the highest luminance would determine how all the pixels in this framework are likely to be perceived. However, direct application of this rule may result in the selection of a luminance value of a patch that is perceived as self-luminous. As the anchor should be the highest luminance value that is not perceived as self-luminous, 6 Lightness is defined as relative perceived surface reflectance. Figure 15: Per-channel gamma correction may desaturate the image. The images were desaturated with values of s = 0.2, s = 0.5, and s = 0.8. selection of the highest luminance value should be preceded by filtering the area of the local framework with a large Gaussian filter. The anchors for each framework are used to compute the net lightness of the full image. This then constitutes a computational model of lightness perception, which can be extended for the purpose of tone reproduction. One of the strengths of using a computational model of lightness perception for the purpose of tone reproduction, is that traditionally difficult phenomena such as the Gelb effect can be handled correctly. The Gelb effect manifests itself when the brightest part of a scene is placed next to an object that is even brighter. Whereas the formerly brightest object was perceived as white, after the change this object no longer appears white, but light gray. 6 Post-processing After tone reproduction, it is possible to apply several postprocessing steps to either improve the appearance of the image, adjust its saturation, or correct for the display device’s gamma. Here, we discuss two frequently applied techniques which have a relatively large impact on the overall appearance of the tone-mapped results. These are a technique to desaturate the results, and a technique to clamp a percentage of the lightest and darkest pixels. 6.1 Color in Tone Reproduction Tone reproduction operators normally compress luminance values, rather than work directly on the red, green and blue components of a color image. After these luminance values have been compressed into display values Ld (x, y), a color image may be reconstructed by keeping the ratios between color channels the same as before compression (using s = 1) [Schlick 1994]: „ «s Ir (x, y) Ir,d (x, y) = Ld (x, y) (17a) Lv (x, y) „ «s Ig (x, y) Ig,d (x, y) = Ld (x, y) (17b) Lv (x, y) «s „ Ib (x, y) Ib,d (x, y) = Ld (x, y) (17c) Lv (x, y) Alternatively, the saturation constant s may be chosen smaller than one. Such per-channel gamma correction may desaturate the results to an appropriate level, as shown in Figure 15 [Fattal et al. 2002]. The results of tone reproduction may sometimes appear unnatural, because human color perception is non-linear with respect to overall luminance level. If we view an image of a bright outdoors scene on a monitor in a dim environment, we are adapted to the dim environment rather than the outdoors lighting. By keeping color ratios constant, we do not take this effect into account. In addition, other phenomena such as for example the Stevens, Hunt, and Bezold-Brücke effects are not accounted for. The above approach should therefore be seen as a limited control to account for a complex phenomenon. A more comprehensive solution is to incorporate ideas from the field of color appearance modeling into tone reproduction operators [Pattanaik et al. 1998; Fairchild and Johnson 2004; Reinhard and Devlin 2005]. The iCAM image appearance model is the first color appearance model operating on images [Fairchild and Johnson 2002; Fairchild and Johnson 2004; Moroney and Tastl 2004]. It can also be used as a tone reproduction operator. It therefore constitutes an important trend towards the incorporation of color appearance modeling in dynamic range reduction, and vice-versa. A rudimentary form of color appearance modeling within a tone reproduction operator is afforded by the sigmoidal compression scheme outlined in Section 4.3 [Reinhard and Devlin 2005]. Nonetheless, we believe that further integration of tone reproduction and color appearance modeling is desirable, for the purpose of properly accounting for the differences in adaptation between scene and viewing environments. 6.2 Clamping A common post-process to tone reproduction is clamping. It is for instance part of the iCAM model, as well as the sub-band encoding scheme. Clamping is normally applied to both very dark as well as very light pixels. Rather than specify a hard threshold beyond which pixels are clamped, a better way is to specify a percentile of pixels which will be clamped. This gives better control over the final appearance of the image. By selecting a percentile of pixels to be clamped, inevitably detail will be lost in the dark and light areas of the image. However, the remainder of the luminance values is spread over a larger range, and this creates better detail visibility for large parts of the image. The percentage of pixels clamped varies usually between 1% and 5%, dependent on the image. The effect of clamping is shown in Figure 16. The image on the left shows the results without clamping, and therefore all pixels are within the display range. In the image on the right, the darkest 7% of the pixels were clamped, as well as the lightest 2% of the pixels. This has resulted in an image that has reduced visible detail in the steps of the amphi theater, as well as in the wall and the bushes in the background. However, the overall appearance of the image has improved, and the clamped image conveys the atmosphere of the environment better than the directly tone-mapped image. To preserve the appearance of the environment, the disadvantage of losing detail in the lightest and darkest areas is frequently outweighed by a better overall impression of brightness. The photograph shown in Figure 16 was taken during a very bright day, and this is not conveyed well in the unclamped images. Finally, the effect of clamping has a relatively large effect on the results. For typical applications it is an attractive proposition to add this technique to any tone reproduction operator. However, as only a few tone reproduction operators incorporate this feature as standard, it also clouds the ability to assess the quality of tone reproduction operators. The difference between operators appears to be of similar magnitude as the effect of clamping. 7 Mappings for HDR Displays The light emitted by an HDR display is, by definition, spread over a much larger range than we see on most current display devices. As a result, many physical scenes could be captured in HDR, and then displayed directly. As such, the need for tone reproduction will be removed for some images. However, this is not the case for all environments. If the image has a range much higher than Figure 16: Example of clamping. Both images were tone-mapped using photographic tone reproduction. The left image is not clamped, whereas 7% of the darkest pixels and 2% of the lightest pixels were clamped in the right image. the HDR display can handle, then non-linear compression schemes will continue to be a necessary pre-display step. As opposed to conventional displays, HDR display devices emit enough light to be sufficiently different from the average room lighting conditions. An important, yet poorly understood issue is that the human visual system (HVS) will adapt in part to the device and in part to the room environment. Such partial (or mixed) adaptation is notoriously difficult to model, and is certainly not a feature of current tone reproduction algorithms. A good tone reproduction algorithm for HDR display devices would probably have to account for the partial adaptation of the viewer. This would include all forms of adaptation, including for instance, chromatic adaptation However, other than this fundamental issue, the distinction between HDR and LDR displays is arbitrary. As such, we would argue that a good tone reproduction algorithm needs to be adaptable to any kind of display. Similarly, in the context of dynamic range management we see little difference between tone reproduction operators and inverse tone reproduction operators, even though the former is used for reducing the dynamic range of an image to match a display with a lower range, and the latter is used for expanding the range of an image to match the dynamic range of a display with a higher dynamic range. A good tone reproduction operator would be able to both compress the dynamic range of an image, as well as expand it. Nonetheless, there are other issues related to dynamic range expansion, which will have to be taken into account. These do not relate to the expansion of values per se, but are related to artifacts in the source material that may be amplified to become more visible. For instance, by expanding the luminance range of an image, the lossy compression applied to JPEG images may become visible. Second, the non-linear encoding of pixel values may, after expansion, lead to visible quantization artifacts. Finally, under- and over-exposed areas may require separate processing to salvage a lack of data in these regions [Wang et al. 2007]. Some solutions for these problems have been proposed. For instance, Banterle et al invert the photographic operator for dynamic range expansion [Banterle et al. 2006]. As this effectively results in an inverse sigmoid, this makes the implicit, but erroneous assumption that the input image is given in units which correspond to either photo-receptor output or some perceptual quantity. Blocky ar- tifacts, for instance those arising from JPEG encoding, are avoided by determining pixels belonging to light sources in the input image, and applying a different interpolation scheme for those pixels. Rempel et al have found that linear up-scaling can typically be performed up until a contrast of 5000 : 1 before the image takes on an unnatural appearance [Rempel et al. 2007]. Their solution is therefore to rely predominantly on linear up-scaling, which is consistent with the finding that, at least for relatively short exposure times, humans prefer to view linear up-scaled image over nonlinearly scaled images [Akyüz et al. 2007]. The appearance of artifacts is minimized by application of noise filtering and quantization reduction, through the use of a bilateral filter. Pixel encodings of 235 or higher in video formats are normally assumed to indicate light sources or highlights. Those pixels can be enhanced separately. Alternatively, highlights could be detected with a dedicated algorithm, before being scaled separately from the remainder of the image [Meylan et al. 2007]. In summary, the main problems associated with displaying conventional images on high dynamic range display devices, revolve around avoiding the visibility of artifacts in the capture and encoding of legacy content. So far, simple functions have proved adequate for dynamic range expansion, although we would not draw the conclusion that this would be the case for all images and all display conditions. 8 Tone Reproduction and Inverse Tone Reproduction Many tone reproduction operators are modeled after some aspects of human vision. The computed display values therefore essentially represent perceived quantities, for instance brightness if the tone reproduction operator is based on a model of brightness perception. If we assume that the model is an accurate representation of some aspect of the HVS, then displaying the image and observing it will cause the HVS to interpret these perceived values as luminance values. The HVS thus applies a second perceptual transform on top of the one applied by the algorithm. This is formally incorrect. A good tone reproduction operator should follow the same common practice as employed in color appearance modeling, and apply both a forward and a reverse transform [Fairchild 1998]. The forward transform can be any algorithm thought to be effective at compressing luminance values. The reverse transform will then apply the algorithm in reverse, but with display parameters inserted. This approach compresses luminance values into perceived values, while the reverse algorithm will convert the perceived values back into luminance values. Tumblin and Rushmeier’s approach correctly takes this approach, as does the Multi-Scale Observer model [Pattanaik et al. 1998], all color appearance models and gradient-domain operators, and the aforementioned sub-band system. However, several perceptually-based operators are applied only in forward mode, including the photographic operator7 and the sigmoidal operator inspired by photo-receptor physiology [Reinhard and Devlin 2005]. While these operators are known to produce visually plausible results, we note that they are effectively not producing display luminances, but brightnesses or other equivalent perceptual attributes. 7 Although this operator can be explained as based on photographic principles, it’s underlying model is a perceptual model of brightness perception [Blommaert and Martens 1990]. 8.1 Sigmoids Revisited Here we discuss the implications of adding a reverse step to sigmoidal compression. Recall that equation (3b) can be rewritten as: V (x, y) = Ln v (x, y) n Lv (x, y) + gn (x, y) (18) where V is a perceived value (for instance a voltage, if this equation is thought of as a simple model of photoreceptor physiology). The function g continues to return either a globally or locally computed adaptation value, which is based on the image values. To convert these perceived values back to luminance values, this equation then needs to be inverted, whereby g is replaced with a display adaptation value (see also Section 4.1). For instance, we could try to replace g with the mean display luminance Ld,mean . The other user parameter in this model is the exponent n, which for the reverse model we will replace with a display related exponent m. By making these substitutions, we have replaced all the image-related user parameters (n and g) with their display-related equivalents (m and Ld,mean ). The resulting inverse equation, computing display values Ld from previously computed perceived values V is then: Ld (x, y) = „ V (x, y) Lm d,mean 1 − V (x, y) «1/m (19) Figure 17: Forward and reverse model with n = 0.7 and k = 0.3 for an HDR image with a relatively modest dynamic range 2.8 log units (left) and an image with a much higher dynamic range (right; n = 0.7, k = 0.08, 8.3 log units dynamic range. Figure 18: Forward model only with n = 0.7 and k = 0.3 (left) and n = 0.7, k = 0.08 (right). For a conventional display, we would set Ld,mean to 128. The exponent m is also a display related parameter and determines how display values are spread around the mean display luminance. For low dynamic range display devices, this value can be set to 1, thereby simplifying the above equation to: Ld (x, y) = V (x, y) Ld,mean 1 − V (x, y) (20) The computation of display values is now driven entirely by the mean luminance of the image (through the computation of g), the mean display luminance Ld,mean , as well as the exponent n which specifies how large a range of values around the mean image luminance will be visualized. As a result, the inverse transform may create display values that are outside the display range. These will have to be clamped. As for most display devices the peak luminance Ld,max as well as the black level Ld,min are known, it is attractive to use these natural boundaries to clamp the display values against. This is arguably a more natural choice than clamping against a percentile, as discussed in Section 6.2. Given that we will gamma correct the image afterwards, we may assume that the display range of Ld is linear. As such, we can now compute the mean display luminance as the average of the display’s black level and peak luminance: Ld,mean = 1 (Ld,min + Ld,max ) 2 (21) As there is no good theoretical ground for choosing any specific value for the exponent m, we will set this parameter to 1 for now. As a result, all display related parameters are fixed, leaving only the image-dependent parameters n as well as the choice of semisaturation constant g(). For our example, we will follow Reinhard et al, and set g() = L̄v /k, where the user parameter k determines how overall light or dark the image should be reproduced (see Section 4.3) [Reinhard et al. 2002]. The exponent n can be thought of as a measure of how much contrast there is in the image. Two results using visually determined optimal parameter settings are shown in Figure 17. The display settings are those for an average display with an assumed black level of 1cd/m2 , and a peak luminance of 300cd/m2 . As a consequence Ld,mean was set to 150.5. Figure 19: Photographic operator with key k = 0.3 (left) and k = 0.08 (right). By using the photographic operator, we have effectively changed the exponent to n = 1. The left image is reproduced in a satisfactory manner. However, the amount of burn-out that has occurred in the window of the right image is too much. However, it is difficult to find a good trade-off between having less burn-out, and introducing other artifacts with this method. For comparison, we show the forward only transform with otherwise identical parameter settings in Figure 18. Note that the left image looks more flat now, largely because the exponent n is now no longer optimal. The window in the right image now appears more correct, as the brown glass panels are now clearly visible. We also show the output of the photographic operator for the same pair of images in Figure 19. The exponent n is effectively set to 1, but the key value is the same as in the previous figures. Although these images are computed with a forward transform only, their visual appearance remains closer to the real environment than the images in Figure 17. Finally, it is desirable that a tone reproduction operator does not alter an image that is already within the display range [DiCarlo and Wandell 2000]. In the model proposed here this is implicitly achieved, as for n = m and g = Ld,mean , the reverse transform is the true inverse of the forward transform. This is borne out in the CIE94 color difference metric, which is uniformly 0 for all pixels after running the algorithm twice. 8.2 Combined Forward/Reverse Sigmoids Although applying both a forward and a reverse transform is formally the correct approach to tone reproduction, there is thus a problem for images with a very high dynamic range. For such images it is difficult, if not impossible, to find parameter settings that lead to an acceptable compression. To see why this is, we can plug the forward transform of (3b) into the inverse (19): 11/m Ln ` ´n Lm d,mean B C + L̄/k C Ld = B @ A Ln ´n 1− n ` L + L̄/k 0 Ln Ln/m Ld,mean = ` ´n/m L̄/k = c Ln/m where c= ` Ld,mean ´n/m L̄/k (22) (23) (24) (25) is a constant. Of course, this is essentially the same result as was obtained by matching image and display brightnesses in Tumblin and Rushmeier’s brightness matching operator (see Section 4.1). Hence, applying a sigmoid in forward and reverse mode amounts to applying a power function. In our experience, this approach works very well in cases where a medium amount of compression is required. For instance, a medium dynamic range image can be effectively tone-mapped for display on a low dynamic range display device. Alternatively, it should be possible to tone-map most high dynamic range images for display on high dynamic range display devices using this technique. However, for high compression ratios, a different approach would be required. A direct consequence is that we predict that color appearance models such as CIECAM02 cannot be extended to transform data over large ranges. It is well-known that CIECAM02 was never intended for transforming between significantly different display conditions. However, this can be attributed to the fact that the psychophysical data on which this model is based, was gathered over a limited dynamic range. The above findings suggest that in addition, extension of CIECAM02 to accommodate large compression ratios would require a different functional form. Whether the inclusion of spatial processing, such as a spatially varying semi-saturation constant, yields more satisfactory results remains to be seen. As can be understood from (23), replacing g() = L¯v /k with a spatially varying function means that each pixel is divided by a spatially determined denominator. Such an approach was pioneered in Chiu et al’s early work [Chiu et al. 1993], and has been shown to be prone to haloing artifacts (see Section 3 and Figure 4). To minimize the occurrence of halos in such a scheme, the size of the averaging kernel used to compute g() must be chosen to be very large; typically a substantial fraction of the whole image. But in the limit that the filter kernel becomes the whole image, this means that each pixel is divided by the same value, resulting in a spatially invariant operator. 9 Discussion Tone reproduction for low dynamic range display devices is nowadays a reasonably well understood problem. The majority of images can be compressed well enough for applications in photography and entertainment, and any other applications that do not critically depend on accuracy. Recent validation studies show that some algorithms perform well over a range of different tasks and displayed material [Drago et al. 2002; Kuang et al. 2004; Ledda et al. 2005; Yoshida et al. 2005; Yoshida et al. 2006; Ashikhmin and Goral 2007]. When dealing with different displays, each having their own dynamic range, it becomes more important to consider tone reproduction operators that can be parameterized for both different types of images and different types of display. Following common practice in color appearance modeling, we have argued that both a forward and a reverse transform are necessary. However, we have identified a disconnect between theory and practice. In particular, a reverse operation should follow a forward transform to convert perceived values back to values amenable for interpretation as luminances. If we apply this to the class of sigmoidal functions, of which color appearance models form a part, then we effectively reduce the compressive function to a form of gamma correction. It is more difficult to produce visually plausible results this way, as less control can be exercised over the trade-off between burn-out, contrast, and visual appearance. To solve this problem, we would either have to find an alternative reasoning whereby the inverse model does not have to be applied, or instead develop new tone reproduction operators which both include a forward and reverse model and produce visually plausible and controllable results. The backward step would have the additional benefit of being adaptable to any type of display, including high dynamic range display devices. We anticipate that this implies that such operators would obviate the need for dedicated inverse tone reproduction operators, although image processing to counter the effects of quantization, spatial compression, as well as possible exposure artifacts will remain necessary. References A KY ÜZ, A. O., R EINHARD , E., F LEM ING , R., R IEC KE, B., AND B ÜLTHOFF , H. 2007. Do HDR displays support LDR content? A psychophysical evaluation. ACM Transactions on Graphics 26, 3. A SHIKHMIN , M., AND G OR AL , J. 2007. A reality check for tone mapping operators. ACM Transactions on Applied Perception 4, 1. A SHIKHMIN , M. 2002. A tone mapping algorithm for high contrast images. In Proceedings of 13th Eurographics Workshop on Rendering, 145–155. BANTERLE, F., L EDDA , P., D EBATTISTA , K., AND C HALM ERS , A. 2006. Inverse tone mapping. In GRAPHITE ’06: Proceedings of the 4th International Conference on Computer Graphics and Interactive Techniques in Australasia and Southeast Asia, 349–356. B LOM MAERT, F. J. J., AND M ARTENS , J.-B. 1990. An object-oriented model for brightness perception. Spatial Vision 5, 1, 15–41. C HIU , K., H ER F, M., S HIR LEY, P., S WAM Y, S., WANG , C., AND Z IM MERM AN , K. 1993. Spatially nonuniform scaling functions for high contrast images. In Proceedings of Graphics Interface ’93, 245–253. C HOUDHURY, P., AND T UM BLIN , J. 2003. The trilateral filter for high contrast images and meshes. In Proceedings of the Eurographics Symposium on Rendering, 186–196. C OM ANICIU , D., AND M EER , P. 2002. Mean shift: a robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 5, 603–619. D I C ARLO , J. M., AND WANDELL , B. A. 2000. Rendering high dynamic range images. In Proceedings of the SPIE Electronic Imaging 2000 conference, vol. 3965, 392–401. D R AGO , F., M ARTENS , W. L., M YSZKOWSKI , K., AND S EIDEL , H.-P. 2002. Perceptual evaluation of tone mapping operators with regard to similarity and preference. Tech. Rep. MPI-I-2002-4-002, Max Plank Institut für Informatik. D R AGO , F., M YSZKOWSKI , K., A NNEN , T., AND C HIBA , N. 2003. Adaptive logarithmic mapping for displaying high contrast scenes. Computer Graphics Forum 22, 3. D UAN , J., Q IU , G., AND C HEN , M. 2005. Comprehensive fast tone mapping for high dynamic range image visualization. In Proceedings of Pacific Graphics. D UR AND , F., AND D OR SEY, J. 2002. Fast bilateral filtering for the display of highdynamic-range images. ACM Transactions on Graphics 21, 3, 257–266. FAIR CHILD , M. D., AND J OHNSON , G. M. 2002. Meet iCAM: an image color appearance model. In IS&T/SID 10th Color Imaging Conference, 33–38. FAIR CHILD , M. D., AND J OHNSON , G. M. 2004. The iCAM framework for image appearance, image differences, and image quality. Journal of Electronic Imaging. FAIR CHILD , M. D. 1998. Color appearance models. Addison-Wesley, Reading, MA. FATTAL , R., L ISC HINSKI , D., AND W ER MAN , M. 2002. Gradient domain high dynamic range compression. ACM Transactions on Graphics 21, 3, 249–256. F ERWERDA , J. A., PATTANAIK , S., S HIR LEY, P., AND G R EENBERG , D. P. 1996. A model of visual adaptation for realistic image synthesis. In SIGGRAPH 96 Conference Proceedings, 249–258. F ERWERDA , J. A. 2001. Elements of early vision for computer graphics. IEEE Computer Graphics and Applications 21, 5, 22–33. H OOD , D. C., AND F INKELSTEIN , M. A. 1979. Comparison of changes in sensitivity and sensation: implications for the response-intensity function of the human photopic system. Journal of Experimental Psychology: Human Perceptual Performance 5, 3, 391–405. H OOD , D. C., F INKELSTEIN , M. A., AND B UC KINGHAM , E. 1979. Psychophysical tests of models of the response function. Vision Research 19, 4, 401–406. J OB SON , D. J., R AHM AN , Z., AND W OODELL, G. A. 1995. Retinex image processing: Improved fidelity to direct visual observation. In Proceedings of the IS&T Fourth Color Imaging Conference: Color Science, Systems, and Applications, vol. 4, 124–125. K LEINSCHMIDT, J., AND D OWLING , J. E. 1975. Intracellular recordings from gecko photoreceptors during light and dark adaptation. Journal of General Physiology 66, 5, 617–648. K R AWCZYK , G., M ANTIUK , R., M YSZKOWSKI , K., AND S EIDEL , H.-P. 2004. Lightness perception inspired tone mapping. In Proceedings of the 1st ACM Symposium on Applied Perception in Graphics and Visualization, 172. PATTANAIK , S. N., AND Y EE , H. 2002. Adaptive gain control for high dynamic range image display. In Proceedings of Spring Conference in Computer Graphics (SCCG2002), 24–27. PATTANAIK , S. N., F ERWERDA , J. A., FAIR CHILD , M. D., AND G R EENBERG , D. P. 1998. A multiscale model of adaptation and spatial vision for realistic image display. In SIGGRAPH 98 Conference Proceedings, 287–298. P R ESS , W. H., T EUKOLSKY, S. A., V ETTER LING , W. T., AND F LANNERY, B. P. 1992. Numerical Recipes in C: The Art of Scientific Computing, 2nd ed. Cambridge University Press. R AHM AN , Z., J OB SON , D. J., AND W OODELL , G. A. 1996. A multiscale retinex for color rendition and dynamic range compression. In SPIE Proceedings: Applications of Digital Image Processing XIX, vol. 2847. R AHM AN , Z., W OODELL , G. A., AND J OB SON , D. J. 1997. A comparison of the multiscale retinex with other image enhancement techniques. In IS&T’s 50th Annual Conference: A Celebration of All Imaging, vol. 50, 426–431. R EINHARD , E., AND D EVLIN , K. 2005. Dynamic range reduction inspired by photoreceptor physiology. IEEE Transactions on Visualization and Computer Graphics 11, 1, 13–24. R EINHARD , E., S TAR K , M., S HIR LEY, P., AND F ERWERDA , J. 2002. Photographic tone reproduction for digital images. ACM Transactions on Graphics 21, 3, 267– 276. R EINHARD , E., WAR D , G., PATTANAIK , S., AND D EB EVEC , P. 2005. High Dynamic Range Imaging: Acquisition, Display and Image-Based Lighting. Morgan Kaufmann Publishers, San Francisco. R EINHARD , E. 2003. Parameter estimation for photographic tone reproduction. Journal of Graphics Tools 7, 1, 45–51. R EINHARD , E. 2007. Overview of dynamic range reduction. In The International Symposium of the Society for Information Display (SID 2007). R EM PEL , A., T R ENTACOSTE, M., S EETZEN , H., Y OUNG , D., H EIDRICH , W., W HITEHEAD , L., AND WAR D , G. 2007. On-the-fly reverse tone mapping of legacy video and photographics. ACM Transactions on Graphics (Proceedings of SIGGRAPH 2007) 26, 3. S C HLICK , C. 1994. Quantization techniques for the visualization of high dynamic range pictures. In Photorealistic Rendering Techniques, Springer-Verlag, Berlin, P. Shirley, G. Sakas, and S. Müller, Eds., 7–20. K R AWCZYK , G., M YSZKOWSKI , K., AND S EIDEL , H.-P. 2005. Lightness perception in tone reproduction for high dynamic range images. Computer Graphics Forum 24, 3, 635–645. S EETZEN , H., W HITEHEAD , L. A., AND WAR D , G. 2003. A high dynamic range display using low and high resolution modulators. In The Society for Information Display International Symposium. K R AWCZYK , G., M YSZKOWSKI , K., AND S EIDEL , H.-P. 2006. Computational model of lightness perception in high dynamic range imaging. In Proceedings of IS&T/SPIE Human Vision and Electronic Imaging, vol. 6057. S EETZEN , H., H EIDRICH , W., S TUER ZLINGER , W., WAR D , G., W HITEHEAD , L., T R ENTACOSTE , M., G HOSH , A., AND V OROZCOVS , A. 2004. High dynamic range display systems. ACM Transactions on Graphics 23, 3, 760–768. K UANG , J., YAM AGUCHI , H., J OHNSON , G. M., AND FAIR CHILD , M. D. 2004. Testing HDR image rendering algorithms. In Proceedings of IS&T/SID 12th Color Imaging Conference, 315–320. S PILLMANN , L., AND W ER NER , J. S., Eds. 1990. Visual perception: the neurological foundations. Academic Press, San Diego. L EDDA , P., S ANTOS , L.-P., AND C HALMERS , A. 2004. A local model of eye adaptation for high dynamic range images. In Proceedings of ACM Afrigraph, 151–160. L EDDA , P., C HALMERS , A., T ROSCIANKO , T., AND S EETZEN , H. 2005. Evaluation of tone mapping operators using a high dynamic range display. ACM Transactions on Graphics 24, 3, 640–648. L I , Y., S HAR AN , L., AND A DELSON , E. H. 2005. Compressing and companding high dynamic range images with subband architectures. ACM Transactions on Graphics 24, 3, 836–844. M EYLAN , L., D ALY, S., AND S ÜSSTRUNK , S. 2007. Tone mapping for high dynamic range displays. In Proceedings of IS&T/SPIE Electronic Imaging: Human Vision and Electronic Imaging XII, vol. 6492. M ILLER , N. J., N GAI , P. Y., AND M ILLER , D. D. 1984. The application of computer graphics in lighting design. Journal of the IES 14, 6–26. T OM ASI , C., AND M ANDUCHI , R. 1998. Bilateral filtering for gray and color images. In Proceedings of the IEEE International Conference on Computer Vision, 836– 846. T UM BLIN , J., AND RUSHMEIER , H. 1993. Tone reproduction for computer generated images. IEEE Computer Graphics and Applications 13, 6, 42–48. T UM BLIN , J., AND T UR K , G. 1999. LCIS: A boundary hierarchy for detailpreserving contrast reduction. In SIGGRAPH 1999, Computer Graphics Proceedings, A. Rockwood, Ed., 83–90. WANG , L., W EI , L.-Y., Z HOU , K., G UO , B., AND S HUM , H.-Y. 2007. High dynamic range image hallucination. In Eurographics Symposium on Rendering. WAR D , G., RUSHMEIER , H., AND P IATKO , C. 1997. A visibility matching tone reproduction operator for high dynamic range scenes. IEEE Transactions on Visualization and Computer Graphics 3, 4. WAR D , G. 1994. A contrast-based scalefactor for luminance display. In Graphics Gems IV, P. Heckbert, Ed. Academic Press, Boston, 415–421. M ORONEY, N., AND TASTL, I. 2004. A comparison of retinex and iCAM for scene rendering. Journal of Electronic Imaging 13, 1. W EISS , B. 2006. Fast median and bilateral filtering. ACM Transactions on Graphics 25, 3. N AKA , K. I., AND RUSHTON , W. A. H. 1966. S-potentials from luminosity units in the retina of fish (cyprinidae). Journal of Physiology (London) 185, 3, 587–599. Y OSHIDA , A., B LANZ , V., M YSZKOWSKI , K., AND S EIDEL, H.-P. 2005. Perceptual evaluation of tone mapping operators with real-world scenes. In Proceedings of SPIE Human Vision and Electronic Imaging X, vol. 5666, 192–203. O PPENHEIM , A. V., S C HAFER , R., AND S TOC KHAM , T. 1968. Nonlinear filtering of multiplied and convolved signals. Proceedings of the IEEE 56, 8, 1264–1291. PAR IS , S., AND D UR AND , F. 2006. A fast approximation of the bilateral filter using a signal processing approach. In European Conference on Computer Vision. Y OSHIDA , A., M ANTIUK , R., M YSZKOWSKI , K., AND S EIDEL , H.-P. 2006. Analysis of reproducing real-world appearance on displays of varying dynamic range. Computer Graphics Forum 25, 3, XXX.