Additional Material, Journal of Imaging Science and Technology, Vol
Transcription
Additional Material, Journal of Imaging Science and Technology, Vol
JULY/AUGUST 1998 Volume 42 • Number 4 The Journal of IMAGING SCIENCE and TECHNOLOGY IS&T The Society for Imaging Science and Technology EDITORIAL STAFF M.R.V. Sahyun, Editor IS&T 7003 Kilworth Lane, Springfield, VA 22151 715-836-4175 E-mail: sahyunm@uwec.edu FAX: 703-642-9094 Pamela Forness, Managing Editor The Society for Imaging Science and Technology 7003 Kilworth Lane, Springfield, VA 22151 703-642-9090; FAX: 703-642-9094 E-mail: pam@imaging.org Vivian Walworth, Editor Emeritus Martin Idelson, Associate Editor Eric Hanson, Associate Editor David R. Whitcomb, Associate Editor Michael M. Shahin, Associate Editor Mark Spitler, Associate Editor David S. Weiss, Associate Editor This publication is available in microform. Papers published in this journal are covered in BECITM, INSPEC, Chemical Abstracts, and Science Citation Index. Address remittances, orders for subscriptions and single copies, claims for missing numbers, and notices of change of address to IS&T at 7003 Kilworth Lane, Springfield, VA 22151. 703-642-9090; FAX: 703-6429094; E-mail: info@imaging.org. The Society is not responsible for the accuracy of statements made by authors and does not necessarily subscribe to their views. Copyright © 1998, The Society for Imaging Science and Technology. Copying of materials in this journal for internal or personal use, or the internal or personal use of specific clients, beyond the fair use provisions granted by the U.S. Copyright Law is authorized by IS&T subject to payment of copying fees. The Transactional Reporting Service base fee for this journal should be paid directly to the Copyright Clearance Center (CCC), Customer Service, (508) 750-8400, 222 Rosewood Drive, Danvers, MA 01923 or check CCC Online at http://www.copyright.com. Other copying for republication, resale, advertising or promotion, or any form of systematic or multiple reproduction of any material in this journal is prohibited except with permission of the publisher. Library of Congress Catalog Card No. 59-52172 Printed in the U.S.A. Guide for Authors Scope. JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY is dedicated to the advancement of knowledge in the imaging sciences, in practical applications of such knowledge, and in related fields of study. The pages of this journal are open to reports of new theoretical or experimental results and to comprehensive reviews. Submission of Manuscripts. Send manuscripts to Pamela Forness, Managing Editor, IS&T, 7003 Kilworth Lane, Springfield, VA 22151. Submit only original manuscripts not previously published and not currently submitted for publication elsewhere. Prior publication does not refer to conference abstracts, paper summaries or non-reviewed proceedings. Editorial Process. All manuscripts submitted are subject to peer review. If a manuscript appears better suited to publication in the JOURNAL OF ELECTRONIC IMAGING, published jointly by IS&T and SPIE, the editor will make that recommendation to both the editor of that journal and the author. The author will receive confirmation, reviewers' reports, notification of acceptance (or rejection), and tentative date of publication from the Editor. The author will receive page proofs and further instructions directly from IS&T. All subsequent correspondence about the paper should be addressed to the Managing Editor. Manuscript Preparation. Manuscript should be typed or printed double-spaced, with all pages numbered. The original manuscript and three duplicates are required, with a set of illustrations to accompany each copy. The illustrations included with the original manuscript must be of quality suitable for reproduction or available in digital form. Legible copies of illustrations may be submitted with the duplicate manuscripts. Title and Abstract Pages. Include on the title page, page one, the address and affiliation of each author. Include on page two an abstract of no more than 200 words stating briefly the objectives, methods, results, and conclusions. Style. The journal will generally follow the style specified in the AIP Style Manual, published by the American Institute of Physics. Equations. Number equations consecutively, with Arabic numbers in parentheses at the right margin. Illustrations. Number all figures consecutively and type captions double-spaced on a separate page or pages. Figures should be presented in such form that they will remain legible when reduced, usually to single column width (3.3 in., 8.4 cm). Recommended font for figure labels is Helvetica, sized to appear as 8-point type after reduction. Recommended size for original art is 1-2 times final size. Lines must be at least 1 point. Color. Authors may either submit color separations or be billed by the publisher for the cost of their preparation. Digital submission of color figures should be in CMYK EPS or TIFF format if possible. Additional costs associated with reproduction of color illustrations will be charged to the author or the author’s supporting institution. References. Number references sequentially as the citations appear in superscript form in the text. Type references on pages separate from the text pages, using the following format: Journal papers: Author(s) (first and middle initials, last name), title of article (optional), journal name (in italics), volume (bold), first page number, year (in parentheses). Books: Author(s) (first and middle initials, last name), title (in italics), publisher, city, year, page reference. JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY (ISSN:1062-3701) is published bimonthly by The Society for Imaging Science and Technology, 7003 Kilworth Lane, Springfield, VA 22151. Periodicals postage paid at Springfield, VA and at additional mailing offices. Printed at Imperial Printing Company, Saint Joseph, Michigan. Society members may receive this journal as part of their membership. Thirty dollars ($30.00) of the membership dues is allocated to this subscription. IS&T members may refuse this subscription by written request. Subscriptions to domestic non-members of the Society, $120.00 per year; single copies, $25.00 each. Foreign subscription rate, US $135.00 per year. POSTMASTER: Send address changes to JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY, 7003 Kilworth Lane, Springfield, VA 22151. Examples: 1. G. K. Starkweather, Printing technologies for images, gray scale, and color, Proc. SPIE 1458, 120 (1991). 2. E. M. Williams, The Physics and Technology of Xerographic Processes, John Wiley and Sons, New York, 1984, p. 30. Page charges. To help support the expenses of publication the author’s institution will be billed at $80 per printed page. Payment is expected from the sponsoring institution, not from the author. Such payment is not a condition for publication, and in appropriate circumstances page charges are waived. Requests for waivers must be made in writing to the managing editor. Additional charges. The author will be held responsible for the cost of extensive alterations after typesetting and for the costs of materials if the paper is withdrawn after typesetting. Manuscripts in Digital Form. Following acceptance of a manuscript, the author will be encouraged to submit the final version of the text and illustrations both as hardcopy and as digital files on diskette. The managing editor will provide specifications for preparing and submitting digital files. Vol. 42, No. 4, July/Aug. 1998 i Calendar IS&T Meetings September 7–11, 1998—International Congress on Imaging Science; Univ. of Antwerp, Belgium. Organized by: Int’l. Committee on the Science of Photography (ICSP) and Royal Flemish Chemical Society (KVCV); IS&T cooperating society. Contact: Jan De Roeck, c/o AgfaGevaert N.V. +32-3444 30 42; Fax: 32-3444 76 97; Web: http://www. ICPS98.be; or E-mail: dekeyzer@ twi.agfa.be October 18–23, 1998—NIP14: The l4th International Conference on Digital Printing Technologies; General Chair: David Dreyfuss, Westin Harbour Square - Toronto; Toronto, Ontario, Canada November 17–20, 1998—IS&T/ SID’s 6th Color Imaging Conference–Color Science, Systems & Applications, General Co-chairs: Sabine Susstrunk (IS&T) and Andras Lakatos (SID); The SunBurst Resort Hotel, Scottsdale, Arizona January 23–29, 1999—IS&T/SPIE Electronic Imaging: Science and Technology, General Co-chairs: Jan P. Allebach (IS&T) and Richard N. Ellson (SPIE); San Jose Convention Center, San Jose, CA April 25–28, 1999—The PIC Conference (IS&T’s 52nd Annual Spring Conference), General Chair: Shin Ohno; Hyatt Regency Hotel, Savannah, Georgia October 17–22, 1999—NIP15: The l5th International Congress on Digital Printing Technologies, The Caribe Royal Resort Suites, Lake Buena Vista, Florida November 16–19, 1999—7th Color Imaging Conference - Color Science, Systems & Applications, cosponsored by the Society for Information Display; The SunBurst Resort Hotel, Scottsdale, Arizona For more details, contact IS&T at 703-642-9090; FAX: 703-642-9094; E-mail: info@imaging.org; or visit us at http://www.imaging.org ii Journal of Imaging Science and Technology CODEN: JIMTEG 42(4) 295–380 (1998) ISSN: 1062-3701 July/August 1998 Volume 42, Number 4 Journal of IMAGING SCIENCE and TECHNOLOGY Official publication of IS&T—The Society for Imaging Science and Technology Contents Special Section: 3-D Imaging vi From the Guest Editor Vivian K. Walworth 295 Photography in the Service of Stereoscopy Samuel Kitrosser 300 Advancements in 3-D Stereoscopic Display Technologies: Micropolarizers, Improved LC Shutters, Spectral Multiplexing, and CLC Inks Leonard Cardillo, David Swift, and John Merritt 307 Full-color 3-D Prints and Transparencies J. J. Scarpetti, P. M. DuBois, R. M. Friedhoff, and V. K. Walworth 311 Stereo Matching by using a Weighted Minimum Description of Length Method Based on the Summation of Squared Differences Method Nobuhito Matsushiro and Kazuyo Kurabayashi 319 Diffuse Illumination as a Default Assumption for Shape-From-Shading in the Absence of Shadows Christopher W. Tyler 325 3-D Shape Recovery from Color Information for a Non-Lambertian Surface Wen Biao Jiang, Hai Yuan Wu and Tadayoshi Shioyama General Papers 331 Optical Effects of Ink Spread and Penetration on Halftones Printed by Thermal Ink Jet J. S. Arney and Michael L. Alber Contents continued iv Journal of Imaging Science and Technology Contents continued 335 Modeling the Yule–Nielsen Effect on Color Halftones J. S. Arney, Tuo Wu and Christine Blehm 341 Optical Dot Gain: Lateral Scattering Probabilities Geoffrey L. Rogers 346 Diffuse Transmittance Spectroscopy Study of Reduction-sensitized and Hydrogen-hypersensitized AgBr Emulsion Coatings Yoshiaki Oku and Mitsuo Kawasaki 349 Silver Clusters of Photographic Interest III. Formation of Reduction-Sensitization Centers in Emulsion Layers on Storage and Mechanism for Stabilization by TAI Tadaaki Tani, Naotsugu Muro and Atsushi Matsunaga 355 A New Crystal Nucleation Theory for Continuous Precipitation of Silver Halides Ingo H. Leubner 364 Carrier Transport Properties in Polysilanes with Various Molecular Weights Tomomi Nakamura, Kunio Oka, Fuminobu Hori, Ryuichiro Oshima, Hiroyoshi Naito, and Takaaki Dohmaru 370 Edge Estimation and Restoration of Gaussian Degraded Images Ziya Telatar and Önder Tüzünalp DEPARTMENTS ii Calendar 375 Business Directory 376 IS&T Honors and Awards—Call for Nominations Vol. 42, No. 4, July/Aug. 1998 v From the Guest Editor This Special Section is devoted to topics that were addressed in the 3-D Session of the IS&T Annual Conference held in May of last year (1997), as well as in the Imaging Technologies part of the Celebration of Imaging held during the same Annual Conference. The Section includes a broad range of topics, from perception of depth in normal vision of natural scenes and objects to methods for creating the illusion of depth in the course of recording and viewing two-dimensional representations of images in depth. The 3-D Session was co-chaired by John Merritt and myself. It was a privilege to share this responsibility with John, who is well known as co-chairman of the annual Conference on Stereoscopic Displays and Applications held within the Electronic Imaging Symposium in San Jose under the cosponsorship of IS&T and SPIE. This year’s Conference was the ninth of the series. Binocular vision is a significant human trait that enables observers to perceive depth in the natural world. The brain processes the disparate information received by our two eyes to produce the perception of depth. We are assisted in this perception by many external clues, including perspective, motion, illumination, shading, and color information. Stereoscopic, or 3-D, imaging depends largely on translating the real-world depth of objects or scenes to twodimensional representations to be viewed with or without the aid of various viewing devices. The photographic representation of depth is as old as photography itself, and perception of depth was a subject of investigation well before the birth of photography. Sir Charles Wheatstone constructed his first mirror stereoscope in 1832, and it was he who is credited with coining the word stereoscope (Greek stereos, solid, skopein, to view). Both Wheatstone and his contemporary, Sir David Brewster, devised lenticular stereoscopic viewers. The introduction of the daguerreotype led to a surge of popular enthusiasm for stereoscopic daguerreotypes. By the late 1800s no Victorian parlor was complete without its Brewster–Holmes stereoscope and a selection of stereo vi Journal of Imaging Science and Technology cards comprising side-by-side left and right-eye images. Today such stereoscopes and stereo cards are popular collectors’ items. The twentieth century has brought us an enormous variety of stereoscopic imaging technologies, from 3-D motion pictures, made practical by Edwin Land’s sheet polarizers, to books of computer-generated autostereoscopic images. Amateur interest in stereophotography has been sparked by the introduction of a variety of stereo attachments and twin-lens stereoscopic cameras. Stereoscopic image pairs have been encoded by polarization, by color, by spatial separation, and by temporal separation. In addition to the sporadic waves of interest in one or another of these stereoscopic imaging techniques for entertainment purposes, there has been a steady increase in the applications of stereoscopy to technical and scientific imaging. Aerial reconnaissance, molecular modeling, and medical imaging are familiar examples. With the rapid growth of 3-D capability in workstations and desktop computers, as well as the development of sophisticated instrumentation, we are seeing a surge of stereoscopic imaging in new fields. Design engineers, seismographers, microscopists, medical researchers, and oceanographers are finding 3-D image information indispensable. There are new technologies in both 3-D hardcopy and 3-D field-sequential LCD displays. Paralleling these activities is contemporary psychophysical research on just how the eye and brain cooperate to accomplish depth perception, both in the real world and in viewing two-dimensional representations. We offer in this Special Section a glimpse of both theoretical and practical aspects of 3-D depth perception, stereoscopic image capture, and stereoscopic image rendition. Each of the authors has presented new insights into this ever-intriguing branch of imaging science and technology. Vivian Walworth Guest Editor JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY • Volume 42, Number 4, July/August 1998 Photography in the Service of Stereoscopy* Samuel Kitrosser† Consultant, 23 Oakland St., Lexington, MA 02173 Our gift of binocular vision, the principles of stereoscopic viewing methods, and the creation of stereoscopic images are all closely related. Here we recapitulate the geometry connecting these fields and provide information that can assist in generating effective stereoscopic image pairs. Journal of Imaging Science and Technology 42: 295–300 (1998) Introduction Ever since its invention, photography has been a major source of image pairs to be used in a variety of stereoscopic viewing systems. The classical parlor stereograms, recent 3-D films at Disney theme parks, Imax 3-D presentations, and stereograms by scientists and engineers as well as by a vast number of amateur and professional stereographers rely largely on photographically produced images. The aim of the present work is to enable one to determine in advance the necessary conditions for creating stereoscopic image pairs having proper parallactic shift for comfortable viewing and at the same time having correct framing that does not require custom cropping. Those familiar with stereoscopic photography will recognize that we refer to the lens interaxial distance and to the interaxial spacing between the respective image apertures. General Considerations Principles of Stereoscopic Viewing Systems. The origins of present-day stereoscopy go back to the work of Sir Charles Wheatstone1 and Sir David Brewster2 in mid-nineteenth century England. Devices constructed by both of these inventors share the same basic concept. These devices introduce in front of the observer’s eyes a means of channeling the view seen by each eye along a separate path. At the far end of this path are located image pairs consisting of specially prepared art work or specially prepared photographs. The distinguishing characteristic of such an image pair is that they represent scenes or objects as seen from two different vantage points. In the process of stereoscopic viewing the two images are fused in the mind and merged to form a single 3-D illusion. Stereoscopic viewing devices not only display the images selectively to our eyes but also assist in superimpos- Original manuscript received April 1, 1997 * Presented in part at the IS&T 50th Annual Conference, Cambridge, MA, May 18–23, 1997. † IS&T Fellow © 1998, IS&T—The Society for Imaging Science and Technology ing the images, using optical means such as lenses, mirrors, and prisms. Some systems physically overlay the images and rely for selective channeling on other optical techniques, such as encoding with polarizers, with complementary color filters, or with multiplexing rasters. Some observers can dispense with viewing devices by “commanding” their eyes to merge a pair of side-by-side images; this technique is known as free vision. This type of viewing is very simple; however, its success still requires a correct pair of stereoscopic images. The Significance of Convergence and Accommodation in Binocular Vision and Stereoscopic Viewing. Normally we are accustomed to judging distances by subtle mental cues received from the muscular eye controls of convergence and accommodation. These two functions are coupled, and they operate in synchronism as we scan each detail of the scene observed. [Fig. 1(a)]. However, in stereoscopic viewing the tie between the accommodation and convergence functions of the observer’s eyes must be uncoupled. The viewing distance to the right- and left-eye images remains constant, whereas the merged scene details of the 3-D image appear at the intersections of the individual lines of sight, as shown in Fig. 1(b). Convergence is also important in the original capture of the stereoscopic image pair because it determines the framing of the subject matter within the apertures. One of the practical methods of achieving convergence is to allow the distance between the image apertures to be greater than the lens interaxial distance. The geometry and calculation of these distances are discussed in a later section. Awareness of the Stereoscopic Window. The stereoscopic window is a concept that is represented in practically every stereoscopic viewing system. If we look into an empty stereoscopic slide viewer we see that the image apertures appear as the boundaries of a single opening, the stereoscopic window. A similar effect is observed by looking at two blank cards through a Brewster–Holmes parlor stereoscope. Whereas the edges of the window do not participate in the creation of the 3-D image, they provide an enclosure for the 3-D image. Similarly, the edges of a stereo print form a border for the 3-D image perceived inside. 295 (a) (b) Figure 1. (a) Natural binocular vision, showing convergence at each of the three points A, B, and C; (b) viewing of a stereoscopic image pair, showing convergence at far (F), near (N), and very near (VN) points. The stereoscopic window also forms an interface between the domains of coupling and uncoupling mentioned in the previous section. Within the domain bounded by the window the accommodation of the eyes remains unchanged, locked to the distance between the observer’s eyes and the stereogram. In the case of lenticular stereoscopes the accommodation and focus are considerably relaxed, the focusing and convergence now being assisted by the lenses of the stereoscope. What is even more important is that the stereoscopic window establishes a plane of reference for the illusionary stereo image. The image may appear in front of the window, in back of the window, or both in front and in back. The placement depends on the creative imagination of the photographer in establishing the position of the 3-D image in relation to the stereoscopic window. Experimentation The studies that follow originated with work by Rule,3 who served as a consultant to Polaroid Corporation during the 1940s. Further experimental work was conducted 4 by the author in the Polaroid Research Laboratories. The Geometry of Stereo Image Recording Figure 2 is a perspective drawing of a typical camera setup, showing two lenses side-by-side, their image apertures, and the far and near points of the subject. To simplify the geometry, the far and near points, the center of the lens, and the center of the aperture for the left-eye camera are shown as collinear. Following is the key to the linear variables indicated: D d f p distance from lens to far point of subject or scene distance from lens to near point of subject or scene distance from lens to film plane parallax, the shift between furthest homologous points at the film plane T1 = lens interaxial distance T2 = center-to-center distance between image apertures w = image aperture width. 296 = = = = Journal of Imaging Science and Technology The parallax, p, should be known to the stereographer from the specifications of the particular imaging format and the intended viewing conditions. The focal length of the lens can be used as the lens-to-film distance except in the case of close-ups, which require the actual lens-to-film distance. The values sought are T1 and T2. Establishment of Parallax, p. Parallax as a linear dimension is established on the basis of limitations imposed by the uncoupling of the convergence and accommodation functions of our eyes during stereoscopic viewing. Because of the tendency of favoring a level orientation during image capture, we often refer to parallax as the “horizontal displacement” or “horizontal parallax.” We find that the linear size of the horizontal displacement is proportional to and limited by the viewing distance to the stereogram. Conversely, the viewing distance is a function of the size or width of the stereo image. Here we come to a point where an exact mathematical model perhaps does not exist. However, a ratio of parallax to image width of 1:24 emerges from the experience and actual practice of professional stereographers. As an example, in the StereoRealist format the image width is 24 mm and the recommended parallax is 1 to 1.2 mm. For the 6 × 13 cm format the recommended parallax is5a,6 approximately 2.5 mm. These specifications are stated as approximations because differences exist among observers and there is great forgiveness in the human visual system. Calculation of T1. For convenience we construct the line “L” in Fig. 2. Then p f pd = and L = . L d f T1 L = . D D−d DL p Dd = × . T1 = D−d f D−d (1) Kitrosser Figure 2. A typical stereoscopic camera setup, showing two side-by-side lenses of focal length f, separated by the distance T1, and the two image apertures. The value T2 is the center-to-center distance between the camera apertures, N is the near point of the scene, and F the far point. The value D is the distance from the lens plane to the far point, d is the distance from the lens plane to the near point, and p is parallax. If D = ∞, T1 = p ×d . f (2) Calculation of T2. The value of T2 can be determined from the same diagram. T2 T = 1. d+ f d d+ f . T2 = T1 d p 1 1 1 × = − . f T1 d D (4) Then, for a given value of D, (3) Note that T2 is only slightly larger than T1. In some cases the construction of the stereo camera prevents adjusting the T2 dimension—for example, on side-by-side 70-mm motion picture cameras—but still allows the adjustment of the interlens distance, T1. This adjustment is usually small enough to achieve the desired T1 and sufficient to provide the needed convergence between the camera units. An alternative is to use auxiliary optical devices.7a A few custom cameras built for close-up work provide laterally adjustable lens mounts that allow for independent adjustment of T2. A patent of Land, Batchelder, and Wolff describes a camera with fixed interaxial distance but with coupled convergence and focusing adjustments.8 This feature made it possible to maintain the stereo window at the near distance. Photography in the Service of Stereoscopy Calculation of D and d. To calculate D and d we rewrite Eq. 1 as follows: TfD 1 1 p 1 = × + and d = 1 . d T1 f D pD + T1 f (5) Similarly, for a given value of d, T fd p 1 1 = − and D = 1 . D d T1 f T1 f − pd (6) The Design of a Lens Interaxial Calculator. To simplify and speed up the calculation of T1 we designed the Polaroid Interocular Calculator.9 The reciprocal form of Eq. 1, as shown in Eq. 4, enabled us to design this calculator in the form of a circular slide rule. An original calculator is shown in Fig. 4. A convenience of such a calculator is the rapid evaluation of any of the variables in cases where the equipment does not allow full control of needed adjustments. We provided several hundred of these calculators to stereo Vol. 42, No. 4, July/Aug. 1998 297 Figure 4. The experimental 5˝ × 7˝ studio stereo camera. In the case of cloud pictures from an airliner window, here is some arithmetic. At 800 km/h, the plane travels 220 m/s. A rapid sequence stereo pair a half-second apart would make T1 = 110 m. Multiplication by 30 gives a distance of about 3.3 km, a fair distance for pictures of cloud formations. Oblique shots may also be practical. At an altitude of 10,000 m, 1/30 is 333 m of travel, equivalent to about 1.3 s interval between shots. Figure 3. The Polaroid Interocular Calculator, a circular slide rule for determining appropriate lens interaxial distance, given the intended final width of the image to be viewed, the lens-film distance, the width of the negative image, and the near and far distances, (Reprinted with permission of Polaroid Corporation). photographers here and abroad. The calculator has been cited in several publications on stereoscopic photography.7b,10 Before going further in our discussion it is appropriate to note the compatibility of the above calculations with traditional rules and recommendations. The 1/30 Rule. Ferwerda5b and other writers on stereoscopy11 often recommend an interaxial value of 1/30 the near distance. The origin of this rule lies in the assumptions that the subject extends to infinity and that the focal length of the lens is the conventional 1.25 times the image width. The rule is valid under these conditions and it is useful, but it does not apply to medium shots and close-ups. There it does not make full use of the available depth effect. Following is the rationale for the 1/30 rule. If we solve Eq. 2 for the interaxial distance, T1, given f = 1.25 w and p = w/24, T1 = 1 1 w × ×d = ×d . 24 1.25w 30 Sidestep Landscapes and Aerial Photographs. Both Wing12 and Weiler13 provide guidance for producing stereo image pairs with a single handheld camera. In these cases we again assume a camera with a conventional focal length lens. For the sidestep method described by Wing, it may be useful to assume that a 1-ft sidestep is good for a near distance of 30 ft, 2 ft for 60 ft, and so on. 298 Journal of Imaging Science and Technology Stereoscopic Cameras. Stereoscopic cameras follow the pattern of our natural binocular vision by recording a pair of images from two different vantage points. The traditional amateur stereoscopic camera is a side-by-side twolens unit with adjustable focusing. The fixed lens interaxial distance is smaller than the aperture interaxial distance, resulting in a convergence that establishes the stereoscopic window at about 2 m from the camera. Using a camera with fixed lens interaxial we can still obtain correct parallax if we balance the subject composition within proper limits of far and near distances, D and d, as indicated by Eqs. 5 and 6. A number of stereo photographers use single or double cameras on a mechanical slide bar or have two cameras modified by permanent coupling. The greatest progress in stereo camera construction has been made in the field of professional motion picture equipment. Here we find single track, side-by-side format on 70-mm film and double-track 35-mm cameras, in both cases with elaborate optical and mechanical refinements.7c For much of our research at Polaroid we used stereoscopic image pairs made with cameras such as the splitfield 5˝ × 7˝ Speed Graphic and aerial reconnaissance photographs from government sources. A 5” x 7” Studio Stereo Camera. We also constructed an experimental studio-type stereo camera (Fig. 5). We coupled two 5˝ × 7˝ Burke James view cameras on a slidebar-type bed at 90° to one another, separated by a 45° semitransparent mirror. The large negatives and color transparencies were easy to evaluate, and both enlargements and reductions were made as needed. The lens interaxial distance could be adjusted from zero to about 20 cm, and the T2 distance could be adjusted by the “sliding back” mechanism. Two 30-cm Wollensak Velostigmats were operated simultaneously by twin cable releases, and the whole assembly balanced well on an Ansco studio tripod. Experimental Stereoscopic Images. We used the 5˝ × 7˝ camera and the Polaroid Interocular Calculator to generate Kitrosser a series of studio photographs that included long shots, medium shots, and close-ups. We also made photographs of tabletop subjects. We obtained very gratifying results confirming the validity and usefulness of a reference parallax-to-image width ratio of 1:24. We also found that a ratio of 1:50 was of limited practical use because it provided insufficient depth effect. A ratio of 1:100 gave still less parallax and approached the limits of stereo acuity. We concluded that the maximum parallax afforded by the ratio 1:24 was more satisfactory, even when it resulted in Lilliputism or other distortions. Our observations during recent years have been consistent with these findings. Discussion Geometric Characteristics of a Stereoscopic Image Pair. We have described the principles of stereoscopic viewing devices and the geometry of stereoscopic photography, and we have briefly discussed stereoscopic cameras and their use to achieve optimal stereoscopic effects. Now we can delineate the most important characteristics of a stereo image pair—first, the parallax, which will determine the visualized depth of the 3-D image, and second, the framing, which will locate the image within the 3-D space. Methods of Measurement and Evaluation. Original stereoscopic negatives or transparencies are usually too small for superimposition over a light table and must be compared under magnification. Parallax is measurable if we superimpose in register the homologous points of the nearest detail and measure the shift of the homologous points of the farthest detail. Assuming that the images are presented in correct orientation, the shift will be displayed in the right-eye image. We can also observe and evaluate the coincidence of the image borders, which will give a clue about the correct framing. Spatial Presentation. We can separate the presentation of the subject matter within the 3-D space into three groups: Group A, all of the image to appear beyond the stereo window; Group B, part of the image beyond the stereo window and part extending forward from the stereo window; and Group C, all of the image in front of the window . We could also add a subgroup to C for images that “float” in front of the window. Group A. This is the most conventional case. A majority of stereo cameras are constructed with this type of presentation in mind. To have all of the image appear beyond the stereo window and be comfortable to view, the recommended maximum parallax is approximately 1/24 of the width of the image, as discussed earlier. With stereo prints to be viewed directly and transparencies projected onto screens, the 1:24 ratio will allow a maximum width of 1.5 m, at which point the lateral shift on the screen will reach its limit of 65 mm. For very large screens viewed at appropriate distances, the eyes will tolerate greater separation. Correct framing is achieved when the near homologous points are in register and the right and left borders of the stereo image coincide. Group B. To obtain the effect of a partial image forward of the stereo window, we will need a total parallax greater than 1/24 of the image width. If we double this amount, we can use 1/24 for the portion behind the window and another 1/24 for the portion of the stereo image in front of the window. The borders of the image must coincide when the zero parallax homologous points are su- Photography in the Service of Stereoscopy perimposed in register. For viewing comfort it is also desirable that the subject matter be composed fully within the borders of the stereo window. When viewed on a 1.5-m screen, such an arrangement should bring the near part of the image halfway between the observer and the screen. In other presentations the image’s placement will vary according to the viewing distance. Group C. When the entire image is created in front of the stereoscopic window, the observer’s eyes will be in a convergent position. This situation seems to be more easily tolerated than looking at an image beyond the window, and the uncoupling of accommodation and convergence has a wider tolerance. In observing drawings of geometric solids that appear placed on a tabletop, a lateral shift of 1/10 of the image width is easy to accept. If the entire image is to appear in front of the stereo window, then the edges of the image pair should be in coincidence when the far homologous points are in register. Summary Properly applied, the tools we have described facilitate the generation of effective stereo images and permit the full enjoyment of stereoscopic presentations. The system may be summarized as follows: 1. Select the value of the parallax, p, according to the format of the camera and the requirements of the viewing method, including the choice of location of the stereo window. 2. Establish the near and far distances from lens to the subject or scene to be photographed. 3. Note the focal length of the taking lens or, for extreme close-ups, the lens-to-film distance. 4. Calculate T1, using Eq. 1 or using an interocular calculator such as the one shown in Fig. 4. 5. According to the location of the stereo window in relation to the reconstructed image, calculate T2, using Eq. 3. 6. Given a camera with fixed interaxial distance, calculate D and/or d to determine suitable stereo composition. Alternative Methods for Achieving Parallax. Stereoscopic pairs generated by means other than paired camera exposures and intended for viewing by conventional stereo methods must follow similar guidelines for parallax and framing in the image output. The geometry of recording the stereo pairs will be specific to the technology of the image capture source. Conclusion Stereoscopic imaging provides an interface between the real-life scene and the creative presentation possibilities offered by various stereoscopic viewing methods. Parallax and convergence have been discussed as major contributing factors in the achievement of satisfactory stereoscopic results. We recognize the lack of suitable stereoscopic equipment for the full implementation of the material presented here. However, the information is still of value to any stereoscopist, whether a photographer with a conventional twin-lens stereo camera, a computer user generating stereoscopic images, an SEM microscopist, or a painter creating stereoscopic artwork. Acknowledgments. The author extends deep appreciation to many friends and colleagues for helpful discussion of the subject and especially to Vivian Walworth for her encouragement and cooperation. I also greatly appreciate the assistance of Jay Scarpetti and his associates at the Rowland Institute for Science. Vol. 42, No. 4, July/Aug. 1998 299 References 1. 2. 3. 4. 5. 6. 300 C. Wheatstone, On some remarkable, and hitherto unobserved, phenomena of binocular vision, Phil. Trans. 1838, 371-394 (1838). D. Brewster, The Stereoscope. Its History, Theory, and Construction, London, 1856, Facsimile Edition, Morgan & Morgan, 1971. J. T. Rule, The geometry of stereoscopic projection, J. Opt. Soc. Am. 31, 325 (1941). S. Kitrosser, Stereoscopic photography as applied to Vectographs, in Polaroid Organic Chemical Research Seminars, Vol. 1, Polaroid Corp., Cambridge, MA, 1945. J. G. Ferwerda, The World of 3-D: A Practical Guide to Stereo Photography, Netherlands Society for Stereo Photography, 1982 (a) p. 238; (b) pp. 103–104. F. G. Waack, Stereo Photography: An Introduction to Stereo Photo Tech- Journal of Imaging Science and Technology 7. 8. 9. 10. 11. 12. 13. nology and Practical Suggestions for Stereo Photography, translation by L. Huelsbergen, Reel 3-D Enterprises, Culver City, CA, 1985, p. 13. L. Lipton, Foundations of the Stereoscopic Cinema: A Study in Depth, Van Nostrand Reinhold, New York, 1982, (a) p. 169; (b) pp. 147–148; (c) pp. 49–50. E. H. Land, A. J. Bachelder and O. E. Wolff, U.S. Patent 2,453,075 (Nov. 2, 1948). S. Kitrosser, Polaroid Interocular Calculator, PSA J. 19B, 74 (1953). W. H. Ryan, Photogr. J. 125, 473 (1985). D. Burder and P. Whitehouse, Photographing in 3-D, 3rd ed., The Stereoscopic Society, London, 1992. p. 6. P. Wing, Hypers by walk, water, wire, and wing, Stereo World 16, 20 (1989). J. Weiler, Tips for hypers from airliners, Stereo World 16, 33 (1989). Advancements in 3-D Stereoscopic Display Technologies: Micropolarizers, Improved LC Shutters, Spectral Multiplexing, and CLC Inks Leonard Cardillo and David Swift VRex, Inc., Elmsford, New York John Merritt Interactive Technologies, Princeton, New Jersey An overview of four new technologies for stereoscopic imaging developed by VRex, Inc., of Elmsford, New York, is presented. First, the invention of µPol micropolarizers has made possible the spatial multiplexing of left and right images in a single display, such as an LCD flat panel or photographic medium for stereoscopic viewing with cross-polarized optically passive eyewear. The µPol applications include practical, commercially available stereoscopic panels and projectors. Second, improvements in fabrication of twisted nematic (TN) liquid crystals and efficient synchronization circuits have increased the switching speed and decreased the power requirements of LC shutters that temporally demultiplex left and right images presented field-sequentially by CRT devices. Practical low-power wireless stereoscopic eyewear has resulted. Third, a new technique called spectral multiplexing generates flicker-free field-sequential stereoscopic displays at the standard NTSC video rate of 30 Hz per frame by separating the color components of images into both fields, eliminating the dark field interval that causes flicker. Fourth, new manufacturing techniques have improved cholesteric liquid crystal (CLC) inks that polarize in orthogonal states by wavelength to encode left and right images for stereoscopic printers, artwork, and other 3-D hardcopy. Journal of Imaging Science and Technology 42: 300–306 (1998) Introduction Since Wheatstone (1838) first reported that binocular disparity is the cue for stereopsis, or what he called “seeing in solid,” many new techniques and devices for producing stereoscopic views from left and right perspective flat images have evolved.1 Beginning with the Helioth–Wheatstone stereoscope, every new technique or device has in common some advancement in one or more of the three necessary conditions to simulate depth: (1) a means to capture left and right perspective views; (2) a means to combine, or multiplex, those views; or (3) a means to deliver each view to the correct eye, or demultiplex. The Wheatstone stereoscope had (1) two perspective views captured by artists, or, early in the history of photography, captured by twin pho- Original manuscript received September 15, 1997 © 1998, IS&T—The Society for Imaging Science and Technology 300 tographs; (2) as a means of multiplexing, simply placing the views side-by-side on the same stereogram viewing card; and (3) as a means of demultiplexing, providing a viewing aperture and convergence optics for each eye in front of the stereogram and a septum between the viewing apertures. A breakthrough in stereoscopy was the invention of practical polarizers by Edwin Land2 in 1932, with Land devising 3 the cross-polarized multiplexing/demultiplexing technique for stereoscopic films in 1935. Improvements in the 3 necessary conditions to simulate depth came from (1) the rapid development of dual motion picture cameras by Zeiss-Ikon and Ciné-Kodak; (2) multiplexing by superimposing on a metallized screen a projection of the left image through a P1 state polarizer and the right image through a P2 state polarizer; (3) demultiplexing with polarized glasses, P1 state at the left eye and P2 state at the right eye to pass polarized light of the same phase and extinguish cross-polarized light. Land’s technique has been noted because many innovations in stereoscopy, including the four to be outlined here, Figure 1. One- and two-dimensional µPol arrays. The one-dimensional pattern is used for TFT-LCD displays, with a half-period resolution of 201 µm used for commercially available 1280 × 1024 panels. Now µPols with half-period resolutions less than 20 µm are possible with present manufacturing techniques. are based in some way upon cross-polarization. We will outline the following: 1. µPol micropolarizers for spatial multiplexing. 2. Improved LC shutters and electronics for temporal multiplexing. 3. Spectral multiplexing for flicker free video at standard video rates. 4. Cholesteric liquid crystal (CLC) inks. The µPol The µPol, invented by Faris in 1991, provided the necessary multiplexing and demultiplexing functions for stereoscopic LCD displays and stereoscopic hardcopy printing.4 The µPol is a passive optical element that transforms incident unpolarized light into periodic, spatially varying (square wave form) polarized light with polarization alternating between two orthogonal states P1 and P2 (linear or circular). The most common fabrication method for µPol is photo-lithography to form a specific micropattern. An additive method prints a pattern on the PVA surface with a high-precision gravure cylinder and iodine-based dichroic ink, producing P1 and P2 micropolarizers; a subtractive method prints the desired pattern on the PVA with photoresist, then bleaches away exposed parts, producing a λ/2 waveplate in a pattern to be optically coupled with a polarized source.2 When the image source is a Thin-Film-Transistor (TFT)-LCD panel, all transmitted light is polarized in a P1 state since a P1 “analyzer” is incorporated over the electrically controllable birefringent (ECB) “light valve” that turns each pixel on or off. With the patterned µPol placed over the panel, active portions of the λ/2 waveplate rotate the phase of light polarization from P1 to P2, while ablated portions leave P1 unchanged. As illustrated in Fig. 1, the µPol can be either a onedimensional or two-dimensional array with half-periods as small as 20 µm possible with current manufacturing processes. For TFT-LCD displays, the one-dimensional pattern is used, the finest resolution to date having a halfperiod of 201 µm on a 15˝ diagonal 1280 × 1024 panel. The first step in creating the stereoscopic image is by spatially multiplexing the left and right perspective views of a 3-D scene, as illustrated in Fig. 2. The left and right images, which are represented by pixel arrays, are spatially modulated with the modulators MOD and MOD, producing the spatial patterns that are then combined into a spatially multiplexed image (SMI). The multiplexing algorithm can be implemented in software, hardware, or by optical means; the µPol itself can perform the multiplexing function when placed in front of the CCD array of a camera or a photographic medium. By placing a µPol in contact with an SMI having the same spatial period, the demultiplexing step is carried out as shown in Fig. 3. The µPol codes each pixel of the right Advancements in 3-D Stereoscopic Display Technologies: ... Figure 2. Spatial multiplexing. Images from digital sources are multiplexed with software, images from video sources are multiplexed with field-switching hardware, and photographic images can be multiplexed using the µPol array self-aligned to the film for both multiplexing and demultiplexing. Two-dimensional multiplexing is shown. Figure 3. Demultiplexing a spatially multiplexed image (SMI). Right image pixels are aligned with the P1 elements of the µPol array, left image pixels with P2 elements. Through cross-polarization, only the right image pixels are transmitted through the polarized lenses to the right eye and left image pixels to the left eye. Vol. 42, No. 4, July/Aug. 1998 301 image with a polarization state P1 and each pixel of the left image with state P2, thus encoding the two images. The viewer, wearing a pair of polarized glasses (or looking through a polarized visor), is able to view the right image only with the right eye and the left image only with the left eye, fusing the two views into a stereoscopic image. Because the left and right image information is simultaneously present in a single frame, the technique is general purpose. The SMI could be displayed by conventional devices, printed by conventional printers, recorded by photographic cameras, and video cameras, or projected by a single slide or a single movie projector. In all cases, color 3-D stereo images can be produced. In contrast, techniques that produce stereo images by means of two separate left and right frames (sequential or in parallel) do not have the µPol’s range of application and are also incapable of producing 3-D hardcopy. µPol Applications and Products. VRex has incorporated µPol technology in commercially available 3-D stereoscopic LCD panels and projectors ranging from 640 × 480 pixel resolution to 1280 × 1024. Other µPol-based devices in production include a 3-D LCD notebook computer, an interactive 3-D information center utilizing a touch-screen LCD and polarized visor, and an immersive environment consisting of wrap-around rear-screen 3-D projection. Hardcopy has been produced using photographic medium with a self-aligned µPol for both multiplexing during exposure and demultiplexing during viewing. Improved Liquid Crystal Shutters A drawback to µPol display applications is their unsuitability for CRT devices. The thickness of a CRT display would position a µPol 10 to 20 mm in front of the image plane scanned on the phosphor screen, introducing parallax between horizontal image raster lines and horizontal µPol lines when viewed above or below the plane orthogonal to the CRT screen. This parallax results in a limited viewing zone, with cross-talk or pseudoscopic images perceived outside this zone. A solution lies in finding polarized material that can be coated inside the CRT in front of the phosphor screen and can withstand the intense heat generated by the cathode heater filament; until then, shutter devices are the preferred technique to demultiplex stereoscopic left and right perspective images time-multiplexed on the CRT. The theory of operation for stereoscopic viewing of CRTs through shuttered eyewear is simple. Images are timemultiplexed so a left perspective image is displayed on the CRT device when the left eye shutter is open and a right perspective image displayed when the right eye shutter is open. At suitable repetition rates, the viewer perceives a continuously present 3-D image. In video applications, the two interleaved fields in each frame of an NTSC display provide a convenient multiplexing method: the right image encoded in Field 1, the left 16 ms later in Field 2. PC monitors driven in page-flipped or interlaced mode provide even faster repetition rates. Early time-multiplexed implementations used mechanical shutters,6,7 but these shutters were cumbersome and obtrusive. PLZT ceramics were an interim solution for shutters,8 but now most devices use variations of liquid crystal (LC) shutters.9 In general, an LC shutter consists of an electrically controllable birefringent (ECB) plate in which molecules are in a liquid-crystalline state. When an electric field is applied across the plate, the molecules align in parallel, producing a 90° phase shift between the horizontal and vertical components of linearly polarized 302 Journal of Imaging Science and Technology light passing through the plate; when the field is removed, the molecules return to random alignment, passing the components of the polarized light in phase. Using a second polarizer, or analyzer, on the exit side of the plate, light is passed when a voltage is applied and the ECB plate is phase shifted, and light is extinguished when the voltage is removed and the plate is in ordinary phase. The reader is referred to Bos10 for an excellent review of LC shutter material and operation. Two basic liquid crystal types are the Π-cell and the twisted nematic (TN) cell; to date, most shutter glass systems have used Π-cells.11 Although the Π phase shifting material used in these cells allows extremely fast switching times (<3 ms), they require very high excitation voltages, 20 Vp-p minimum, which makes wireless battery-powered operation difficult and costly to achieve. A second characteristic of Π-cells is that the cell is not transparent when the excitation voltage is removed; with no power applied the cell will retain a semi-opaque color hue. A low-level excitation voltage, 8 Vp-p approximately, needs to be applied to the cell to achieve transition from the full transparent to the full opaque state. A practical time-multiplexed stereoscopic shutter system using TN liquid crystals has recently been developed.12 A major advantage of TN cells is that, unlike Π-cells, no background excitation voltage is needed to keep the shutters in the transmissive state. However, TN cells have had the disadvantage of slow transition time (>10 ms) from the transmissive to the opaque state and back to transmissive as excitation voltage is applied and disconnected.13 In addition, the transmissive to opaque (turn on) time may differ from the opaque to transmissive (turn off) time. However, the performance of the TN cell has been optimized both in the manufacturing process of the cell itself and in the timing of the applied excitation voltage to overcome these limitations. A major improvement was reducing cell thickness to a minimum for faster switching and lower excitation voltages. This was key to obtaining long battery life in wireless battery shutter glasses because no high-voltage dc–dc converters would be required as with Π-cell shutters. During normal operation, the shutter drivers draw 130 µA when shutters are transmissive with a DC signal, 150 µA with a 60-Hz signal and 200 µA with a 120-Hz signal. Each shutter can switch at frequencies in excess of 120 Hz with no interfield cross talk. Higher frequency switching was accomplished by synchronizing shutter transitions with video fields. Previous devices synchronized the shutter transition to the beginning of each video field: once a vertical reset pulse or similar signal was detected, pulse coded information was sent to toggle the optical state of the shutters. For the shutters to change state before the first line of displayed video, the pulse codes had to be very short, requiring high speed circuitry in the receiver that consumed much power. To reduce the power requirements of the present system further, the field identification information is sent prior to the vertical blanking interval so the pulse information may be transmitted at a much slower rate. The detection circuitry functions at a slower frequency and battery life is greatly increased. A further improvement implemented in the stereoscopic shutter system was the ability to synchronize the shutters to all popular display formats used by TVs and PCs: the IR transmitter that sends driving codes to the shutters in the eyewear can detect synchronization signals, polarities, and frequencies present in all VGA, SVGA, and XGA computer formats, as well as NTSC and PAL video sources. The image field rate and the mode of operation, i.e., 2-D, 3-D interlaced, or 3-D page flipped, is determined Cardillo , et al. Figure 4. Stages of spectral multiplexing. In Stage 1, r,g,b image pixels are captured and in Stage 2, pixels are separated into a magenta buffer (r + b) and a green buffer (g) for the left and right images. In Stage 3, filler pixels are added so pure red, green, blue, or magenta areas of the image will not be dark during the alternate field. Stage 4 shows the field-sequential presentation of left and right images. from these signals. Detecting carrier synchronization signals is a benefit because earlier attempts at field coding required tagging the video content itself with black and white markers on a horizontal scan line.14 Shutter Applications and Products. The largest commercial application of the improved LC shutters is in stereoscopic eyewear called VR Surfer. System software enables the user to optimize the performance of the shutter system to a particular PC monitor: a basic mode of operation is provided that encodes image identification information in the display sync signals during the vertical reset interval. This mode will enable the display of stereoscopic images in DOS applications but does possess some degree of perceived image flicker because the image switching rate is in the 60 to 72-Hz range. An advanced mode detects the video card chip set driving the PC monitor and will automatically implement the best stereoscopic display mode at rates up to 120 Hz. In this mode, Microsoft Windows applications are supported. The system is also compatible with field-sequential stereoscopic video in NTSC and PAL standards. While the improved LC shutters have been used chiefly for demultiplexing in eyewear, a device is under development that uses the shutters to multiplex stereoscopic perspective views to a single CCD recording device. Specifically, if a beam splitter is placed in the primary viewing path of a video camera and mirrors relay a second perspective view offset from the primary, the left and right perspectives necessary for a stereoscopic view will be imaged on the CCD. By placing a pair of LC optical shutters in each of the viewing paths, each viewing perspective can be alternately imaged by the camera if the shutters are opened and closed in synchronization with the video field output. By convention, the right perspective will be encoded in Field 1 of the video frame and the left perspective in Field 2. A promising commercial application for the device is as an attachment for the home video camcorder. Spectral Multiplexing Because of the predominance of television as a display medium, it would be beneficial if field-sequential multi- Advancements in 3-D Stereoscopic Display Technologies: ... plexing could operate at NTSC television standard 60-Hz field rates with no flicker at all. In the 60-Hz standard field-sequential LC shutter system just described, some residual flicker is perceived because left and right images are coded in one field of video so there is alternately in each eye the full brightness of the image and a 16.6-ms dark interval. The human visual system will integrate light energy over time, reaching a critical fusion frequency (CFF) beyond which these alternating light and dark intervals are perceived as steady, but at normal viewing luminance, the 60 Hz field rate is15,16 below the CFF. Flicker is more apparent with brighter displays, the fusion threshold increasing with the logarithm of luminance according to the Ferry–Porter law,17 and the contours defining images increase the threshold even more because contrast increases with fast visual “off” transients generated at image offset.18 While some evidence exists that visual persistence of stereoscopic stimuli is longer than mono-planar stimuli,19,20 thus decreasing the fusion threshold, the effect is not long-lasting enough to bridge the 16-ms dark interval. To prevent flicker, a new field-sequential system is under development that eliminates the dark interval within the video frame, maintaining light energy at both eyes at all times, decreasing or eliminating luminance modulation between video fields.21 Figure 4 shows the image capture and multiplexing functions of the spectral multiplexing technique. In Stage 1, left and right perspective views from cameras or computer graphics are captured and analyzed on a pixel-by-pixel basis. In Stage 2, each is separated into two spectral buffers: one for red and blue (r + b) and one for green (g). (Note that “pixel” here refers to an individual color component, not one of three points comprising a color.) The luminance at each eye is of different wavelength components, but these will still summate luminance to decrease the CFF relative to light/dark stimulation.22 At this stage, the pixel data in the buffers could be field-sequentially multiplexed, as shown in the final stage, so each eye receives light energy during both Field 1 and Field 2 presentations, a technique similar to that described by Street.23 However, pixels that represent a pure primary color (r or b or g) or magenta (r + b) will not have spectral components in one of the buffers and will still be dark Vol. 42, No. 4, July/Aug. 1998 303 Figure 5. Eyewear for spectral demultiplexing. Left and right eye optics are identical: shown in order from the image back to the eye are green cholesteric liquid crystal filters in a P1 state, magenta cholesteric liquid crystal filters in a P2 state, active TN cells, and a broadband polarizer (analyzer) in a P1 state. The TN cells are shown in a state to transmit the Field 1 image; to transmit the Field 2 image, the voltage polarities are reversed. during one field. The advancement over Street’s technique is in maintaining some spectral luminance at the eye even when a primary- or magenta-colored object has no spectral components in the subsequent field. Therefore, at Stage 3 each pixel is analyzed, zero values identified, and a “filler pixel” inserted. A pixel of a suitable minimum luminance value replaces dark r + b pixels in one buffer or g pixels in the other buffer for each eye’s view, maintaining energy at all pixels in both eyes during both Field 1 and Field 2. The luminance of filler pixels from the alternate buffer is chosen to shift the chromaticity or saturation of the r,g,b, or magenta color a minimum perceived amount in color space24 yet maintain enough energy during the otherwise dark field to prevent flicker. This is possible because the visual system does not discriminate colors perfectly, with observers perceiving similar colors with substantial shifts in wavelength. Attempted isomeric or metameric matches to a given wavelength show large just noticeable differences (JND) in chromaticity space as well as saturation space.25 Suitably chosen color pixels can be added to the otherwise dark pixel space in the alternate field to summate with the luminance of the primary r,g,b or magenta pixels, resulting in minimum shifts in perceived color yet decreasing luminance modulation, so flicker is not perceived. After the frame buffers are updated with “filler pixel” data, the buffers are shifted into the NTSC field format in Stage 4: r + b pixels from the left view and g pixels from the right view into Field 1; g pixels from the left view and r + b pixels from the right view into Field 2. An ordinary CRT monitor with NTSC video input displays the field sequential information directly or from standard recorded video tape. Figure 5 shows the implementation of the spectral demultiplexing function at the eyes, with eyewear consisting of passive green and magenta CLC filters, active electronic TN cells, and passive broadband polarizers. The CLC filters pass their respective colors circularly polarized, right-hand polarized greens (P1) and left-hand polarized magentas (P2). The TN cells are EBC devices synchronized with Field 1 and Field 2 of the NTSC signal using the same circuit techniques described previously. During Field 1, the left eye’s TN cell is activated (V-), reversing the polarized state passed by the filters while the right eye’s TN cell is inactive (V+), maintaining the polarization. Figure 6 shows the Field 1 state of the TN cells; during Field 2 the right eye’s TN cell switches to V- and 304 Journal of Imaging Science and Technology the left to V+. The broadband polarizer, or analyzer, passes P1 light and blocks P2. Referring to Fig. 4, Stage 4, the CRT monitor displays r + b (magenta) information from the left eye image and g information from the right eye image simultaneously during Field 1. During Field 2, g information from the left eye and r + b information from the right eye is displayed simultaneously. By alternately activating and de-activating the TN cells in synchrony with Field 1 and 2, it can be seen that the eyewear performs the appropriate spectral demultiplexing function, passing left and right image information to the correct eyes for field-sequential stereopsis without flicker. 3-D Printing Based on CLC Inks A new method of printing 3-D images has been made possible through several patented processes for manufacturing inks based on CLC materials.26,27 This CLC ink can be made right circularly polarized (RCP) as well as left circularly polarized (LCP) to enable polarization multiplexing of the left and right images, with circularly polarized 3-D glasses demultiplexing images. Applications for these new inks include 3-D hardcopy from inkjet printing, offset printing, gravure, and silk-screen processes that can be printed on any paper, fabric, or other medium. CLC Properties. CLC is a nematic liquid crystal with chiral additives or polysiloxane side-chain polymers that cause the cigar-shaped molecules to be spontaneously aligned in an optically active structure both LCP and RCP. Chiral additives give the CLC molecules a degree of twist and helical structure with pitch “p.” (Pitch can be thought of as the length of a bolt that a nut must travel to make a 360° turn, tighter pitch resulting in less travel.) The higher the concentration of the chiral additive, the tighter the molecules are arranged in the helix, resulting a shorter pitch. CLCs can be either left-handed (LH) or right-handed (RH), each having a unique property known as selective wavelength reflection. When light is incident upon the CLC surface, the selective reflection is described by the following equation: λ = λo = na p, (1) where λ is the reflective wavelength, na is the mean index of refraction (approx. 1.6) of the CLC material, and p is Cardillo , et al. Figure 6. Reflective properties of left-handed CLC. The wavelength of light reflected is determined by the pitch of the CLC, and the direction of polarization, RCP or LCP, is determined by the direction of helical twist of the CLC. the pitch of the helical structure. Figure 6 illustrates the selective reflection of LH CLCs. If the CLC is RH, then it reflects 50% of the incoming light at the selected wavelength in RCP light and transmits 50% of that wavelength in LCP light. All other wavelengths are transmitted through the material. Similarly, LH CLC material will reflect LCP light and transmit RCP light. The reflected wavelength, or color, can be tuned by changing the length of the pitch, which is dependent on the chiral additive concentration. The polarization of the reflected light, RCP or LCP, can be altered by using RH or LH CLC material. New CLC Ink Fabrication Processes. A major breakthrough in CLC ink fabrication was a process to make the inks useable at room temperature. Formerly, room-temperature use required CLC to be in the liquid phase, encapsulated or confined to cells; in the solid phase, the CLC inks had to be applied at very high temperature. In the new process,26 molten CLC material above the glass temperature (for polysiloxane-based CLC polymers, about 120°C to 150°C) is deposited onto a rotating belt and aligned using a knife edge. After a cooling stage, the CLC film is transferred to another rotating belt coated with an adhesive. The second belt, after receiving the CLC film, goes through an air jet stage where an ultrasonic air jet or air jet with fine powder abrasives removes the ultrabrittle CLC film from the adhesive. The result is tiny CLC flakes that retain the helical structure normal to the CLC flake surface. The CLC flakes range in thickness from 1 to 20 µm and in size from 5 to 75 µm with an average size of 25 µm. The geometry of the flakes can be regular or irregular. To produce the CLC inks, these CLC flakes are mixed with a host fluid or host matrix, the carrier. The carrier must be chosen for suitable tackiness, drying speed, adhesion to surfaces, friendliness to environment, etc., depending on the application: offset printing, ink-jet printing, painting, drawing, xerography, or other imaging methods. When the CLC flakes are mixed with a suitable host matrix such as wax or other sticky material that is a solid at Advancements in 3-D Stereoscopic Display Technologies: ... room temperature, crayons, pencils, or other drawing devices can be made. Printing 3-D Images using CLC Inks. Unlike conventional pigments and dyes, CLCs work on a reflective mechanism, with six types of crystal necessary to render the visible spectrum in two polarization states: red, blue, and green pitched crystals in LCP and RCP states. (In practice, it has been found necessary to use two additional types, namely, white pitched crystals in each polarization state). Instead of printing or plotting on a white piece of paper, CLCs are applied onto light absorbing or nonreflective surfaces. When viewed by itself, the ink appears almost transparent. When the ink is applied to a black paper or other medium, the color corresponding to the wavelength of the CLC material can be seen. All of the other colors are absorbed by the black medium. Moreover, if viewed with an RCP or LCP polarizer, the CLC material will be seen only through the same-phase polarizer. It is the circular polarization property of the CLC inks that is used to multiplex left and right images for 3-D stereoscopic printing, the left image put on the black substrate using LCP ink and the right image using RCP ink. The 3-D images are demultiplexed at the eyes by cross-polarization through ordinary circularly polarized glasses. Conclusion Four new advancements in 3-D stereoscopic display technology developed by VRex, Inc., of Elmsford, New York, have been outlined. µPol optics have been applied to 3-D LCD displays with benefits of reduced cost, self-alignment of images, compatibility with video and computer standards, and single projector implementation of 3-D, leading to commercial desirability over other cross-polarization displays for multiple viewers. Improvements in LC shutter materials and electronics have led to commercially desirable 3-D eyewear for personal viewing of stereoscopic 3-D images from TVs and PCs, while the technique of spectral multiplexing, now under development, promises Vol. 42, No. 4, July/Aug. 1998 305 flicker-free time-multiplexed stereoscopic content from popular, low-cost 60-Hz TV and video displays. Finally, advancements in CLC inks have led to stereoscopic hardcopy printable on any medium. The first practical CLC applications are for posters and clothing using a silk screen process, with other printing techniques, including ink jet, under development. 12. 13. References 16. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 306 C. Wheatstone, Phil. Trans. Roy. Soc. London 128, 371–394 (1838). M. McCann, ed., Edwin H. Land’s Essays. Volume I: Polarizers and Instant Photography, Society for Imaging Science and Technology, Springfield, VA, 1993. R. M. Hayes, 3-D Movies: A History and Filmography of Stereoscopic Cinema, McFarland, Jefferson, NC, 1989. S. M. Faris, Micro-polarizer arrays applied to a new class of stereoscopic imaging, SID Dig. 38, 840–843 (1991). S. M. Faris, U.S. Patent 5,327,285 (1994). R. J. Beaton, R. J. DeHoff and S. T. Knox, Revisiting the display flicker problem: refresh rate requirements for field-sequential stereoscopic display systems, Dig. Tech. Pap. SID Int. Symp. 17, 150 (1986). J. Lipscomb, Experience with stereoscopic display devices and output algorithms, Proc. SPIE Non-Holographic True 3-D Display Techniques 1083, 28–34 (1989). J. A. Roese and A. S. Khallafalla, Stereoscopic viewing with PLZT ceramics, Ferroelec. 10, 47 (1976). J. A. Roese, Liquid crystal stereoscopic viewer, U.S. Patent 4,021,846 (1977). P. J. Bos, Liquid crystal shutter systems for time-multiplexed stereoscopic displays, in Stereo Computer Graphics and Other True 3-D Technologies, Princeton University Press, Princeton, NJ, 1993. P. J. Bos and K. R. Koehler-Beran, The pi-cell: a fast liquid crystal optical device, Mol. Cryst. Liq. Cryst. 113, 329 (1984). Journal of Imaging Science and Technology 14. 15. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. S. M. Faris, U.S. Patent pending. M. R. Harris, A. J. Geddes and A. C. T. North, Frame-sequential stereoscopic system for use in television and computer graphics, Disp. 7(1), 12. L. Lipton and J. Halnon, Universal electronic stereoscopic display, in Stereoscopic Displays and Virtual Reality Systems III, M. T. Bolas, S. S. Fisher and J. O. Merritt, Eds., Proc. SPIE 2653, 219–223 (1996). L. Ganz, Temporal factors in visual perception, in Handbook of Perception Vol. 5, E. C. Carterette and M. P. Friedman, Eds., Academic Press, NY (1975). A. B. Watson, Temporal sensitivity, in Handbook of Perception and Human Performance, K. R. Boff, L. Kaufman and J. P. Thomas, Eds., Vol. 3, Wiley, NY, 1986. H. de Lange, Research into the dynamic nature of the human foveacortex systems with intermittent and modulated light: I. Attenuation characteristics with white and colored light, J. Opt. Soc. Am. 48, 777–784 (1958). R. W. Bowen, Isolation and interaction of ON and OFF pathways in human vision: contrast discrimination at pattern offset, Vision Res. 37(2), 185–198 (1997). G. R. Engel, An investigation of visual responses to brief stereoscopic stimuli, Q. J. Exp. Psychol. 21, 148–166 (1970). W. Skrandies, Visual persistence of stereoscopic stimuli: electrical brain activity without perceptual correlate, Vision Res. 27(12), 2109–2118 (1987). S. M. Faris, U.S. Patent pending. K. Uchikawa and M. Ikeda, Temporal integration of chromatic double pulses for detection of equal-luminance wavelength changes, J. Opt. Soc. Am. A 3, 2109–2115 (1986). G. S. B. Street, U.S. Patent 4,641,178 (1987). G. Wyszecki and W. S. Stiles, Color Science , Wiley & Sons, NY (1982). W. R. J. Brown and D. L. MacAdam, Visual sensitivities to combined chromaticity and luminance differences, J. Opt. Soc. Am. 39, 808–818 (1949). S. M. Faris, U.S. Patent 5,364,557 (1994). S. M. Faris, U.S. Patent 5,457,554 (1995). Cardillo , et al. JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY • Volume 42, Number 4, July/August 1998 Full-color 3-D Prints and Transparencies* J. J. Scarpetti,† P. M. DuBois,† R. M. Friedhoff,† and V. K. Walworth‡ The Rowland Institute for Science, Cambridge, MA 02142 We have put into practical form one of Edwin Land’s early inventions, the stereoscopic Vectograph image. We have reinterpreted the concept to produce digital 3-D hardcopy conveniently from both photographic and digital 3-D records. Digital 3-D images may be produced directly by digital cameras, by computers and workstations, and by various instrumental outputs or they may be acquired by scanning and digitizing photographic image pairs. Digital 3-D polarizing images are printed conveniently with an ink-jet printer. To produce full-color 3-D hardcopy on standard ink-jet printers we have formulated special inks and substrates. Our technique unites two very significant growing technologies: ink-jet printing and 3-D imaging. Journal of Imaging Science and Technology 42: 307–310 (1998) Introduction The most commonly used method of 3-D presentation comprises encoding left- and right-eye images in terms of polarization by mounting oppositely oriented polarizing filters over the lenses of paired projectors and superimposing the left- and right-eye images on the screen, as shown in Fig. 1. To preserve the polarization, the screen used must have a metallic, nondepolarizing surface. Suitable aluminum screens are commercially available. Observers view the composite image through 3-D polarizing glasses. Under these circumstances each eye sees only the assigned image and the observer perceives the composite as a single three-dimensional image. This method is effective, but it is difficult to achieve correct and consistent stereoscopic registration and alignment without specialized precision equipment. In 1940 Edwin Land introduced the Vectograph concept.1 Instead of using polarizing filters to encode paired images, he used images that were themselves polarizers. He printed the respective left- and right-eye images on opposite sides of a single transparent support, as indicated in Fig. 2, so that once the two images had been properly registered and printed, they could not be misaligned. The earliest Vectograph images were in black and white, and they were formed by staining preregistered paired gelatin relief images with an iodine ink and transferring that ink to oppositely oriented polyvinyl alcohol (PVA) layers laminated to opposite surfaces of a transparent film base.§ In each image area the effective density is directly related to the amount of iodine transferred to the PVA and thus to the degree of polarization. Original manuscript received November 10, 1997 * Presented in part at the IS&T 50th Annual Conference, Cambridge, MA, 18–23, 1997. † IS&T Member ‡ IS&T Fellow © 1998, IS&T—The Society for Imaging Science and Technology The chemistry of this Vectograph imaging process resembles that used in the production of sheet polarizers such as the Polaroid H-sheet. In each case the substrate is a PVA sheet that has been heated and stretched to orient the polymeric molecules, then laminated to a support and stained with an iodine ink. Such polarizers are defined as dichroic polarizers. Iodine may be described as a dichroic stain, and dyes that form spectrally selective dichroic polarizers are described as dichroic dyes.2 In 1953 Land showed three-color Vectograph images that had been formed by successively transferring cyan, Figure 1. 3-D projection display with two projectors. § The black-and-white Vectograph process was used extensively for both military and industrial imaging during World War II. For many years Stereo Optical Co., Chicago, IL, has been producing black-and-white Vectograph prints for use by ophthalmologists in binocular vision testing and training. 307 Figure 3. Color Vectograph image printing by dye-transfer process. Figure 2. The Vectograph concept, using paired, oppositely oriented polarizing images. magenta, and yellow dichroic dye images from paired gelatin relief images.¶ The color Vectograph printing process required for each 3-D image a set of six flawless, perfectly registered matrices, as indicated in Fig. 3. The process produced excellent stereoscopic color prints and transparencies and even experimental 3-D motion pictures. However, preparation was difficult, time-consuming, and costly, and these factors precluded widespread utilization of the process. In recent years 3-D computer technology has dramatically influenced scientific, engineering, and medical imaging. For example, molecules, airplanes, oil rigs, and buildings are now designed or investigated using 3-D computers, often with real-time stereoscopic renditions displayed on monitors. Until now, however, there has been no convenient form of hardcopy for these 3-D display images. The purpose of our investigation has been to develop a greatly simplified process for generating full-color 3-D prints and transparencies, using contemporary ink-jet printing technology with dichroic inks and specially constructed substrates. New materials and techniques make it practical to use ink-jet technology for printing full-color 3-D hardcopy images.3 Materials and Methods Preparation of 3-D Digital Image Pairs. Figure 4 details the various paths from initial image acquisition to the finished stereo image. If the initial stereo images are conventional photographic negatives or positive transparencies, the images are first scanned and digitized. Image pairs from digital cameras, CAD images, and digital images based on various instrumental data are used directly. th ¶ Presentation by E. H. Land at the 38 Annual Meeting of the Optical Society of America, Rochester, NY, 1953. 308 Journal of Imaging Science and Technology Figure 4. Flow diagram illustrating the preparation of digitized stereoscopic polarizing images. The principal steps are: (1) the preparation of a digitized left–right image pair, (2) storage of the two images as a left–right image file, (3) paste-layering, using Adobe Photoshop. Following adjustment of contrast and color, suitable cropping, and stereo registration, the two images are printed sequentially, one on each surface of the two-sided sheet. In addition to using an array of external sources, we can import digital 3-D images electronically from remote instruments and computers. The paired left- and right-eye digital images are transferred to Adobe Photoshop or a comparable program. We adjust the contrast and color balance of each image, then register the pair stereoscopically, as illustrated by the two upper images in Fig. 5. Each of the two images is “pastelayered” into a new canvas, in which the dimensions match the dimensions of the printing medium, commonly 8.5 × 11˝ or 8.5 × 14˝. We then select the right-eye image, reduce its transparency to 50%, and move the image into stereoscopic alignment with the left-eye image, superimposing precisely the points that are to lie in the plane of the stereoscopic “window,” i.e., the plane of the screen or frame, as shown in the lower left panel of Fig. 5. Finally, we restore full density and reverse the right-eye image right to left (Fig. 5, lower right panel), so that when the two images are printed face to face both images will again have the same left–right orientation. Image Dyes. Figure 6 shows the structure of a typical dichroic azo dye suitable for forming polarizing images Scarpetti, et al. Figure 5. Monitor screen views showing the steps in stereoscopic registration of image pair. Figure 6. A typical dichroic image dye, Direct Green 27. upon imbibition into an oriented PVA layer. The dye shown is Direct Green 27, which we use in the cyan ink.4 The polyazo dye molecule is sufficiently flexible to align readily with oriented molecules of PVA to form an efficient dichroic polarizer. The sulfonic acid groups confer high solubility in aqueous solution, providing mobility of the dye within water-permeable polymeric layers. For our application the dyes are carefully purified and formulated into dichroic inks that perform well in standard ink-jet cartridges. Sheet Material. We print the paired images on the two surfaces of a multilayer sheet, as represented in Fig. 7. The film base is a nondepolarizing transparent support of cellulose triacetate or cellulose acetate butyrate. An image-receiving layer of stretched PVA is laminated to each surface of the film base, with the stretch axes of the two Full-color 3-D Prints and Transparencies PVA layers oriented at 90° to one another and at 45° to the edge of the sheet (Polaroid Corporation, Cambridge, MA). A thin metering layer of a nonoriented ink-permeable polymer, such as carboxymethyl cellulose, overlies the surface of each of the PVA layers, as indicated in Fig. 7. Printing Equipment. Desktop ink-jet printers are characterized as drop-on-demand printers. The printhead comprises a bank of ink cartridges, one for each of three subtractive color inks and in most cases one for black ink as well. As the printhead moves rapidly across a sheet of paper or transparent base, microscopic nozzles eject ink onto the sheet. In certain drop-on-demand printers, such as those provided by Epson, electronic signals actuate piezoelectric diaphragms within the head to force the imagewise ejection of droplets. In bubble-jet systems, such as the Hewlett-Packard (HP) and Canon printers, heat- Vol. 42, No. 4, July/Aug. 1998 309 Figure 7. Schematic drawing of sheet structure. ing elements create bubbles that expand to force out the ink droplets. Most of our images to date have been produced on HP printers, including Models 500C, 550C, 850C, and 820. We are also printing images on an Epson Stylus 800 printer. In each case the image resolution is determined by the resolution of the printer. The Hewlett-Packard printers are rated at 300 dpi and the Epson at 1440 dpi. To print the 3-D images, we insert standard ink-jet cartridges filled with dichroic inks into otherwise unmodified ink-jet printers. To make a transparency with the HP printer we select the media setting “transparency” and we set the intensity at “normal” to “darkest.” To make a reflection print we select “glossy” as the media setting and choose “lighter” to “normal” ink intensity. We print the left-eye image on one surface of the sheet and the righteye image on the opposite surface. Finishing Stage. After transfer of the images to the oriented PVA layers has taken place, the metering layers are removed. Transparencies are mounted for viewing by overhead projection or for direct viewing on a light box. Images prepared for viewing as reflection prints are laminated to reflective aluminized backing sheets. Results We have used a variety of stereoscopic images in the course of our development work. These images represent many applications, including molecular modeling, microscopy, data visualization, entertainment, and pictorial photography. The 3-D image pair used in Fig. 5 was taken from a Photo CD file. Several of the images shown during the presentation of this paper originated as computergenerated or instrument-generated stereoscopic data. For example, a model of a complex protein molecule was produced from its molecular coordinates, using the program MOLMOL in a Silicon Graphics workstation.5,6 The image information was saved as a TIFF file and transferred to a PowerMac in our laboratory via FTP, using NCSA TelNet. Fig. 8 illustrates the image as a side-by-side stereo pair. The 3-D transparencies produced in desktop equipment are convenient hardcopy of size and quality suitable for projection in a standard overhead projector onto an aluminum screen or for direct viewing by transmitted light. In most situations the observers wear conventional lin- 310 Journal of Imaging Science and Technology Figure 8. Molecular structure of α-t-α, a 35-residue peptide with a helical hairpin conformation in solution.5 Left, left-eye image; right, right-eye image. early polarized viewers. The same images may be projected directly through an overlaid quarter-wave retarder for observation with circularly polarized viewers. The transparencies may also be viewed without glasses, using tabletop autostereoscopic display apparatus. Although all of the prints and transparencies produced so far have been printed on desktop equipment, we are exploring the applicability of larger format drop-on-demand printers and continuous-flow ink-jet printers. Summary and Conclusions We have developed a 3-D imaging system that should be compatible with many of the color ink-jet printers now in use. All indicators suggest that ink-jet printing will continue to be a leading technology for producing digital hardcopy of high quality. Advances in resolution, image quality, speed, and convenience are occurring rapidly in the ink-jet industry, and these advances will contribute to the utility of our process. The increasing use of 3-D information in many fields makes it desirable to have ready access to high-quality 3-D hardcopy. We believe that our technology offers new opportunities for modern stereoscopic imaging. Acknowledgments. We thank David Burder for the use of several of his 3-D images, including the pair used in Fig. 5. We also thank John Osterhout for the image of the molecule shown in Fig. 8. References 1. 2. 3. 4. 5. 6. E. H. Land, J. Opt. Soc. Am. 30, 230–238 (1940). (a) W. A. Shurcliff, Polarized Light: Production and Use, Chap. 4, Harvard University Press, Cambridge, MA, 1966; (b) E. H .Land and C. D. West, “Dichroism and dichroic polarizers” in Colloid Chemistry, J. Alexander, Ed., Vol. 6, Reinhold Publishing Corp., New York, 1946, pp. 160–190. J. Scarpetti, International Patent WO 96/23663, 1996, assigned to The Rowland Institute for Science. R. Bernhard, U.S. Patent 1,829,673, asssigned to J. R. Geigy, S.A. Y. Fezoui, P. J. Connolly and J. J. Osterhout, Solution structure of α-t-α, a helical hairpin peptide of de novo design, Prot. Sci. 1869–1877 (1997). R. Koradi, M. Billeter and K. Wuthrich, MOLMOL: A program for display and analysis of macromolecular structures, J. Mol. Graph. 14, 51–55 (1996). Scarpetti, et al. JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY • Volume 42, Number 4, July/August 1998 Stereo Matching by Using a Weighted Minimum Description of Length Method Based on the Summation of Squared Differences Method Nobuhito Matsushiro† and Kazuyo Kurabayashi First Research Laboratory, OKI Data Corporation, 3-1, Futaba Town, Takasaki City, Gunma Prefecture, 370-0843, Japan A stereo matching method for 3-D scenes is described. The problems addressed are a definition of a search space for scanlines of stereo images and a formulation of the determination of the optimal path in the search space. The SSD (Summation of Squared Differences) method is a measure of similarity between a right image region and a left image region and is used as a cost function of the optimal problem. There are some problems regarding the SSD modeling. To resolve the problems we propose a stereo matching method based on the WMDL (weighted minimum description length) criterion in which parameters of the problem are optimized. The WMDL criterion is an information criterion proposed by us previously. The efficiency of the method is shown by using both synthetic and real images. As the results show, 3.1% to about 18.9% of the error values decreased for the synthetic images and 8.07% to about 12.19% of the error value decreased for the real image in comparison with the SSD method. Journal of Imaging Science and Technology 42: 311–318 (1998) Introduction This article describes a stereo matching method for 3-D scenes. To synthesize stereoscopic view images, correspondence of one image (left or right eye image) to another (right or left eye image) is necessary. In addition, from the correspondence, depth information for each point can be obtained by the principle of triangulation measurement. The problems addressed are a definition of a search space for each scan-line of stereo images and a formulation of the determination of the optimal path in the search space.1–4 The SSD (summation of squared differences) method3,4 is a measure of similarity between a right image region and a left image region and is used as a cost function in the optimization problem. There are some problems regarding the SSD modeling. To resolve the problems we propose a stereo matching method based on the WMDL (weighted minimum description length), in which parameters are optimized by the WMDL criterion. Originally, the MDL criterion5–8 was derived for model parameter estimation of coding models. The criterion was applied to model parameter estimation in various statistical problems, such as prediction and estimation. We have generalized the MDL criterion for the observation system and have derived the WMDL criterion. Stereo matching can be formulated as a prediction problem in which the right image is predicted by using the left image (and vice versa), and the parameters of the prediction problem are evaluated by the WMDL criterion. The proposed algorithm was tested on both synthetic and real images. Experimental results show the efficiency of the proposed method. SSD Method The SSD is a measure of similarity between a right image region and a left image region and is used as a cost function of the optimal problem. The SSD value is calculated as a summation of squared differences between two pixel intensity values of corresponding positions in the left image window and the right image window, as illustrated in Fig. 1. The window size must be large enough to include enough intensity variation for reliable matching but small enough to avoid projection distortion. The first problem of the SSD modeling is that the SSD values of variable windows cannot be compared with each other because of model differences. The optimal window size depends on local variations of image characteristics, and an appropriate criterion for the comparison of different models is necessary. The second problem of the SSD modeling is the projection distortion. The effect of the projection distortion can be explained by a simple example. Figure 2 shows the left and the right images of a cube. In the figure, the centers of the windows correspond. Figure 3 shows the windows observed from the vertical direction against the surface Original manuscript received December 3, 1997 † e-mail: matusiro@okidata.co.jp; tel: 027-328-6172; fax: 027-328-6390 © 1998, IS&T—The Society for Imaging Science and Technology Figure 1. Correspondence between the left image and the right image. 311 Figure 2. The left image and the right image of a cubic. a–b–c–d. In Fig. 3, the centers indicated by the symbol O correspond but the larger the distance from the center of the windows, the more the increase of the projection distortion from which a disparity distortion in the windows is generated. The symbol × indicates the intensity comparison points on the windows that are in different positions on the real object. The larger the distance from the center, the more the uncertainty of the correspondence. WMDL Method We have applied the MDL criterion5–8 framework to the stereo matching problem9 to resolve the first problem of the SSD modeling. Different models can be compared using the MDL criterion framework, which is based on information theory. Originally, the MDL criterion was derived for model parameter estimation of coding models. It has been applied to model parameter estimation in various statistical problems, such as prediction and estimation. We have generalized the MDL criterion regarding the observation system and have derived the WMDL criterion.10 Stereo matching can be formulated as a prediction problem in which the right image is predicted by using the left image (and vice versa), and the parameters of the prediction problem are evaluated by the WMDL criterion. By the generalized observation system of pixel information, the uncertainty of the correspondence caused by the projection distortion (the second problem of the SSD modeling) can be absorbed. The MDL Criterion. Before describing the WMDL criterion, the general MDL criterion is described. The MDL criterion is formulated as follows: n 1 MDL = − ∑ log Qk ,θ (i ) + K log n, (1) 2 i =1 where log = natural logarithm n = number of observed data θ = (θ1, θ 2,…, θ k), a model parameter vector k = number of elements Qk,θ(i) = a probability distribution. The first term of Eq. 1 is the description length of data by a model, and the second term is the description length of the model itself. The larger the number of observed data, the more precise the estimated model parameters. Consequently, the description length of the second term increases because of the decrease of the description length of the first term. But, the smaller the number of observed data, the less precise the estimated model parameters. Consequently, the description length of the second term decreases because of the increase of the description length of the first term. The first and the second term are in such 312 Journal of Imaging Science and Technology Figure 3. Explanation of the projection distortion. a trade-off relationship. The effect of the parameter k is explained in the same way. The MDL model fitting requires a shorter description by a model without redundancy of the model. The WMDL Criterion (Appendix A). The WMDL criterion is formulated as follows: n WMDL = − ∑ wi log Qk ,θ (i ) + i =1 1 n K log ∑ wi ÷, i =1 2 (2) where wi is generalized observation coefficients, (i = 1,2, . . . , n). By the correspondence between the second term of Eq. 1 and the second term of Eq. 2 it is known that n ∑ wi i =1 is an apparent number of observed data. In Appendix B, it is shown that the disparity distortion can be modeled as a noise term added to the WMDL value of the true probability distribution. The effect of the additive noise term that takes a positive value can be absorbed by the generalized observation system. The generalized observation system of the weighted coefficients effect on the WMDL of the true probability distribution and the additive noise term, and requires shorter description by a model without redundancy of the model. Probability Modeling for the Prediction Problem. Let IL(i),IR(i) denote intensity values in the left window and the right window indexed by i, respectively. It is assumed that prediction errors are subject to a Gaussian distribution of zero mean and each error value is independent. It is also assumed that disparities are uniform in each window. The probability modeling for the prediction problem is as follows with a squared difference included in the equation: ε 2 (i ) Qk ,θ (i ) = 1 2 e 2σ , 2π σ (3) Matsushiro, et al. TABLE I. The SSD and the WMDL Comparison for Synthetic Images of a Square Plate Against a Background Methods SSD WMDL where θ =σ ε (i) = I L (i) − I R (i) σ 2 = ε 2 (i). Calculation of the WMDL Criterion. By applying Eq. 3 to Eq. 2, the following equation is derived: n i =1 ( 1 n K log ∑ wi i =1 2 ) 1 n 1 n (4) n = ∑ wi log 2π σ + 2 ∑ wi ε 2 (i ) + K log ∑ wi . i =1 i =1 2 2σ i =1 The second term of Eq. 4 corresponds to the SSD value. The WMDL value is normalized as follows: n WMDL = WMDL ∑ wi . i =1 3×3 5×5 7×7 9×9 11 × 11 13 × 13 15 × 15 17 × 17 ξ = 1.00 0.95 0.90 32.3 (best) 33.6 35.5 37.5 40.0 42.6 45.3 47.5 (worst) 29.2 the data including additive noise of disparity distortion. The generalized observation coefficients satisfy ξwi = wi–1 (i = 1,2, . . . ,n), wn = 1, where x indicates an attenuation parameter for one pixel distance from the center of a window. The predetermined attenuation parameters are 1.0, 0.95, and 0.90. In the WMDL method the optimal window size and the optimal attenuation parameter ξ that minimize the WMDL value are selected for each window. The stereo matching algorithm was tested on both synthetic and real images. In synthetic images, true disparity was known in advance. Figure 4. The search space. WMDL = − ∑ wi log Qk ,θ (i ) + Disparity error (%) (5) Determination of the Optimal Path in the Search Space. In this article, it is assumed that the x–y axes of two cameras are parallel to each other and the centers of the lenses are both on the x axis. Based on this assumption, epipolar lines that restrict the range of the search space are parallel to the scanlines and each scanline is treated independently. In the search space illustrated in Fig. 4, the DP (dynamic programming) is applied to search for the optimal path that minimizes the total WMDLnrm value on a scanline. At each point in the search space, the window size and the generalized observation coefficient that minimizes WMDLnrm are selected from predetermined values. In Eq. 5, k = 3 which are σ and a window position’s two coordinates. Experiments As described above, different models can be compared by using the WMDL criterion. The WMDL criterion is applied to model parameter estimation and the determination of the optimal path in the search space. The predetermined window sizes are 3 × 3, 5 × 5, 7 × 7, 9 × 9, 11 × 11, 13 × 13, 15 × 15, and 17 × 17. The generalized observation system is designed so that the larger the distance from the center of a window the less the effect of Experiment I. In the experiments, a synthetic left and a synthetic right image shown in Fig. 5 are used. Each of the image sizes is 160 × 160 (pixel). The light source of the synthetic images is a point source. The synthetic image consists of a square plate against a background. The comparison between the SSD method and the WMDL method is performed by using the disparity errors. Table I shows the disparity errors of the SSD method and the WMDL method. The disparity error of the WMDL method decreased by 3.1% to 18.3% in comparison with the SSD method. Figures 6(a) through 6(g) are the true disparity, the disparity estimated by the SSD method with a 3 × 3 window, the disparity error (absolute value) by the SSD method with a 3 × 3 window, the disparity (SSD, 17 × 17), the disparity error (SSD, 17 × 17), the disparity (WMDL), and the disparity error (WMDL), respectively. By comparing Fig. 6(c) with Fig. 6(e), it can be observed that the errors near the edges are decreased for the small window and are increased for the large window. In addition, it can be observed that the errors increase for the small window on the flat plane of few intensity variations and decrease for the large window. With the WMDL method, the window size and the observing system are optimized with respect to local variations of image characteristics, and the uncertainty of the assumption of disparity uniformity in a window is absorbed. Hence, the disparity error is decreased as a whole in the image in comparison with the SSD method. Experiment II. In this experiment, the synthetic left and right images shown in Fig. 7 are used. Each of the image sizes is 160 × 160 (pixel). The light source of the synthetic images is a point source. The synthetic image consists of two square plates against a background. Table II shows the disparity errors of the SSD method and the WMDL method. The disparity error of the WMDL method decreased by 13.5% to about 18.9% in comparison with the SSD method. Figure 8(a) through 8(g) are the true disparity; the disparity estimated by the SSD with a 7 × 7 window; the Stereo Matching by Using a Weighted Minimum Description of Length Method... Vol. 42, No. 4, July/Aug. 1998 313 (a) (b) Figure 5. (a) The left image; (b) the right image. disparity error (absolute value) by the SSD with a 7 × 7 window; the disparity (SSD, 17 × 17); the disparity error (SSD, 17 × 17) the disparity (WMDL); and the disparity error (WMDL), respectively. The same behavior can be observed near the edges and the flat planes in previous experiments. Experiment III. In the experiment the real images shown in Fig. 9 are used. Each of the image sizes is 832 × 624 (pixel). The comparison between the SSD method and the WMDL method is performed by using the WMDL value in which the SSD conditions are converted to the WMDL value. The results are shown in Table III. In the WMDL method, 8.07% to about 12.19% of the WMDL value decreased in comparison with the SSD method. The experimental results (I, II, and III) show the efficiency of the proposed method. Conclusions A stereo matching method for 3-D scenes has been described. Stereo matching can be formulated as a prediction problem in which the right image is predicted by using the left image (and vice versa). The problems addressed have been a definition of a search space for each scanline of stereo matching and a formulation of the determination of the optimal path in the search space. We have proposed the WMDL criterion based SSD method in which parameters of the problem are optimized by the WMDL criterion to resolve the problems of the SSD modeling. The stereo matching algorithm has been tested on both synthetic and real images. In the experiment using a synthetic image, the disparity error of the WMDL method decreased by 3.1% to about 18.3% in comparison with the SSD method. In the experiments using another synthetic image, the disparity error of the WMDL method decreased 13.5% to about 18.9% in comparison with the SSD method. In the experiments using a real image, the WMDL value 314 Journal of Imaging Science and Technology TABLE II. The SSD and the WMDL Comparison for Synthetic Images of Two Square Plates Against a Background Methods SSD WMDL 3× 5× 7× 9× 11 × 13 × 15 × 17 × Disparity error (%) 3 5 7 9 11 13 15 17 ξ = 1.00 0.95 0.90 43.6 42.7 42.6 (best) 42.9 43.3 44.7 46.4 48.0 (worst) 29.1 TABLE III. The SSD and the WMDL Comparison of Real Images Comparison Methods ➀ ➁ WMDL SSD 3 × 3 SSD 5 × 5 SSD 7 × 7 SSD 9 × 9 SSD11 × 11 SSD13 × 13 SSD15 × 15 SSD17 × 17 Decreased (➀-➁) WMDL value (%)) 8.07 8.07 8.84 9.65 10.38 11.05 11.65 12.19 decreased 8.07% to about 12.19% in comparison with the SSD method. Experimental results have shown the efficiency of the proposed method. In this article, only gray scale images are used in the experiments. In the future we will apply the proposed method to color images. Matsushiro, et al. (a) (b) (c) (d) (e) (f) (g) Stereo Matching by Using a Weighted Minimum Description of Length Method... Figure 6. (a) The true disparity; (b) the SSD (3 × 3) disparity; (c) the SSD (3 × 3) disparity error; (d) the SSD (17 × 17) disparity; (e) the SSD (17 × 17) disparity error; (f) the WMDL disparity; (g) the WMDL disparity error. Vol. 42, No. 4, July/Aug. 1998 315 (a) (b) Figure 7. (a) The left image; (b) the right image. n k ( ) Appendix A: Derivation of the WMDL Criterion10 The WMDL criterion is derived from a weighted log lw = ∑ wi log Qk , θˆ (i ) + ∑ wi log 1 e j . likelihood ∑ wi log Qk ,θ (i ) of a probability distribution By applying Eq. A-2 to Eq. A-3 and by the Taylor expansion of Eq. A-3 around θ = θˆ , the following equation is derived: [] i =1 j =1 (A-3) n i =1 n n ∏ Qk ,θ (i ) instead of the log likelihood ∑ log Qk ,θ (i ) that is i =1 i =1 the basis of the minimum description length (MDL) criterion, where log indicates the natural logarithm and wi(i = 1,2, . . . ,n) indicates weighted coefficients that satisfy wi–1 < wi, wn = 1. The best parameter θ̂ is the most likelihood estimation of about the weighted log likelihood, as follows: n θˆ = arg min − ∑ wi log Qk ,θ (i ). i =1 θ (A-1) ( ) k n n lw = ∑ wi log Qk ,θ (i ) + (1 2)e l ∑ wi M e + ∑ log 1 e j + Rn , (A-4) j =1 i =1 i =1 where Rn = M= the rest term a k × k matrix of Mk1,k2(k1,k2 = 1,2,…,k) elements Mk1,k 2 = n ∂2 n − ∑ wi log Qk ,θ (i ) ∑ wi . i =1 θ =θˆ ∂θ k1∂θ k 2 i =1 The description of the probability of a model is equivalent to the description of θ̂ in a finite precision. The description of θ̂ in a finite precision can be formulated as follows: In Eq. A-4, the first and second differential terms are considered. And the first differential term equals zero, because [θˆ] = θˆ + e, − ∑ wi log Qk ,θ (i ) (A-2) where θ̂ = the most likelihood estimation of θ concerning the weighted log likelihood e = e1 e2 M an error vector of finite precision e k [ θ̂ ] = a finite precision description of θ̂ . The probability distribution of a model is Qk,[θ̂ ] , and it is necessary to discriminate reciprocal numbers of ej to describe the j’th [ θ̂ ] element. The log description length of the discrimination is log(1/ej) and the total description length is as follows: 316 Journal of Imaging Science and Technology n i =1 takes a pole value (the minimum value) at θ = θˆ by of the definition. The n ∑ wi log Qk ,θ (i ) i =1 n ∑ wi i =1 value in Mk1,k2 is a weighted averaged log likelihood that corresponds to the averaged log likelihood n ∑ wi log Qk ,θ (i ) n i =1 in the MDL derivation procedure. The ej value that minimizes lw value can be derived by differentiating the summation of the second and the third terms in Eq. A-4 about ej as follows: ej = dj n ∑ wi , (A-5) i =1 where dj is the constant value depending on M. Matsushiro, et al. (a) (b) (c) (d) (e) (f) (g) Stereo Matching by Using a Weighted Minimum Description of Length Method... Figure 8. (a) The true disparity; (b) the SSD (3 × 3) disparity; (c) the SSD (3 × 3) disparity error; (d) the SSD (17 × 17) disparity; (e) the SSD (17 × 17) disparity error; (f) the WMDL disparity; (g) the WMDL disparity error. Vol. 42, No. 4, July/Aug. 1998 317 ( ( ) ) n n D Q* Qk ,θ = 1 ∑ wi ÷∑ wi log Q* (i ) Q k ,θ (i ) . i =1 i =1 ( (A-9) ) Theorem 2. The weighted divergence D Q* Qk,θ takes a positive finite value. Proof. Under the assumption that a weighted average is equivalent to the expectation value E{•}, the following relation is derived: 1 ∑ wi ÷ ∑ wi log (Q* (i) n i= 1 n i= 1 { Qk,θ (i) ) )} (i) Q* (i))} = − log ∑ Q* (i)(Q (i) Q* (i)) ( ≥ − log E{(Q = E − log Qk,θ (i) Q* (i) k ,θ (A-10) k ,θ (a) Q = − log 1 = 0. ( ) So the weighted divergence D Q* Qk ,θ takes a positive value. The Q*(i)/Qk,θ(i) (i = 1,2,…,n) takes a finite value that let U denote the maximum value of log(Q*(i)/Qk,θ(i)), and the following relation is derived: 0< ∑ wi log(Q* (i) n i= 1 ) n Qk,θ (i) < U ∑ wi . (A-11) i= 1 By using Theorem 1, the right side of Eq. A-11 takes a finite value and the weighted divergence D Q* Qk ,θ takes a positive finite value. ( Theorem 3. Let’s assume that the disparity distortion can be included in θ. Disparity distortion can be modeled as a noise term added to the WMDL value of the true probability distribution. Proof. Let’s assume q* to be the true model parameter. The first-order Taylor expansion of Eq. A-7 is as follows: (b) Figure 9. (a) The left image; (b) the right image. By applying Eq. A-5 to Eq. A-4, ignoring the second term, and ignoring the constant term separated from the third term and Rn, the following equation is derived: n n lw = ∑ wi log Qk ,θˆ ( Si ) + (1 2)k log ∑ wi ÷. i =1 i =1 (A-6) By minimizing Eq. A-6 about k and indicating the optimal k as k̂ , the WMDL is derived as follows: n n WMDL = ∑ wi log Qkˆ , ˆ (i ) + (1 2)kˆ log ∑ wi ÷. θ i =1 i =1 (A-7) Appendix B Theorem 1. Under the assumption that wi–1 < wi (i = 1,2,…,n), wn = 1, n ∑ wi takes finite value even if n → ∞. i =1 Proof. There exists a value γ that satisfies the relation wi ≤ γ n–i (i = 1,2,…,n). n −i < 1 (1 − γ ) . ∑ wi ≤ ∑ γ n n i =1 i =1 (A-8) The right side of Eq. A-8 takes a finite value and the left side value takes a finite value even if n→∞. Definition. Let Q* denote the true probability distribution. A weighted divergence D between Q* and Qk,θ is defined as follows: 318 Journal of Imaging Science and Technology ) n n WMDL = ∑ wi log Qkˆ ,θˆ (i) + (1 2)kˆ log ∑ wi ÷ i= 1 i= 1 (A-12) (( n n = ∑ wi log Qkˆ ,θ (i) + (1 2)kˆ log ∑ wi ÷ + O D Q* Qk,θ * i= 1 i= 1 )) + R . n The third term can be seen to be noise additive to the WMDL value of the true probability distribution. References 1. M. Tadenuma and I. Yuyama, Optimization of matching-point-detection in stereoscopic images, in Proc. of the Institute of Television Engineers of Japan Annual Convention, Japan, 1993. 2. J. L. Barron, D. J. Fleet and T. A. Burkitt, Performance of optical flow techniques, in IEEE Proc. CVPR, IEEE, Piscataway, NJ, 1992, p. 236. 3. T. Kanade and M. Okutomi, A stereo matching algorithm with an adaptive window: theory and experiment, Technical Report CMU-CS-90, School of Computer Science, C. M. U., Pittburgh, PA, 15213 (1990). 4. T. Azuma, K. Uomori and A. Morimura, Motion estimation from different size block correlation, in Proc. of the Institute of Television Engineers of Japan 3-D Image Conference, Japan, 1994, p. 33. 5. J. Rissanen, Stochastic complexity and modeling, Ann. Statist. 14, 1080 (1986). 6. J. Rissanen, Modeling by shortest data description, Automatica, 14, 465 (1978). 7. J. Rissanen, A universal prior for integers and estimation by minimum description length, Ann. Statist. 11, 416 (1983). 8. J. Rissanen, Universal coding, information, prediction and estimation, in IEEE Trans. Inf. Theory, Vol. IT-30, 629 (1984). 9. N. Matsushiro and K. Kurabayashi, Stereo matching based on mdl based ssd method, in Proc. of the Society for Imaging Science and Technology 50th Annual Conference, Boston US, 645 (1997). 10. N. Matsushiro, Considerations on the weighted minimum description length criterion, J. Inst. Tele. Eng. Japan, 50 (4), 483 (1996). Matsushiro, et al. JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY • Volume 42, Number 4, July/August 1998 Diffuse Illumination as a Default Assumption for Shape-From-Shading in the Absence of Shadows Christopher W. Tyler Smith-Kettlewell Eye Research Institute, San Francisco, California 94115 USA Sinusoidal luminance patterns appear dramatically saturated toward the brighter regions. The saturation is not perceptually logarithmic but exhibits a hyperbolic (Naka–Rushton) compression behavior at normal indoor luminance levels. The object interpretation of the spoke patterns is not consistent with the default assumption of any unidirectional light source but implies a diffuse illumination source (as if the object were looming out of a fog). The depth interpretation is, however, consistent with the hypothesis that the compressed brightness profile provided the neural signal for perceived shape, as an approximation to computing the diffuse Lambertian illumination function for this surface. The surface material of the images is perceived as non-Lambertian to varying degrees, ranging from a chalky matte to a lustrous metallic. Journal of Imaging Science and Technology 42: 319–325 (1998) Introduction It is common to assume that the perception of the shape of an object from its shading image follows a few simple principles based on default assumptions about the light source and surface properties. For example, much of the computer vision literature makes the assumption of a spatially limited (or approximately point) source of light and surfaces of Lambertian (or uniform matte) reflectance properties. Such assumptions are commonly supposed to provide reasonable approximations to the typical interpretations of the human perceptual system (at least in the absence of explicit highlight features). In fact, however, the present analysis will show that there is wide variation in the interpreted surface quality depending on minor variations in the luminance profile of the shading image. Human observers do not seem to make a default assumption about reflectance properties, but to impute them for the particular shading image. Moreover, their interpretation of simple shading images is not consistent with the point-source assumption. The theoretical expectations for a variety of illuminant assumptions is examined in an attempt to determine what default assumption is made by human observers. Prior work in diffuse illumination includes extensive analysis of the properties of diffuse illumination by Langer and Zucker1 (considered in detail below) and a study of fogging of non-Lambertian objects by Barun.2 Although these studies consider the inverse problem of estimating the shape of the surface from the resultant luminance in- Original manuscript received November 10, 1997 † Email: cwt@skivs.ski.org; Web: www.ski.org/cwt; Fax:415-561-1610 © 1998, IS&T—The Society for Imaging Science and Technology formation, they do not address the specific ambiguities and distortions that are the topic of the present work. It is well known that the surface depth corresponding to a particular (submaximal) luminance value is indeterminate for point-source illumination, because its luminance is controlled by its angle to the surface normal. For the onedimensional surface, this ambiguity reduces to two possible values. It is less clear from the cited studies that the same ambiguity pertains to the diffuse illumination case, where surfaces become darker as they lie deeper in “holes.” This kind of issue and the processes that the human brain may use to decode the surface shape are the topic of the present analysis. The focus will be on shading images based on sinusoidal and related luminance functions. As an initial demonstration of the shapes perceived from sinusoidal shading images, Fig. 1 depicts three spoke patterns in which there is repetitive modulation as a function of radial angle. The first pattern has a linear sinusoidal profile, the second is predistorted so as to have an approximately sinusoidal appearance to most observers, and the third is further distorted so as to appear as an accelerating function with wider dark bars than light bars. Note that, in this radial format, there is a strong tendency to perceive these luminance profiles as deriving from three-dimensional surfaces. What are the properties of the perceived surfaces? Although the generator function is one-dimensional, we are able to estimate simultaneously the surface shape, the reflectance properties, and something about the illuminant distribution. We thus parse the one-dimensional luminance function at a particular radius in the image into three distinct functions. Such parsing can occur only if the visual system makes default assumptions about two of the functions. The question to be addressed is what default assumptions are made? Because the patterns are radially symmetric, the illuminant distribution must itself be symmetric (or the 319 (a) (b) (c) Figure 1. Depictions of sinusoidal spoke patterns with various levels of brightness distortion; (a) linear sinusoid; (b) perceptually sinusoidal compensation, accelerating hyperbolic distortion to provide sinusoidal appearance; (c) overcompensation for perceptual inversion, extreme hyperbolic distortion to appear as an accelerating distortion. Best approximation to intended appearance will be obtained if viewed from a distance so that pixellation is not visible. Sequence of Operations in Shape Perception. SURFACE SHAPE INCIDENT ➤ ILLUMINATION OF SURFACE ➤ REFLECTANCE FUNCTION ➤ BRIGHTNESS COMPRESSION ➤ SHAPE RECONSTRUCTION Figure 2. The sequence of operations involved in the perception of the shape of a viewed object from the luminance shading information. Does the visual system reconstruct the full sequence or use the simplifying assumption that the output approximates the input? shading on different spokes would vary with the orientation of the spokes relative to the direction of the illuminant). Thus the only possible variation of illuminant properties is the degree of diffusion of the illuminant from a point source (positioned above the center of the surface). To most observers, the surface appears to be of matte (or Lambertian) material in Fig. 1(a) and to become progressively more lustrous in Figs. 1(b) and 1(c). Somehow, the human visual system partitions the single function in each image into separate shape, reflectance and illumination functions. This study is an initial attempt to explore the rules by which such partitioning takes place. Compressive Brightness Distortion. Before proceeding with the analysis of surface properties, first consider the simple compressive distortion of the brightness image. If the surface properties are ignored for the moment, the direct brightness profile of Fig. 1 does not appear to be sinusoidal: the dark bars look much narrower than the bright bars (based on the perceived transition through midgray). This narrowing effect is far more pronounced in 320 Journal of Imaging Science and Technology high-contrast images on a linearized CRT screen than in this printed example, which has a contrast of about 95%. For several reasons, it is probable that the perceived distortion arises at the first layer of visual processing, the output of the retinal cone receptors (Macleod et al.,3 Hamer and Tyler.4 However, the focus here is on the distortion’s perceptual characteristics, not its neural origin. It is reasonable to be skeptical of the linearity of the reproduction of Fig. 1. A simple test of the accuracy of its linearity is to view the figure in (very) low illumination, after dark-adapting the eyes for a few minutes. In such conditions, the visual system defaults to an approximately linear range, and it can be seen that Fig. 1(a) now appears to have roughly equal widths of the bright and dark bars. In terms of shape-from-shading issues, the question arises whether the depth interpretation mechanism of the visual system “knows” that it is being fed a distorted input. The most adaptive strategy, for either genetic specification or developmental interaction with the environment, would be for the brightness distortion to be compensated Tyler in the depth interpretation process so that the perceived brightness distortion does not distort the depth interpretation (Fig. 2). However, the observed depth interpretation from these patterns seems to follow closely the waveform of the perceived brightness profile; when the brightness is perceived as sinusoidal [Fig. 1(b)], the surface is perceived as a roughly sinusoidal “rosette.” When the brightness pattern is perceived as having narrow dark bars [Fig. 1(a)], the surface is perceived more like a ring of cones with narrow valleys between them. Fig. 1(c) continues this trend, although a second principle of change in surface properties now appears. The question to be addressed is: what principles is the visual system using in deriving its surface interpretation from the luminance profile? The direct relationship between perceived brightness and surface depth that is the typical perception of the patterns of Fig. 1 is surprising in relation to the luminance profiles that should be expected from geometric reflectance considerations. For example, in Fig. 1(b) the surface appears approximately sinusoidal and peaks in phase with the peaks of the luminance image. As the following illumination analysis will show, this interpretation is completely incompatible with point-source illumination in any position. This incompatibility is surprising in view of the widespread use of the point-source assumption in the field of computer vision. Development of a diffuse illumination analysis then provides an explanatory basis for the observed perceptual interpretations. An additional benefit of the diffuse illumination analysis is that it shows how the direct relationship between perceived brightness and surface depth perception is compatible with the operation of a compensation for early brightness compression in the perceived brightness function. Properties of Diffuse Illumination. Although much of the computer vision literature has concentrated on illumination by point sources, Langer and Zucker1a,b have laid the groundwork for the analysis of the luminance properties arising from diffuse illumination. The basic assumption is that the illuminance of any point on a surface is the integral of the incident light at that point. This amounts to the cross-section of the generalized cone of rays reaching that point through the aperture formed by the rest of the surface. Its properties are described by the sum of a direct illumination term, a (first-order) self-illumination term of reflections from other surface points and a residual ε encompassing higher order self-illumination terms. R( x) = ρ ρ Rsrc N ( x) ×udΩ + R(∏ ( x, u)) N ( x) ×udΩ +ε , (1) π ν (∫x) π η ( x)ν∫ ( x) where x is a surface point, N(x) is the surface normal, η(x) = {u: N(x) u > 0} is the hemisphere of outgoing unit vectors, η(x) is the set of directions in which the diffuse source is visible from x, dΩ is an infinitesimal solid angle, and Π(x,u) is the self-projection to the surface from point x in direction u. The properties of self-illumination by reflection to a point from nearby surfaces have been treated for diffuse illumination by Stewart and Langer.5 Although some special cases deviate in detail, they show that for complex surfaces the self-illumination component tends to operate as a multiplicative copy of the direct term, so that the whole equation for R(x) may be approximated by the first term multiplied by a constant close to 1. • R( x) ≈ (1 + k) ρ Rsrc N( x)× udΩ. π ν (∫x) (2) Figure 3. Lambertian reflectance profiles for a sinusoidal surface (a) under three illumination conditions; (b) point-source illumination from infinity at a grazing angle to the left-hand slopes; (c) point-source illumination from infinity directly above the surface; and (d) diffuse illumination from all directions. Intuitively, this simplification occurs because the maximum self-illumination generally arises from surfaces of similar luminance to the point under consideration. This result is particularly clear for surfaces that are symmetric with respect to the average surface normal (such as a V-shaped valley), where the closest points across the valley are those at the same height as a chosen point. Stewart and Langer show that even extreme departures from this symmetry (such as an overhanging cliff) introduce only relatively mild distortions into the net diffuse illumination function. Illumination Analysis. The general principles of luminance profiles based on Lambertian objects are well known, but it is instructive to consider the variety of luminance patterns that may arise from a simple object such as a sinusoidal surface under different illumination conditions, for comparison with human perceptual performance in the reconstruction of shape from shading when the light sources is unknown. For point sources at infinity, the angle of incidence is a critical variable. For the alternative assumption of diffuse illumination, the principal factor is the acceptance angle outside which the diffuse illumination is blocked from reaching a particular point on the surface. The assumptions for the following analysis are: 1. The surface has constant albedo (inherent reflectance). 2. The surface has Lambertian reflectance properties. 3. Secondary reflections from one part of the surface are negligible for point-source illumination and as described in Eq. 2 for diffuse illumination. The Lambertian reflectance assumption is that the surface illumination is proportional to the sine of the angle of incidence at the surface and that the reflectance is uniform at all angles. Hence, the reflected light is assumed to follow the cosine rule of proportionality to the cosine of the angle of incidence relative to the surface normal. Diffuse Illumination as a Default Assumption for Shape-From-Shading ... Vol. 42, No. 4, July/Aug. 1998 321 Figure 3 shows (top) the profile of a sinusoidal surface, below which are three luminance profiles for selected illumination conditions designed to illustrate the variety of outputs. Because the surface is assumed Lambertian, the reflected luminance is proportional to the incident illumination and hence proportional to the cosine of the angle of the surface to the viewer. The first luminance profile is derived from a point source at infinity whose angle grazes (is tangential to) the lefthand descending slopes of the sinusoidal surface. Hence, the reflected luminance is lowest at the position of the grazing slope and highest along the opposite slope, as shown by Fig. 3(b). Note that, in this position, the luminance profile has the same number of cycles as the original surface (though distorted rather than being a strict derivative). The second luminance profile is derived from a point source at infinity directly above (normal to) the surface [Fig. 3(c)]. Because the peaks and troughs of the surface waveform are at the same angle, they have the same Lambertian reflectance and hence produce a frequencydoubled luminance profile. For 100% luminance modulation, this profile is close to sinusoidal as described in the following section. Here the point is that a quantitative shift in the angle of incidence of the point source produces a qualitative change in the resulting luminance profile of the same object. The third luminance profile [Fig. 3(d)] is derived from the assumption of a diffuse illumination source rather than point source. The resulting luminance profile is again very different from the other two based on point sources. These examples are chosen to illustrate the complexity of the interpretation of shape from shading, because a given shape can give rise to qualitatively different shading profiles depending on the assumed source of illumination. When confronted with a luminance profile that is actually sinusoidal, does the human observer assume that it is a frequency-doubled reflection of a underlying surface of half that frequency, the diffusely illuminated profile of a nonsinusoidal surface, or a non-Lambertian surface, etc.? Geometric Derivation. To develop the theoretical reflectance functions of Fig. 3 required two stages: computation of the angle-of-incidence functions for the selected illumination conditions according to Eq. 2 and conversion to reflectance functions through the Lambertian reflectance assumption. The sinusoidal surface profile is shown again for reference in Fig. 4, below which are plots of the angle of incidence for three different illumination conditions. The first angle-of-incidence function [Fig. 4(b)] is derived from a point source at infinity whose angle grazes (is tangential to) the left-hand descending slopes of the sinusoidal surface. Hence, the angle of incidence is zero at the position of the grazing slope and highest along the opposite slope, as shown by Fig. 4(b). This curve will itself be sinusoidal if (and only if) the amplitude of the surface sinusoid (top curve) is such that opposite flanks are at a 90º angle to each other. Note that, in this position, the angleof-incidence function has the same number of cycles as the original surface function (though shifted in phase in the direction of the angle of the incident light). The second angle-of-incidence function [Fig. 4(c)] is derived from a point source directly above (normal to the mean orientation of) the surface. Because the peaks and troughs of the surface waveform are at the same angle, they produce a frequency-doubled luminance profile that is asymmetric with respect to its peaks and troughs. The third angle-ofincidence function [Fig. 4(d)] is derived from the assumption of a diffuse illumination source rather than a point 322 Journal of Imaging Science and Technology Figure 4. Net angle-of-incidence profiles for a sinusoidal surface (a) under three illumination conditions; (b) point-source illumination at a grazing angle to the left-hand slopes; (c) point-source illumination directly above the surface; and (d) diffuse illumination from all directions. source. The light is assumed to be coming equally from all directions but to be occluded if any part of the surface lies in its path according to Eq. 2. The resulting luminance profile is again very different from the other two based on point sources. The derivation of the diffuse illumination profile depicted in Fig. 4(d) is depicted in Fig. 5. For a particular point on the upper trace of the surface being viewed, the acceptance angle for any point on the surface is the angle between the line passing through point p that is tangent to the surface on the left [Fig. 5(b)] and the one that is tangent to the surface on the right [Fig. 5(c)]. The sum of the two angles φL and φR defines the acceptance angle for each point on the surface. Within this acceptance angle, the light from all directions has to be integrated according to the Lambertian cosine rule for each direction of the diffuse illumination relative to the orientation of the surface, as specified in Eq. 2. The net result of the diffuse illumination analysis is shown for the sinusoidal surface by the lowest curve of Figs. 3 and 4. Note that this curve peaks at a value of π at each peak of the waveform but drops to some lower (nonzero) value depending on the absolute depth of the sinusoidal modulation of the surface. Interestingly, the acceptance angle is not a well-known function such as a catenary but has marked shoulders between relative straight regions. Note that the flatness of the lower portion implies that the trough of the sinusoid approximates the shape of a circle, which has a constant acceptance angle relative to a gap in its surface (as was demonstrated by Euclid). Discussion The conclusion from the analysis of the three paradigm cases in Fig. 3 is that, contrary to the appearance of images in Fig. 1, there is no point-source illumination assumption of a sinusoidal Lambertian surface form that would give rise to a periodic luminance profile matching the frequency and phase of the surface waveform (as is perceived by the human observer). The only luminance function that has the observed frequency and phase relative to the peaks of the surface is the diffuse one, and even Tyler Perception of Sinusoidal Patterns. With the analysis in hand, we may now analyze the perception of the patterns of Fig. 1. The most important result is that these patterns do give pronounced depth perceptions, even though they are qualitatively incompatible with any position of point-source illumination. These reports correspond most closely to the diffuse reflectance profile of Fig. 2 (bottom curve), as looking like a surface with peaks at the positions of the luminance peaks. However, the case where the brightness profile [Fig. 1(b)] looks most sinusoidal corresponds to the case where the perceived surface has the most sinusoidal shape. This seems odd because a sinusoidal surface is predicted to have a much more peaked luminance distribution according to the diffuse illumination assumption [Figs. 3(d) and 4(d)]. Note that typical deviations from the Lambertian and the diffuse assumptions will both enhance the discrepancy. If the surface had a reflectance function that is more focused than the Lambertian, it would tend to increase the luminance in the direction of the observer and hence make the peaks of the assumed surface brighter relative to the rest. Similarly, if the illumination source were more focused than a pure diffuse source, it would introduce a second-harmonic component into the reflectance function similar to Fig. 3(c), which would again enhance the peaks and also introduce a bright band in the center of the dark strips. Hence, the diffuse illumination function at the bottom of Fig. 2 is the least peaked function to be expected from any single illumination source. nal response R that seems to be most closely approximated by a hyperbolic function (like the Naka–Rushton equations for receptor response saturation), as described in Chan et al.6,7 and Tyler and Liu.8 The optimal equation was of the form a R= . (3) L +σ Figure 6 illustrates how such a brightness compression behavior can result in an output that approximates the original surface shape. For a sinusoidal surface [Fig. 6(a)] the diffuse reflectance function under Lambertian assumptions is the peaky function of Fig. 6(b). The effect of a hyperbolic compression on this waveform is shown in Fig. 6(c) to result in an approximately sinusoidal output waveform. For comparison, the effect of the same hyperbolic compression on a straightforward sinusoidal waveform is shown in Fig. 6(d), appearing strongly asymmetric in terms of the peak versus trough shapes. It is thus plausible that the shape-processing system could use the compressed brightness signal as a simple means of deriving the original surface shape from the diffuse reflectance profile. If the visual system does indeed use its inbuilt brightness compression as a surrogate for a more elaborate reconstitution algorithm of the shape from shading under diffuse illumination assumptions, the approximation should work for other typical surface waveforms. One example to test this hypothesis is a cylindrical waveform corresponding to a one-dimensional version of the sphere that is used widely in computational vision (and which corresponds to the most-simplified form of an isolated object in the world). A cylindrical waveform is depicted in one-dimensional cross-section in Fig. 7(a), although the vertical axis is extended relative to a purely circular cross-section. The subsequent panels, in the same format as Fig. 6, show the diffuse reflectance profile, the effect of brightness saturation on this profile, and a simple sinusoid with the same degree of compression. Notice that the saturated diffuse profile again looks similar to the surface waveform, supporting the idea that the brightness-compressed signal can generally act as a surrogate for the back-computation of the surface waveform. In this case, the compressed sinusoid looks somewhat similar to the surface waveform also, which may explain why the linear sinusoid of Fig. 1(a) resembles a ring of conical “dunce caps” (because a cone is a version of a cylinder with a converging diameter). If the visual system treats the brightness-compressed signal as an approximation to the depth profile of the object under diffuse illumination, any object that generates a similar signal after brightness compression should appear to have a similar shape. Finally, some brief thoughts on the different qualities of surface material perceived in Fig. 1. Given that the image that appears Lambertian is the one that resembles the ring of dunce caps with circular cross-section, it may be that the visual system has a Bayesian constraint to prefer a solution that corresponds to such discrete objects rather than a continuously deformed surface. If so, shape reconstructions that deviated from such a circular crosssection (in the absence of explicit contour cues) may tend to be interpreted as deviations from the Lambertian assumption rather than deviations from the assumption of circular cross-section. It is not intended for the present work to provide an empirical analysis of this question but merely to frame the hypothesis. Role of Perceptual Response Compression. Human vision is, of course, not linear as a function of image luminance L but shows a saturating compression of the inter- Conclusion The object interpretation of the spoke patterns of Fig. 1 is not consistent with the default assumption of any Figure 5. Derivation of diffuse illumination profile for the sinusoidal surface (a); (b) surface tangent to the left of each point along surface; (c) surface tangent to the right of each point along surface; and (d) net acceptance angle at each point. it is much more cuspy than a sinusoid. It therefore seems clear that the human observer is defaulting to a diffuse illumination assumption, in contrast to the point source typically assumed for computer graphic displays. Diffuse Illumination as a Default Assumption for Shape-From-Shading ... Vol. 42, No. 4, July/Aug. 1998 323 Figure 6. Role of response compression in the interpretation of depth from shading. (a) Sinusoidal surface shape; (b) net reflectance profile assuming diffuse illumination and Lambertian reflectance function; (c) Perceived brightness signal after hyperbolic saturation. Note similarity to original surface waveform; (d) same degree of hyperbolic saturation applied to a sinusoidal signal, to illustrate how much brightness distortion is perceived in Fig. 1(a) under high illumination. Figure 7. A second example of response compression in the interpretation of depth from shading. (A) Cyclic surface shape. (B) Net reflectance profile assuming diffuse illumination and Lambertian reflectance function. (C) Perceived brightness signal after hyperbolic saturation. Note similarity to original surface waveform. (D) Same degree of hyperbolic saturation applied to a sinusoidal signal, to illustrate similarity of result to (A) and (C). unidirectional light source but implies a diffuse illumination (as if the object were looming out of a fog). The existence of such a default for human vision of shape from shading has not been previously described to our knowledge. Note that similar percepts are obtained for linear sinusoids of high contrast (such as a “stack of cigarettes”), although the sense of shape-from-shading is weaker initially. No-one ever seems to see a linear sinusoid in a rectangular aperture according to the predictions of Fig. 3(b) for a local illumination source, even though there is now no orientational symmetry to force a symmetric source illumination Thus, the default to diffuse illumination appears to be general unless specific cues imply an oriented source (e.g., Ramachandran).9 324 Journal of Imaging Science and Technology Given default diffusion, the depth interpretation is consistent with the hypothesis that the visual system uses the compressed brightness profile directly as the neural signal for perceived shape. It is shown that this equivalence is a reasonable approximation to computing the diffuse Lambertian illumination function for this surface. This match provides the visual system with a rough-and-ready algorithm for shape reconstruction without requiring elaborate back-calculation of the brightness compression and integral angle-of-acceptance functions through which the diffuse illumination image was built. Acknowledgment. Supported by NEI grant No. 7890. Tyler References 1. (a) M. S. Langer and S. W. Zucker, Casting light on illumination: a computational model and dimensional analysis of sources, Comp. Vis. Image Understand. 65, 322–335 (1997); (b) M. S. Langer and S. W. Zucker, Shape from shading on a cloudy day, J. Opt. Soc. Am. 11, 467–478 (1994). 2. V. V. Barun, Imaging simulation for non-Lambertian objects observed through a light-scattering medium. J. Imaging Sci. Technol. 41, 143– 149 (1997). 3. D. I. A. Macleod, D. R. Williams and W. Makous, A visual nonlinearity fed by single cones, Vis. Res. 32, 347–363 (1992). 4. R. D. Hamer and C. W. Tyler, Phototransduction: Modeling the primate cone flash response, Vis. Neurosci. 12, 1063–1082 (1995). 5. A. J. Stewart and M. S. Langer, Towards accurate recovery of shape from shading under diffuse lighting, IEEE Trans. Patt. Anal. Mach. Intell. 19, 1020–1025 (1997). 6. H. Chan and C. W. Tyler, Increment and decrement asymmetries: Implications for pattern detection and appearance. Soc. Inf. Displ. Tech. Dig. 23, 251–254 (1991). 7. H. Chan, C. W. Tyler, P. Wenderoth, and L. Liu, Appearance of bright and dark areas: An investigation into the nature of brightness saturation, Investigative Ophthalmology and Visual Science, Suppl. B, 1273 (1991). 8. C. W. Tyler and L. Liu, Saturation revealed by clamping the gain of the retinal light response, Vis. Res. 36, 2553–2562 (1996). 9. V. S. Ramachandran, The perception of depth from shading, Sci. Am. 269, 76–83 (1988). 3-D Shape Recovery from Color Information for a Non-Lambertian Surface Vol. 42, No. 4, July/Aug. 1998 325 JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY • Volume 42, Number 4, July/August 1998 3-D Shape Recovery from Color Information for a Non-Lambertian Surface Wen Biao Jiang, Hai Yuan Wu and Tadayoshi Shioyama† Department of Mechanical and System Engineering, Kyoto Institute of Technology, Matsugasaki, Sakyo-ku, Kyoto 606, Japan This article presents a method for shape recovery from color image in the case of a non-Lambertian surface illuminated by only a single light source. In the first step, we use the dichromatic reflection model and obtain the directions of spectral power distributions of the light due to surface reflection and body reflection by using eigenvectors of the moment matrix of the color signals. In the second step, the parameters determining the reflectance map are identified by using the dichromatic reflection model and information of the maximum intensity point. Subsequently, the surface normal of the object is estimated and the 3-D shape is recovered. Journal of Imaging Science and Technology 42: 325–330 (1998) Introduction The approach of the shape from shading was proposed by Horn.1 The intensity of an object within an image depends on the light source position, the surface normal of the object, and the viewing direction.1,2 When the object surface is made of a material that acts as a Lambertian reflector, the intensity varies with the cosine of the angle between the incident ray and the surface normal. We see that it is simple to determine the reflectance map of a Lambertian surface, and shape recovery from image intensity (shading) is easy. For the Lambertian case, methods of classical shape recovery from shading have been developed for the single image by Ikeuchi and Horn,2 and for two images by Onn and Bruckstein.3 But in non-Lambertian surface cases, it is difficult to recover the 3-D shape of an object only from image intensity because the reflectance map varies depending on the type of surface material of the object. For shape recovery in the non-Lambertian case, photometric stereo (PMS) methods have been developed by many researchers.4–8 The previous PMS procedures use multiple images of an object, taken under different illumination conditions, to estimate parameters determining the reflectance map and the surface normal. In these works, however, color information was not used. Schlüns9 suggested using color information to obtain the direction of spectral power distributions of the light due to surface reflection and body reflection and applied PMS to the derived matte image. Because PMS methods are based on multiple images of an object sequentially illuminated by multiple light sources, it is considered impossible to apply the methods to natural scene understanding. Furthermore, when the observed object is moving, the PMS methods give rise to a correspondence problem of points among multiple images because of sequential illumination. Original manuscript received June 17, 1997 † E-mail: shioyama@ipc.kit.ac.jp; FAX: 075-724-7300} © 1998, IS&T—The Society for Imaging Science and Technology For the purpose of natural scene understanding like that from the retina of a human being, in our previous article10 we presented a method for shape recovery from shading in the case of a non-Lambertian surface illuminated only by a single light source. We estimated the parameters determining the reflectance map by using the intensity information at the occluding boundary. Because it is difficult to observe the intensity at the occluding boundary precisely, in this article we propose another method for shape recovery from a color image of an object with a nonLambertian surface illuminated by a single light source. The advantage of using color information is that we can recover shape from shading without intensity information at the occluding boundary. In this method, the vectors in the (RGB) color space corresponding to the color signals due to surface reflection and body reflection are estimated by using eigenvectors of the moment matrix of the color signals on the basis of the dichromatic reflection model11 for the color signals. The normal of the object surface is estimated by an iterative method using the image-irradiance equation, and 3-D shape is recovered. Furthermore, to evaluate the algorithm, experimental results on several real objects are shown. Reflection Model The method of this article deals with materials that are optically inhomogeneous, meaning that light interacts both with the surface matter and with particles of a colorant that produce scattering and coloration under the surface. Many common materials can be described this way, including plastics, most paints, varnishes, paper, etc. Metals and crystals are not included in this discussion. We also limit the discussion to opaque surfaces that transmit no light from one side to the other. When light strikes a surface, some of the light is reflected at the interface producing interface reflection (or surface reflection). The direction of such reflection is in the “perfect specular direction” relative to the local surface normal. Most materials are optically “rough” with local surface normals that differ from the macroscopic perfect specular direction, so that the interface reflection is somewhat scattered at the macroscopic level. The light that 325 sr ∫ L(λ , i, n, r) r (λ ) dλ ÷ s ≡ s g ÷ = ∫ L(λ , i, n, r) g (λ ) dλ ÷. ÷ sb ∫ L(λ , i, n, r)b (λ ) dλ ÷ (3) The interval of summation is determined by the responsivity, which is non-zero over a bounded interval of λ. Substituting Eqs. 1 and 2 into Eq. 3, it follows that: sr = ∫ [ c1 (λ )m1 (i, n, r) + c2 (λ ) m2 (i, n, r)]r (λ )λd ≡ ≡ c1, r m1 (i, n, r) + c2, r m2 (i, n, r). (4) Assuming the three filters are sensitive in the spectral channels characterized by the color names red, green, and blue, the color values in vector notation are sr s = sg ÷ ÷ sb Figure 1. Imaging geometry. penetrates through the interface undergoes scattering from the particles of a colorant and is either transmitted through the material (if it is not opaque), absorbed, or reemitted through the same interface by which it entered, producing “body reflection.” The above inhomogeneous materials are described well by the dichromatic reflection model that was shown by Shafer to be a useful approximation. The dichromatic reflection model is stated as11 L(λ,i,n,r) = L1(λ,i,n,r) + L2(λ,i,n,r), (1) where λ is the wavelength of light, i is the unit vector aligned with the incident light direction, n is the surface normal, and r is the viewing direction, as illustrated in Fig. 1. Equation 1 says that the total radiance L of reflected light is a sum of two parts: the radiance L1 of light reflected at the interface and the radiance L2 of the light reflected from the body. The dichromatic reflection model makes several assumptions. The model assumes the surface is an opaque, inhomogeneous medium with one significant interface, not optically active (i.e., has no fluorescence), and uniformly colored (the colorant is uniformly distributed). The model assumes independence of spectral and geometrical properties, and the following simplifying separation may be used: Lt(λ,i,n,r) = ct(λ)mt(i,n,r), with mt(i,n,r) ≥ 0 and t = 1,2, (2) where ct(t = 1,2) is the spectral power distribution of the reflected light, which is the product of the spectral power distribution of the incident light and the spectral reflectance of the surface, and mt is a geometrical scaling factor. The factor m2 is modeled by the Lambertian cosine law. Several possible models for m1 are known, for example, the Torrance-Sparrow model. In a color camera, the red color value sr is a summation of the radiance L(λ,i,n,r) at each wavelength, weighted by the responsivity of the camera combined with the red filter r (λ ), [and so on, for the green filter g(λ ), and the blue filter b (λ ), ]. Then the color values of the radiance L(λ,i,n,r) are given by 326 Journal of Imaging Science and Technology c1,r c2,r = c1, g ÷m1 (i, n, r) + c2, g ÷m2 (i, n, r) ÷ ÷ c ÷ c ÷ 1,b 2,b (5) ≡ c1m1 (i, n, r) + c2 m2 (i, n, r), c1, r c1 ≡ c1, g ÷, ÷ c ÷ 1, b c2, r c2 ≡ c2, g ÷, ÷ c ÷ 2, b (6) where c1,g, c2,g, c1,b and c2,b are defined in the same manner as c1,r and c2,r. This equation defines the dichromatic plane (DCP) in a coordinate system spanned by the primary colors called the RGB space (see Fig. 2). In this work, we assume the surface material is uniformly colored. Then c1 and c2 become constant vectors for all image points. As models for m 1 and m 2, we use the Torrance-Sparrow model6,12 and the Lambertian law: m1(i,n,r) = exp{–c2[cos–1(ns • n)]2} (7) m2(i,n,r) = (i • n), (8) where the symbol • denotes a scalar product, ns ≡ (i + r)/ i + r is the macroscopic specular direction, • denotes a norm of the vector, and c is a constant that depends on the surface roughness. In this article, the value of c is set as 2.578. For a very rough surface it is shown that typical values for c are around 2.5. We adopt the value 2.578, which was experimentally determined by Tagare and deFigueiredo.7 For a vector, value can be expressed as a product of its unit vector and its norm. We define ĉ 1 as the unit vector aligned with c1, that is cˆ 1 = c1 / c1 , and ĉ 2 is the unit vector aligned with c2. Then, the unknown c1 and c2 can be expressed by c1 = c1 cˆ 1 , c 2 = c 2 cˆ 2 , respectively. Algorithm for 3-D Shape Inference We assume orthographic image projection for the imaging geometry being shown as Fig. 1 and let the viewing direction r be parallel to the z axis. Then, the 3-D shape of an object can be described by its height z at coordinate (x,y) in the image plane. We explain the procedure of 3-D shape recovery based on the dichromatic reflection model. We estimate ĉ 1 and ĉ 2 in the next subsection. Jiang, et al. Figure 2. (RGB) space. The Method for Estimating ĉ 1 and cˆ 2 . We assume the color signals s are normalized so that the maximum value of their intensity E is equal to 1. When si,j denotes a column vector, which represents the color signals of the observed object, at a point with coordinate (i,j) in the image plane, the moment matrix M is given as follows: Figure 3. The eigenvectors of the moment matrix. where e ∫ (0.299, 0.586, 0.115). From Eqs. 5, 6, and 10 through 12, we have 1 T M= ∑ s i, j s i, j , N ( i, j )∈Ω (9) where T denotes a transposition, Ω is defined as the set of orthographic image projections of points on the observed surface of the object, and N is the total number of points in the set Ω. Because M is a real symmetric matrix, all of the eigenvalues are nonnegative and three eigenvectors are orthogonal. Note that an eigenvector expresses a statistical characteristic of the color signals of an observed object. As illustrated in Fig. 3, the eigenvector u1 corresponding to the greatest eigenvalue expresses the direction of the centroid of all color signals si,j, (i,j) ∈ Ω, and eigenvectors u2 and u3 corresponding to the second and the third eigenvalue represent the maximal and minimal dispersive directions of all the vectors in the set Ω, respectively. Then we can obtain the DCP normal by the minimal dispersive direction u3 and obtain the vectors ĉ 1 and ĉ 2 by means of the maximal dispersive direction u2. Because the number of color signals is very small near the specular point, the vector ĉ 1 (near the specular point) and vector ĉ 2 can be easily distinguished. When the object surface material exhibits nonLambertian reflection, the reflectance map R(i,n,r) is given by6,12 E = (e • s), ˆ 1 ), ρ1 = (e • c1 ) = c1 (e • c ˆ 2 ). ρ 2 = (e • c 2 ) = c 2 (e • c (13) If c 1 and c 2 are known, the 3-D shape is recovered by the iterative method mentioned in the section Surface Normal Inference, using the image-irradiance equation Eq. 11. Next we estimate parameters c 1 and c 2 . Estimating c 1 and c 2 . In this section, we use the spherical coordinate, i.e., the zenith angle θ and azimuth angle φ, to represent the unit vector. The convention we adopt with respect to these is as follows: the zenith angle of any unit vector is measured positively down from the z axis while the azimuth angle is measured positively counterclockwise from the x axis. The θ and φ usually are subscribed to indicate the vectors to which they belong. Thus θn and φn are zenith and azimuth angles of the vector n, while θi and φi are the angles of i, respectively. Then we have the following relations: n = (sinθncosφn, sinθnsinφn, cosθn), (14) i = (sinθicosφi, sinθisinφi, cosθi), (15) R(i, n, r) = { ρ 1 exp − c 2 [cos −1 (n s • n)]2 }+ ρ 2 (i• n), 0, if (i • n) > 0, (10) otherwise, where ρ1 and ρ2 are parameters determining the reflectance property. Denoting the intensity by E, we obtain the image-irradiance equation E = R(i,n,r). (11) (i • n) = cosθi,cosθn + sinθisinθncos(φi – φn). (16) Because viewing direction r is along the z axis and ns lies in the “principal plane” spanned by i and r, the zenith and azimuth angles of ns are θs = θi/2 and φs =φi. Then (ns • n) is given by From the Commission Internationale de l’Éclairage (CIE), the intensity E of an image can be obtained by the following equation: E = 0.299sr + 0.586sg + 0.115sb = (e • s), (12) 3-D Shape Recovery from Color Information for a Non-Lambertian Surface (ns • n) = cos θi θ cos θ n + sin i sin θ n cos(φ i − φ n ). 2 2 (17) Vol. 42, No. 4, July/Aug. 1998 327 The mapping from unit vector to zenith and azimuth angles is one to one; thus Eq. 11 can be written as E = R(θ n ,φ n ) θ θ = ρ 1 exp− c 2 cos −1 cos i cos θ n + sin i sin θ n cos(φ i − 2 2 + ρ 2 [cos θ i cos θn + sin iθsin cos( i nθ −φ n) φn )÷ ]φ. 2 (18) It is considered that the right hand side of the above equation takes the maximum value Rmax at a vector in the principal plane, i.e., φn = φi. Hence, at the point θ n* , φ n* of maximum value of intensity, ρ1, ρ2, θ n* , and φ n* should satisfy ( ( ) ) Rmax = R θ n* , φ n* = φi = 1. (19) Because the direction of the light is known, i.e. (θi, φi) is known with 0 < θI < π/2, substituting φ n* = φi into Eq. 18, R θ n* , φ n* becomes ( ) ( R θ n* , φ n* ) = φi = ( ) (23) g = 2q 1 + p2 + q 2 − 1 /( p2 + q 2 ), (24) (20) Because Eq. 13 implies that finding unknown ρ1, ρ2 and θ* can be replaced by finding unknown c 1 , c 2 , and θ*, from Eq. 5 through Eq. 8, we have the relation of smax defined as s corresponding to R θ n* , φ n* =Rmax, ( ) s max = c 1m1 (i, n * , r) + c 2 m2 (i, n * , r), i.e., 2 θ = c 1 cˆ 1 exp− c 2 θ n* − i ÷ 2 p≡ ∂z ∂z , q≡ ∂x ∂y (25) and are related to f and g as follows: p = 4f/(4 – f 2 – g2), q = 4g/(4 – f2 - g2). ( ) * + c 2 cˆ 2 cos θ n − θ i . ( (21) ) n 2 θ θ 2 c 1 (e ⋅ cˆ 1 ) c 2 θ n* − i ÷ exp− c 2 θ n* − i ÷ + 2 2 (22) + c 2 (e ×cˆ 2 ) sin(θ n* − θ i )= 0. Hence, c 1 , c 2 , and θ* should satisfy the above equation. Using Eq. 22 as a constraint equation associated with vector Eq. 21, we can obtain the values of c 1 , c 2 , and θ n* . The iterative algorithm can be used to solve the nonlinear simultaneous equations, where we will use the Marquardt method.13 The Marquardt method is a combination of the Newton method and the method of the steepest descent. The above procedure obtaining the parameters c 1 , c 2 , and θ n* does not use the intensity of the occluding boundary, because it is not easy to observe the intensity of the occluding boundary precisely. After obtaining c 1 , c 2 , and θ n* by the method mentioned above, we estimate surface normal using Eq. 11 and the tangent property of the occluding boundary. Surface Normal Inference. The unit vectors r, n, and i are described by points on the unit sphere called the Gaussian sphere (see Fig. 4). In the stereographic projection, a point on the Gaussian sphere is projected by a ray through the point from the south pole onto the tangent plane at the north pole, which is called the stereographic plane. The coordinate (f,g) in the stereographic plane is given as Journal of Imaging Science and Technology (26) The unit vector r and the surface normal n are given by r = (0, 0, 1), n = (− p, − q, 1) / 1 + p2 + q 2 , From Eq. 19, finding the maximum value of R θ n* , φ n* = φi , ∂ i.e., finding the solution θ n* of ∂θ R(θn, φn = φi) = 0, we have 328 f = 2 p 1 + p2 + q 2 − 1 /( p2 + q 2 ), where p and q are defined as 2 θ ρ 1 exp− c 2 θ n* − i ÷ + ρ 2 cos θ n* − θi . 2 s max Figure 4. Gaussian sphere. (27) The vectors n and i are described in terms of f and g as n = [–4f, –4g, 4 – f2 – g2]/(4 + f2 + g2), (28) i = [ −4 fi , −4 gi , 4 − fi2 − gi2 ] /(4 + fi2 + gi2 ), (29) where (fi, gi) denotes the stereographic coordinate corresponding to the direction of the light. We assume that the viewing direction coincides with the north pole of the Gaussian sphere and only considered points are on the northern hemisphere of the Gaussian sphere. Therefore, the considered points (f, g) and (fi, gi) in the stereographic plane are constrained to the following regions: f2 + g2 ≤ 4, fi2 + gi2 ≤ 4. Then for each considered point in the image plane, Eq. 18 is rewritten as Eij = R(fij, gij), (30) where (fij, gij) denotes the stereographic coordinate corresponding to the surface normal at image plane coordinate (i, j) and Eij the intensity at (i, j). We define the constraint h(f, g) as h(u) ≡ E – R(u) = 0, (31) u ≡ (u1, u2)T ≡ (fij, gij)T, (32) which is imposed on the image intensity. We again use the following Marquardt method to estimate surface normal. At each considered point in the image, the estimate u(n) at the n’th iteration is improved with Du in the following steps. 1. Solve the following equation with unknown vector ∆u ≡ (∆u1, ∆u2)T ≡ (∆fij, ∆gij)T, [G(u ) + γ I]∆u = − ∂W∂(uu (ν ) (ν ) ) , (33) Jiang, et al. where I denotes a 2 × 2 unit matrix and W(u), G, and J are defined as 1 2 h , G ≡ J T J, 2 W (u ) ≡ (34) ∂W (u) ∂h = hJ , J ≡ : Jacobian. ∂ (u ) ∂u (35) 2. Improve the estimate u(ν) satisfying u(n + 1) = u(n) + ∆u. (36) Solving Eq. 33, ∆u is given by ∆u1(ν ) ≡ ∆fij(ν ) ( Eij − R = Rf ( ) 2 fij(ν ) , gij(ν ) gij(ν ) ( Rg fij(ν ) , + ∆u2(ν ) ≡ ∆gij(ν ) = fij(ν ) , ( ) ( ) 2 ( ) 2 gij(ν ) Eij − R fij(ν ) , gij(ν ) Rf fij(ν ) , gij(ν ) (37) ) + Rg fij(ν ) , gij(ν ) +γ ( ) Rf fij(ν ) , gij(ν ) , (38) ) 2 +γ ( Rg fij(ν ) , gij(ν ) ), where Rf ≡ ∂R ∂R , Rg ≡ . ∂f ∂g (39) In the above-mentioned algorithm, the value of u at a point on the occluding boundary is known.2 The initial value of u at a point on the region except the occluding boundary is set as u = 0. For convenience, the region where u = 0 and u is not yet improved is called the unknown region. The estimate u(n) at a point on the unknown region can be obtained from the following steps: 1. When there is at least one point called the known point, which does not belong to the unknown region, in the eight neighboring points, we have fij(ν ) = afij(ν ) + bfˆij( ν) , gij( )ν = agij( ) ν+ bgˆ ij( ) ,ν where fij ≡ fi + 1, j + fi, j + 1 + fi − 1, j + fi, j − 1 gij ≡ gi + 1, j + gi, j + 1 + gi − 1, j + gi, j − 1 fˆij ≡ fi − 1, j − 1 + fi + 1, j − 1 + fi − 1, j + 1 + +fi gˆ ij ≡ gi − 1, j − 1 + g+i 1, j − 1 + gi − 1, +j 1 1+, j 1 + g+i 1, +j 1 . If there exists a known point in the four immediate neighbors (i ± 1, j) and (i, j ± 1), (a,b) = (1/ξ,0), where ξ is the number of known points in the four immediate neighbors, else (a,b) = (0,1/η) where η is the number of known points in the eight neighboring points. 2. In the case rather than step 1, fij(ν ) = gij(ν ) = 0. Because the unknown region will vanish as the iteration goes ahead, the above two steps are used only as a transient process. After obtaining (f,g) by the above algorithm, we can get (p,q) by Eq. 26 and obtain the height z by integrating p and q from Eq. 25. Thus, we can reconstruct the 3-D shape. When the above method is applied to a real image, one difficult problem is obtaining the differentiable curve of the occluding boundary. In such a case, we use the B-spline curve to fit the occluding boundary of the image and compute the initial value of u at a point on the occluding boundary by the normal of the fitting curve. Experimental Results To evaluate the proposed algorithm, we show results of several experiments. The materials used in experiments are plastics, which are amenable to the analysis based on the dichromatic model. Two balls and one bowling pin are used as real objects. Object A is a yellow table tennis ball made of plastic with an almost diffuse reflector surface. Object B is a green can cap also made of polished plastic with a hemisphere shape. Object C is a white bowling pin made of plastic with a diffuse reflector surface. The direction of the light source is given by θi = 5° and φi = 0°. The observed image intensities of objects A, B, and C are preprocessed by the method in the following steps and shown in Figs. 5(a), 6(a) and 7(a), respectively. The practical experimental steps are as follows: • Preprocess. Noise is removed by using the median filter with 9 × 9 pixels, and contrast stretching transformation is performed14 so that the value of image intensity varies from 0 to 255. • Estimating Parameters. After forming the moment matrix of color signals, we determine DCP by eigenvectors of the moment matrix and estimate ĉ 1 and ĉ 2 in DCP. By Eqs. 21 and 22, the parameters c 1 , c 2 , and θ n* are determined. • Extracting Occluding Boundary. Because a normal to the silhouette in the image plane is parallel to the normal to the surface at the corresponding point on the occluding boundary, we can regard the normal of the silhouette in the image plane as the normal of the surface on the occluding boundary. In this article, we use the Laplacian–Gaussian filter method to detect the boundary of the image and use the B-spline curve to fit discrete edge data. Then the initial values fi(, 0j ) and gi(,0j) on the occluding boundary are obtained from the fitted curve. • Surface Normal Inference. Using the imageirradiance equation Eq. 11 and initial values fi(, 0j ) and gi(,0j) , we obtain the 3-D shape by the Marquardt iterative algorithm. Table I shows the results of estimated parameters ρ1, ρ2, and θ n* of the real three objects. Figures 5(b), 6(b), and 7(b) illustrate the 3-D shapes reconstructed by the proposed algorithm from the image intensities shown in Figs. 5(a), 6(a), and 7(a), respectively. The limitation of the reflection model is that in a particular case where the model is described by only the Lambertian component and the incident light direction coincides with the viewing direction, the model cannot distinguish between concave and convex shapes because of similar brightness variations. Conclusion We have proposed a method for shape recovery from the color signals for a non-Lambertian surface illuminated by TABLE I. Object ρ1 ρ2 θn * A B C 0.226 0.371 0.000 0.778 0.602 1.000 3.585 3.236 4.699 3-D Shape Recovery from Color Information for a Non-Lambertian Surface Vol. 42, No. 4, July/Aug. 1998 329 (a) (b) Figure 5. (a) Image intensity and (b) reconstructed shape of object A. (a) (b) Figure 6. (a) Image intensity and (b) reconstructed shape of object B. (a) (b) Figure 7. (a) Image intensity and (b) reconstructed shape of object C. only a single light source. In the method, parameters determining the reflectance map are identified by using color information and subsequently the normal of the object surface is estimated and 3-D shape recovered. Our method has been shown to produce reasonable results on several real objects. References 1. B. K. P. Horn, Understanding image intensities, Art. Intell. 8, 201–231 (1977). 2. K. Ikeuchi and B. K. P. Horn, Numerical shape from shading and occluding Boundaries, Art. Intell. 17, 141–184 (1981). 3. R. Onn and A. Bruckstein, Integrability disambiguates surface recovery in two-image photometric stereo, Int. J. Computer Vision, 5, 105– 113 (1990). 4. K. Ikeuchi, Determining the surface orientations of specular surfaces by using the photometric stereo method, IEEE Trans. PAMI–3, 661669 (1981). 5. E. N. Coleman and R. Jain, Obtaining 3-D shape of textured and specular surfaces using four-source photometry, CVGIP, 18, 309–328 (1982). 330 Journal of Imaging Science and Technology 6. H. D. Tagare and R. J. P. deFigueiredo, A theory of photometric stereo for a class of diffuse non-Lambertian surface, IEEE Trans. PAMI–13, 133–152 (1991). 7. H. D. Tagare and R. J. P. deFigueiredo, Simultaneous estimation of shape and reflectance map from photometric stereo, CVGIP: Image Understanding 55, 275–286 (1992). 8. F. Solomon and K. Ikeuchi, Extracting the shape and roughness of specular lobe objects using four light photometric stereo, IEEE Trans. PAMI-18, 449–454 (1996). 9. K. Schlüns, Photometric stereo for non-lambertian surfaces using color information, in Proc. 5th Int. Conf. Computer Analysis of Images and Patterns, 444–451(1993). 10. W. B. Jiang, H. Y. Wu and T. Shioyama, 3-D Shape Recovery from Image Brightness for non-Lambertian Surface, J. Imag. Sci. Technol. 41 (4), 429–437 (1997). 11. S. A. Shafer, Using color to separate reflection components, Col. Res. Appl. 10(4), 210–218 (1985). 12. K. E. Torrance and E. M. Sparrow, Theory for off-specular reflection from roughened surfaces, J. Opt. Soc. Am. 56, 1105–1114 (1967). 13. J. M. Ortega and W. C. Rheiboldt, Iterative Solution of Nonlinear Equations in Several Variables, Academic Press, New York, 1970, p. 281. 14. A. K. Jain, Fundamentals of Digital Image Processing, Prentice-Hall, Inc, Englewood Cliffs, NJ, 1989. Jiang, et al. JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY • Volume 42, Number 4, July/August 1998 Optical Effects of Ink Spread and Penetration on Halftones Printed by Thermal Ink Jet J. S. Arney* and Michael L. Alber* Rochester Institute of Technology, Rochester, New York 14623 A probability-based model of halftone imaging, which was developed in previous work to describe the Yule–Nielsen effect, is shown in the current work to be easily modified to account for additional physical and optical effects in halftone imaging. In particular, the effects of ink spread and ink penetration on the optics of halftone imaging with an ink-jet printer is modeled. The modified probability model was found to fit the experimental data quite well. However, the model appears to overcompensate for the scattering associated with ink penetration into paper. Journal of Imaging Science and Technology 42: 331–334 (1998) Introduction Recent work in this laboratory has been directed at the development of a probability model of the Yule–Nielsen effect to relate fundamental optical properties of papers and inks to tone reproduction in halftone printing. However, practical halftone models also need to account for physical effects such as the lateral spread of ink on the paper, called physical dot gain, and the penetration of ink into the paper. The most fundamental description of the Yule–Nielsen effect involves modeling the optical point spread function, PSF, of light in the paper and convolving the PSF with a geometric description of the halftone dots. Although such models have been shown to be quite accurate in describing the Yule–Nielsen effect, they are computationally quite intensive. Moreover, they are difficult to combine with models of physical dot spread and especially of physical penetration of ink into the paper. But the probability-based model is much less computationally intensive, can be written in a closed analytical form, and is only slightly less rigorous than the convolution approach. Moreover, the probability approach will also be shown to be easily modified to account for ink spread and penetration. The Probability Model The probability model has been described elsewhere,1,2 and here we present only the recipe for its application. The model begins with an empirical description of the mean probability P p that a photon of light that enters the paper between halftone dots will emerge under a dot. [ Pp = w 1 − (1 − F ) B ], Original manuscript received August 21, 1997 * IS&T Member © 1998, IS&T—The Society for Imaging Science and Technology (1) where F is the dot area fraction and w is the magnitude of the Yule–Nielsen effect and is related quantitatively to the optical point spread function of the paper.1,2 Both F and w can have values from 0 to 1. The B factor is a constant characteristic of the chosen halftone pattern and the geometric characteristics of the printer. For the printer used in the current work, an HP 1600C thermal ink-jet, a B factor of 2.0 was found to provide the best correlation between the model and the experimental measurements described below. A second function needed to model tone reproduction is the probability Pi that a photon that enters the paper under a halftone dot (having first passed through the dot) then reemerges from the paper under a dot. The two probabilities have been shown to relate as follows.1 1− F Pi = 1 − Pp F . (2) We assume initially an ink that is transparent, with no significant scattering. Then, as shown previously, the reflectance of the paper between the dots and of the dots is given by Eqs. 3 and 4, with Rg the reflectance of the paper on which the halftone pattern is printed. [ ] Rp = Rg 1 − Pp (1 − Ti ) , (3) Ri = Rg Ti [1 − Pi (1 − Ti )] . (4) Note that the reflectance of the ink and of the paper between the dots are not constant but depend on the dot area fraction F through Eqs. 1 and 2. With the reflectance of the ink dots and the paper between the dots, the overall reflectance of the halftone image is calculated with the Murray–Davies equation. R(F) = FRi + (1 – F)Rp. (5) 331 1 Reflectance Measured Dot Fraction, F 1 0.5 0 0 0 0 Figure 1. Reflectance versus dot area fraction for the paper between the dot (+), the mean image (o), and the ink (x) for the pigmented magenta ink printed at 300 dpi with a disperse halftone pattern on a commercial gloss paper. The lines are drawn from the model with ε = 0.060 m2/g and w = 0.75 with no physical dot gain and no penetration. The Yule–Nielsen “n” factor is not used in Eq. 5 because the Yule–Nielsen effect is described by the scattering probability Pp. Thus, to model tone reproduction R versus F, one needs (1) the transmittance of the ink Ti, (2) the reflectance of the paper Rg, (3) the scattering power of the paper w, and (4) the geometry factor B. The value of Ti can be determined with the Beer–Lambert equation using the coverage of the ink within the dot c in g/m2 and the extinction coefficient ε in m2/g. (6) The pigment-based ink was delivered by the printer at c = 7.31 g/m2. This was determined by weighing the ink cartridge before and after commanding the printer to print a known number of ink drops at a selected area coverage of 0.50. As a test of the model, a dispersed-dot halftone at 300 dpi addressability was printed using an HP 1600C thermal ink-jet. Figure 1 shows the measured reflectance of the halftone image R, the ink dots Ri, and the paper between the dots Rp versus the dot area fraction F measured by microdensitometry as described previously.1,2 The reflectance values are integral values characteristic of the instrument spectral sensitivity. The solid lines in Fig. 1 are the model calculated as follows: The values of Rg and c were measured independently. The values of ε and w were used as independent variables to provide the best fit between the model and the data. For selected ε and w, Eq. 6 was applied, then Eqs. 2 through 5. The values of ε and w were adjusted to provide a minimum rms deviation between the model and experimental values of Rp. Figure 1 shows that the model describes the paper reflectance Rp, quite well, but the measured values of Ri are significantly higher than expected from the model. Clearly, modification of the model to account for nonideal behavior of the thermal ink-jet system is needed. 332 Journal of Imaging Science and Technology 0.4 0.6 0.8 1 Dot Fraction from Printer, Fo Dot Fraction, F Ti = 10 − εc . 0.2 Figure 2. Measured ink area fraction F, versus the nominal gray fraction F0 commanded by the printer. The Fmax is the ink area fraction at a nominal gray fraction of F0 = 1.00. Dot Spread and Overlap A deficiency of the above model is the way in which Ti is estimated with Eq. 6. The value of c = 7.31 g/m2 was estimated from an accurate measure of ink mass, but the area coverage was estimated as the value commanded by the printer. However, inks can spread out and/or overlap, and this makes the actual ink coverage differ from the commanded ink coverage. This, in turn, changes the transmittance of the ink layer on the paper. To improve the estimate of Ti in the model, the ideal value of c0 = 7.31 g/ m2 was modified to estimate the actual ink coverage c. This was done by measuring the actual area coverage F determined by microdensitometry and comparing it with the value F0 sent to the printer. The correct value of c was calculated from Eq. 7. c = c0 F F0 (7) To use Eq. 7 in the model, a relationship between F and F0 is needed. However, this is a characteristics of a given printer, and rather than model it a priori the effect was characterized experimentally by measuring the printed ink area fraction F as a function of the value commanded by the printer F0. Values of F were measured by histogram segmentation of images captured by the microdensitometer, as described previously.1,2 Figure 2 is an example, and the data were fit empirically to Eq. 8 with Fmax = 0.79 and m = 1.05. F = Fmax F0m . (8) The model was then run by ranging F from 0 to Fmax. At each F the ratio F/F0 was calculated using Eq. 8. Equation 7 was then applied to determine c, which was used in Eqs. 6 and 2 through 5. The values of Rg, m, Fmax, and c0 were measured independently, and the values of ε and w were adjusted to provide a minimum rms deviation between the model and experimental values of Rp, as shown in Fig. 3. Again fit to Rp is good, but Ri is still Arney and Alber and the product Sx will be used as an independent variable in the tone reproduction model. Second, some light penetrates the dot and enters the paper. The transmittance of the dot, according to Kubelka– Munk, is given as follows: Reflectance 1 Ti = 0 0 Dot Fraction, F 1 Figure 3. Reflectance versus dot area fraction for the paper between the dot (+), the mean image (o), and the ink (x) for the pigmented magenta ink printed at 300 dpi with a disperse halftone pattern on a commercial gloss paper. The lines are drawn from the model with ε = 0.052 m2/g , w = 0.70, and Fmax = 0.79. modeled with a reflectance that is lower than observed experimentally. Indeed, the fit appears worse than in Fig. 1 suggesting that ink spread and overlap, while clearly present in Fig. 2, is not the major perturbation in tone reproduction characteristics of the system. It was anticipated that ink penetration into the paper may have a significant effect. Ink Penetration into the Paper The effect of ink penetration into the substrate could be quite complex. In an a priori model in which the paper PSF is convolved with the halftone pattern, vertical penetration of the dot would require a 3-D convolution and a detailed knowledge of the 3-D geometry of the ink. Such halftone modeling has been described but is quite complex.3–5 For the current probability model, ink penetration was approximated in a much simpler way. The major optical effect of ink penetration was assumed to be in the increased scattering of light in the ink by the paper. To model the effect we assume the ink behaves as if it does not actually penetrate the sheet but only increases in scattering coefficient S. In other words, the model is identical to the case of a nonpenetrating ink with a significant scattering coefficient. Thus an increase in S is used as an index of the degree of ink penetration into the paper substrate. This scattering effect was added to the probability model as follows: First, the ink scattering coefficient causes some light to reflect from the ink dot without penetrating through the dot. The Kubelka–Munk model gives this reflectance contribution as follows:6 RiK = 1 , a + b ×Coth( bSx ) (9) where a = (Sx + Kx)/Sx and b = (a2 – 1)1/2. The value of the product Kx is linearly related to the product εc, Kx = 2.303 εc, (10) b a ×Sinh( bSx ) +b ×Cosh( bSx ) (11) Equation 11 replaces Eq. 6 in the model. Light that enters the paper between the halftone dots is scattered and may emerge with probability Pp under the dot. Equation 1 has been used to model this probability for the disperse dot halftone. However, light that encounters a dot with a significant scattering coefficient Sx may be reflected back into the paper. A detailed description of this effect might include multiple scattered reflections between the substrate and the dot, but a simpler approximation will be used in the current model. One approach might be to assume the effect results in a decrease in the effective value of Ti of the dot. However, light that fails to transmit through the dot is returned to the paper where it can scatter and emerge between the dot. This would not be accounted for by simply approximating a decrease in the effective value of Ti. Alternatively, the effect can be described as a decrease in the probability factor Pp. In other words, the effect of scattering in the dot can be modeled as a decrease in the probability that light entering the paper between the dots will emerge from the system after passing through the dot. The effect will be approximated by modifying Eq. 1 with the reflectance factor from Eq. 9. [ ] Pp = w 1 − (1 − F ) B [1 − RiK ] . (12) The value of Pp from Eq. 12 is used to determine Pi from Eq. 2 and Ri from a modified form of Eq. 4 in which reflectance from the bulk is added to the Kubelka–Munk reflectance RKM to produce the overall ink reflectance, Ri = Rg Ti [1 − Pi (1 − Ti )] + RiK . (13) The reflectance of the paper is determined from Eq. 3 as before, and the overall reflectance is determined with Eq. 4. If the Kubelka–Munk reflectance RKM is zero (no scattering), the model reduces exactly to the model used in Fig. 3. If, however, the scattering Sx is adjusted as a third independent variable, the result shown in Fig. 4 can be achieved. Modifying Ink Spread and Penetration Achieving the fit of all three nonlinear sets of data in Fig. 4 with only the three independent variables ε, w, and Sx suggests the model is at least a reasonable approximation of the optical and physical behavior of the ink-jet system. To examine the physical impact of spread and penetration further, the ink and halftone pattern of Fig. 4 was printed on a recycled plain paper. The experimental data and the fit of the model are shown in Fig. 5. Evident from this experiment are the following: First, the model is able to fit the data quite well. Moreover, the fit is achieved with a significantly higher value of Sx than one would expect for the plain paper system. The ink penetrates farther into the plain paper and thus has a higher effective scattering coefficient. However, the model may overcompensate for this scattering effect in the ink layer and, thus, Optical Effects of Ink Spread and Penetration on Halftones Printed by Thermal Ink Jet Vol. 42, No. 4, July/Aug. 1998 333 1 Reflectance Reflectance 1 0 0 Dot Fraction, F 0 1 0 1 Dot Fraction, F Figure 4. Reflectance versus dot area fraction for the paper between the dot (+), the mean image (o), and the ink (x) for the pigmented magenta ink printed at 300 dpi with a disperse halftone pattern on a commercial gloss paper. The lines are drawn from the model with ε = 0.051 m2/g and w = 0.73, measured dot gain parameters of m = 1.05 and Fmax = 0.79, and ink penetration modeled with Sx = 0.5. Figure 5. Reflectance versus dot area fraction for the paper between the dot (+), the mean image (o), and the ink (x) for the pigmented magenta ink printed at 300 dpi with a disperse halftone pattern on a recycled plain paper. The lines are drawn from the model with ε = 0.06 m2/g and w = 0.55, measured dot gain parameters of m = 1.05 and Fmax = 0.79, and ink penetration modeled with Sx = 1.3. requires a slightly higher value of ε to achieve a good fit with the data. Moreover, the value of w which fits the data is lower for the plain paper than for the gloss-coated paper, which is the reverse of expectation.7 The value of w is related to the mean distance light travels between scattering events, and this is expected to be larger in plain papers than in coated papers. Perhaps this effect also has been overcompensated by the simplifying assumptions in modeling ink penetration. Halftone patterns were also printed for a dye-based ink on both the plain paper and the coated paper. The parameters used to fit the model to the data for all experiments along with the observed values of Fmax are summarized in Table I. In most cases the trends in the parameters are as expected. For example, the measured values of Fmax indicate the amount of lateral spread of ink on the paper and the lateral spread is greater for dye-based ink on the coated paper than on the plain paper. However, the amount of lateral spread is not significantly different for the pigmented ink on the two types of paper. But the effective increase in light scattering within the ink dot, Sx, in going from the coated paper to the plain paper is evident in both the pigment and the dye-based inks. In addition, the value of ε is higher for the dye-based ink, as is typically observed, but the values should not change when the paper is changed. That it does in both cases suggests the simple model of ink penetration overestimates the optical effect of scattering, requiring a compensating adjustment of ε. TABLE I. Summary of Modeling Parameters. Parameters Adjusted to Achieve the Minimum rms Deviation Between Model and Data for All Three Sets of Data R, Ri, and Rp versus F. Also Shown is the Value of Fmax, or the Dot Area Fraction at a Nominal Print Gray Scale of 100%. Conclusion The success of the model described in this report indicates the advantage of the probability model for exploring and modeling the mechanism of halftone imaging. Because the probability model can be written in closed analytical form, it is easily modified to account for additional mechanistic effects such as ink spread. Such modifications are much more difficult to do with an a priori model involving the convolution of ink with the paper point spread function. The probability model does, nevertheless, maintain a reasonable connection with the fundamental parameters of the point spread function 334 Journal of Imaging Science and Technology Ink base Paper ε (m2/g) w Sx Fmax pigment dye pigment dye coated glossy coated glossy recycled plain recycled plain 0.052 0.099 0.060 0.13 0.70 0.75 0.55 0.55 0.50 0.88 1.3 1.5 0.79 0.84 0.77 1.017 through the empirical w parameter1,2 and through fundamental theory described by Rodgers.8 Caution should be used, however, in applying the simplifying assumptions for ink penetration, because the model appears to overcompensate the optics of the penetration effect and to decrease the reliability of the w parameter as an index of the paper point spread function. Acknowledgments. Support for this project was provided by DuPont Corporation and is gratefully acknowledged. Special thanks to Paul Oertel for many challenging discussions and helpful suggestions. References 1. 2. 3. 4. 5. 6. 7. 8. J. S. Arney, A probability description of the Yule–Nielsen effect, J. Imaging Sci. Technol. 41(6), 633-636 (1997). J. S. Arney and M. Katsube, A probability description of the Yule–Nielsen Effect II: The impact of halftone geometry, J. Imaging Sci. Technol. 41(6), 637, (1998). F. Ruckdeschel and O. G. Hauser, Appl. Opt. 17, 3376 (1978). S. Gustavson, Color gamut of halftone reproduction, J. Imaging Sci. Technol. 41, 283 (1997). S. Gustavson, Dot gain in color halftones, Ph.D. Dissertation, Kinkoping University Department of Electrical Engineering, Linkoping, Sweden, Fall 1997. G. Wyszecki and W. S. Stiles, Color Science, 2nd. ed., John Wiley & Sons, NY, 1982, p. 785. J. S. Arney, C. D. Arney, and M. Katsube, An MTF analysis of paper, J. Imaging Sci. Technol. 40, 19 (1996). G. L. Rogers, Optical dot gain in halftone print, J. Imaging Sci. Technol. 41(6) 643-656 (1997); Optical dot gain: Lateral scattering probabilities, J. Imaging Sci. Technol. 42(4), 341 (1998). Arney and Alber JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY • Volume 42, Number 4, July/August 1998 Modeling the Yule–Nielsen Effect on Color Halftones J. S. Arney,* Tuo Wu and Christine Blehm* Rochester Institute of Technology, Center for Imaging Science, Rochester, NY 14623-0887 The Neugebauer approach to modeling color cmy halftones generally has to be modified to correct for the Yule–Nielsen light scattering effect. The most common modification involves the Yule–Nielsen n factor. A less common, but more fundamentally correct modification of the Neugebauer model involves a convolution of the halftone geometry with the point spread function, PSF, of the paper. The probability model described in the current report is less complex than the PSF convolution approach but is still much less empirical than the Yule–Nielsen n model. The probability model assumes the Neugebauer equations are correct and that the Yule–Nielsen effect manifests itself in a variation in the XYZ tristimulus values of the eight Neugebauer primary colors as a function of the amounts of c, m, and y printed. The model describes these color shifts as a function of physical parameters of the ink and paper that can be measured independently. The model is based on the assumption that scattering and absorption probabilities are independent, that the inks obey Beer–Lambert optics, and that ink dots are printed randomly with perfect hold-out. Experimentally, the model is most easily tested by measuring the shift in the color of the paper between the halftone dots, and experimental microcolorimetry is presented to verify the model. Journal of Imaging Science and Technology 42: 335–340 (1998) Background One of the conceptual advantages of halftoning is the linearity between the fractional area coverage of the ink dots, Fk, and the overall reflectance R of the image as expressed in the Murray–Davies equation, R = Fk Rk + (1 – Fk)Rp, where Rk and Rp are the reflectance factors of the ink and paper, respectively. In color halftoning this also means a linearity between c, m, and y dot areas and the CIE XYZ chromaticity coordinates of the image. However experimental measurements typically show a nonlinearity between R and Fk with R being less than predicted by the Murray–Davies equation. The nonlinearity between Fk and R is caused by two phenomena: (1) physical dot gain in which the actual dot fraction is larger than the dot fraction commanded in the printing process, and (2) the Yule– Nielsen effect in which the lateral scatter of light within the paper leads to an increase in the probability of the ink dots absorbing the light.1 Thus, to describe tone and color reproduction in halftone images, modifications of the Murray–Davies equation are used. In this report, F will refer to the actual, measured dot area fraction rather than the fraction commanded by the printer, and a modification of the Murray–Davies equation that models the Yule– Nielsen effect on color halftones will be described. The earliest and still most commonly used modification of the Murray–Davies equation is the Yule–Nielsen equation, with an empirical n factor.1 R1 n = Fk Rk1 n + Fp R1p n . Original manuscript received September 8, 1997 * IS&T Member © 1998, IS&T—The Society for Imaging Science and Technology (1) In this expression, Fk and Fp are the area fractions of the ink dots and the paper between the dots, respectively, and Fp = 1 – Fk. The Murray–Davies and Yule–Nielsen equations are often extended to describe spectral reflectance in cmy color halftoning. 8 R(λ ) = ∑ fi Ri (λ ) , i =1 (2) 8 R(λ )1 n = ∑ fi Ri (λ )1 n . i =1 (3) The fi are the area fractions of the eight possible colors (white, cyan, magenta, yellow, red, green, blue, and black) formed by overlap between the printed ink area fractions c, m, and y. The Ri(λ) are the reflection spectra of the eight colors. By knowing or assuming the geometry of overlap between ink dots, the color fraction may be determined from the ink fractions, fi = f(c,m,y). The most common assumption regarding dot overlap is that dots are printed randomly, which leads to the so-called Demichel equations [f1 =(1 – c)(1 – m)(1 – y), through f8 = c, m, y].2 Models for deterministic dot placement have also been published.3 By integration, the CIE chromaticity coordinates may be determined. XYZ = ∫ R(λ ) xyzP(λ )dλ , (4) where XYZ represents the X, Y, or Z chromaticity value and xyz represents the corresponding x, y, or z color matching function. The value P(λ) is the spectral power distribution of the light used to view the image. By applying Eq. 4 to Eq. 2 we have what is often called the Neugebauer equation for tristimulus values,2 335 I3 8 XYZ = ∑ fi ×XYZi1 n , (5) i =1 where XYZi represents a tristimulus value for the color region i of the halftone. The empirical modification for R(λ) may also be used to calculate tristimulus values. However, Eq. 6 does not follow from application of Eq. 4 to Eq. 3. 8 XYZ 1 n = ∑ fi XYZi1 n . Nevertheless, Eq. 6 is occasionally used as an empirical modification of Eq. 5. Because n is only an empirical factor, no reason exists not to use Eq. 6 if it provides a useful description of a given halftone system.4 Work in this laboratory has explored an alternative modification to the Murray–Davies equation in which Ri and Rp are not constants but are described as functions of the dot area fraction Rk(Fk) and Rp(Fp).5 R = FkRk(Fk) + FpRp(Fp). The Probability Functions Equation 7 may be expanded to describe cmy halftone color. Equation 2 is the appropriate expansion of Eq. 7 if we consider the eight Ri(λ) spectra to be functions of the color fractions fi as well as functions of wavelength λ. Then, integration gives the tristimulus values of the color image. The problem is to describe the way in which the eight Ri(λ) of Eq. 2 depend on the eight fi. The approach taken in this report is to describe probability functions for the lateral scattering of light in the paper and then to describe the way in which the eight Ri spectra depend on the probability functions. We begin by defining the probability function Pji. This is the probability that if a photon enters the paper in region j, of area fraction fj, it will reemerge after scattering in region i, of area fraction fi. In other words, if N photons enter the paper in region i, then Pji is the fraction of these N photons that scatters and emerges in region i, provided no light is absorbed by the paper. To account for light absorption, we assume absorption and scattering probabilities are independent so that the final number of photons from region j that emerges in region i is the product RgPji. Consider the monochrome case with region j = 1 defined as the region between the dots and region i = 2 the region of the paper containing dots. The probability P11 is the probability of light emerging from the region between the dots after entering between the dots. For conventional clustered dot halftones this probability was experimentally shown to be well described by the following function7: ] P11 = 1 − (1 − f1 ) 1 − (1 − f1 ) w + (1 − f1w ) , (8) where f1 is the same thing as Fp in Eq. 7 and w is a factor related to the scattering of light in paper. The w factor has been shown to be related to the scattering optics of the paper, 336 − Ak pν C ink Iof1 M paper P43 4 P33 P23 3 2 P13 1 Figure 1. For a two-color (cyan, magenta) halftone, the light reflected back from the blue region has four origins as a result of the Yule–Nielsen effect. (7) Experimentally it has been well shown that both Rk and Rp decrease as Fk increases.5 Both empirical and theoretical models have been reported for describing Rk and Rp versus Fk for monochrome halftones.5–7 w = 1− e I of 2 (6) i =1 [ I of 4 I of 3 , Journal of Imaging Science and Technology (9) where A is a constant characteristic of the halftone geometry, kp is a constant proportional to the mean distance light travels in paper before reemerging as reflected light, and ν is the halftone dot frequency. A thorough discussion of the terms in Eq. 9 was reported elsewhere.7 Equation 8 may be generalized to describe the probability Pjj that light that enters an area of the paper marked j will emerge from the paper at area j, with fj as the area fraction. [ ] Pjj = 1 − (1 − f j ) 1 − (1 − f j +) w −(1 f jw ) . (10) Similarly, an extension of previous work on monochrome FM halftones leads8 to a somewhat different expression for Pjj, ( ) Pjj = 1 − w 1 − f jB , (11) where w is again given by Eq. 9 but with ν defined as the inverse of the dot diameter. The B factor is an empirical factor related to the particular geometry of the FM halftone.8 In addition to the Pjj probabilities, there are also all of the Pji probabilities, as illustrated for the cy two-ink case in Fig. 1. If we have functions to describe all of the Pji, then for a three-color cmy halftone we would have an 8 × 8 matrix of probability functions Pji with the Pjj functions on the diagonal of the matrix. Similarly, a monochrome halftone would be described with a 2 × 2 matrix of probability functions. As will be shown below, the Pji can be related to the Pjj. First, however, we examine how these probabilities can be used to calculate color reproduction in the halftone. From Probability to Reflectance The two-ink case illustrated in Fig. 1, shows how the incident irradiance, I0 = watts/area, is divided among the areas, fi, of the halftone image. Photons I0 fi strike the image in region number i. The light that then enters the paper in this region is I0 fiTi, where Ti is the Beer–Lambert transmittance of the ink layer over region i. Note that Ti = 0 for i = 1 (the paper between the dots) and that T2 = Tcyan, T3 = Tcyan Tyellow, etc. Then, the number of photons from region i that emerge from region j is I0 fiTiPji. Arney, et al. For example, as illustrated in Fig. 1, the number of photons that strike Region 4 (the cyan-color region) and eventually emerge under Region 3 is given by the product (I0T4RgP43). The total amount of scattered light that reaches Dot 3 is the sum of these expressions for Regions 1, 2, and 4. Then the light passes through Dot 3 and is attenuated by T3. Similarly, the general expression for the photon irradiance emerging from any dot i is as follows: Ii = I 0 Rg Ti ∑ (TJ Pji f j ). J (12) The reflectance of the dot is the ratio of the light emerging from the dot, Ii, to the light entering the dot, I0 fi. Thus, dividing Eq. 12 by I0 fi gives the following expression for the reflectance of dot i. fj Ri = Rg Ti ∑ TJ Pji ÷. fi J (13) The spectral designation (λ) has been dropped to simplify the notation, but Ri, Rg, and all the Tj are functions of wavelength. The spectral Ri may then be used in Eq. 2 to determine the overall spectral reflectance of the halftone image, and then Eq. 5 can be used to calculate the tristimulus values of the image. The only unknown in the model is a description of the off-diagonal probabilities Pji. The Off Diagonal Probabilities Everything required to model halftone color is now known except the off-diagonal probabilities Pji. Intuitively, the Pji must relate to the Pjj and to the color fractions fj and fi. We can derive this relationship by assuming the independence of the scattering probabilities Pji and the absorption probabilities Ti and Rg. We also assume the paper is sufficiently thick that loss of light through the back of the paper is negligible. Under these conditions, the photons that enter Region 3 of Fig. 1 must eventually emerge in one of the four regions. P13 + P23 + P33 + P43 = 1. (14) This expression is a special case of Eq. 13 for a two-ink halftone and for Rg = Ti = Tj = 1. If we further assume that the dots are randomly placed on the paper so that the probability of light from Region k emerging in some other Region i ≠ k is proportional to the area fraction i. Thus, for any two regions i ≠ k and j ≠ k, we may write the following: Pki/Pkj = fi/fj. (15) For example P31 = P31(f1/f1), P32 = P31(f2/f1), and P34 = P31(f4/ f1). Combining these with Eq. 14 gives the following: P31 ( f1 f1 ) + P31 ( f2 f1 ) + P31 + P31 ( f4 f1 ) = 1 (16) Note that Eq. 15 does not apply to P33, but only to i ≠ j. Then we recognizing f1 + f2 + f4 = 1 – f3 and solve Eq. 16 for the off-diagonal P31. f P31 = (1 − P33 ) i ÷. 1 − f3 (17) We may generalize this expression for any off-diagonal term, i ≠ j. Modeling the Yule–Nielsen Effect on Color Halftones f Pji = 1 − Pjj i ÷ . 1 − fj ( ) (18) We now have a sufficient set of functions to model color halftones. The Model Recipe To apply these probability functions to calculate the XYZ tristimulus values of a color halftone given the c, m, and y ink fractions delivered by a printer, the following steps are taken: (Note c, m, and y are the actual areas not the areas commanded by the printer. Physical dot gain is not considered here.) Step 1. Measure the transmittance spectra of the individual inks, Tcyan, Tmagenta, and Tyellow. Assume the Beer–Lambert law and determine the transmittance spectra Ti of the eight colors. Also measure the reflection spectrum of the paper, Rg. Step 2. Begin with the ink combination (c,m,y) and calculate the eight color fractions, f1 through f8. For randomly placed ink dots, the Demichel equations may be used. Otherwise dot geometry must be modeled, as illustrated for dot-on-dot halftones described subsequently. Step 3. Use Eq.n 10 to calculate the eight diagonal probabilities, Pjj, for a traditional clustered dot halftone. Eq. 11 may be used with an FM, stochastic type of halftone. The parameters w and B may be taken as arbitrary constants to fit the model to data. Alternatively, w and B may be measured independently as described previously.8 Step 4. Use Eq. 18 to calculate the off-diagonal probabilities. Step 5. Use Eq. 13 to calculate the reflection spectra of the eight colors. Step 6. Use Eq. 2 to calculate the reflection spectrum of the overall halftone image. Step 7. Use Eq. 5 and the power spectrum of the illumination light, P(λ), to calculate the XYZ tristimulus values of the halftone image. Testing the Model Color halftones were generated with an HP 1600C inkjet printer on a high-quality coated sheet to minimize ink penetration and dot gain. Halftones were printed with an error diffusion algorithm, and dot-on-dot was not used. The dots from the different colors were at 300 dpi and were randomly placed with respect to each other. A fixed amount of magenta (dot fraction m = 0.45) was printed at different cyan dot fractions (0 < c < 1). No yellow was printed in this experiment. A microscopic image of the dot pattern was captured with a 2-mm field of view using a 3-chip color CCD camera and video frame grabber. The sample was illuminated with an incandescent light source through fiber optics. The resulting light on the sample was measured and found to have the power distribution P(λ) of CIE Illuminant A. The camera and optical system had been calibrated to the ink-jet dye set so the rgb images could be translated into XYZ space. In addition, gray-level segmentation in the original rgb images provided independent measures of the c, m, and y dot area fractions. Using the color microdensitometer, measurements were made of the XYZ tristimulus values of not only the overall image but of the space between the ink dots. The results were plotted as a function of the cyan dot area fraction c and are shown in Figs. 2, 3, and 4. Figure 5 shows the corresponding x,y chromaticity values. The data do not go all the way to the gamut limit because the printer, at a command of Vol. 42, No. 4, July/Aug. 1998 337 100 X50 Y50 Y X 100 Paper Paper Mean 0 0 Mean 0.5 0 1 0 C Figure 2. The X tristimulus value of the paper between the dots () and of the overall, mean value of the halftone image (O) versus the ink area fraction of cyan at magenta = 0.45. Error diffusion halftone at 300 dpi. 0.5 Figure 3. The Y tristimulus value of the paper between the dots () and of the overall, mean value of the halftone image (O) versus the ink area fraction of cyan at magenta = 0.45. Error diffusion halftone at 300 dpi. 0.6 100 g paper w Z Z y y 50 mean 0.1 0.5 1 C Figure 4. The Z tristimulus value of the paper between the dots () and of the overall, mean value of the halftone image (O) versus the ink area fraction of cyan at magenta = 0.45. Error diffusion halftone at 300 dpi. 100% ink, formed dots with very little dot gain and occupied only 90% of the paper area. The model was run over the range 0 < c < 0.9 to agree with the experiment. These experiments demonstrate that the color between the dots is, indeed, not the color of the unprinted paper but mimics the mean value color of the overall image. The solid lines in Figs. 2 through 5 were calculated with the model recipe described above. The transmittance spectrum of the cyan dye was determined from the reflection spectrum, Rcyan, of a 100% cyan region (m = y = 0) and the function, Tcyan = (Rcyan/Rg)1/2. Spectra for the magenta and yellow were similarly determined. The Demichel equations 338 Journal of Imaging Science and Technology m r b Mean 0 sp ec tru m loc y us c Paper 0 1 C 0.2 xx 0.7 Figure 5. The xy chromaticity trajectory with Illuminant A for the paper between the dots and for the overall image for the variable cyan, fixed magenta, error diffusion halftone. The gamut of the printer at maximum c, m, and y inks is shown. The paper chromaticity (w ) and the spectrum locus are shown. were used to determine the color area fractions fi, and Eq. 11 for FM halftones was used for the on-diagonal probabilities. The model was fit to the data by adjusting w and B. Rather than search for a statistical fit criteria, the authors simply adjusted w and B to achieve a visually acceptable agreement between the model and all of the data in Figs. 2 through 5. Values of w = 0.82 and B = 1.2 were used in this calculation and are consistent with independent estimates from earlier work.8,9 A second experiment was performed using the same inks and a traditional clustered dot halftone. However, the clustered dot halftone was printed dot-on-dot rather Arney, et al. 100 50 50 Paper Y X 100 Paper Mean 0 0 Mean 0.5 0 1 M 0 0.5 M Figure 6. The X tristimulus value of the paper between the dots (O) and of the overall, mean value of the halftone image (X) versus the ink area fraction of cyan at magenta = 0.4. Clustered doton-dot halftone at 53 dpi. 100 Figure 7. The Y tristimulus value of the paper between the dots (O) and of the overall, mean value of the halftone image (X) versus the ink area fraction of cyan at magenta = 0.4. Clustered doton-dot halftone at 53 dpi. , 0.6 g paper 50 y Z w y r m mean Mean 0 y c Paper 0 1 b 0.5 1 M Figure 8. The Z tristimulus value of the paper between the dots (O) and of the overall, mean value of the halftone image (X) versus the ink area fraction of cyan at magenta = 0.4. Clustered doton-dot halftone at 53 dpi. than randomly. Figures 6 through 9 show the results. Again the color of the paper between the dots mimics the color of the halftone image. The solid lines in these figures were modeled by the recipe above with the following changes. First, Eq. 10 was used for the diagonal probabilities. Second, the Demichel equations were replaced with a geometric calculation for dot-on-dot halftones. For the fixed magenta at different levels of cyan, the functions in Table I were used. The value of w = 0.80 was found to provide an overall fit, judged visually, to the data in Figs. 6 through 9. Discussion As shown by Engeldrum, the Yule–Nielsen effect manifests itself in color halftones as a change in the color of the Modeling the Yule–Nielsen Effect on Color Halftones 0.1 0.2 0. xx Figure 9. The xy chromaticity trajectory with Illuminant A for the paper between the dots and for the overall image for the variable cyan, fixed magenta, clustered dot-on-dot halftone. The gamut of the printer at maximum c, m, and y inks is shown. The paper chromaticity (w ) and the spectrum locus are shown. TABLE I. Color Fractions. Dot-On-Dot Geometric Calculation of Color Area Fractions from Ink Area Fractions For Cyan, Magenta Two-Ink Halftones Color If c < m If c ≥ m white cyan magenta blue 1–m 0 m–c c 1–c c–m 0 m paper between the halftone dots as the dot area fractions change.10,11 The probability model appears to provide a mechanistic rational for this phenomenon. Moreover, the model rationalizes the overall color of the halftone image. The printer Vol. 42, No. 4, July/Aug. 1998 339 used in the project employed a default algorithm for gray color removal, and this prevented experimental testing with more than two of the three cmy inks. However, the fit with the two color cases strongly supports the model. This, in turn, indicates that the Neugebauer Eqs. 2 and 5 are correct descriptions of halftone color reproduction provided the eight reflectance spectra Ri and the eight sets of tristimulus values XYZi are treated as continuous functions of ink fractions cmy and not as the reflectance spectra and tristimulus values of the eight Neugebauer colors printed at 100% coverage. This point is emphasized by integrating Eq. 13 directly to find the eight sets of Neugebauer tristimulus values to use in Eq. 5. Integration leads to the following: 8 fj XYZ = ∑ Pij XYZij . fi j =1 (19) Note that integration leads to a matrix of 64 tristimulus values XYZij. The eight values on the diagonal XYZjj are the traditional Neugebauer values for the eight Neugebauer colors printed at 100% coverage.2,11 These values may be measured independently. However, the off-diagonal values XYZij are tristimulus values for light that passes Dot j, scatters in the paper, and then passes through Dot i. The XYZij tristimulus values can not be measured independently. Unlike the Yule–Nielsen modification to the Neugebauer equation, the probability model has a direct link with the fundamental optical and geometric characteristics of the halftone system via Eq. 9. While the probability model is significantly more complex than the traditional n modified Yule–Nielsen model, it is significantly less complex than a convolution model involving the fundamental probability function PSF of light in the paper. Gustavson12 has demonstrated such a model, and it is fundamentally correct theoretically. However, the current probability model is expressed with closed analytical functions and is much more amenable to modifications for nonideal systems, as demonstrated in previous work.9 Moreover, one should be able to derive the mean level probabilities Pjj from the fundamental probability PSF and a knowledge of the geometry of the halftone system. Because the PSF of paper is quite difficult to measure experimentally, it is typically modeled empirically. In the current model, we begin by modeling Pjj empirically. In addition, as demonstrated previously,8 it may be easier to measure the Pjj than the PSF. Thus, one experimental approach to measuring PSF may be to measure Pjj with several known dot geometries and then to calculate PSF. 340 Journal of Imaging Science and Technology Appendix A reviewer of this manuscript correctly pointed out that Eq. 15 implies an assumption. The assumption is that the off-diagonal probabilities Pik are proportional to the area fractions fi so that Pik = ak fi, for i ≠ k. If a nonlinear proportionality actually applies so that Pik = akG(fi) for some function G, then Eqs. 15 through 18 become more complex. While this may certainly be the case, it is not revealed in the experimental data and the data are not sufficiently noise-free to provide a guide to a more advanced estimate of the functional form of Eq. 15. For a more rigorous analysis of this probability, the reader is directed to recent theoretical work by Rogers.13–15 Acknowledgments. The authors express their appreciation to the Hewlett-Packard Company for support of this project. Thanks to the reviewers of the paper who offered extremely helpful criticism. Special thanks to the students in the 1997/98 course in Color Reproduction at RIT for finding all the typos. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. J. A. Yule and W. J. Nielsen, TAGA Proc. p. 65 (1951). J. A. Yule, Principles of Color Reproduction, Chap. 10, John Wiley & Sons, NY, 1967, p. 255. T. N. Pappas, Proc. IS&T 47, 468 (1994). J. A. S. Viggiano, TAGA Proc. 37, 647 (1985). J. S. Arney, P. G. Engeldrum and H. Zeng, An expanded Murray–Davies model of tone reproduction in halftone imaging , J. Imaging Sci. Technol. 39, 502 (1995). J. S. Arney, C.D. Arney and P. Engeldrum, Modeling the Yule–Nielsen halftone effect, J. Imaging Sci. Technol. 40, 233 (1996). J. S. Arney, A probability description of the Yule–Nielsen effect, J. Imaging Sci. Technol. 41, 633 (1997). J. S. Arney and M. Katsube, A probability description of the Yule–Nielsen effect II: The impact of halftone geometry, J. Imaging Sci. Technol. 41, 637 (1998). M. Alber, Modeling the effect of ink spread and penetration on tone reproduction, M.S. Dissertation, Rochester Inst. of Technol., Rochester, NY, 1997. P. G. Engeldrum and B.Pridham, Application of turbid medium theory to paper spread function measurements, TAGA Proc . 47, 353 (1995). P. G. Engeldrum, The color gamut limits of halftone printing with and without the paper spread function, J. Imaging Sci. Technol. 40, 2229 (1996). S. Gustavson, Color gamut of halftone reproduction J. Imaging Sci. Technol. 41, 283 (1997). G. L. Rogers, Optical Dot gain in a halftone print, J. Imaging Sci. Technol. 41, 643 (1997). G. L. Rogers, Optical dot gain: Lateral scattering probabilities, J. Imaging Sci. Technol. 42(4), 336-339 (1998). G. L. Rogers, The effect of light scatter on halftone color, J. Imaging Sci. Technol. 42, in press (1998). Arney, et al. JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY • Volume 42, Number 4, July/August 1998 Optical Dot Gain: Lateral Scattering Probabilities Geoffrey L. Rogers* Matrix Color, 26 E 33 Street, New York, New York 10016 In the development of the technology of halftone imaging there has been significant interest in physically modeling the halftone microstructure. An important aspect of the microstructure is the scattering of light within the paper upon which the halftone image is printed. Because of light scatter, a photon may exit the paper at a point different from the point at which it entered the paper. The effect that this light scatter has on the perceived color of the printed image is called optical dot gain. Optical dot gain can be characterized by lateral scattering probabilities, which is the probability that a photon entering the paper through a particularly inked region exits the paper through a similar or different type inked region. In this article we explicitly calculate these lateral scattering probabilities for the case of AM and FM halftone screening. We express these probabilities in terms of the fractional ink coverage and the lateral scattering length, a quantity that characterizes the distance a photon travels within the paper before exiting. Journal of Imaging Science and Technologies 42: 341–345 (1998) Introduction Halftone imaging is a widely used technique for producing printed images. Recently there has been significant interest in physically modeling the halftone microstructure to control the tone characteristics of the halftone image better.1–4 An important aspect of this microstructure is the scattering of light within the paper upon which the image is printed. This effect of scattering is called optical dot gain, because, for achromatic images, the ink dots are effectively larger as a result of the scattering.5 Several authors have expressed optical dot gain in terms of lateral scattering probabilities1,2 which is the probability that a photon having entered the paper through a particular type inked region exits the paper through a similar or different type inked region. In this article we explicitly calculate these probabilities; in particular we calculate the ink–ink probability, which is the probability that if a photon enters the paper through an inked region it also exits the paper through an inked region—a conditional probability we label Pii. Knowledge of Pii allows one to calculate all the other lateral scattering probabilities.1 Although the calculation done here involves a single array of dots, our results are applicable to a chromatic halftone image.9 We make the calculation for the case of both AM and FM halftone screening.2 In Ref. 1 it is shown that the ink–ink probability can be expressed in terms of an infinite series—the Z-series— involving the Fourier transforms of the dot shape and the paper’s point spread function. Here, we explicity calculate Pii and obtain a closed-form expression. The model we construct to determine Pii is as follows: a uniform stream of photons is incident on the paper within an area of one dot, and we calculate the fraction of the Original manuscript received September 15, 1997 * IS&T Member © 1998, IS&T—The Society for Imaging Science and Technology photons that exit the paper through this dot and through all the other dots. This fraction is Pii. We assume that the dots are circular with radius d and that they are arranged in a square grid (screen) with screen period r. The origin of the coordinate system is at the center of the dot through which the photons enter the paper (see Fig. 2). We define 2πR(ρ)ρ,dρ as the probability that a photon, having entered the paper through the dot centered on the origin, exits the paper through an annulus, also centered on the origin, with radius ρ and thickness dρ. R(ρ) is the radial reflectance per unit area, and R(ρ) integrated over the entire surface is the paper’s reflectance, Rp: ∞ R p = 2π ∫0 R(ρ ) ρdρ. (1) We define the radial covering distribution A(ρ), as the probability that an arbitrary point at a distance ρ from the origin is covered by ink. Then the ink–ink probability is: ∞ Pii = 2π ∫0 R(ρ ) A(ρ ) ρdρ. (2) In the section “Reflectance” we calculate the reflectance per unit area, R(x,y), for photons that have entered the paper through the area of a single dot. In the section “Radial Covering Distribution” we calculate the covering distribution A(ρ). In the section “Ink-Ink Probability” we carry out the integration of Eq. 2, making two approximations to obtain a closed-form expression for Pii. The calculations carried out in these sections are for AM halftone screening in which the number of dots within a region is constant and the size of the dots is varied. In the section “FM Halftone Screen” we calculate Pii for FM halftone screening: the dots are of constant size and the number of dots is varied. In the section “Ink-Ink Probability for Diffusion PSF” we give the ink-ink probability as calculated with the diffusion point spread function. Reflectance The reflectance per unit area R(x,y) is the probability that a photon exits the paper at the point x,y after having 341 Figure 1. Radial reflectance per unit area R(ρ) with d = 0.4r and (a) ρ = 0.1r, (b) ρ = 0.6r, (c) ρ = 1.5r, and (d) ρ = 4.0r. entered the paper through the area of a dot of radius d centered on the origin and is given by: Rp R( x, y) = ∫∫ H ( x − x' , y − y' ) I ( x' , y' ) dx' dy' , (3) S0 where S0 is the number of photons incident on the paper per unit time, H(x,y) is the paper’s normalized point spread function, and I(x,y) is the incident photon distribution. The value H(x – x′,y – y′) is the probability that a reflected photon having entered the paper at x′,y′ exits the paper at x,y. The value I(x,y) is the number of photons per unit area per unit time entering the paper at the point x,y and is given by: x 2 + y2 S , I ( x, y) = 02 circ d πd x 2 + y2 = πd 2 J1 (2πkd) , F circ d πkd where J1 is a Bessel function. Due to the circular symmetry, the inverse Fourier transform can be expressed as a Hankel transform, and one writes Eq. 3 as: (4) where ρ is the polar radial coordinate. To evaluate Eq. 4, one must choose an appropriate point spread function. A widely used PSF is6: Journal of Imaging Science and Technology ρ2 K 0 (2πρ / ρ ), where K0 is a modified Bessel function of the second kind. The parameter ρ is ρ = 4 < ρ > and < ρ > is the first moment of H called the lateral scattering length. It is the average lateral distance a photon travels within the paper, and its inverse, < ρ >–1, is the approximate bandwidth of the paper. The MTF is: 1 1 + (ρ k) 2 . (5) 1 − (2πd / ρ ) K 1 (2πd / ρ ) I0 (2πρ / ρ ), 0 ≤ ρ ≤ d πd 2 R(ρ ) = , (6) d<ρ Rp (2πd / ρ ) I1 (2πd / ρ ) K 0 (2πρ / ρ ), The integral Eq. 3 is a convolution and can be evaluated as the inverse Fourier transform of the product of the transforms of H(x,y) and I(x,y). The transform of H(x,y) is the paper’s modulation transfer function (MTF) labeled H˜ (k), with k the spatial frequency (lines per unit length). Owing to the assumed isotropy of the point spread function, the MTF has circular symmetry. The transform of the circ[ ] function is: 342 2π Integrating Eq. 4 using Eq. 5, one finds: 1, 0 ≤ ρ ≤ d . 0, ρ > d ∞ πd 2 R(ρ ) = 2πd ∫0 H˜ (k) J1 (2πkd) J0 (2πkρ) dk, Rp H (ρ) = H˜ (k) = where d is the radius of the dots and circ [ρ/d] is: ρ circ = d Figure 2. The small circles are dots, and the large circle has radius ρ. Light is incident through the central dot. The value A(ρ) is the sum of the bold arc-lengths of the large circle divided by its radius. The value θ is the angle subtended by the bold arclengths. where I0 and I1 are modified Bessel functions of the first kind. Figure 1 shows the radial reflectance Eq. 6, with d = 0.4r, for several different ρ . Radial Covering Distribution The radial covering distribution, A(ρ), is the probability that an arbitrary point at a distance ρ from the origin is covered by ink. The value A(ρ) is the fraction of the circumference of a circle, centered on the origin with radius ρ, that lies on a dot. This is shown graphically in Fig. 2. The small circles are the dots, with radius d, and the large circle has a radius ρ. The variable A(ρ) is the sum of the bold arclengths of the large circle divided by its circumference. If the dots overlap (d > r/2), then for some values of ρ the sum of the arc-lengths is larger than the circumference—in this case, all points of the large circle lie on a dot and A(ρ) = 1. The value A(ρ) is calculated as follows: We define the neighbor distribution N(s) as the number of dots whose centers lie at a distance s from the origin. We define θ(s,ρ) as the angle subtended by the arc-length covering a dot whose center lies at a distance s, as shown in Fig. 2. Then, the radial covering distribution is: A( ρ ) = 1 ∞ ∫ N ( s)θ ( s, ρ )ds. 2π 0 (7) Rogers Integrating the first term and dividing by πd2 one obtains: 1 − 2 K 1 (2πd / ρ ) I1 (2πd / ρ ). (12) This expression is the probability that a reflected photon exits the paper through the same dot as that through which it entered the paper. Integrating the second term, one obtains a sum of integrals of the form: xk + d ∫x k −d x2 + ρ 2 − d2 K 0 (2πρ / ρ ) arccos k ρdρ. 2 xk ρ (13) These integrals can be evaluated numerically with littletrouble, however it is possible to get a very accurate closedform expression by making two approximations. The first is an approximation to the arccos [ ]: x2 + ρ 2 − d2 1 d 2 − ( xk − ρ ) 2 . arccos k → ρ 2 xk ρ Figure 3. The value A(ρ) with d = 0.4r. The second approximation is: Both N(s) and θ(s, ρ) are derived in Ref. 2 and are given by: K 0 (2πρ / ρ ) → K 0 (2πxk / ρ ) exp[ −2π (−ρ for xk - d ≤ ρ ≤ xk + d. The errors in these approximations tend to cancel each other for all d and ρ so that the expression θ ( s, ρ ) = 2π , 0 ≤ ρ ≤ d − s 2 2 2 2 arccos s + ρ − d /(2 sρ ) , s − d ≤ ρ ≤ s + d 0, ρ ≤ s − d or ρ ≥ s + d [( ] ) K 0 (2πxk / ρ ) ∫− d exp[ −2πu / ρ ] d 2 − u 2 du (8) d ∞ ∑ pkδ ( xk − s), ρd I1 (2πd / ρ ) K 0 (2πxk / ρ ). 2 (9) k=0 where δ(x) is a Dirac delta function and x k is r k with k a natural number, and pk is the number of combinations of integers n and m such that k = n2 + m2. The quantity xk is the distance to the kth “set” of dots, and pk is the number of dots in the “set”; i.e., the number of dots at a distance xk. The first few xk/r with nonzero pk are 0, 1, 2, 2, 7 5 , 8 ; and the corresponding pk are 1, 4, 4, 4, 8, 4. Carrying out the integration in Eq. 7 and defining: Rp−1 Pii = 1 − 2 K 1 (2πd / ρ ) I1 (2πd / ρ )+ 2[ I1 (2πd / ρ )]2 S(ρ ), (16) where we define: S( ρ ) = ) ] d , A( ρ ) = ∑ k=0 Ak (ρ ). Ink–Ink Probability Inserting the expressions for R(ρ), Eq. 6, and A(ρ), Eq. 10, into Eq. 2, one obtains: d 2πd πd 2 Pii = 2π ∫0 1 − K 1 (2πd / ρ ) I0 (2πρ / ρ ρdρ Rp ρ + (2π ) 2 ∞ ∞ d I1 (2πd / ρ ) ∑ ∫d K 0 (2πρ / ρ ) Ak (ρ ) ρdρ. ρ k= 1 Optical Dot Gain: Lateral Scattering Probabilities π (d / r )2 , µ= (θ + cosθ ) /(1 + sin θ ), (10) Figure 3 shows A(ρ) for dot radius d = 0.4r. Equation 10 is correct for d ≤ r/2. If d > r/2, the right side of Eq. 10 is greater than 1 for some values of ρ, in which case one sets A(ρ) = 1. (11) (17) The second term in Eq. 16, 2[I 1 (2πd/ ρ )] 2 S( ρ ), is the probability that reflected photons exit the paper through dots other than the one through which they entered the paper. It is convenient to express Pii in terms of the fractional ink coverage rather than the dot radius. The percent area covered by ink, µ, is: one obtains: ∞ ∞ ∑ pk K 0 (2πxk / ρ ). k= 1 and for k ≥ 1: ( p / π )arccos x k2 + ρ 2 − d 2 /(2 x k ρ ) , x k − d ≤ ρ ≤ x+k Ak ( ρ ) = k 0, ρ < x k − d or ρ> x k + d (15) Inserting Eqs. 12 and 15 into Eq. 11, one obtains: 1, 0 ≤ ρ ≤ d A0 (ρ ) = 0, ρ > d [( (14) is a very accurate approximation to Eq. 13. The integral is easily evaluated, and one obtains for Eq. 14: and N (s) = xk ) / ρ ], 0 ≤ d ≤ r/2 r/2 ≤ d ≤ r/ 2 (18) where θ= π r − 2 arccos ÷. 2d 2 The expression Eq. 16 is correct for 0 ≤ µ ≤ π/4. Numerical integration of Eq. 2 indicates that linear extrapolation of Eq. 16 for π/4 ≤ µ ≤ 1 is an excellent approximation. One then obtains for the ink-ink probability: 1 − ξ (µ ), R p−1 Pii (µ ) = 1 − (1 − µ ) / (1 − µ 0 ) ξ (µ 0 ), [ ] 0≤µ ≤π /4 π / 4 ≤ µ ≤ 1 (19) where Vol. 42, No. 4, July/Aug. 1998 343 Figure 4. The value Pii as a function of µ for various ρ . (a) ρ = 0.2r, (b) ρ = 1.0r, (c) ρ = 2.0r, and (d) ρ = 6.0r. Figure 6. Comparision of the first and second terms of Eq. 16 with ρ = 1.5r. (a) Probability that photon exits incident dot. (b) Probability that photon exits any of the other dots. (c) Total probability that photon exits a dot, sum of (a) and (b). that our final result depends only on the average number of dots within a given region; we assume that within a region of constant tone, the dots are uniformly distributed. For ease in notation we assume the paper reflectance Rp is unity; for Rp < 1, the final expression for Pii is multiplied by Rp. We assume the dots are potentially located on a square grid array with period r. The dots are labeled by their coordinates n, m, with the photons entering the paper through the n = 0, m = 0 dot. We define Pnm as the probability that a photon having entered the paper through the dot 0, 0 exits the paper through the dot n, m. We also define the stochastic variable pnm as: 1, if there isa dot at m, n pnm = 0, if there is nodot at m, n Figure 5. The value Pii as a function of ρ for (a) µ = 0.1, (b) µ = 0.4, (c) µ = 0.6, and (d) µ = 0.9. ( ξ (µ) = 2 I1 2r µπ ρ / )[ K (2r 1 µ π ρ/ ) − I (2r πµρ / ρ )S( )] 1 and µ0 = π/4. Note that ξ(µ) is the probability that a reflected photon exits the paper through a nonink region after entering through an inked region. Figure 4 shows Pii versus µ for several ρ and Fig. 5 shows Pii as a function of ρ for several µ. In the figures, ρ is in units of r. As indicated by the curves in Fig. 5 and as can be shown by Eq. 16, if ρ >> r, then Pii ≈ µ. This corresponds to the case of “complete scattering”.1 Figure 6 shows the first and second terms in Pii separately (as a function of µ) for ρ = 1.5. Curve (a) is the probability that the light exits through the incident dot, (b) is the probability it exits through the other dots, and (c) is the sum of (a) and (b). For convenience, we have set the paper reflectance equal to unity in all the figures. FM Halftone Screen In this section we calculate the ink-ink probability for an FM halftone screen. In such a method, all the dots have the same size and are square with dot area equal to a cell area and the number (or frequency) of dots is varied. There are a number of techniques for determining the exact placement of the dots.8 The calculation done here is general in 344 Journal of Imaging Science and Technology (20) subject to the constraint: lim N →∞ 1 N2 N /2 ∑' n,m =− N / 2 pnm = µ, (21) where the ′ on Σ indicates that the n = m = 0 term is excluded from the sum (p00 ≡ 1) and µ is the fractional ink coverage. The left side of Eq. 21 is the average pnm so that: <pnm> = µ (22) (excepting the n = m = 0 term). The ink–ink probability is obtained by first summing the probability that a photon exits the paper through the n, m cell, Pnm , over all cells that contain a dot (pnm = 1), then averaging over all realizations of the pnm consistent with Eq. 21: Pii = ∑ pnm Pnm nm = ∑ pnm Pnm . (23) nm For a uniform distribution, the average over all possible realizations of the pnm is equivalent to the average defined by the left side of Eq. 21, so one can write: Pii = µ ∑ Pnm − P00 + P00 . nm As we assume Rp = 1, the sum is unity: ∑ Pnm = 1, nm (24) (25) Rogers Figure 7. The value FM Pii as a function of µ for (a) ρ = 0.2r, (b) ρ = 1.0r, (c) ρ = 2.0r, and (d) ρ = 6.0r. Figure 8. The value FM Pii as a function of ρ for (a) µ = 0.1, (b) µ = 0.4, (c) µ = 0.6, and (d) µ = 0.9. which simply states that the number of photons is conserved. The probability that the photons exit the same dot as that through which they entered the paper, P00 , is given by Eq. 12 (where we approximate the square dot with a circular dot with area equal to cell area) with d = r/√π, so the ink–ink probability is: ξ (µ) = Pii = 1 – (1 – µ)χ, ) ( ) qn / σ n2 n=1 1 + (2πkt / σ n ) 2 ∑ , (28) qn / σ n2 , n and the lateral scattering length is: ρ = tπ 2 Rp ∑ ∞ ∑ k= 1 pk K 0 (σ n xk / t). qn / σ n3 . n The diffusion ink–ink probability for AM screening has the same form as Eq.19 with ξ(µ) given by: Optical Dot Gain: Lateral Scattering Probabilities χ= 2 Rp ∞ ∑ n=1 ( ) ( ) qn I1 r / π σ n / t K 1 r / π σ n / t . σ n2 (30) Conclusion In this article, we explicitly calculate the probability that a photon exits the paper through an inked region after originally entering the paper through an inked region, and we obtain a simple closed-form expression. This conditional probability completely contains the effects of optical dot gain; i.e., knowledge of this probability allows one to account for the effects of optical dot gain in a halftone print completely. We calculate the probability for both AM and FM halftone screening. The results reported here also allow a simple calculation of the Z that appear in the theory of the multi-ink halftone image.9 G. L. Rogers, Optical dot gain in a halftone print, J. Imaging Sci. Technol. 41, 643 (1997). 2. J. S. Arney, Probability description of the Yule–Nielsen effect: Part I and II, J. Imaging Sci. Technol. 41, 633 (1997). 3. G. L. Rogers, Neugebauer revisited: Random dots in halftone screening, Col. Res. Appl. 23, 104, (1998). 4. (a) J. S. Arney, C. D. Arney, and P. G. Engeldrum, J. Imaging Sci. Technol. 40, 233 (1996); (b) J. S. Arney, C. D. Arney, and Miako Katsube, J. Imaging Sci. Technol. 40, 19 (1996); (c) J. S. Arney, P. G. Engeldrum, and H. Zeng, J. Imaging Sci. Technol. 39, 502 (1995). 5. J. A. C. Yule and W. J. Nielsen, TAGA Proc. 3, 65 (1957). 6. J. C. Dainty and R. Shaw, Image Science, Academic Press, New York, 1974. 7. G. L. Rogers and R. Bell, Bessel function identity: dots in a circle, to be published. 8. See, for example, A. Zakhor, S. Lin, and F. Eskafi, A new class of B/W halftoning algorithms, IEEE Trans. Image Process. 2, 499 (1993); D. E. Knuth, Digital halftones by dot diffusion, ACM Trans. Graph. 6, 245 (1987); R. W. Floyd and L. Steinberg, An adaptive algorithm for spatial gray scale, SID ’75 Dig., Society for Information Display, 36 (1975). 9. G. L. Rogers, The effect of light scatter on halftone color, J. Opt. Soc. Am. 15, 1813 (1998). 1. where qn and σn are defined in Ref. 1 and t is the paper’s thickness. The paper’s reflectance is: Rp = Sn = References Ink–Ink Probability for Diffusion PSF The MTF of the diffusion point spread function is:1 ∞ ) ] ) ( were Sn is given by: (27) Unlike with the AM halftone screen, the probability here is linear with µ for all ρ . The Pii is shown as a function of µ for several different ρ in Fig. 7, and as a function of ρ for several different µ in Fig. 8. For ρ >> r, the AM Pii(µ) is equal to the FM Pii(µ). Note that χ is the probability that a photon having entered the paper through a dot exits the paper outside the dot. The different terms of Pii can be interpreted by writing Eq. 26 as Pii = 1 – χ + µχ. In other words: [the probability that the photon exits through a dot (Pii)] = [the probability it exits within the dot through which it entered the paper (1 – χ)] + [the probability there is a dot located at an arbitrary point (µ)] × [the probability the photon exits the paper outside the dot through which it entered (χ)]. ∑ )[ ( ( qn I1 r µ / π σ n / t K 1 r µ / π σ n / t − I1 r µ / π σ n / t Sn , σ n2 For FM halftone screening, Pii has the same form as Eq. 26 with χ given by: χ = 2 K 1 2 π r / ρ I1 2 π r / ρ . 1 H˜ (k) = Rp ∑ n= 1 (29) (26) with: ( 2 Rp ∞ Vol. 42, No. 4, July/Aug. 1998 345 JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY • Volume 42, Number 4, July/August 1998 Diffuse Transmittance Spectroscopy Study of Reduction-sensitized and Hydrogen-hypersensitized AgBr Emulsion Coatings Yoshiaki Oku and Mitsuo Kawasaki*† Department of Molecular Engineering, Graduate School of Engineering, Kyoto University, Yoshida, Kyoto 606-8501, Japan A diffuse transmittance spectroscopy method that utilizes a large-area photodetector in contact with the sample allowed the absorption spectra of reduction-sensitization centers and hydrogen-hypersensitization centers in AgBr emulsion to be measured using standard emulsion coatings. Both reduction-sensitization centers produced by dimethylamine borane and hydrogen-hypersensitization centers exhibited the common absorption band centered at 455 nm, shorter by ~20 nm than that previously ascribed by other groups to similar reduction-sensitization centers in thick liquid layers of AgBr emulsions by means of diffuse reflectance spectroscopy. Journal of Imaging Science and Technology 42: 346–348 (1998) Introduction The formation and properties of small silver clusters in silver halide emulsions continue to be a critical issue in photographic science. Direct experimental characterizations of these clusters are requisite to gain proper understanding of their properties and photochemical behaviors, but are not necessarily easy to achieve because of the extremely small size and small concentration at which such clusters generally function in photographic systems. In this context, Tani and Murofushi first demonstrated that a diffuse reflectance spectroscopy method could be a promising technique to obtain the absorption spectra of reduction-sensitization centers in photographic emulsions.1 Usefulness of the same method was confirmed by Hailstone and coworkers,2,3 though questions still remain about the identity of the centers that give rise to the relevant absorption band observed at ~475 nm. Powerful as it is, diffuse reflectance spectroscopy requires measured reflectance data be analyzed by the so-called Kubelka−Munk equation4 to obtain a spectrum that approximately scales linearly to the real absorption coefficient of the absorbing species. Furthermore, to meet the condition for the use of the Kubelka−Munk transform, derived for an ideal scattering layer with infinite thickness, the previous measurements invariably involved thick layers of liquid emulsions. Thus the corresponding spectroscopic data may not necessarily be correlated directly with a variety of other experimental data, most of which were obtained by using standard emulsion coatings. From this standpoint, we briefly introduce here a diffuse transmittance spectroscopy method, which is an alternative technique to obtain reliable spectroscopic data for small silver Original manuscript received July 27, 1997 * IS&T Member; Corresponding Author † e-mail: kawasaki@ap6.kuic.kyoto-u.ac.jp FAX: (+81)75-753-5526 © 1998, IS&T—The Society for Imaging Science and Technology 346 clusters that are present in arbitrary emulsion coatings on a clear support. Experimental Figure 1 shows the components and simple optical geometry of the model spectrometer constructed for the present purpose. Automatic wavelength scan and data acquisition are not presently available in this system so the measurement was done point-by-point at 5 to 10-nm intervals by manual adjustment of the monochromator. The data were then subject to interpolation in a personal computer to obtain a smoothed spectrum. In Fig. 1, a stack of sample films is mounted in close contact with an end-on photomultiplier tube (Hamamatsu Co., Hamamatsu, Japan, type R375) with a large aperture size ~50 mm in diameter; a geometry that ensures capturing of the majority of the diffuse transmitted light by the photomultiplier. The test emulsion coatings consist of monodisperse 0.45-µm octahedral AgBr grains, coated on a clear support at silver and gelatin coverages of 1.0 g/m2 and 2.0 g/m2, respectively. It is this comparatively low silver coverage, giving the maximum developable density of ~0.9, that necessitated the use of a stacked film sample to improve the signal-to-noise ratio. We limited, however, each stack to a maximum of 10 sample films, resulting in the total sample thickness of ~2.5 mm including the film base. For the principle of the measurement, it should be noted first that the light scattering characteristic of the sample is controlled by the large number of AgBr emulsion grains regardless of the presence of a small amount of extra absorbing species. In addition, one may expect the effective path lengths across such a highly scattering layer to be approximately equal for all the diffuse transmitted photons at the given thickness of the sample. In this condition, when the diffuse transmittance of a control sample with no silver clusters is measured as T0 and that of a sample with a small concentration of silver clusters as T, their absorptivity may be expressed simply by the diffuse absorbance, as defined by −log T/T0. This could also be supported experimentally, as the measured absorbance proved to scale linearly to the total number of coatings stacked together. Monochromator Figure 1. Experimental setup for diffuse transmittance measurement, with a stack of emulsion coatings mounted in close contact with an end-on photomultiplier. Figure 3. A series of diffuse absorbance spectra of reduction-sensitization centers produced by DMAB, measured for a stack of 10 sample coatings. The number attached to each spectrum refers to the initial DMAB concentration in mg/mol-Ag. The inset shows the relationship between the peak absorbance at 455 nm and the initial DMAB concentration. concentrations ranging from 0.1 to 1.0 mg/mol-Ag. The hydrogen hypersensitization involved 1-atm pure H2 atmosphere maintained at ~50°C in which an unsensitized coating evacuated by a turbo molecular pump for ~14 h in advance was kept for 1 to 5 h. Figure 2. Sensitometric data (relative speed and fog density) for a series of (a) DMAB-sensitized and (b) hydrogen-hypersensitized samples as a function of DMAB concentration or time of hydrogen treatment. The sensitivity refers to the density, 0.2 above the fog level, obtained by 1-s blue exposure followed by 10-min development in an M-AA-1 surface developer at 20°C. The present method has been favorably tested for a series of reduction-sensitized samples by dimethylamine borane (DMAB), as well as for hydrogen-hypersensitized5 coatings. Note that no spectroscopic information about the products of hydrogen hypersensitization has been made available in the previous diffuse reflectance works on liquid emulsions.1–3 The reduction sensitization (before coating) was carried out at 70°C for 40 min at selected DMAB Results and Discussion Figure 2 shows the sensitometric data for the series of DMAB-sensitized and hydrogen-hypersensitized coatings. It can be seen that the hydrogen hypersensitization allowed noticeably higher sensitivity to be reached at lower fog density as compared with the DMAB sensitization. Figure 3 shows a series of diffuse absorbance spectra taken for the DMAB-sensitized samples. Another series of spectra obtained for the hydrogen-hypersensitized coatings is presented in Fig. 4. In both cases well-defined absorption bands are clearly resolved, the peak position being invariably located at 455 nm. This is substantially shorter by ~20 nm than that previously ascribed to similar reduction-sensitization centers in liquid phase emulsions, which we tentatively attribute to different spectroscopic environments of the relevant silver clusters in liquid emulsions and in emulsion coatings. Figure 3 (see inset) suggests that the peak absorbance at 455 nm is an approximately linear function of the initial DMAB concentration. Because the diffuse absorbance is Diffuse Transmittance Spectroscopy Study of ... AgBr Emulsion Coatings Vol. 42, No. 4, July/Aug. 1998 347 Figure 4. As in Fig. 3 but for hydrogen-hypersensitization centers. Numbers on the spectra show the time/h of hydrogen treatment. expected to be proportional to the concentration of the extra absorbing species, as noted already, the result indicates that the number of reduction-sensitization centers produced by the reaction involving DMAB increases linearly with the initial DMAB concentration. Regardless of the exact kinetics and stoichiometry involved, this would be a reasonable relationship simply obeying the law of mass action, if the sensitizing reactions involving DMAB are identical and complete at all sensitizer levels. According to Fig. 4 the 455-nm absorption band also grows linearly with the time of hydrogen hypersensitization. This is another reasonable relationship, suggesting that the corresponding reaction rate has been maintained approximately constant over the range of reaction time examined here. Note that the intensity of the ~455-nm absorption is considerably smaller as a whole for the hydrogen-hypersensitized coatings than for the DMAB-sensitized ones, as opposed to the trend in the photographic sensitivity (cf. Fig. 2); an interesting correlation requiring further pursuit. Of course, the identity of the centers that give rise to the 455-nm absorption band (or 475 nm as observed for liquid emulsions) may still be a controversial issue. Tani and Murofushi have associated this absorption band exclusively with what they referred to as P centers,1 which they claimed 348 Journal of Imaging Science and Technology to be capable of trapping photoelectrons. In contrast, Hailstone and coworkers have suggested that the same absorption band may be a more general feature associated with reduction-sensitization centers as a whole, of which the predominant function is hole removal.2,3 In our opinion, some silver centers with an electron-trapping property that somehow resembles that of the photolytic subimage center do form at high enough levels of DMAB sensitization, where the photographic behaviors of the sensitized film rather resemble those of sulfur-sensitized emulsions,6 and by prolonged development the fog density gradually but steadily increases up to the maximum developable density. (At the highest DMAB concentration of 1.0 mg/mol-Ag, the fog density increased to ~0.4 to 0.8 at 40 to 80 min development in the M-AA-1 developer.) Even so, such P-like centers do not seem to be totally responsible for the 455-nm absorption bands, which clearly showed up even at the lowest level of DMAB sensitization and also by hydrogen hypersensitization. In this aspect our results may be in closer agreement with the work of Hailstone and co-workers. Conclusion In summary, a diffuse transmittance spectroscopy method that utilizes a large-area photodetector in contact with the sample has proved a simple and reliable method to obtain direct spectroscopic data for reduction-sensitization centers that are present in the standard emulsion coatings. By this method, we were able to show that both reductionsensitization centers produced by dimethylamine borane and hydrogen-hypersensitization centers have the common absorption band centered at 455 nm. Acknowledgment. We thank Dr. Judith M. Harbison (retired) and Dr. Marian Henry of the Eastman Kodak Company for supplying the series of DMAB-sensitized sample coatings. References 1. 2. 3. 4. 5. 6. T. Tani and M. Murofushi, J. Imaging. Sci. Technol. 38, 1 (1994). S. Guo and R. K. Hailstone, J. Imaging. Sci. Technol. 40, 210 (1996). A. G. DiFrancesco, M. Tyne, C. Pryor, and R. K. Hailstone, J. Imaging. Sci. Technol. 40, 576 (1996). P. Kubelka and F. Munk, Z. Tech. Phys., 12, 593 (1931). T. A. Babcock, P. M. Ferguson, W. C. Lewis, and T. H. James, Photogr. Sci. Eng. 19, 49 (1975). M. Kawasaki and H. Hada, J. Imaging. Sci. 33, 21 (1989). Oku and Kawasaki JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY • Volume 42, Number 4, July/August 1998 Silver Clusters of Photographic Interest III. Formation of ReductionSensitization Centers in Emulsion Layers on Storage and Mechanism for Stabilization by TAI Tadaaki Tani,* Naotsugu Muro and Atsushi Matsunaga Ashigara Research Laboratories, Fuji Photo Film Co., Ltd., 210 Minami-Ashigara, Kanagawa, 250-0193, Japan Comparison of the increase in sensitivity of AgBr emulsion layers caused by their storage in Ar gas at 70°C for 72 h with that caused by reduction-sensitization of the same emulsions revealed that R centers of reduction-sensitization were formed on the emulsion grains during the storage. The driving force for the electron transfer from gelatin to AgBr grains on storage for formation of R centers was confirmed by ultraviolet photoelectron spectroscopy (UPS) measurement, whereby the Fermi level of gelatin was higher than that of AgBr. It was however proposed for the formation of R centers on storage that the electron transfer took place by a two-electron process without creating any free electron or any single silver atom in the AgBr grain. Fog centers as well as reduction-sensitization centers were formed on sulfur-plus-gold-sensitized grains during storage. A stabilizer 4-hydroxy-6-methyl-1,3,3a-7-tetraazaindene (TAI) depressed the sensitivity increase and fog formation in layers of sulfur-plus-gold-sensitized AgBr emulsions on storage. The observation provided evidence for the mechanism of stabilization of emulsion layers by TAI, according to which TAI depresses sensitometric change on storage by preventing formation of reduction-sensitization centers. Journal of Imaging Science and Technology 42: 349–354 (1998) Introduction The formation of reduction-sensitization centers owing to the reduction of silver ions on silver halide grains by gelatin was observed in several phenomena including reduction-sensitization by digesting emulsions at low pAg (i.e., silver digestion) as reported by Wood,1 formation of silver clusters during precipitation of silver halide emulsion grains as suggested by Pouradier2 and confirmed by Tani and Suzumoto3 and by Nakatsugawa and Tani,4 and reduction-sensitization by digesting emulsions at high pH as reported by DeFrancisco and coworkers.5 The formation of reduction-sensitization centers owing to the reduction of silver ions on silver halide grains by gelatin is important for understanding various photographic phenomena, because photographic emulsions are composed of suspensions of silver halide grains in gelatin or in aqueous gelatin solutions and reduction-sensitization centers have significant influence on various photographic phenomena.6 However, little analysis has been undertaken on the formation of reduction-sensitization centers as a result of the reduction of silver ions on silver halide grains by gelatin. In this series of investigations,7–9 reduction-sensitization of fine grain AgBr emulsions was studied by digesting them in the presence of DMAB and other sensitizers and by characterizing reduction-sensitization centers on Original manuscript received November 12, 1997. * IS&T Fellow © 1998, IS&T—The Society for Imaging Science and Technology the basis of measurements of sensitometry of emulsion layers, photoconductivity and ionic conductivity of emulsion grains, and diffuse reflectance spectra of emulsions. As a result of the above experiments, reduction-sensitization centers acting as positive hole traps and electron traps could be separately prepared and characterized according to our proposal, as dimers of silver atoms formed at electrically neutral sites and at positively charged kink sites (i.e., R centers and P centers), respectively. The present investigation was undertaken to confirm formation of reduction-sensitization centers in emulsion layers on storage by observing the phenomenon under such a simplified condition that nothing but the interaction of AgBr grains with gelatin could be the cause for the phenomenon taking place. The result was also compared with the phenomena caused by normal reduction-sensitization, which were well established in this series of investigations. The materials and experimental conditions employed in this study were the same as those in the previous investigations so that the observed phenomena could be analyzed on the basis of established results.7–9 We believe that the formation of reduction-sensitization centers on storage causes the photographic instability of emulsion layers, because it should increase their sensitivity and might cause the formation of fog centers, especially in sulfur-plus-gold-sensitized emulsions. We also believe a compound that depresses the formation of reduction-sensitization centers in an emulsion layer on storage photographically, stabilizes the emulsion. Birr discovered 4-hydroxy-6-methyl-1,3,3a,7-tetraazaindene (TAI) as a stabilizer, which depresses sensitometric change of an emulsion on storage.10 It was noted that sulfur-plus-gold sensitization was realized in practice, when Koslowski succeeded in stabilizing gold-enhanced fog us- 349 ing TAI.11 The stabilizing effect of TAI has been studied mainly in relation to the change in the condition of sulfursensitization centers on storage.6 Although an idea for the stabilizing effect of TAI in relation to the change in the condition of reduction-sensitization on storage was noted,11a any evidence for the change in the condition of reduction of reduction-sensitization in emulsion layers on storage was not described. The results obtained in this investigation could verify formation of reduction-sensitization centers on AgBr grains in emulsion layers on storage and provide the grounds for proposal of a mechanism for stabilization of emulsion layers, according to which stabilizers depress sensitometric change on storage by limiting the formation of reductionsensitization centers. Experimental The emulsions used were the same as those used in the previous investigation,7,8 composed of octahedral or cubic AgBr grains with equivalent circular diameter of 0.2 µm suspended in aqueous solutions of an inert and deionized gelatin provided by Nitta Gelatin Co., Ltd., (Yao, Osaka) and prepared at pH 2 at 75°C for 60 min by a controlled double-jet method12,13 to minimize formation of reductionsensitization centers during precipitation.3,4 These emulsions were reduction-sensitized by digesting them at 60°C for 60 min in the presence of various amounts of dimethylamine borane (DMAB). They were sulfur-plus-goldsensitized by digesting them at 60°C for 60 min in the presence of various amounts of sodium thiosulfate as a sulfur sensitizer, potassium chloroaurate as a gold sensitizer, and potassium thiocyanate as a stabilizer for a gold sensitizer. The above-stated emulsions were coated on triacetate cellulose (TAC) film base with 1.74 g of AgBr/m2 and 1.27 g of gelatin/m2 and used as film samples for the measurements of sensitometry and microwave photoconductivity. The film samples were exposed at room temperature for 10 s to a tungsten lamp (color temperature: 2856 K) through a continuous wedge. Exposed films were developed14 at 20°C for 10 min by use of a surface developer MAA-1, fixed, washed, dried, and subjected to the measurement of optical density. Photographic sensitivity of a film sample was given by the reciprocal of the exposure to give the optical density of 0.1 above fog density. To observe the formation of reduction-sensitization centers on AgBr grains in emulsion layers on storage, film samples were evacuated, stored at a fixed temperature for a fixed period in Ar, and subjected to the above-stated sensitometry. The photoconductivity of AgBr grains in the emulsion layers was measured at -100°C by means of a 9-GHz microwave photoconductivity apparatus.15,16 Ultraviolet photoelectron spectroscopy (UPS) was also applied to an evaporated thin AgBr layer and a thin gelatin layer to measure the Fermi levels of AgBr and gelatin. A UPS apparatus was used with the retardation potential technique designed under the guidance of Seki.17 Results and Discussions Figure 1 shows the photographic sensitivity, fog density, and microwave photoconductivity of octahedral AgBr grains with equivalent circular diameter of 0.2 µm in emulsions, which were reduction-sensitized by digesting them at 60°C for 60 min in the presence of DMAB in the amount indicated on the abscissa. Sensitivity increased through two steps with increasing amounts of DMAB. The photoconductivity of the emulsion grains was hardly influenced by the sensitization centers bringing about the sensitivity increase 350 Journal of Imaging Science and Technology Figure 1. Photographic sensitivity (S, ), fog density (×), and photoconductivity (σ, ) of octahedral AgBr emulsions, unsensitized and reduction-sensitized at 60°C for 60 min in the presence of DMAB with the amount indicated in the abscissa. in the first step and was significantly decreased by the sensitization centers bringing about the sensitivity increase in the second step. This result provided the evidence for the idea proposed in the series of these investigations8,9 that the sensitivity increases in the first and second steps were ascribed to the effects caused by hole-trapping R centers and electron-trapping P centers, respectively. Figure 2 shows photographic sensitivity and fog density of unsensitized and reduction-sensitized octahedral AgBr emulsion layers stored at 70°C for 0, 72, and 96 h in dry Ar gas. The reduction-sensitization was carried out by digesting the emulsions at 60°C for 60 min in the presence of DMAB in the amount shown on the abscissa In the latter case, sensitivity increased through two steps with increasing amounts of DMAB in accordance with the previous papers,8 and as seen in Fig. 1. The sensitivity increases in the first and second steps were thus ascribed to the effects caused by R and P centers of reduction-sensitization, respectively. As seen in Fig. 2, storage in dry Ar gas at 70°C for 72 and 96 h significantly increased the sensitivity of the unsensitized emulsion, and the sensitivity increase was detected and saturated on storage at 4 and 72 h, respectively. After storage in dry Ar gas, the sensitivity increase caused by R centers of reduction-sensitization could hardly be observed, whereas the sensitivity increase caused by P centers of reduction-sensitization remained. The sensitivity achieved by storage in dry Ar gas was nearly the same as that achieved by R centers of reduction-sensitization. We propose from these results that R centers of reduction- Tani, et al. Figure 2. Photographic sensitivity (S) and fog density of unsensitized and reduction- sensitized octahedral AgBr emulsion layers stored in dry Ar gas at 70°C for 0(), 72(), and 96 h (×). The reduction sensitization was carried out by digesting the above-stated emulsions in the presence of DMAB in the amounts indicated in the abscissa. Figure 3. Photographic sensitivity (S) and fog density of unsensitized and reduction-sensitized cubic AgBr emulsion layers, stored in dry Ar gas at 70°C for 0(), 72(), and 96 h (×). The reduction-sensitization was carried out by digesting the abovestated emulsions in the presence of DMAB in the amounts indicated in the abscissa. sensitization were formed on the unsensitized emulsion grains on their storage in dry Ar gas at 70°C for 72 and 96 h owing to reduction of silver ions on the grains by gelatin, because the emulsion grains were surrounded only by gelatin in an inactive atmosphere during storage. Following the results shown in Fig. 2, Fig. 3 shows photographic sensitivity and fog density of unsensitized and reduction-sensitized cubic AgBr emulsion layers, which were stored in dry Ar at 70°C for 72 and 96 h. Sensitivity increased through two steps by reduction-sensitization with increasing amounts of DMAB in accordance with the results in the previous article.8 The sensitivity increases in the first and second steps were ascribed to the effects caused by R and P centers, respectively. In a similar fashion to the results with octahedral AgBr emulsions, the sensitivity of unsensitized cubic AgBr emulsions significantly increased by storage in dry Ar gas. After storage in dry Ar gas, the sensitivity increase caused by R centers of reduction-sensitization was hardly observed, whereas the sensitivity increase caused by P centers remained. The sensitivity achieved by the image in dry Ar gas was nearly the same as that achieved by R centers of reduction-sensitization. We likewise propose that R centers of reduction-sensitization were also formed on cubic AgBr grains in emulsion layers owing to reduction of silver ions on the grains by gelatin on storage in dry Ar gas. Following the results shown in Fig. 2, Fig. 4 shows photoconductivity along with photographic sensitivity of unsensitized and reduction-sensitized octahedral AgBr grains in emulsion layers stored in dry Ar gas at 70°C for 72 h. As seen here, storage in dry Ar gas hardly influenced the photoconductivity of unsensitized grains in contrast to the fact that the image significantly increased the grain sensitivity. This result also supports the idea that the storage of the emulsion layers in dry Ar gas caused the formation of R centers of reduction-sensitization on the grains. Figure 5 shows photographic sensitivity and fog density of sulfur-plus-gold-sensitized emulsion layers stored in dry Ar gas at 70°C for 0 and 18 h. The sulfur-plus-gold sensitization was carried out by digesting the emulsions at 60°C for 60 min in the presence of sulfur and gold sensitizers with the amounts indicated in the abscissa. As shown, storage increased the sensitivities and fog densities of the sulfur-plus-gold-sensitized emulsions. This result suggests that storage caused the formation of reduction-sensitization centers on the emulsion grains. Most of the reduction-sensitization centers contributed to the sensitivity increase, but some were converted by gold ions into fog centers composed of gold atoms. Following the results shown in Fig. 5, Fig. 6 shows photographic sensitivity and fog density of sulfur-plus-gold-sensitized octahedral AgBr emulsion layers without and with TAI stored in dry Ar gas at 70°C for 18 h. The increases in sensitivity and fog density of the sulfur-plus-gold-sensitized emulsions caused by the storage in dry Ar gas were less owing to the addition of TAI to the emulsions. To obtain knowledge of the driving force for the reduction of silver ions on AgBr by gelatin in the dry and inactive condition, UPS was applied to a thin layer of gelatin and an evaporated AgBr layer to obtain their electronic states according to the procedure reported in the literature17 using the apparatus described in Experimental section. Although a dried gelatin layer in the ground state Silver Clusters of Photographic Interest III... Vol. 42, No. 4, July/Aug. 1998 351 Figure 4. Photographic sensitivity (S) and photoconductivity of unsensitized and reduction-sensitized octahedral AgBr grains in emulsion layers, stored in dry Ar gas at 70°C for 0() and 72 h (). The reduction-sensitization was carried out by digesting the above-stated emulsions in the presence of DMAB in the amounts indicated in the abscissa. The photoconductivity of the emulsion grains was measured at –100°C by a microwave photoconductivity apparatus. is an insulator, it has its own Fermi level that can be determined by means of UPS because the gelatin is electronically excited and can therefore exchange electrons with AgBr to equalize the Fermi levels during the measurement of their UPS.17 In gelatin in an emulsion layer, there is some reducing substituents and/or impurities that can transfer electrons to silver halide to form silver clusters during storage. It is not clear at present how those substituents and/or impurities in a gelatin layer are related to the gelatin’s Fermi level. The resulting electronic energy level diagram of gelatin and AgBr is shown in Fig. 7. As indicated, it was found that the Fermi level of gelatin was higher than that of AgBr, indicating the presence of a driving force for electron transfer from gelatin to AgBr when they are in contact with each other. Based on this result, we propose that the reduction of silver ions on AgBr grains by gelatin occurs because of the electron transfer from gelatin to AgBr grains in emulsion layers during storage. Discussion Formation of Silver Clusters. As described in the previous section, reduction-sensitization centers were formed on AgBr emulsion grains during storage of emulsion layers owing to reduction of silver ions on the grains by gelatin. This phenomenon should be very important for 352 Journal of Imaging Science and Technology Figure 5. Photographic sensitivity (S) and fog density of sulfurplus-gold-sensitized octahedral silver bromide emulsion layers, stored in dry Ar gas at 70°C for 0() and 18 h (). The sulfurplus-gold sensitized was carried out by digesting the emulsions at 60°C for 60 min in the presence of sulfur and gold sensitizers with the amounts indicated in the abscissa. sensitivity and stability of silver halide photographic materials, because silver halide grains are always surrounded by gelatin and reduction-sensitization centers have significant influence on the sensitivity and stability of the photographic materials. We suggest that this phenomenon involved the formation of silver clusters brought about by reduction of silver ions on AgBr grains as a result of electron transfer from gelatin to the grains. It seems meaningful to compare electron transfer to AgBr by this reduction reaction to the formation of silver clusters with light-induced electron transfer from photoexcited sensitizing dyes to AgBr. Note that formation of reduction-sensitization centers during storage was very slow. The sensitivity increase by the reduction-sensitization centers was detected and saturated on storage in dry Ar gas at 70°C at 4 and 72 h, respectively. But electron transfer from sensitizing dyes is the excited state to AgBr emulsion grains takes place in several picoseconds when the energy level of the excited electrons in a dye are18 above the bottom of the conduction band of AgBr. We therefore propose that the energy level of the electrons transferred from gelatin to AgBr emulsion grains should be much lower than the bottom of the conduction band of AgBr. It has been shown that Lowes hypothosis7 really takes place according to the following steps: Namely, an R center Tani, et al. Figure 7. The UPS energy level diagram of thin layers of AgBr and gelatin where broken lines indicate their Fermi levels. The UPS measurement also gave the top of the valence band of AgBr and the highest occupied electronic energy level of gelatin as –6.30 and –6.84 eV, respectively. By taking21 the band gap of AgBr as 260 eV, the bottom of the conduction band of AgBr was evaluated to be –3.70 eV. Figure 6. Photographic sensitivity (S) and fog density of sulfurplus-gold-sensitized octahedral silver bromide emulsion layers without (,) and with (, ) TAI, stored in dry Ar gas at 70°C for 0(,) and 18 hours (,). The sulfur-plus-gold sensitization was carried out by digesting the emulsions at 60°C for 60 min in the presence of sulfur and gold sensitizers with the amounts indicated in the abscissa. The amount of TAI added to the above-stated emulsions was 2 × 10–2 mole/mole AgBr. is oxidized by a positive hole to give Ag2+ (step 1), which undergoes ionic relaxation to give a silver atom and an interstitial silver ion (step 2). A silver atom then dissociates to give a free electron and an interstitial silver ion (step 3). were disposed to form many R centers on the grain. We therefore suggest that the electron transfer from gelatin to an AgBr grain takes place by creating neither any free electrons nor free silver atoms in the grain, which is contrary to the electron transfer in spectral sensitization that creates free electron in an AgBr grain.20 Namely, there is an essential difference between the electron transfer processes by reduction reaction and the light-induced electron transfer (i.e., spectral sensitization) for the formation of silver clusters. To analyze the proposed difference, it is meaningful to compare the following two electron transfer processes: Ag2(R center) + h+ → Ag2+, (1) R + Ag+ → R+ + Ag, (4) Ag2+ → Ag + Ag+, (2) R + 2Ag+ → R2+ + Ag2. (5) Ag → Ag+ + e–. (3) Step 2 was originally proposed by Mitchell,18 and Step 3 was proposed according to the analysis of low-intensity reciprocity failure.19 Accordingly, appearance of free electrons leads to the formation of a silver cluster acting as an electron trap (i.e., P center) on a grain following the mechanism of the formation of a latent image center.7 Namely, only one P center is formed and grown on such a fine AgBr grain according to the concentration principle7 when free electrons take part in the formation of the silver cluster. It was obvious in this investigation that the electrons transferred from gelatin to an AgBr grain during storage Silver Clusters of Photographic Interest III... The electron transfer for spectral sensitization corresponds to process 4. It is proposed in this article that the electron transfer for the formation of a reduction-sensitization center by reduction reaction correspond to process 5. In the case of spectral sensitization, R in Eq. 4 corresponds to a sensitizing dye molecule in the excited state in which only one electron is excited to the higher molecular orbital and available for the electron transfer within the lifetime of the excited dye molecule. Therefore, one electron is transferred from the excited dye molecule to the conduction band of silver halide in spectral sensitization contributing to the formation and growth of one silver cluster acting as an electron trap on a grain. Vol. 42, No. 4, July/Aug. 1998 353 But formation of a silver cluster by thermal reduction proceeds not through process 4, but through process 5 owing to the following reason: In process 5, R is in the ground state and has two electrons in the highest occupied molecular orbital both of which are available for electron transfer. In addition, a dimer of two silver atoms is much more stable than two isolated silver atoms owing to the large binding energy of the dimer.7–9,21 It is therefore expected that the formation energy of a silver cluster through process 5 is smaller than that through process 4 leading directly to the formation of a silver cluster by the reduction reaction. No analysis could be made in this investigation on the chemical composition of the gelatin that transferred electrons to AgBr emulsion grains during storage. However, the difference in Fermi level between gelatin and AgBr provides the driving force for electron transfer from gelatin to the grains. Namely, the Fermi level of AgBr is situated nearly at the middle of its forbidden band, lower than the Fermi level of gelatin. Electron transfer from gelatin to AgBr emulsion grains and formation of reduction-sensitization centers there could accordingly take place, raising the Fermi level of the grains until it became equal to that of gelatin. The detailed elementary processes for achieving process 5 could not however be clarified in this study. Mechanism of Stabilization by TAI. As described in the previous section, reduction-sensitization centers were formed on AgBr emulsion grains during storage of emulsion layers owing to two-electron reduction of silver ions on the grains by gelatin. This phenomenon should be very important for stability of silver halide photographic materials because formation of reduction-sensitization centers during storage should have significant influence on both sensitivity and fog density of photographic materials. As Yoshida, Mifune, and Tani reported,22 it is known that application of gold sensitization to reduction-sensitized emulsions causes formation of fog centers by converting reduction-sensitization centers into clusters of gold atoms that act as fog centers owing to the development-enhancing effect of gold atoms.6 It is therefore expected that formation of reduction-sensitization centers on silver halide grains in sulfur-plus-gold-sensitized emulsion layers during storage might cause an increase in fog density in addition to the increase in sensitivity. As shown in Fig. 5, storage in Ar gas actually increased the sensitivity and fog density of sulfur-plus-gold-sensitized emulsions. This result indicates that storage caused the formation of reduction-sensitization centers on the emulsion grains, most of which contributed to the sensitivity increase, and some of which were converted into fog centers composed of gold atoms. 354 Journal of Imaging Science and Technology As shown in Fig. 6, increases in the sensitivity and fog density of the sulfur-plus-gold-sensitized emulsions by storage in dry Ar gas was less because of the addition of TAI to the emulsions. Namely, TAI could stabilize sulfur-plus-goldsensitized emulsion layers by inhibiting the formation of reduction-sensitization centers during storage. As indicated by Eq. 5, the formation of reduction-sensitization centers during storage depends upon the activity of silver ions. It is well known that TAI decreases the activity of silver ions by forming its sliver ions on the grain surface.6,11a Thus TAI could depress the formation of reduction-sensitization centers on emulsion grains during storage. Reference 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. H. W. Wood, J. Photogr. Sci. 6, 33 (1958). J. Pouradier, J. Soc. Photogr. Sci. Technol. Jpn. 54, 464 (1991). T. Tani and T. Suzumoto, J. Imaging Sci. Technol. 40, 56 (1996). H. Nakatsugawa and T. Tani, Formation of silver clusters due to reduction of silver ions by gelatin during precipitation and digestion processes of silver halide emulsion grains in the preprint book of Autumn meeting of Soc. Photogr. Sci. Technol. Jpn. Nov., 1996, pp. 14–16. A. G. DeFrancisco, M. Tyne, C. Pryor, and R. Hailstone, J. Imaging Sci. Technol. 40, 576 (1996). T. Tani, Photographic Sensitivity: Theory and Mechanisms, Oxford University Press, New York, 1995, Chap. 6. T. Tani, Photographic Sensitivity: Theory and Mechanisms, Oxford University Press, New York, 1995, Chap. 4. T. Tani and M. Murofushi, J. Imaging Sci. Technol. 38, 1 (1994). T. Tani, J. Imaging Sci. Technol. 41, 577 (1997). E J. Birr, Z. Wissensch. Photogr. 47, 2 (1952). (a) M. H. Van Doorselaer, J. Imaging Sci. Technol. 37, 524 (1993); (b) F. W. H. Mueller, J. Opt. Soc. Am. 31, 499 (1949); (c) R. Koslowski, Z. Wissensch Photogr. 46, 65 (1951); (d) K. Meyer, Z. Wissensch Photogr. 47, 1 (1952). C. R. Berry and D. C. Skillman, Photogr. Sci. Eng. 6, 159 (1962). (a) E. Klein and E. Moisar, Photogr. Wiss. 11, 3(1962); (b) E. Klein and E. Moisar, Ber. Bunsenges. Phys. Chem. 67, 349 (1963); (c) E. Moisar and E. Klein, Ber. Bunsenges. Phys. Chem. 67, 949 (1963); (d) E. Moisar, J. Soc. Photogr. Sci. Technol. Jpn. 54, 273 (1991). T. H. James, W. Vanselow and R. F. Quirk, Photogr. Sci. Technol. 19B, 170 (1953). (a) L. M. Kellogg, N. B Libert and T. H. James, Photogr. Sci. Eng. 16, 115 (1972); (b) L. M. Kellogg, Photogr. Sci. Eng. 18, 378 (1974). T. Kaneda, J. Imaging Sci. 33, 115 (1989). K. Seki, H. Yanagi, Y. Kobayashi, T. Ohta, and T. Tani, Phys. Rev. B49, 2760 (1994). (a) J. W. Mitchell, Recent Progr. Phys. 20, 433 (1957); (b) J. W. Mitchell, J. Phys. Chem. 66, 2359 (1962); (c) J. W. Mitchell, Photogr. Sci. Eng. 25, 170 (1981). (a) P. C. Burton and W. F. Berg, Photogr. J. 86B, 2 (1946); (b) P. C. Burton, Photogr. J. 86B, 62 (1946); (c) W. F. Berg and P. C. Burton, Photogr. J. 88B, 84 (1948); (d) P. C. Burton, Photogr. J. 88B, 13 (1948); (e) P. C. Burton, Photogr. J. 88B, 123 (1948); (f) W. F. Berg, Rep. Prog. Phys. 11, 248 (1948). T. Tani, Photographic Sensitivity: Theory and Mechanisms, Oxford University Press, New York, 1995, Chap. 5. (a) J. W. Mitchell, Photogr. Sci. Eng. 22, 1 (1978); (b) J. W. Mitchell, Imaging Sci. J. 45, 2 (1997). Y. Yoshida, H. Mifune and T. Tani, J. Soc. Photogr. Sci. Technol. Jpn. 59, 541 (1996). T. Tani, Photographic Sensitivity: Theory and Mechanisms, Oxford University Press, New York, 1995, Chap. 3. Tani, et al. JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY • Volume 42, Number 4, July/August 1998 A New Crystal Nucleation Theory for Continuous Precipitation of Silver Halides* Ingo H. Leubner† Imaging Research and Advanced Development, Eastman Kodak Company, Rochester, New York 14650-1707 A new steady state theory of crystallization in the continuous stirred tank reactor (CSTR), or mixed-suspension mixed-productremoval (MSMPR), system was developed based on a dynamic balance between growth and nucleation. The present model was (but is not) limited to nonseeded systems with homogeneous nucleation, diffusion controlled growth, and a nucleation model previously confirmed for such systems in controlled double-jet batch precipitations. No assumptions of size-dependent growth were needed. The model predicts the correlation of the average crystal size with residence time, solubility, and temperature of the system and enables calculation of the supersaturation ratio, the maximum growth rate, the ratio of nucleation to growth, the ratio of average to critical crystal size, and the size of the nasent nuclei. The model predicts that the average crystal size is independent of reactant addition rate and suspension density. The average crystal size is a nonlinear function of the residence time where the crystal size has a positive value at zero residence time (plug–flow condition). Results of continuous precipitations of silver chloride confirm the predictions of the model. The ratio of the fraction of the input stream used for nucleation and crystal growth was calculated from the experimental results to decrease from 4.79 to 0.12, and the size of the nascent crystals to increase from 0.194 to 0.221 µm between 0.5- and 5.0-min residence time. The ratio of average to critical crystal size was determined to 5.73∗103 (1.02 to1.09), the supersaturation ratio to 12.2 (0.54, average crystal size L= 0.5 µm ), the supersaturation to 8.2∗10–8 (12.7∗10–9 mol/l, L␣ = 0.5 µm), and the maximum growth rate to 4.68 A/s (1.20 to 4.25). The data in the brackets are for equivalent batch precipitations. The experiments indicate that the width of the crystal size distribution increased with suspension density and was independent of reactant addition rate. While the present model was developed for homogeneous nucleation, diffusion limited growth, and unseeded systems, it may be modified to model seeded systems, systems containing ripening agents or growth restrainers, and systems where growth and nucleation are kinetically, heterogeneously, or otherwise controlled. Journal of Imaging Science and Technology 42: 355–364 (1998) Introduction It is the object of this work to extend the previously reported model of crystal nucleation for precipitation of highly insoluble compounds in controlled double-jet precipitation batch processes1–5 to precipitations in continuous suspension crystallizers. The present proposed new theory of crystallization in continuous mixed-suspension mixed-product-removal (MSMPR) crystallizers [here referred to as continuous stirred tank reactor (CSTR)] differs from the previous theory for continuous precipitation by Randolph and Larson6,7 in that it correlates the average crystal size of the crystal population with reaction conditions like residence time, temperature, and crystal solubiltity. It is thus intended to complement the Randolph–Larson theory which is concerned with the crystal population size distribution. In addition, the new theory consists of three distinct parts that distinguish it from the Randolph–Larson theory: Original manuscript received October 20, 1997 * This work was presented at the AIChE Annual Conference, Chicago, IL, November 11–15, 1996, and at the IS&T 50th Annual Conference, Boston, MA, May 18–23, 1997. † IS&T Fellow © 1998, IS&T—The Society for Imaging Science and Technology 1. The new model is based on a dynamic balance between crystal nucleation and crystal growth at steady state. 2. The new mathematical treatment of the model is based on mass and nucleation balance. It shares certain concepts of mass and nucleation balance with the Randolph–Larson theory. In addition, it introduces the concepts and equations of the crystal nucleation theory previously proposed and experimentally supported by the author and his coworkers. This leads to new mathematical equations that have no arbitrary adjustable parameters. These new equations are significantly different from those of the Randolph– Larson theory. Furthermore, the new theory and the equations make distinctly new predictions. Some of these were experimentally supported in the present work. 3. The new theory is supported by experimental results. As in any new endeavor, only few predictions could be tested. The theory and the equations give ample suggestions for further experimental and theoretical research. The use of continuous precipitation systems for the precipitation of silver halide dispersions has been investigated previously.8–13 Continuous streams of reactants (silver nitrate and halide solutions) and a gelatin solution are fed to a well-stirred precipitation vessel and product is simultaneously removed while maintaining a constant reaction volume and reaction conditions (Fig. 355 Figure 1. Schematic diagram of the continuous suspension crystallizer for silver chloride precipitations. 1). Initially, the precipitation vessel contains an aqueous solution of gelatin and halide salt. It may include silver halide seed crystals that would affect the transient period but would not affect the present results, which are based on spontaneous homogeneous nucleation. After a transient time period a steady state is reached after which the size distribution and morphology of silver halide crystals removed from the precipitation vessel remain unchanged. In Gutoff,8,9 Wey and Terwilliger,10 and Wey et al.11,12 the crystal size distribution was investigated using the formalisms derived by Randolph and Larson for the mixed-suspension, mixed-product-removal (MSMPR) system [Randolph and Larson6,7]. Wey et al.12 examined the crystal size distribution of AgBr using the population balance technique and included both McCabe’s ∆L law (size-independent growth rate) and a size-dependent growth model. The crystal population distribution could not be satisfactorily modeled. Using the large grain population, nucleation and maximum growth rates were determined using the Randolph–Larson model. The rates were empirically related to temperature and supersaturation. Their data indicate that the nucleation of AgBr is homogeneous and that no significant secondary nucleation mechanism is involved in the precipitation of AgBr crystals. Wey, et al.13 studied the transient behavior of unseeded silver bromide precipitations and determined that the steady state of crystal population distribution was achieved only at about 6 to 9 residence times (τ) after the start of the precipitations, much later than the steady state of suspension density (at about 4 residence times). Their results also showed that at steady state the crystal population is rather narrowly distributed around an average crystal size. In Fig. 2, electron micrographs of AgCl crystals (carbon replica) are shown that were obtained at steady state for CSTR precipitations at a residence time of 5.0 min. In Fig. 3, the crystal size distribution of AgCl at steady state is shown for precipitations varying in residence time from 0.5 to 5.0 min. It is apparent that the distribution is rather narrow and can be described by an average crystal size L, and a somewhat symmetrical crystal size distribution. In a log numbered size plot (R/L theory), the curves are not symmetrical in agreement with AgBr precipitations (Wey et al.12). Clearly, the steady state crystal size distribution does not follow the linear semi-log correlation predicted by the Randolph–Larson model (Ref. 7, p. 87). 356 Journal of Imaging Science and Technology Figure 2. Electron micrographs of silver chloride crystals (carbon replica) obtained at steady state. Residence time 5.0 min., 60°C, pAg 6.45, 2.4% gelatin, suspension density 0.1 mol/l. An objective of this work is to derive a model that predicts the average crystal size of these precipitations and to support some of the model’s predictions experimentally. This model is based on the previously derived nucleation model for batch double-jet precipitations, the maximum growth rate of the system, and the dynamic mass balance between nucleation and growth. The newly developed model explicitly describes the average crystal size as a function of experimental variables like solubility of the product, residence time, and temperature without the assumption of arbitrarily adjustable parameters. The fractions of the incoming reaction stream that are used for nucleation and growth can be determined. The average size of the nascent (newly formed) crystals at steady state also can be calculated. The new model also leads to the determination of the critical crystal size, which allows calculation of the supersaturation ratio in the precipitation system. Theory Review. Since its first publication, the model by Randolph and Larson6,7 has found wide application to describe the crystal size population in continuous MSMPR crystallizers (Eq. 1): n = n0 exp(–Lx/Gmτ), (1) where n = population density at size Lx , n0 = nuclei population density, number/(volume-length), Gm = maximum growth rate, and τ = residence time. Equation 1 describes the expected number distribution of the crystal product at steady state. This equation is applicable where a straight line of a plot of the logarithm of population density n versus crystal size Lx describes the crystal size population. It was observed in precipitations of silver bromide12 and AgCl (Fig. 3) that the crystal size distribution is not described by Eq. 1 but is given by13 a somewhat symmetrical distribution around an average crystal size L where only a small, large-sized part of the crystal population obeys Eq. 1. The Randolph–Larson theory (Eq. 1) also does not explicitly include other factors that affect the precipitation population Leubner Mass-Balance: R0 = Rn + Ri R0 = Reaction Addition Rate = Product Removal Rate Rn = Fraction of R0 consumed for Crystal Nucleation Ri = Fraction of R0 consumed for Crystal Growth Figure 4. Dynamic mass balance model for the CSTR/MSMPR system. Figure 3. AgCl crystal size distribution for residence times varying from 0.5 to 5.0 min. (60°C, pAg 6.45, 2.4% gelatin). such as reactant addition rate, solubility, or temperature effects. A new theory is presented that directly correlates the average crystal size L (cm) to the residence time τ (s), the solubility Cs (mol/cm3), and temperature T (K), and makes predictions about the effect on reactant addition rate and suspension density. This new model is derived from the nucleation theory for batch precipitation, which was developed by this author and coworkers.1–3 New Model. In continuous MSMPR crystallizers, reactants, solvents, and other addenda are continuously added while the product is continuously removed (Fig. 1). In the following, this precipitation scheme will be referred to as a continuously stirred tank reactor (CSTR) system. For the present derivation, only the reaction-controlling reactant that leads to the crystal population will be considered. For instance, silver halides are generally precipitated with excess halide in the reactor, which is used to control the solubility. Thus the soluble silver salt, e.g., AgNO3, is the reaction-controlling compound. The solubility of the resultant silver halides (chloride, bromide, iodide, and mixtures) is so low that their concentration in the reactor will be neglected for the mass balance. For other precipitations where the solubility of the reaction product is significant, the solubility will need to be retained as an explicit rate term on the right side of Eq. 2. Similarly, unreacted material needs to be added as a rate term on the right side of Eq. 2. Flow rates of other necessary addenda for the precipitation, such as halide salt solutions, water, gelatin, etc., are included into the calculation of the residence time τ and suspension density Mt and are important to control the silver halide solubility. The present model is based on several premises. It is apparent that modifications of the present model can be obtained by changing some of these premises. • The reactants react stoichiometrically to form the crystal population, and the solubility of the resultant product (in the present experiment, silver halide) is not significant with regard to the mass balance. It is straightforward to expand the model to include the effect of significant solubility of the reaction product. • Homogeneous nucleation is assumed. For nonhomogeneous and other nucleation processes, the proper nucleation model must be substituted for the presently used nucleation model. • The input reactant stream at steady state is consumed in a constant ratio for crystal nucleation and crystal growth (Fig. 4). • Crystal nucleation in the CSTR system follows the same mechanism as in double-jet precipitation. • Crystal growth is given by the maximum growth rate of the crystal population. In the present derivation, a single maximum growth rate Gm is assumed, which can be derived from the experimental results (see below). However, an analytical equation for Gm as a function of crystal size and reaction conditions may be substituted if it is known. This reduces the number of unknowns in the final equation to the ratio of average to critical crystal size L/ Lc. • For the present derivation it is assumed that the nucleation is by a diffusion-controlled mechanism.3 The following approach was successfully used previously to describe renucleation in batch precipitations quantitatively.14 When a stream of reactant R0 [addition rate of reactant, e.g., silver nitrate (mol/s)] is added to the reaction vessel at steady state, a fraction of the stream will be used to renucleate new crystals to replace crystals leaving in the product stream (Rn, addition rate fraction used for nucleation) and the remainder to sustain the maximum growth rate of the crystal population (Ri, addition rate fraction used for growth = crystal size increase) (Fig. 4). These assumptions are well met by the precipitation of highly insoluble materials such as the silver halides that perform at practically 100% conversion. The mass balance requires that R0 = Rn + Ri. (2) The addition rate R0 is given by the concentration (mol/ l) and flow rate (l/s) of the reactants. The product removal rate at steady state is by definition equal to the reactant addition rate. In the following, Rn and Ri will be derived and finally inserted in Eq. 2 to provide the new model for the crystal population at steady state. Crystal Nucleation. The number of crystals nucleated Zn must be equal to the number of crystals leaving in the reaction stream and is given by the mass balance: Zn = R0Vm/kvL3, (3) where Vm is the molar volume of the reaction product (to convert molar addition rate into volume cm3/mol) and kv is the volume factor that converts from the characteristic average crystal size L to crystal volume (see glossary). This definition of nucleation rate is different from B0 used in the Randolph–Larson theory, which is defined as the rate of formation of nuclei per unit volume in the crystallizer.7 A New Crystal Nucleation Theory for Continuous Precipitation of Silver Halides Vol. 42, No. 4, July/Aug. 1998 357 As will be shown in the new theory and in the experimental section, the nucleation rate is independent of reactor volume for homogeneous nucleation conditions. For homogeneous crystal nucleation under diffusion-controlled growth conditions, Eq. 4 was derived for double-jet precipitations1–3: Zn = RnRgT/2 ks γDVmCsΨ, (5) (6) Zn = KRn, (7) Rn = Zn/K. (8) which is solved for Rn: Substitution of Zn from Eq. 3 into the equation for Rn (Eq. 8), leads to (9) Thus, an analytical equation for R n has been found, which below will be entered into Eq. 2. It remains now to develop an analytical equation for Ri, which is derived from the maximum growth rate of the crystal population. Crystal Growth. The growth rate G is defined as the change in crystal diameter with time, dL/dt. The maximum growth rate Gm is given by the mass balance between the maximum growth of the system and the fraction of the reactant addition rate consumed for this growth Ri. The maximum growth rate is a function of crystal size, and thus the individual size classes will grow at different absolute maximum growth rates. The maximum growth rate of the whole crystal population is then given by an average maximum growth rate Gm, which is defined to be related to the average crystal size L. If the crystal size is difficult to determine because of a complicated crystal structure, for instance, denditric crystals, the specific surface area Sm (e.g., surface area/mol of crystals) may be used to derive the maximum growth rate. This was not necessary under the present conditions. The use of Sm was discussed in Leubner,14 and may be transferred to the present model as desired. Equation 10 results in solving this mass balance for Gm: Gm = VmRi/3.0kvL2Zt. 358 Journal of Imaging Science and Technology Ri = 3GmR0τ/L. (12) At this point, both Rn and Ri have been expressed by analytical expressions that contain only fully defined parameters and variables. It is now possible to continue to the formulation of the new theory. Crystal Growth and Nucleation in the Continuous Crystallizer. The foundation has now been laid to combine nucleation and growth at steady state. For this purpose the expressions of Rn (Eq. 9) and Ri (Eq. 12) are inserted into Eq. 2. After simplifying, Eq. 13 is obtained: Vm/KkvL3 + 3Gmτ/L = 1.0. (13) Reinserting K into this equation and solving for zero leads to: kvL3RgT – 2ksγDVm2CsΨ – 3kvGmRgTL2τ = 0. This leads to a simplified equation for Zn: Rn = R0Vm/KkvL3. (11) Inserting Eq. 11 into Eq. 10 and solving for Ri leads to the desired analytical equation for Ri: and Rg is the gas constant, T is the temperature (K), ks is the crystal surface factor, γ is the surface energy, D is the diffusion coefficient of the reaction-controlling reactant, and Cs is the sum of the solubility with regard to the reaction-controlling reactant. The value Lc is the critical crystal size at which a crystal has equal probability to grow or to dissolve by Ostwald ripening. In previous papers,1–3,14 Eq. 4 was quoted for the specific case of spherical crystal morphology (ks = 4 π). For batch processes Zn equals the total number of stable crystals formed during nucleation. Here Zn is equal to the nucleation rate (dZ/dt) at steady state. This extrapolation is in agreement1,3 with the underlying derivation of Eq. 4. For the remainder of the derivation of the equations, the intermediate variable K is introduced: K = RgT/2ksγDVmCsΨ. Zt = R0Vmτ/kvL3. (4) where Ψ = L/Lc – 1.0 Here, Zt is the total number of crystals present in the reaction vessel during steady state. The value Zt can be calculated from the average crystal size and the suspension density, which, in turn, is a function of reactant addition rate and residence time. (10) (14) Solving for L results in a very complicated solution as a function of residence time which is of little immediate use. But solving for the residence time τ is relatively straightforward: τ = L/3 Gm – 2ksγDVm2CsΨ/3kvGmRgTL2. (15) These equations can be used to determine Gm and Ψ from the experimental values of residence time τ and the average crystal size L obtained at crystal population steady state, and by entering the other parameters that are either known or can be experimentally determined. If Gm is known from other experiments (or an analytical equation of Gm is available that relates it to the crystal size and reaction conditions), this may be substituted in Eq. 15. This would reduce the number of unknowns to Ψ. From Ψ and the average crystal size L, the critical crystal size Lc can be calculated (Eq. 5), which is related to the supersaturation ratio by S* = 1.0 + 2γVm/RgTLc. (16) The supersaturation ratio S* is defined by Eq. 17: S* = (Css – Cs) / Cs, (17) where Css is the actual (supersaturation) concentration and Cs is the equilibrium solubility as defined above. Special Limiting Conditions for Continuous Crystallization. Two limiting cases of Eq. 15 are of special interest. For this purpose Eq. 15 is rewritten15: τ = L(1.0 – 2ksγDVm2CsΨ/kvL3RgT)/3Gm. (18) Very Large Average Crystal Size L. If the average crystal size is very large as given by the definition in Eq. 19: Leubner L3 >> 2ksγDVm2CsΨ/kvRgT, (19) then the term inside the parenthsis of Eq. 18 can be set equal to one and simplified to τ = L/3Gm (20) L = 3Gmτ. (21) or This indicates that τ and L are linearly related and Gm the maximum growth rate can be determined from the linear part of the correlation at large crystal sizes. The growth rate Gm can then be resubstituted into the original Eq. 14, 15, or 18, and Ψ can be determined. Continuous Precipitation in a Plug–Flow Reactor, τ → Zero. The second limiting case of the above equation is when τ, the residence time, approaches zero. The above equation predicts that under this limiting condition, the average crystal size will not become zero but reach a limiting value. Plug–flow reactors are characterized by a short residence time in the nucleation zone, τ → 0, followed by crystal growth and ripening processes. The present derivation allows us to predict the minimum crystal size for the condition where τ → 0: L3 = 2ksγDVm2CsΨ/kvRgT. (22) Nascent Nuclei Size. For the present work, the term “nascent nuclei” is defined as the stable crystals that are newly formed during steady state, which continue to grow in the reactor and which are removed from the reactor. The size of these crystals is larger than that of the critical nuclei, which have equal probability to grow or dissolve in the reaction mixture. The nascent nuclei will have a crystal size distribution that is related to the critical crystal size. The nascent nuclei population may be represented by an average nascent nuclei size Ln. With the model that has been developed up to this point, it is possible to calculate Ln for the different precipitations. For this purpose we define Ln by: Ln3 = RnVm/kvZn. Here, Zn can be calculated from Eq. 3 and Rn can be calculated from Eqs. 2 and 25. Back-substitution into Eq. 27 leads to Eq. 28: Ln3 = (Rn/R0)L3. Vg = kvL3, (23) where Vg is the average crystal volume, leads to Vg = 2 ksγDVm2CsΨ/RgT. (24) Thus, for residence times approaching zero, the average crystal volume Vg is only a function of solubility Cs and temperature T. The value Ψ, which can be calculated from Eqs. 22 or 24, may be a function of Cs and T. Note that without the 1/L2 term, Eq. 15 would predict zero crystal size for zero residence time. Nucleation versus Growth, Rn/Ri. In the model used for these derivations, the reactant addition stream R0 is separated in the reactor into a nucleation stream Rn and a growth stream, Ri (Fig. 4). It is now possible to derive the ratio of the reactant streams Rn/Ri by dividing Eqs. 9 and 12. After back-substituting for K (Eq. 6), Eq. 25 is obtained: Rn/Ri = 2ksγDVm2CsΨ/3kvGmRgTτL2. (25) Except for τ in the divisor this is equal to the second part on the right side of Eq. 15. If we simplify to Eq. 26, it becomes evident that the ratio Rn/Ri decreases with increasing residence time τ and with the square of the average crystal size at steady state. Rn/Ri ~ 1/τL2. (26) It appears intuitive that the nucleation should decrease relative to growth as the residence time and the surface area of the crystal population (proportional to L2) increase. The value of the ratio Rn/Ri was calculated from the experimental data. From the ratio Rn/Ri and the mass balance (Eq. 2), Rn and Ri were calculated as a fraction of the reactant addition rate R0. (28) This equation was used to calculate Ln values for the different residence times. It is now possible to back-substitute further from Eq. 25, and Eq. 29 is obtained that relates the nascent nuclei size to the precipitation conditions and the average crystals size at steady state: Ln3 = 2ksγDVm2CsΨL3/ (3kvGmRgTτL2 + 2ksγDVm2CsΨ) Substituting (27) (29) The terms in this equation have been defined above and are listed in the Nomenclature section. It is apparent that Ln is a complicated function of the reaction conditions and of the average crystals size, which in itself is a complicated function of the same reaction variables (Eq.14). Experimental The present experiments were done before the present theory was developed. If the theory had been known at the time of the experiments, a wider range of experiments would have been performed to determine the present results to a greater degree. Unfortunately, the author is no longer in a position to provide additional experiments. However, the present experiments support several important predictions of the theory and might be the starting point for more extended work in the future. It should be remembered that the original paper by Randolph and Larson6a did not supply any supporting experimental results to support their model. Silver chloride, AgCl, was precipitated in a single-stage continuous stirred tank reactor (CSTR) system (Fig. 1). The residence time was varied from 0.5 to 5.0 min (Table I). For the residence time of 3.0 min the suspension density was varied from 0.05 to 0.40 mol/l (Table II). The temperature was held constant at 60°C. The reactor volume, flow rates, and reactant concentrations are given in Tables I and II. Bone-gelatin was used as the peptizing agent. The free silver ion activity {Ag+} was controlled in the reactor at pAg 6.45, where pAg = –log {Ag+}. This corresponds16 to a solubility of 6.2 × 10–6 mol Ag/l. The exit stream had the same free Ag+ concentration as the reactor mixture. This solubility consists of the sum of concentrations of free silver ion plus complexes of silver ions with halide ions, AgCln1–n (n equals 1 to 4). The silver chloride precipitated in cubic morphology. A crystal growth restrainer was added to the output material to avoid Ostwald ripening and to preserve the crystal size distribution. The crystal size distribution A New Crystal Nucleation Theory for Continuous Precipitation of Silver Halides Vol. 42, No. 4, July/Aug. 1998 359 TABLE I. Effect of Residence Time on the Average Crystal Size of AgCl Precipitated in the CSTR System* Experiment Residence time Edge length Size distribution No. τ min L µm d.r. 1 2 3 4 0.5 1.0 3.0 5.0 0.207 0.263 0.337 0.413 2.69 2.64 2.58 2.22 Crystal number Zt × 10 12 73 36 41 37 Reactor volume Flow rate Addition rate Suspension density V0 ml F ml/min R0 mol/min Mt mol/l 300 300 600 1000 50 25 20 20 0.050 0.025 0.020 0.020 0.083 0.083 0.100 0.100 * Reaction conditions: 60°C, AgNO3, NaCl 1.0 mol/l, 2.4% bone gelatin τ residence time (min), V0 reactor volume (l), F flow rate (ml/min, AgNO3 and NaCl), R0 molar addition rate (mol/min), Mt suspension density (mol AgCl/l), L (µm) cubic edgelength, d.r. decade ratio (measure of size distribution), Zt total crystal number in reactor. TABLE II. Effect of Suspension Density on Average Crystal Size for Precipitations of AgCl in the CSTR System† Experiment 1 2 3 4 5 6 Suspension density Edge length Size distribution Mt mol/l L µm d.r. 0.05 0.10 0.10 0.20 0.30 0.40 0.327 0.350 0.337 0.331 0.333 0.344 2.45 2.59 2.58 2.79 (3.51)a 3.25 Crystal number Zt x 10 12 Reactant concentration Addition rate C mol/l R0 mol/min 0.5 1.0 1.0 2.0 3.0 4.0 0.005 0.010 0.020 0.020 0.030 0.040 11 18 41 43 63 81 † Reaction conditions: 60°C, pAg 6.45, residence time τ 3.0 min; all experiments except No. 3: AgNO3, NaCl 10 ml/min, gelatin (2.4%) 80 ml/min, V0 300 ml. Experiment No. 3: AgNO3, NaCl 20 ml/min, gel (2.4%) 160 ml/min, V0 600 ml. C is the concentration of reactants (AgNO3 and NaCl), R 0 molar addition rate (mol/min), Mt suspension density (mol AgCl/l), L average crystal size (cubic edgelength, µm), d.r. decade ratio (measure of crystal size distribution), Zt total crystal number in reactor; (a) data not reliable. of the crystal suspensions were determined using the Joyce–Loebl disk centrifuge.14,17 This analytical method determines the size distribution of the crystals by their sedimentation time using Stokes’ law and the relative frequency of the crystal sizes by light scattering. This method determines the crystal size distribution of the material that represents the final product of the precipitation process. In situ determinations of the crystal size population were not done because the high concentrations did not allow precise light-scattering experiments. The original data were determined as equivalent circular diameter (ecd) and were converted to cubic edge length (cel), where cel = 0.86 ∗ ecd. (30) The crystal size distribution curves are shown in the ecd scale. The crystal size distribution could be fitted to the sum of two Gaussian distributions, and thus the distribution cannot be described by a single standard deviation. Thus, the crystal size distribution is given by an empirical measure, the decade ratio (d.r.), which is defined by the ratio of the sizes at 90 to 10% of the experimentally determined crystal size population. Electron micrographs (carbon replica) of AgCl crystals precipitated in the CSTR system (5-min residence time) are shown in Fig. 2. The results for the variation of crystal size with residence time are shown in Fig. 3 and Table I. The dependence of crystal size and crystal size distribution on suspension density was studied for the 3.0-min residence time (Table II). The constants used for the calculations are listed in Table III. Results and Discussion The new model makes a number of predictions that can be tested with the experimental results. • The average crystal size is independent of addition rate and suspension density. • The size dependence of the average crystal size on residence time can be modeled using the equations given. 360 Journal of Imaging Science and Technology TABLE III. AgCl Precipitations in the CSTR System. Constants Used in the Calculations Constant Value Comment kv ks γ 1.0 6.0 52.2 D 1.60*10–5 Vm Cs 25.9 6.2*10–9 Rg T 8.3*107 333 K/60°C cubic cubic erg/cm2 (Ref. 16) cm2/s (Ref. 16) cm3/mol AgCl mol/cm3 (Ref. 16) erg/deg mol • The average crystal size has a final value at zero residence time. • The maximum growth rate, the critical crystal size, the supersaturation ratio and the supersaturation, and the ratio of nucleation to growth of the system can be determined at steady state. Crystal Size, Addition Rate, and Suspension Density. The model predicts that the average crystal size is independent of the reactant addition rate and by implication of the suspension density. This is supported by the results in Table II, which show that over a range of addition rates from 0.005 to 0.04 mol/min and a suspension density of 0.05 to 0.40 mol/l the average crystal size did not significantly vary. At the same time, the total crystal number in the reactor, Zt, increased proportionally to the molar addition rate. It is significant that for Experiments 2 and 3, which have the same suspension density but vary by a factor or two in molar addition rate, the average crystal size and the decade ratio are not significantly different. The doubling of the addition rate leads to a doubling in the total Leubner crystal number. Similarly, Experiments 3 and 4 have the same molar addition rate but vary in the suspension density by a factor of 2, while producing the same average crystal size and total crystal number. This supports the prediction of the theory that the crystal number is independent of suspension density and, by implication, independent of the population density. The results in Table II show that the width of the crystal size distribution as measured by d.r. increased with increasing suspension density. This is indicated by Experiments 2 and 3 where for Experiment 3 the molar addition rate (and reactor volume V0) was doubled while the suspension density was held constant. The value of d.r. is the same, indicating that the suspension density, not the molar addition rate affects the width of the crystal size distribution. Experiments 3 and 4 have the same addition rate, but Experiment 4 has half the reactor volume and flow rate, so that it has twice the suspension density of Experiment 3. The experiment with the higher suspension density (Experiment 4) has the wider crystal size distribution which reinforces the direct relationship between suspension density and width of crystal size distribution. The unusually wide size distribution of Experiment 5 is probably due to some undetermined experimental deviation. The crystal size distribution is governed by two different reactions: For crystals larger than the stable crystal size, growth is dominated by the maximum growth rate. This part of the crystal population can probably be described by the Randolph–Larson model, which is based on the maximum growth rate of the crystals. The crystals smaller than the stable crystal size also grow at maximum growth rate, but also disappear at some rate by Ostwald ripening. Ostwald ripening is the process by which larger crystals increase in size (ripen) at the expense of the dissolution of smaller ones. The critical crystal size at which a crystal has equal probability to grow or dissolve by Ostwald ripening can be determined by the present model and experiments. In addition, it was determined that in controlled doublejet precipitations the maximum growth rate nonlinearly decreases with increasing crystal size.16 It was shown that the maximum growth rate increases under crowded conditions when the crystal population density is very high and where the diffusion layers of the crystals overlap.14,18,19 This effect may depend on the crystal size and the state of supersaturation in the reactor and thus may contribute to the increase in crystal size distribution with increasing suspension density. Because two different reaction mechanisms, growth and the Ostwald ripening effect, are differently effective for the different fractions of the crystal size populations, it can be anticipated that the shape of the resultant crystal population might not be symmetrical. This idea is supported by the crystal size distributions shown in Fig. 3. The result that the average crystal size is independent of molar addition rate and suspension density allows us to add the result from Table II to those of Table I for the correlation of crystal size with residence time. Crystal Size and Residence Time. The results from Tables I and II were combined and plotted in Fig. 5. A linear least square evaluation using L and 1.0/L2 based on the size/τ correlation predicted in Eq. 18 results in Eq. 31: τ = 11.88 ∗ L – 0.1026/L2, where τ is in minutes and L in µm. (31) Figure 5. Crystal size of silver chloride crystals (µm cubic edge length) as a function of residence time (min), 60°C, pAg 6.45, 2.4% gelatin. The standard error of estimate is 0.77 and 0.0214 for the first and second constant, respectively. The correlation coefficient is 0.9844. In Fig. 5, the solid line is given by Eq. 31. The linear correlation (dashed line) represents the L/τ correlation for large crystal sizes as defined by Eq. 21. Maximum Growth Rate, Gm. From Eqs. 31 and 18 the maximum growth rate was calculated to Gm = 28.1∗10–3 µm/min, or 4.68 A/s. This is in good agreement with the results by Strong and Wey16 who determined the maximum growth rate of AgCl in controlled double-jet precipitations to between 4.25 to 1.20 A/s for grain sizes between 0.209 and 0.700 µm. This suggests that the growth rate Gm may be obtained independent from the continuous precipitations to be used for the calculations in the present theory. It was determined in the experiments by Strong and Wey16 that the maximum growth rate decreased with increasing crystal size. In the doublejet precipitations, Gm can be determined as a function of crystal size because the crystal size distribution is very narrow. In the present precipitations, Gm is the average of the maximum growth of a relatively wide crystal size distribution. Minimum Crystal Size. This value is essentially the same as determined for τ = 0.5 min, 0.207 µm. From the experimental results the crystal size for zero residence time was estimated to be 0.205 µm. Ψ, L/Lc, S*, and Css. The parameters in Table III were used to calculate Ψ (5.73∗103), L/Lc (~5.73∗103 , Eq. 5), the ratio of average to critical crystal size. The accuracy of the calculated results is affected by the constants in Table III, especially the values of surface energy γ, diffusion coefficient D, A New Crystal Nucleation Theory for Continuous Precipitation of Silver Halides Vol. 42, No. 4, July/Aug. 1998 361 TABLE IV. Critical Crystal Size, Supersaturation and - Ratio as a Function of Residence Time Residence time Cubic edgelength Critical crystal size Supersaturation ratio Supersaturation µm Lc µm*10–5 S* min Css mol/cm3 *10–8 0.5 1.0 3.0 5.0 0.207 0.263 0.337 0.413 3.61 4.59 5.88 7.21 28.1 22.3 17.6 14.6 18.0 14.5 11.6 9.65 Ψ = 5.73 * 103. For experimental conditions see Table I TABLE V. Calculated Experimental Constants and Variables CSTR and Batch Precipitations of AgCl Variable max. Growth Rate Aver./Critical Size ratio Supersaturation ratio Supersaturation CSTR 4.68 5.73*103 12.2 8.2*10–8 Batch 1.20-4.25 A/s Variable Gm Reference ( b) 1.85 1.02-1.09 12.7*10–9 — for L = 0.5 µm mol/cm3 L / Lc S* Css ( a) (a,b) Ψ = 5.73 * 103. References for Batch Precipitations: (a) I. H. Leubner,2 (b) R. W. Strong and J. S. Wey16 TABLE VI. AgCl Crystal Nucleation and Crystal Growth Reactant Rates and Nascent Nuclei Sizes Residence time (min) Cubic edgelength,L µm Rn/RI Rn % of R0 RI % of R0 Ln µm % Ln % of L 0.5 1.0 3.0 5.0 0.207 0.263 0.337 0.413 4.79 1.48 0.30 0.12 82.7 59.7 23.1 10.7 17.3 40.3 76.9 89.3 0.194 0.221 0.207 0.196 93.9 84.2 61.4 47.5 Rn = fraction of reactant input rate consumed for crystal nucleation, Ri fraction consumed for crystal growth at steady state. The value Ln size of nascent nuclei, %Ln = % of nascent nuclei sizes versus average size. Reaction conditions: AgCl, 60°C, pAg 6.45, 2.4% gelatin. and solubility Cs. The calculated results thus should be considered estimates and must be reevaluated when more reliable data become available for the constants used. In Table IV, the critical crystal size Lc , the supersaturation ratio S*, and the supersaturation Css during steady state were calculated as a function of residence time τ. The data indicate that the critical crystal sizes increase with residence time, while the supersaturation and supersaturation ratio decrease. A theoretical supersaturation ratio of 1.60∗105 is obtained if one adds 1.0 mol/l silver nitrate to a solution where the solubility is 6.2∗10–6 mol/l as in the present experiments.3 In stop–flow experiments, Tanaka and Iwasaki20 estimated that the size of the primary nuclei formed in AgCl precipitations were about (AgCl) 8. Thus, the difference between the theoretical (1.60∗105) and actual supersaturation ratio (14.6 to 28.1) in this system indicates that Ostwald ripening involving metastable nanoclusters may play a significant role in the nucleation/growth mechanism. The values for the present CSTR system are substantially higher than those reported for controlled batch double-jet precipitations under the same conditions (Table V) as reported by Leubner2 and Strong and Wey,16 who reported values of S* of about approximately 1.02 and 1.09, and values for L/Lc of 1.85 and 4.2. For comparison of the systems, an average crystal size of 0.5 µm was used. The higher values of L/Lc and S* for the CSTR versus the batch system indicate that during steady state the balance between maximum growth and renucleation stabilizes much smaller critical crystal sizes than the batch double-jet precipitation. This is also supported by the wider crystal size 362 Journal of Imaging Science and Technology distribution in the CSTR system. However, the absence of very small crystals in the product suggests that upon removal of the high supersaturation in the product stream, the small initial crystals rapidly dissolve by Ostwald ripening to produce the observed crystal size distributions. This is in agreement with simulations of the effect of Ostwald ripening on the crystal size distribution in batch precipitations by Tavare.21 Unfortunately, Tavare did not provide experimental evidence for his simulations. The effect of Ostwald ripening and growth restraining agents on the crystal nucleation in batch precipitations was modeled and experimentally supported by this author.22,23 Nucleation versus Growth, Rn/Ri. Using Eqs. 2 and 25, Rn/Ri and Rn and Ri (as a percent of R0) were calculated as a function of residence time τ and average crystal size L. The results are listed in Table VI. The data show that at short residence times, nucleation (high Rn) is dominant, while at long residence times, growth (high Ri) dominates the reaction. Nascent Nuclei Size. The size of the nascent nuclei Ln were calculated using Eq. 28. The results are shown in Table VI. Note that the size of the nascent nuclei is relatively independent of residence time and of the final crystal size, ranging between 0.194 and 0.221 µm. Interestingly, this size range is of the same order of magnitude as calculated for the limiting crystal size for the plug–flow reactor, 0.205 µm. As a consequence of the relatively stable size of the nascent nuclei, their size relative to the steady state average crystal sizes decreases with increasing residence time. Leubner Thus, at the 0.5-min residence time, the size of the nascent nuclei is about 93.9% of the final crystal size, while for 5.0-min residence time it is only about 47.5% of the final crystal size. Conclusions A new theory of crystallization is proposed for the CSTR or MSMPR system. The model is based on nonseeded systems with homogeneous nucleation, diffusion controlled growth, and the nucleation model previously derived for such systems in controlled double-jet batch precipitations. It does not need any assumptions about size-dependent growth (McCabe’s law). The model predicts the correlation between the average crystal size and the residence time, solubility, and temperature of the reaction system and allows determination of useful factors that are experimentally hard to determine such as L/Lc the ratio of average to critical crystal size; the supersaturation ratio S*; the supersaturation Css, the maximum growth rate Gm, and the ratio of nucleation to growth Rn/Ri. Results of continuous precipitations of silver chloride were chosen to support the predictions of the model. Silver chloride precipitations in batch double-jet precipitations indicate that the crystal growth is mainly determined by a diffusion-controlled mechanism.2,16 The model predicts that the average crystal size is independent of reactant addition rate and suspension density, which was supported with experiments where the molar addition rate was varied from 0.005 to 0.04 mol/min and the suspension density from 0.05 to 0.40 mol/l. The width of the crystal size distribution (given by the decade ratio) increased with suspension density from 2.45 to 3.25 and was independent of reactant addition rate. The crystal size distribution is a factor of maximum growth rate (small and large crystals), Ostwald ripening (small crystals), and possibly of crowded growth conditions. These insights may be applied to modify the Randolph–Larson model to describe the size distribution of the crystal population. The model also predicts that the average crystal size and residence time are linearly related when the average crystal size is significantly larger than a certain limiting size is, which can be derived from the experimental correlations. At smaller crystal sizes, the average crystal size is larger than predicted by the linear part of the correlation. These predictions were confirmed by the experimental results. The model further predicts that when the residence time approaches zero, the average crystal size approaches a limiting value larger than zero. For the present AgCl precipitations the limiting value was calculated to about 0.205 µm, while the average crystal size varied from 0.207 to 0.413 µm between residence times from 0.5 to 5.0 min. The condition where the residence time approaches zero is similar to that obtained during nucleation in plug–flow reactors and thus predicts a lower limit of average crystal size for the CSTR and plug–flow systems of about 0.205 µm. The model also allowed us to calculate the part of the input reactant stream R0 is used for nucleation Rn and for growth Ri. The ratio Rn/Ri decreased with increasing residence time from 4.79 to 0.12. The average crystal size of the nascent (newly formed) crystals Ln was determined for the different residence times to vary between 0.194 to 0.221 µm. This size range is in the range calculated for the plug–flow condition (τ → 0), 0.205 µm. Experiments are needed to confirm the model as a function of solubility and temperature where care has to be taken to consider that Ψ (= L/Lc – 1.0) may be a function of solubility and temperature as shown for AgBr, AgCl1–4 and addition rate.5 The present model may be expanded to include the effect of Ostwald ripening agents and growth restrainers using the formalism previously applied to precipitations in batch precipitations.22,23 While the present model was developed for homogeneous nucleation under diffusion-limited growth conditions and unseeded systems, it may be easily modified to model seeded systems and systems where growth and nucleation are kinetically, hetrogeneous, or otherwise controlled. The present work also suggests many additional experiments and new approaches to the evaluation of the results of continuous precipitations. Acknowledgments. I am indebted to J. A. Budz, J. Z. Mydlarz, J. P. Terwilliger, and J. S. Wey for reading the manuscript and for helpful discussions and suggestions, and to K. Marsh for editing the manuscript. Nomenclature: cel = cubic edge length C = concentration (mol/l) Css = supersaturation (mol/cm3) Cs = solubility (mol/cm3) ecd = equivalent circular diameter Ψ = L/Lc –1.0 D = diffusion constant (cm2/s) F = flow rate (ml/min) Ft = total flow rate (l/min) γ = surface energy (erg/cm2) Gm = maximum growth rate of crystal population (cm/ s, A/s, µm/s) L = average crystal size (µm, cm) Lc = critical crystal size (µm, cm) Ln = nascent (newly formed) crystal size (µm, cm) Lx = crystal size (cm) kv = volume shape factor, (if L is the edge length of a cubic crystal, kv = 1.0; if L is the radius of a spherical crystal, kv = 4π/3) ks = surface shape factor, (if L is the edge length of a cubic crystal, ks = 6.0; if L is the radius of a spherical crystal, ks = 4π) N = population density, number/(volume-length) n0 = nuclei population density, number/(volume-length) Mt = suspension density (mol/l) Rg = gas constant (8.3 × 107 erg/deg mol) R0 = addition rate of reactants (mol/s) Rn = addition rate fraction used for nucleation Ri = addition rate fraction used for growth (crystal size increase) Sm = characteristic surface (e.g., area/mol silver halide) τ = residence time, (s, min) T = temperature (K) V0 = reaction volume Vg = average crystal volume (cm3) Vm = molar volume, (cm3/mol, crystals) Zn = number of crystals nucleated at steady state Zr = number of crystals/ml in the reactor during steady state Zt = total number of crystals in reactor during steady state References 1. I. H Leubner, R. Jagannathan, and J. S. Wey, Photogr. Sci. Eng. 24, 103 (1980). 2. I. H. Leubner, J. Imaging Sci. 29, 219 (1985). 3. I. H. Leubner, J. Phys. Chem. 91, 6069 (1987). A New Crystal Nucleation Theory for Continuous Precipitation of Silver Halides Vol. 42, No. 4, July/Aug. 1998 363 4. I. H. Leubner, in Final Program and Advance Printing of Paper Summaries, 45th Annual Conference of the Society for Imaging Science and Technology, East Rutherford, NJ, p. 13 (1992). 5. I. H. Leubner, J. Imaging Sci. Technol. 37, 68 (1993). 6. A. D. Randolph and M. A. Larson, AIChE J. 8, 639 (1962); H. S. Bransom, W.J. Dunning, and B. Millard, Disc. Faraday Soc. 5, 83 (1949). 7. A. D. Randolph and M. A. Larson, Theory of Particulate Processes, Analysis, and Techniques of Continuous Crystallization, 2nd ed., Academic Press, San Diego, CA, 1991. 8. E. B. Gutoff, Photogr. Sci. Eng. 14, 248 (1970). 9. E. B. Gutoff, Photogr. Sci. Eng. 15, 189 (1971). 10. J. S. Wey, J. S. and J. P. Terwilliger, AIChE J. 20, 1219 (1974). 11. J. S. Wey, J. P. Terwilliger, and A. D. Gingello, Res. Discl. 14987 (1976). 12. J. S. Wey, J. P. Terwilliger, and A. D. Gingello, AIChE Symposium Series No. 193, 76, 34 (1980). 364 Journal of Imaging Science and Technology 13. J. S. Wey, I. H. Leubner, and J. P. Terwilliger, Photogr. Sci. Eng. 27, 35 (1983). 14. I. H. Leubner, J. Imaging Sci. Technol. 37, 510 (1993). 15. J. P. Terwilliger, Eastman Kodak Company, suggested this approach to determine the limiting conditions for the continuous precipitation model, private communication, 1996. 16. R. W. Strong and J. S. Wey, Photogr. Sci. Eng. 23, 344 (1979). 17. T. W. King, S. M. Shor, and D. A. Pitt, Photogr. Sci. Eng. 25, 70 (1981). 18. J. S. Wey. and R. Jagannathan, AIChE J. 28, 697 (1982). 19. R. Jagannathan, J. Imaging Sci. 32, 100 (1988). 20. T. Tanaka and M. Iwasaki, J. Imaging Sci. 29, 20 (1985). 21. N. S. Tavare , AIChE J. 33, 152 (1987). 22. I. H. Leubner, J. Imaging Sci. 31, 145 (1987). 23. I. H. Leubner, J. Cryst. Growth 84, 496 (1987). Nakamura, et al. Carrier Transport Properties in Polysilanes with Various Molecular Weights Tomomi Nakamura, Kunio Oka, Fuminobu Hori, Ryuichiro Oshima, Hiroyoshi Naito, and Takaaki Dohmaru*† College of Engineering, Osaka Prefecture University, 1-1 Gakuencho, Sakai, Osaka 599-8531, Japan † Research Institute for Advanced Science and Technology, Osaka Prefecture University, 1-2 Gakuencho, Sakai, Osaka 599-8570, Japan Carrier transport properties in poly(methylphenylsilane) films were studied with interest focusing on the variation of Bässler’s disorder parameters with changing molecular weights. There was no apparent correlation between the charge transport properties and the molecular density and free volume of the samples, the latter being measured by a positron annihilation technique. This result led us to a model explaining the variation in Σ with changing molecular weights, a model which is based on a hypothesis that ca. 10 silylene units at the end of a polymer chain do not participate in forming a σ-conjugated domain. The validity of this model was demonstrated by UV absorption measurements. The variation of µ0 was interpreted in terms of the partial orientation of the main chains induced by the mechanical force applied by a bar-coater. It is discussed that µ0 is sensitive to even the slightest orientation of the main chains while the other disorder parameters are not. Journal of Imaging Science and Technology 42: 364–369 (1998) Introduction Organic polysilanes with σ-conjugated Si backbones have been extensively investigated because of their unique physical and chemical properties.1-8 High hole drift mobilities of ~10–4 cm2/Vs at room temperature are one of their most remarkable properties.2 In addition, it has now been generally accepted that the carrier transport in organic polysilanes occurs via hopping through the σ-conjugated domains developed along Si main chains.3 Recently, we investigated carrier transport properties in various organic polysilanes.9 Then we happened to obtain a result that may suggest the molecular weight dependence of hole drift mobilities. This result considerably inspired our interest in the molecular-weight dependences of hole drift mobilities of polysilanes because it has long been considered that drift mobilities are independent of the molecular weights of polysilanes when they are comparatively high.4 Our first detailed study was performed by using poly(methylphenylsilane)s with narrow molecular weight distribution. Analysis of the results according to the disorder formalism proposed by Bässler Original manuscript received September 8, 1997 * IS&T Member © 1998, IS&T—The Society for Imaging Science and Technology 364 et al.10–12 indicated that the increase in the molecular weight mainly caused the increase in the positional disorder parameters ( Σ ). 13,14 Later we prepared poly(methylphenylsilane)s with much narrower molecular-weight distribution than before, carefully avoiding a tailing of low molecular weight components.15,16 More detailed re-investigation using these samples showed that Σ values decreased with increasing molecular weights,15 being quite opposite the tendency of our first study. The reliability of the latter result is much higher than our first study in every point such as the purity of the polysilane samples and the uniformity of the polysilane films. We tentatively proposed two models to interpret this unexpected result on the basis of the difference in the average distance between σ-conjugated domains and the partial orientation of the main chains.15,16 In this article, the two models were checked in the light of an additional experimental result obtained from a positron annihilation technique that gives information on the microscopic inner structure of materials. Experimental Procedure Samples used in this experiment were six poly(methylphenylsilane)s with various molecular weights and with small dispersity. Their weight average molecular weights were changed by two orders from ca. 10,000 to ca. 1,000,000. TABLE I. The Disorder Parameters Weight and Related Characteristics for Poly(methylphenylsilane) Samples with Various Molecular Weights Sample PS1 PS10 PS50 PS100 MW (×103) MW/MN 14 108 526 1144 1.27 1.28 1.41 1.67 viscosity* (mPa•s) σ(meV) 4.17 — 221.02 425.34 90.24 92.06 89.09 88.60 Σ µ0 (cm2/Vs) 3.00 2.86 2.33 2.20 3.52 × 102 4.47 × 102 1.70 × 102 1.68 × 102 * Measured on the respective toluene solutions; — not measured . Hole drift mobilities for the polysilane samples were measured by means of the conventional TOF (Time-of-Flight) technique. Samples for the TOF measurements were of sandwich type of Au/polysilane (2.33 µm to about 6.22 µm)/ bisazo compound (ca. 1 µm)/Al/PET film. Details concerning preparation of polysilane samples and the TOF measurements were described previously.15,16 Measurements of the free volume space in polysilane films were performed by means of the positron–electron annihilation lifetime spectroscopy using the conventional fast– fast coincidence system. Two BaF2 scintillators coupled with photomultipliers (HF3378-01, Hamamatsu Photonics, Shizuka, Japan) were employed to detect 1.28 MeV (birth) and 511 keV (annihilation) γ-rays. We employed 22NaCl as a radioactive source and a Kapton film (thickness: 25 µm) as a cover foil to envelope the radioactive source. Poly(methyl-phenylsilane) was casted from its toluene solution on a pure iron substrate (size: 8 × 8 × 0.2 mm3, Residual Resistance Ratio (RRR): ≥6000) and dried for 1 h at room temperature and 2 h at 80°C. A pair of polysilane samples was prepared and were closely put on both sides of the radioactive source sandwiched by two Kapton foils. Positron lifetime measurements were performed at room temperature; 1,500,000 counts were accumulated for each lifetime spectrum. All lifetime spectra were computer-fitted by using “POSITRONFIT” in the PATFIT-88 program of Kirkegaard et al.17 In this calculation, source correction terms were inputted as fixed values and subtracted from the lifetime spectrum for each polysilane/Fe sample. The viscosity of toluene solutions of poly(methylphenylsilane) was measured by a viscometer (LVT CP-40, Brookfield, Eng. Lab., Stoughton, MA USA). The density of each polysilane film was measured by a pycnometric measurement of a NaBr aqueous solution (density range: approximately 1.00 g/cm3 to 1.41 g/cm3) with density equal to a particular poly(methylphenylsilane) sample. UV absorption measurements were performed on five polysilane solutions of an equal concentration prepared weightwise with precision of concentration of <0.03%. Toluene was used as a solvent to decrease the evaporation of a solvent during the preparation of the polysilane solutions. Results and Discussion In the present study, we attempted to analyze temperature—and the electric field—dependences of the hole drift mobilities for each polysilane sample according to the disorder formalism of Bässler et al.12 that is expressed as Eq. 1. µ = µ0 exp [–(2σ/3kT)2]exp[C{(σ/kT)2 –Σ 2}F1/2] (Σ > 1.5), (1) where µ0 is the drift mobility of a hypothetical crystalline structure (with no disorder), σ is the energy width of Gaussian distribution of hopping sites (σ/kT is the degree of the energetic disorder of hopping sites), Σ is the degree of the positional disorder of hopping sites, and C is a constant. Table I shows the disorder parameters obtained for four polysilane samples with various molecular weights, together with the viscosity for each toluene solution. While the values of σ stay almost constant, Σ values obviously decrease from 3.00 to 2.20 with increasing molecular weights. In the previous communication,15 we tentatively proposed a model that this variation in Σ values was ascribed to the change in the average distance between the neighboring sites caused by the strain force arising from the entanglement of Si main chains. But in a recent report,16 we also attempted to interpret this result by proposing the other model that the variation in Σ may be caused by the partial orientation of the higher molecular weight polysilanes due to the mechanical force applied by a bar-coater. First, we discuss the plausibility of the former model.15 The average distance between the neighboring sites mentioned above is considered to correspond to free space called “free volume,” which is a direct measure of the inter-main chain distances. We proceeded to measure the sizes of the free volumes in the polysilane films by means of a positron annihilation technique18,19 and also to measure the densities of the polysilane films that supplement the information necessary to describe the inner structure of the polysilane films. Figure 1 shows positron lifetime spectra for poly(methylphenylsilane) samples with various molecular weights; the lifetime spectrum for the substrate with no polysilane film is shown in Fig. 1(D). Comparison of the two lifetime spectra in Fig. 1(D) clearly shows that new slopes appear that are ascribed to the annihilations of positrons and positroniums in the polysilane film. Analysis of the lifetime spectra in this study shows that five-component deconvolution gives the best fit for all samples. Two components of the five are the source correction terms arising from the annihilation of positrons and positroniums in Kapton foils. Thus, three components are left for polysilane films: τ1, τ2, and τ3 in the order of lifetime value. The shortest lifetime component (τ1) is a mixture of two components: one from the annihilation of positrons in the substrates and the other from the self-annihilation of para-positroniums in the polysilane films. The middle lifetime component (τ2) is considered to correspond to the annihilation of positrons in microscopic space such as vacancy and chain defects in the polysilane films and/or the annihilation of orthopositronium in the bulk of the polysilane films.20,21 The longest lifetime component (τ3) is ascribed to the pick-off annihilation of ortho-positroniums in free volume in the polysilane films. We are most interested in these sizes of free volumes (i.e., τ3 values), which are equal to the average distance between the σ-conjugated domains dwelling on Si main chains. We estimated the free volume size as the average free volume radii (FVR) calculated from τ3 values according to Eqs. 2 and 3, which are semiempirically given on the basis of the assumption that a free volume space is spherical.22 τ = 0.5 {1 – R /R 0 + sin(2πR/R0)/2π}–1, (2) R = R0 – ∆R (∆R = 0.166 nm), (3) where τ is the ortho-positronium lifetime (i.e., τ3 in the present study) and R is the radius of a spherical free volume. Lifetime values (τ2 and τ3) and FVR for each polysilane sample are summarized in Table II, together with the polysilane densities. The values of τ2 give information on the bulk of the polysilane film, but their detailed discussion is beyond our scope in this report. The values of τ3 Carrier Transport Properties in Polysilanes with Various Molecular Weights Vol. 42, No. 4, July/Aug. 1998 365 TABLE II. Positron Lifetimes, Free Volume Radii (FVR) , and Densities Relating to Poly(methylphenylsilane)s with Various Molecular Weights Sample τ2 (ns) τ3 (ns) FVR (Å) density (g/cm3) PS1 PS10 PS50 PS100 0.64 0.65 0.71 0.75 2.27 2.19 2.21 2.26 3.09 3.02 3.04 3.09 1.10 1.10 1.10 1.10 (a) (b) Figure 2. Schematic illustration of the hypothesis that the relative densities of the σ-conjugated domains and the void silylene units differ between (a) high and (b) low molecular weight polysilane films. Figure 1. Lifetime spectra for poly(methylphenylsilane)s with various molecular weights. Lifetime spectrum for the Kapton source/pure iron substrate system is shown in (D). and, thus, FVR stay almost constant against the change in the molecular weight by more than 2 orders. This result shows that the average free volume sizes in the polysilane films, i.e., the average inter-chain distance, is constant against the change in the molecular weight from ca. 10,000 to ca. 1,000,000. Table II also shows that the densities of the four polysilane samples are precisely equal. Combining these two supplementary results, we can depict the inner structure of the poly(methylphenylsilane) films; the densities of both occupied and unoccupied volumes in the polysilane films do not change against the change in the molecular weight, meaning the invariance of the density of silylene units in the polysilane films of various molecular weights. 366 Journal of Imaging Science and Technology We now propose a model to elucidate the variation in Σ values against the change in the molecular weight shown in Table I. According to the disorder formalism by Bässler et al.,12 the parameter “Σ ” tells the degree of the positional disorder of the hopping sites, which in the case of polysilanes corresponds to the σ-conjugated domains that in turn are composed of 15 to 30 silylene units.3 The present results show that the number of silylene units per unit volume in the polysilane films is extremely similar despite the difference in the molecular weight by 2 orders. At first glance, the present results contradict the former model, but this discrepancy would be easily avoided by assuming that there are some void silylene units that do not participate in forming a σ-conjugated domain, as pointed out by Hayashi et al.,23 who introduced the concept that there are two types of silylene units in a polysilane film: one in a σ-conjugated domain and the other between the σ-conjugated domains. We propose a hypothesis that such void silylene units are located dominantly near the ends of polysilane chains, on the basis of a INDO/S calculation for (H2Si)20 by Klingensmith et al.,24 who showed that the transition density near the end of the polysilane molecule was much lower than that in the middle. Schematic illustration of the model on the basis of this hypothesis is shown in Fig. 2. The key features in this model are two-fold. One is the spatial disposition of the σ-conjugated domains, which we have finally figured out from the results in Table II, that suggests similar free volume spaces among the polysilane films of different molecular weights. The other is the number of the σ-conjugated domains per unit volume of the polysilane films. Because the σ-conjugated domains are generally accepted to be a chromophore for UV absorption of polysilanes, we attempted to substantiate the latter feature by an absorbance measurement Nakamura, et al. Figure 3. UV absorption spectra of toluene solutions of poly(methylphenylsilane)s with various molecular weights. Measured wavelength range is from 280 to 450 nm. Figure 4. Molecular-weight dependence of absorbance in poly(methylphenylsilane)s. of five polysilane solutions with precisely equal concentration. Although the absolute value of absorbance in solution is sometimes different from that in film, we may assume that their relative values do not change significantly between solution and film when absorbance of some polysilanes are measured. Figure 3 shows the absorption spectra, which is one of the three repeated measurements that gave precisely equal results in which each spectrum is due to the typical σ–σ* transition for polysilanes peaking at ca. 340 nm. Figure 3 reveals two important features. One is a slight red shift of ca. 2 nm in the absorption maximum wavelength (λmax) with increasing molecular weights. Trefonas et al.25 reported that λmax in polysilanes abruptly increases with increasing chain lengths at the shorter chain length region and approaches a limiting value at the chain length of 40 to 50. But close examination of their original experimental figure suggests a slight increase in λmax even in the chain length region from 50 to 1350. Our experiment manifests that this slight red shift still continues even in the higher chain length degree up to ca. 10,000 for PS100. This slight red shift suggests that the size of the σ-conjugated domain continues to develop very slightly with increasing chain-lengths even in the chain length of ca. 10,000. As shown in Table I, the variation in σ values with changing molecular weights is very small (by approximately 1 to 3 meV). We reported that the values of σ were almost constant against the change in the molecular weight in the previous reports,15,16 but it may better be interpreted that σ values slightly decrease with increasing molecular weights reflecting the slight variation in the distribution of the sizes of the σ-conjugated domains with changing molecular weights. The other feature in Fig. 3 is the obvious increase in absorbance at λmax for each polysilane sample with increasing molecular weights, indicating that the number of the σ-conjugated domains per unit volume become larger with increasing molecular weight. This trend is more clearly seen in Fig. 4, which depicts the variation of absorbance against the molecular weight. This variation of the absorbance in Fig. 4 was simulated according to Lambert–Beer’s law on the basis of the following considerations: (1) each chromophore (σ-conjugated domain) consists of 20 silylene units, (2) there are 10 void silylene units that do not contribute to a chromophore at each end of a polysilane chain, (3) the value of the absorptivity per Si–Si bond is determined as 11,100 1/Mcm. In this simulation, the number of void silylene units was taken as a fitting parameter and determined to be 10 for the best fit. Consideration 1 may be generally accepted because it was reported that a σ-conjugated domain consists of 15 to 30 silylene units3 and the number of silylene units of 20 is within this range. Consideration 3 is calculated on the basis of the experimental absorbance of PS100 by assuming it has no void units, which is in fair agreement with the value of 12,000 1/Mcm obtained by Harrah and Zeigler.26 The best-fit simulation curve is illustrated by the dotted line in Fig. 4; the fitting with the experimental plots is satisfactory, which suggests the validity of this model that is based on the hypothesis that 10 silylene units at each end of a polysilane chain are void in forming a chromophore. In the INDO/S calculation by Klingensmith et al. (vide supra),24 it is reported that a single gauche link somewhere in the all-trans backbone of (H2Si)20 separates the chromophore into two segments and the lowest energy excitation was localized on the longer segment. Applying their model to the present case of poly(methylphenylsilane), a conformational break predominantly occurs near the end of a polymer chain making the end segment (the shorter one) void in forming a σ-conjugated domain. Thus, higher molecular weight polysilanes have smaller numbers of void segments per unit volume, i.e., larger numbers of σ-conjugated domains per unit volume (larger absorption) leading to smaller Σ values. Next, we check the validity of the latter model,16 which proposes that the variation in Σ with changing molecular weights may be caused by the partial orientation of the higher molecular weight polysilane chains due to the mechanical force applied by a bar-coater. The key factors for this model are the length of the main chains (i.e., the molecular weight) and the viscosity of the polysilane solutions used in the sample preparation for the TOF measurements. As shown in Table I, the viscosity of the solutions obviously increases with increasing molecular weight, suggesting that the larger mechanical force applied by a bar-coater induces the Carrier Transport Properties in Polysilanes with Various Molecular Weights Vol. 42, No. 4, July/Aug. 1998 367 Figure 5. Concentration dependences of the positional disorder parameter. (• and solid line) 1,1-bis(di-4-tolylaminophenyl)cyclohexane doped bisphenol-A-polycarbonate system27; () tri-ptolylamine doped bisphenol-A-polycarbonate system28; () our results. The values for tri-p-tolylamine doped bisphenol-A-polycarbonate system were calculated from three hole transport parameters: T0, θ, and γ. higher degree of partial orientation for the higher molecular weight polysilane chains leading to the smaller Σ values. We attempted to detect the main-chain orientation by polarizing-microscopic observation, but the observation for each bar-coated sample showed no detectable orientation of Si main chains. However, we can not abandon the possibility of the partial orientation completely because µ0 may reflect the very slight orientation of the main chains. The values of µ 0 show a tendency to decrease with increasing molecular weights although their variation is not always systematic. In the definition of disorder formalism, the shape of the hopping sites is regarded as isotropic and therefore µ 0 values reflect only the inter-site distance. But the shape of the hopping sites (i.e., σ-conjugated domains) for polysilanes is quite anisotropic, and therefore not only the inter-site distance but also the shape of the hopping sites are sometimes reflected in the hypothetical crystalline state, e.g., of oriented polysilane films. The present tendency of µ0 may demand that not only the void silylene units but also the effects of the partial orientation of Si main chains are to be taken into consideration. Introducing this additional effect, the number of the main chains parallel to the direction of the carrier transport becomes smaller with increasing molecular weights, leading to the decrease in µ 0. But the inter-site distance becomes shorter with increasing molecular weights because of the higher density of the σ-conjugated domain, leading to the obvious increase in µ0. Table I is interpreted to show that the decrease in µ0 arising from the decreased number of the main chains parallel to the direction of carrier transport is slightly more predominant than the increase in µ0 due to the shorter inter-site distance. The variation in the carrier transport properties with changing molecular weights shown in Table I is considered to appear as a result of the effects of both the void silylene units and the slight partial orientation of the main chains. 368 Journal of Imaging Science and Technology It is of interest to compare the σ-conjugated hopping sites of polysilanes with the molecular hopping sites, the original model for deriving disorder formalism,10–12 with interest focusing on how Σ values are affected by the concentration of the respective kinds of hopping sites. Although it was found that the variation in the carrier transport seen in Table I was caused by the void silylene units and the slight partial orientation of the main chains, we believe the variation in Σ is almost caused by the former effect and not by the small orientation that is not detected by the polarizing-microscopic observation. The latter effect is seen only in µ0 very sensitive to the shape and direction of the hopping sites. It is assumed that the σ-conjugated domain concentration is 100% when all silylene units in a polysilane film contribute to the formation of the σ-conjugated domains. If a void segment at each end of a polysilane molecule consists of 10 silylene units according to our simulation, the σ-conjugated domain concentrations calculated for PS100, PS50, PS10, and PS1 are 99.79%, 99.57%, 97.77%, and 82.52%, respectively. Borsenberger27 studied the concentration dependences of the disorder parameters for the 1,1-bis(di-4-tolylaminophenyl)cyclohexane doped bisphenol-A-polycarbonate system and reported that the values of Σ increased with decreasing dopant concentrations, drastically increasing from 2 to 3 against the decrease in the concentration from 100% to 80%. Figure 5 shows the dependence of Σ values on the concentration of the hopping sites in polysilanes and those in two kinds of molecularly doped polymers by Borsenberger.27,28 Although the concentration range in the present experiment is much narrower than that adopted in the Borsenberger’s experiment,27 our results agree with his result very well in the concentration range from 100% to 80%. This fact seems to be obvious evidence suggesting that the disorder formalism originally proposed for carrier transport in molecularly doped polymers can be applied to polysilanes that are typical mainchain polymers. Summary Carrier transport properties in poly(methylphenylsilane)s with various molecular weights were studied in the framework of disorder formalism of Bässler et al.10–12 Conclusions in this study are summarized as follows: 1. The increase in the molecular weight of poly(methylphenylsilane)s from ca. 10,000 to ca. 1,000,000 mainly causes the lowering of the values of Σ. 2. Both the average size of free volume in the polysilane films and the polysilane densities are invariant against the change in molecular weight by 2 orders. 3. A model to explain the variation in Σ against the molecular weight was proposed on the basis of the void silylene units that do not participate in forming a σ-conjugated domain. 4. The absorbance at λmax of five polysilane solutions with precisely equal concentration clearly increases with increasing molecular weights, and a slight red shift of ca. 2 nm in λmax is observed with increasing molecular weights. 5. The partial orientation not detected by polarizingmicroscopic observation is reflected as the variance in µ0 that is sensitive to the shape and the direction of the hopping sites. References 1. 2. R. D. Miller and J. Michl, Polysilane high polymers, Chem. Rev. 89, 1359 (1987). R. G. Kepler, J. M. Zeigler, L. A. Harrah, and S. R. Kurtz, Photocarrier generation and transport in ff -conjugated polysilanes, Phys. Rev. B35, 2818 (1987). Nakamura, et al. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. M. A. Abkowitz, M. J. Rice, and M. Stolka, Electric transport in silicon backbone polymers, Phil. Mag. B 61, 25 (1990). M. Stolka, H.-J. Yuh, K. McGrane, and D. M. Pai, Hole transport in organic polymers with silicon backbone (polysilylenes), J. Polym. Sci. Polym. Chem. Ed. 25, 823 (1987). M. Stolka and M. A. Abkowitz, Electric transport in glassy Si-backbone polymers, J. Non-Cryst. Solids 97/98, 1111 (1987). M. A. Abkowitz and M. Stolka, Common features in the electronic transport behavior of diverse glassy solids, Phil. Mag. Lett. 58, 239 (1988). M. Abkowitz, F.E. Knier, H.-J. Yuh, R. J. Weagley, and M. Stolka, Electronic transport in amorphous silicon backbone polymerts, Solid State Commun. 62, 547 (1987). K. Yokoyama and M. Yokoyama, Trap-controlled charge carrier transport in organopolysilanes doped with trapforming organic compounds, Solid State Commun. 73, 199 (1990). T. Dohmaru, K. Oka, T. Yajima, M. Miyamoto, Y. Nakayama, T. Kawamura, and R. West, Hole transport in polysilanes with diverse side-chain substituents, Phil. Mag. B 71, 1069 (1995). H. Bässler, Localized states and electronic transport in single component organic solids with diagonal disorder, Phys. Stat. Sol. B107, 9 (1981). H. Bässler, Charge transport in molecularly doped polymers, Phil. Mag. B 50, 347 (1984). P. M. Borsenberger, L. Pautmeier and H. Bässler, Charge transport in disordered molecular solids, J. Chem. Phys. 94, 5447 (1991). M. Miyamoto, T. Dohmaru, R. West, and Y. Nakayama, Substituent and molecular-weight effects on hole transport in polysilanes (in Japanese), J. Soc. Electrophotography Jpn. 33, 209 (1994). M. Miyamoto, Y. Nakayama, L. Han, K. Oka, R. West and T. Dohmaru, Effect of molecular-weight on hole drift mobility in polysilanes, Proc. of Polymer for Microelectronics, Kawasaki, Japan, 1993, p. 176. T. Nakamura, K. Oka, H. Naito, M. Okuda, Y. Nakayama, and T. Dohmaru, Effect of molecular weight on hole transport in polysilanes, Solid State Commun. 101, 503 (1997). 16. T. Dohmaru, T. Nakamura, K. Oka, F. Hori, R. Oshima, Y. Nakayama, H. Naito, and M. Okuda, Effect of molecular weight on hole transport in polysilanes, in Proc. of IS&T’s 12th Int. Congress on Advances in NIP Tech., IS&T, Springfield, VA, 1996, p. 471. 17. P. Kirkegaard, M. Eldrup, O. E. Mogensen, and N. J. Pedersen, Program system for analysing positron life time spectra and angular correlation curves, Comp. Phys. Commun. 23, 307 (1981). 18. Y. C. Jean, Positron annihilation spectroscopy for chemical analysis: a novel probe for microstructural analysis of polymers, Microchem. J. 42, 72 (1990). 19. Richard A. Pethrick, Positron annihilation -a probe for nanoscale voids and free volume?, Prog. Polym. Sci. 22, 1 (1997). 20. Y. Ohko, A. Uedono and Y. Ujihira, Thermal variation of free volumes size distribution in polypropylenes. Probed by positron annihilation life time technique, J. Polym. Sci. B, Polym. Phys. 33, 1183 (1995). 21. Y. Ujihira and H. Nakanishi, Nondestructive analysis by positron measurement (in Japanese), Radioisotop. 30, 511 (1981). 22. H. Nakanishi and Y. C. Jean, Positron and Positronium Chemistry, D. M. Schrader and Y. C. Jean, Eds., Elsevier, Amsterdam, 1988, p. 95. 23. H. Hayashi, T. Kurando, and Y. Nakayama, Prephotobleaching process in polysilane films, Jpn. J. Appl. Phys. 36, 1250 (1997). 24. K. A. Klingensmith, J. W. Downing, R. D. Miller, and J. Michl, Electronic excitation in poly(di-n-hexylsilane) , J. Am. Chem. Soc. 108, 7438 (1986). 25. P. Trefonas, R. West, R. D. Miller, and D. Hofer, Organosilane high polymers: electronic spectra and photodegradation, J. Polym. Sci., Polym. Lett. Ed. 21, 823 (1983). 26. L. A. Harrah and J. M. Zeigler, Electronic spectra of polysilanes, Macromol. 20, 601 (1987). 27. P. M. Borsenberger, The concentration dependence of the hole mobility of 1,1-bis(di-4tolylaminophenyl)cyclohexane doped bisphenol-Apolyearbonate, J. Appl. Phys. 72, 5283 (1992). 28. P. M. Borsenberger, Hole transport in tri-p-tolylamine doped bisphenolA-polyearbonate, J. Appl. Phys. 68, 6263 (1990). Carrier Transport Properties in Polysilanes with Various Molecular Weights Vol. 42, No. 4, July/Aug. 1998 369 JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY • Volume 42, Number 4, July/August 1998 Edge Estimation and Restoration of Gaussian Degraded Images Ziya Telatar*† and Önder Tüzünalp‡ Ankara University Faculty of Science, Department of Electronic Engineering, 06100-Besevler, Ankara, Turkey The blur function of a degraded image is often unknown a priority. The blur function must first be estimated from the degraded image data before restoring the image. We propose an algorithm to address the blur estimation problem. The present algorithm based on the estimation of restoration filter parameters by using edge information of the degraded image is presented to solve the restoration problem of the degraded image. The information that relates the variance of the Gaussian blur kernel on degraded image is considered. Simulation results of image restoration illustrate the performance of the proposed estimation method. Journal of Imaging Science and Technology 42: 370–374 (1998) Introduction The restoration of images degraded by blur is still a central problem in image processing. Blur can be introduced by atmospheric turbulance, improperly focused lenses, relative motion, or other environmental factors between an object being photographed and an image scanner. The restoration of degraded images differs in each case. The problem of deblurring of images with known blur function has been dealt with extensively in the literature. The restoration algorithms include Fourier domain methods (inverse filtering, 1–3 blind deconvolution, 3–5 Cepstrum,6–8 etc.) or spatial domain methods.9 In many applications, however, the blur function is unknown. Therefore, the estimation or identification of blur function directly from the blurred image has been a focus of great deal of interest. A number of techniques have been proposed to address this problem. Chang, Tekalp, and Erdem 10 have proposed a blur identification algorithm in which an observed image has been segmented into N segments by using a method for blur identification. Reeves and Mersereau11 have used a generalized cross validation method for blur identification. Kayargadde and Martens 12 have used polynomial transformations to estimate the edge parameters and image blur. Although several methods exist to restore degraded images, there is still room for improvement.3 In this work we propose a new algorithm for restoration of unknown Gaussian blurred images using edge estimation. The proposed algorithm follows an iterative scheme to converge Original manuscript received October 3, 1997 * IS&T Member † e-mail: telatar@eros.science.ankara.edu.tr ; TEL: ++90 312 2126720/ 1145; FAX: ++90 312 223 23 95 ‡ e-mail: tuzunalp@eros.science.ankara.edu.tr ; TEL: ++90 312 2126720/ 1221; FAX: ++90 312 223 23 95 © 1998, IS&T—The Society for Imaging Science and Technology 370 the blur function and then restores the degraded image after a certain number of iterations. The following sections present the blur model, our restoration algorithm, experimental results, and conclusions. Problem Identification and the Proposed Method In this work, we address the problem of deconvolution of unknown Gaussian degradation from an edge estimation. To describe our techniques, we begin in this section with a brief description of a blurred image, and state some important properties relating to it. Generally, a blurred image can be modeled as follows: y(n1,n2) = g(n1,n2)*h(n1,n2) + v(n1,n2), (1) where the original image g(n1,n2) has been blurred by the function h(n1,n2) with an additive noise v(n1,n2). Additive noise may come from the imaging system independent of the original image. Additive noise degradation model parameters of imaging systems are known. Thus, additive noise may be easily removed from the degraded image by using special image processing techniques such as Wiener filtering before the restoration. Therefore the additive noise problem has been left out here; rather, we refer the reader to Refs. 1 through 5 and 9 for further details. Neglecting the additive noise, Eq. 1 can be rewritten as, y(n1,n2) = g(n1,n2)*h(n1,n2). (2) Equation 2 states that degradation is the result of the convolution between the original scenery and the blur function. The blur function h(n1,n2) given in Eq. 2 could have statistically different distributions, and different models could be identified for each distribution problem. As a result, the restoration problem may become very complex. In so far as the Gaussian distribution includes all other distributions, in general, we assume that all blurring effects have Gaussian, or normal, distribution. For example, atmospheric turbulence, unfocused imaging systems, motion, or evaporation effects could cause the where, b i+1 is the convergency vector of the model coefficients and ∆bi is the correction term that depends on the measurements along a period. The choice of the amplitude level of the edge points is particularly important because the edge algorithm may detect low noise as an edge point. To obstruct the false edges, first, we introduced a fixed threshold (k). Then, we chose the amplitudes up to the fixed threshold as an edge point. Here if k is low, noise may be detected as an edge point. If k is high, some edge points may not be detected. Thus, the threshold level k has been chosen as 35% of the amplitude of the maximum edge pixel level. As a result, the relation between the variance and the edge algorithm is defined as, σ 2 = f (∇)= Figure 1. Edge estimation and restoration algorithm block diagram. original image to blur with Gaussian distribution. A Gaussian distribution can be modeled as, − 1 h1 (n1 , n2 ) = 2 e 2πσ ( n12 + n22 ) 2σ 2 . (3) Equation 3 represents the filter model in which the variance and the matrix size have an effect on the filter performance. This model will be used for the restoration of the blurred image. As long as the variance and the matrix size can be appropriately arranged, a good approximation of the original image should be obtained. In this study, we use the gradient edge detection method3 to estimate the filter model parameters. In a blurred image, regions of high frequency, called edge pixels are spread over the neighboring pixels causing the loss of image details. Thus, the blurred image edge map does not contain more edge lines or points than the original one. Using this property, we can say that the edge map of an image contains important information about the degradation. Figure 1 shows a block diagram for the algorithm. The process to converge to the blur function of the degraded image is as follows: Step 1. Find the edge map of the actual image. Step 2. Choose the filter model parameters, variance and matrix size from step 1. Step 3. Construct a filter using the parameters in step 2. Step 4. Restore the degraded image. Step 5. Find the edge map of step 4. Step 6. Compare step 5 and previous edge map. If step 5 > the previous edge map, filter parameters are actual, else filter parameters are the previous. Step 7. Choose the next value of the filter parameters. Step 8. Construct a new filter and repeat steps 5 through 8. After a certain number of iterations, the best edge map gives the best filter model parameters used for designing the restoration filter. (Note that different variances are used for a fixed matrix size of filter model in the algorithm. In other words, the variance is searched for a fix blur matrix size, then the matrix size is changed.) There for each iteration step: bi+1 = bi + ∆bi[n1(i + 1); n2(i + 1)], (4) Edge Estimation and Restoration of Gaussian Degraded Images { ∑∑∇[ y(n ; n )≥ k ]} 1 2 max . (5) where σ2 is the variance, ∇ is the gradient operator, and y is the blurred image. Equation 5 is used to compute appropriate matrix size and variance for the restoration filter model. Having estimated the filter model parameters, we use the Fourier domain Cepstrum transform for the filtering. The Cepstrum algorithm has been extensively used for image processing applications and its features are well documented in the literature.3,6–8 Fourier domain Cepstrum transform of Eq. 2 is obtained as, y′(ω1,ω2) = g′(ω1,ω2) + h′(ω1,ω2). (6) Equation 6 shows that the blur function is decomposed into a sum of original scenery of the image component and blur effect component by using the Fourier domain Cepstrum transformation. Let the designed filter be h1(n1,n2) after the iterations and the filtering process in Cepstrum domain as, y′(ω1,ω2) = g′(ω1,ω2) + h′(ω1,ω2) – h′1(ω1,ω2) (7) Minimizing of the error between the blur function model parameters and the constructed filter model parameters presents the improvement of the quality in the image. Mean squared prediction error is then computed to obtain the restoration error: E = ∑∑ [ g(n1 , n2 ) − ynew (n1 , n2 )] , n n 2 1 (8) 2 where g(n1,n2) and ynew(n1,n2) are the original and restored images, respectively. The energy of the original signal E1 is defined by E1 = ∑∑ [ g(n1 , n2 )] . n n 2 1 (9) 2 To evaluate the improvement in the restored image, we combine Eqs. 8 and 9, I = 20 log10 E1 . E (10) Experimental Results The performance of the proposed algorithm has been investigated with three different types of blurred images. The restoration results are presented in this section with a 200 × 200 pixel simulated child image, a real world degraded Vol. 42, No. 4, July/Aug. 1998 371 Figure 2. (a) Blurred child image with variance 13.75 (left above); (b) edge map of (a) (right above); (c) restored image by filter not estimated correctly from degraded image (middle left); (d) edge map of (b) (middle right); (e) resulting image from (e), (left below); (f) edge map of result of iterations (right below). TABLE I. Mean Square Error Measurement in Blurred and Restored Child Image Image Child (Fig.-2.) Estimated Variance 5.5 (7 × 7 pix.) 10.25 (13 × 13 pix.) 13.75 (13 × 13 pix.) MSE in blurred image 1145.3 2400.0 3154.1 MSE in Improvement restored image (dB) 4.811 4.988 6.463 75.8 75.5 73.3 photographic image, and a real world satellite image. The proposed algorithm estimates the size and variance of blur function from the blurred image and restores it. Figures 2(a) and 2(b) show the Gaussian blurred child image and its edge map in which image details have been lost. It is seen that not enough edge pixels are on the edge map. Figures 2(c) and 2(d) show restoration results from an incorrectly estimated filter parameter. So, the image is still not very clear. Figures 2(e) and 2(f) present the restored image and its edge map. The threshold operation prevents the edge pixels from detection of small noise as an edge point. The performace of the restoration is shown by the improvement in the image quality in Table I. The algorithm has also been applied to real life degraded images and considerable improvement has been observed in the resulting images. Figures 3(a) and 3(b) show an original photograph image and its edge map. The image was blurred by out of focus lenses. Degradation again has a Gaussian distribution. The restoration result of the image is depicted in Fig. 3(c) and its edge map in Fig. 3(d). Table II also presents the improvement in image quality. Figures 4(a), 5(a), and 6(a) show some real world satellite images taken by the Hubble Space Telescope. These images 372 Journal of Imaging Science and Technology TABLE II. Restoration Results in the Real World Images Images Estimated Variance Fig. 3 Fig. 4 Fig. 5 Fig. 6 5 3 7 4 Improvement 26.6 28.7 26.4 30.15 Image type Photographic Satellite Satellite Satellite have been degraded by atmospheric turbulence. Thus, some details on the images have been lost. The restored images are shown in Figs. 4(c), 5(c), and 6(c) respectively. Improvements in all of the satellite images are considerable as given in Table II. Conclusion This paper develops a new restoration algorithm for unknown Gaussian degraded images. Variance and matrix size of the convolutional Gaussian effect are estimated and accordingly the blurred image is restored. Our experimental results show that the proposed method performs effective restoration for degraded images. If the original scene has only been degraded by the blur function as in the case of the simulated child image, the restoration result is satisfactory. However, there are some unmeasured observation effects in real world images and these effects cannot be controlled. Our filter model partially compensates unmeasured observation effects but not all. So the restoration results and improvements for the real world images given in Table II are not as high performance as in the simulated image. Telatar and Tüzünalp Figure 3. (a) A real world photograph image (left above); (b) edge map of (a) (right above); (c) restored image from (a) (left below; (d) edge map of (c). Figure 4. (a) A real world satellite image (left above); (b) edge map of (a) (right above); (c) restored image from (a) (left below; (d) edge map of (c). Edge Estimation and Restoration of Gaussian Degraded Images Vol. 42, No. 4, July/Aug. 1998 373 Figure 5. (a) A real world satellite image (left above); (b) edge map of (a) (right above); (c) restored image from (a) (left below; (d) edge map of (c). Figure 6. (a) A real world satellite image (left); (b) restored image from (a) (right). The algorithm needs 125 iteration steps with 43 min required on a 90-MHz Pentium computer. If the matrix size is chosen properly, iteration steps can be reduced to 20 with very low computing time, which can be important for real-time applications. Under these conditions, restoration quality may decrease, however. References 1. B. L. McGlamery, Restoration of turbulence-degraded images, J. Opt. Soc. Amer. 57, 293 (1967). 2. G. M. Robbins and T. S. Huang, Inverse filtering for linear shift variant imaging systems, Proc. IEEE 60, 862 (1972). 3. J. S. Lim, Two Dimensional Signal and Image Processing, Prentice Hall,1990. 4. J. W. Goodman, Introduction to Fourier Optics, McGraw Hill, 1968. 5. S. C. Pohlig, New techniques for blind deconvolution, Opt. Eng. 20, 281 (1981). 6. D. E. Dudgeon, The computation of two dimensional Cepstra, IEEE Trans. Acoust. Speech Signal Proc. ASSP-25, 476 (1977). 374 Journal of Imaging Science and Technology 7. J. K. Lee, M. Kabrisky, M. E. Oxley, S. K. Rogers, and D. W. Ruck, The complex Cepstrum applied to two-dimensional images, Patt. Recog. 26, 1579 (1993). 8. P. A. Petropulu and C. L. Nikias, The complex Cepstrum and Bicepstrum: Analytic performance evaluation in the presence of Gaussian noise, IEEE Tran. Acoust. Speech Signal Proc. 38, 1246 (1990). 9. O. Shunichiro, Restoration of images degraded by motion blur using matrix operators, Int. J. Systems Sci. 22, 937 (1991). 10. M. M. Chang, A. M. Tekalp and A. T. Erdem, Blur identification using bispectrum, IEEE Trans. Signal Proc. 39, 2323 (1991). 11. S. J. Reeves and R. M. Mersereau, Blur identification by the method of generalized cross-validation, IEEE Trans. Image Proc. 1, 301 (1992). 12. V. Kayargadde and J. B. Martens, Estimation of edge parameters and image blur using polynomial transforms, CVGIP 56, 442 (1994). 13. G. Demoment, Image reconstruction and restoration: Overview of common estimation structures and problems, IEEE Trans. Acoust. Speech Signal Process. 37, 2024 (1989). 14. Z. Telatar, Adaptive restoration of blurred satellite images, in ISI 51st Session of the International Statistical Institute, Book 2, 499, (1997). 15. Y. Ando, A. Hansuebsai and K. Khantong, Digital restoration of faded color images by subjective method, J. Imag. Sci. Tech. 41, 259 (1997). Telatar and Tüzünalp The Business Directory Electronic Imaging and Color Reproduction Business Directory Ad Journal of Imaging Science and Technology Burt Saunders Consultant 8384 Short Tract Rd. Nunda, NY 14517 716-468-5013 IS&T Mbrs. U.S. $100: Non-Mbrs. U.S. $150 Contact: Pam Forness TO THE NON-IMPACT PRINTING INDUSTRY Six Issues IS&T, 7003 Kilworth Lane Springfield, VA 22151 703-642-9090: FAX: 703-642-9094 E-mail: pam@imaging.org AUTOMATE COLOR & DENSITY MEASUREMENTS of Proofs and Calibration Sheets X-Y scanning stages and software for measuring sheets up to 30′′ × 40′′ with the handheld instruments that you presently use to make measurements manually David L. Spooner, PE rhoMetric Associates, Ltd. Business: 978-448-5485 Home: 978-448-5583 Edgar B. Gutoff, Sc.D., P.E. PHOTOTHERMOGRAPHY Consulting Chemical Engineer 194 Clark Road, Brookline, MA 02146 Phone/Fax: 617-734-7081 John Winslow, Consultant •13 years experience with Fortune 500 Company •Specializing in Silver Organic/Silver Halide Systems • Coating Seminars Co-authored Coating and Drying Defects (1995), Modern Coating and Drying Technology (1992), and The Application of SPC to Roll Products (1994). WARREN SOLODAR Consulting Chemist consulting in: • electrostatics • static problems • electrographics • xerographics • dielectric materials 556 Lowell Rd. Groton, MA 01450 • Consulting in slot, slide, curtain, and roll coating; in coating die design, and in drying technology. • PATENT SEARCHES • EXPERT WITNESS • MATERIALS SOURCING Please direct your inquiries in confidence to: ELECTROSTATIC CONSULTING ASSOCIATES Telephones (302) 754-9045• FAX (302) 764-5808 • INK JET INKS • DYES • PIGMENTS PAUL J. MALINARIC consulting engineer Call to Inquire for Fax number 2918 N. Franklin Street Wilmington, DE 19802-2933 • Drying Software CONSULTING CHEMIST 480 Montgomery Avenue Merion Station, PA 19066-1213 Phone or Fax: 610-664-5321 EPPING GmbH— PES Laboratorium R & D & E for Electrostatic Powder Physics Measurement and Sale of Devices for Charge, Conductivity and Magnetic Parameters for Powders. Contact: Send inquires to: John Winslow Phototherm Consulting Ltd. 407 Orchard Lane So. St. Paul, MN 55075 612-450-1650 Andreas J. Kuettner Carl-Orff-Weg 7 85375 Neufahrn bei Freising Germany TEL +49-8165-960 35 FAX +49-8165-960 36 E-mail rhepping@t-online.de Positions Available and Positions Wanted Can now be found on the IS&T homepage Please visit us at http://www.imaging.org click on EMPLOYMENT OPPORTUNITIES IS&T Vol. 42, No. 4, July/Aug. 1998 375 1999 IS&T HONORS AND AWARDS NOMINATION (It is important that the Committee have complete and accurate information when making selections. Please complete the questionnaire thoroughly and accurately.) Name_________________________________________________________________________________________ Employer___________________________________________Current Position______________________________ Home Address____________________________________________________Phone#________________________ Business Address__________________________________________________Phone#________________________ Recommended Honor/Award______________________________________________________________________ How does this person qualify for the recommended honor/award?__________________________________________ ______________________________________________________________________________________________ ______________________________________________________________________________________________ Place and date of birth_____________________________________________________________________________ Education (include year and school for degrees)_______________________________________________________ ______________________________________________________________________________________________ ______________________________________________________________________________________________ Brief employment history_________________________________________________________________________ ______________________________________________________________________________________________ ______________________________________________________________________________________________ IS&T member?______________________________________________________How long?__________________ Current or previous positions______________________________________________________________________ Previous IS&T honors/awards_____________________________________________________________________ Other organization memberships/honors_____________________________________________________________ ______________________________________________________________________________________________ Please attach a bibliography. (Any additional information about the candidate may be submitted on an additional sheet.) Send this form to: IS&T 7003 Kilworth Lane Springfield, VA 22151 703/642-9090; Fax: 703/642-9094 E-mail: info@imaging.org Nominated by ____________________________________________ Name Date ____________________________________________ Address ____________________________________________ ____________________________________________ Deadline: January 1, 1999 376 Journal of Imaging Science and Technology Voice FAX ____________________________________________________________________________ E-MAIL IS&T Honors and Awards—Call for Nominations Each year IS&T Honors and Awards committee selects scientists, engineers, educators and students who have made outstanding contributions to the field of imaging. Your nominations are needed. A list of prior recipients may be found on the Society’s web site—www.imaging.org. Honorary Member Honorary membership, the highest award bestowed by the Society, recognizes outstanding contributions to the advancement of imaging science or engineering. Edwin H. Land Medal The Edwin H. Land Medal is endowed by Polaroid Corporation and awarded in alternate years by IS&T and the Optical Society of America. The award recognizes an individual who has demonstrated, from a base of scientific knowledge, pioneering entrepreneurial creativity that has had major public impact. Chester F. Carlson Award The Chester F. Carlson Award, sponsored by Xerox Corporation, Webster Research Center, was awarded for the first time in 1985. The award has been established to recognize outstanding work in the science or technology of electrophotography. Lieven Gevaert Medal The Lieven-Gevaert Award, sponsored by Bayer Corporation/Agfa Division, recognizes outstanding contributions in the field of silver halide photography. Kosar Memorial Award The Kosar Memorial Award, sponsored by the Tri-State Chapter, recognizes contributions in the area of unconventional imaging. Raymond C. Bowman Award The Raymond C. Bowman Award is sponsored by the Tri-State Chapter. The award is given in recognition of an individual who has been instrumental in fostering, encouraging, helping, and otherwise facilitating individuals, either young or adult, in the pursuit of a career, beginning with an appropriate education, in the technical-scientific aspects of photography or imaging science. Fellowship Fellowship is awarded to a Regular Member for outstanding achievement in imaging science or engineering. Senior Membership Senior Membership is awarded for long term service to the Society at the national level. Journal Award (Science) The Journal Award recognizes an outstanding contribution in the area of basic science, published in the Journal of Imaging Science and Technology during the preceding year. Charles E. Ives Award (Engineering) The Charles E. Ives Award, sponsored by IS&T’s Rochester Chapter, is given in recognition of an outstanding contribution published originally in the Journal of Imaging Science and Technology during the preceding calendar year. The publication should be in the general area of applied science or engineering, concerned with the successful application of scientific and engineering principles to an imaging problem or with a technical problem solved with imaging technology. Itek Award The Itek Award is for an outstanding original student publication in the field of imaging science and engineering. Service Award The Service Award is given in recognition of service to a Chapter, or to the Society. Vol. 42, No. 4, July/Aug. 1998 377 IS&T Recent Progress Series Keeping up with the latest technical information is a task that becomes increasingly difficult. This is not only caused by the large amount of information, but also by its dispersed distribution at a variety of conferences. The “Recent Progress” series collects, through the eyes of the Society for Imaging Science and Technology, technical information from several conferences and publications into a concise treatise of a subject. This series allows the professional to stay up-to-date and to find the relevant data in the covered field quickly and efficiently. Now Available • Recent Progress in Color Management and Communications 1998 (Mbr. $65; Non-Mbr. $75) • Recent Progress in Color Science 1997 (Mbr. $65; Non-Mbr. $75) • Recent Progress in Toner Technology 1997 (Mbr. $65; Non-Mbr. $75) • Recent Progress in Ink-Jet Technologies 1996 (Mbr. $55; Non-Mbr. $65) • Recent Progress in Digital Halftoning 1995 (Mbr. $55; Non-Mbr. $65) Plus shipping and handling: $4.50 U.S.; $8.50 outside the U.S.A. Contact IS&T to order Today! Phone: 703-642-9090 Fax: 703-642-9094 E-mail: info@imaging.org www.imaging.org 378 Journal of Imaging Science and Technology IS&T’s NIP14: International Conference on Digital Printing Technologies October 18–23, 1998 The Westin Harbour Castle Hotel Toronto, Ontario, Canada General Chair: Dr. David Dreyfuss Lexmark International, Inc. Come and meet with us in Toronto for IS&T's NIP14: International Conference on Digital Printing Technologies. Over the years, the NIP Conferences have emerged as the preeminent forum for discussion of advances and directions in the field of non-impact and digital printing technologies. A comprehensive program of more than 170 contributed papers from leading scientists and engineers, is planned along with daily keynote addresses, an extensive program of tutorials, a print gallery and an exhibition of digital printing products, components, materials and equipment. Following the presentations each day, the authors will be available for oneon-one discussions. The proposed program topics are: • Electrostatic Marking Processes • Electrostatic Marking Materials • Photoreceptors • Ink-Jet Processes • Ink-Jet Materials • Media for Digital Printing • Print and Image Quality • Color/Science/Image Processing • Advanced and Novel Printing • Desktop, Commercial and Industrial InkJet Printing • Thermal Printing • Liquid Toner Processes and Materials • Electrography and Magnetography • Textile and Fabric Printing • Production Digital Printing • Quality Control Instrumentation • Wide and Grand Format Printing • Enabling Technologies behind Recent Major Product Announcements The Society for Imaging Science and Technology 7003 Kilworth Ln., Springfield, VA 22151 703-642-9090; FAX: 703-642-9094; E-Mail: info@imaging.org Vol. 42, No. 4, July/Aug. 1998 379 Preliminary Program now available September 7-11, 1998 • University of Antwerp (UIA) Belgium Secretariat: Jan De Roeck c/o Agfa-Gevaert N.V., Septestraat 27, B-2640 Mortsel, Belgium. Tel: +32(1)3 444.88.78; Fax: +32(1)3 444.88.71; e-mail: icps.be; http://www.icps.be Conference sessions in three tracks: I. Nanostructured Materials for Imaging Symposium on Advanced Characterization Techniques for Nanostructured Materials Symposium on Environmental Issues for Imaging Systems II. Printing and Non-AgX Imaging Systems Symposium on Textile Printing and Related Industrial Applicatiosn III. Electronic Imaging Symposium on Information Technology: A new Century for Medical Imaging Short Courses: Digital Image Preservation Ink Jet Printing: The Basics of Photorealistic Quality Colour in Multimedia Comparing the Imaging Physics of Film and CCD Sensors Electronic Imaging Systems Fundamentals Modern Display Technologies Questions and Answers to Colour Image Processing in Computed Radiography 380 Journal of Imaging Science and Technology Franziska Frey Annette Jaffe Lindsay MacDonald Michael Kriss Nitin Sampat Patrick Vandenberghe Jean-Pierre Van de Capelle Piet Vuylsteke IS&T—The Society for Imaging Science and Technology 7003 Kilworth Lane, Springfield, VA 22151 President ROBERT GRUBER, Xerox Corporation, 800 Phillips Road, W114-40D, Webster, NY 14580 Voice: 716-422-5611 FAX: 716-422-6039 e-mail: robert_gruber@wb.xerox.com Executive Vice President JOHN D. MEYER, Hewlett Packard Laboratories, 1501 Page Mill Rd., 2U-19, P.O. Box 10490, Palo Alto, CA 94304 Voice: 650-857-2580 FAX: 650-857-4320 e-mail: meyer@hpl.hp.com Publications Vice President REINER ESCHBACH, Xerox Corporation, 800 Phillips Road, 0128-27E, Webster, New York 14580 Voice: 716-422-3261 FAX: 716-422-6117 e-mail: eschbach@wrc.xerox.com Conference Vice President WAYNE JAEGER, Tektronix, M/S 61-IRD, 26600 S. W. Parkway, Wilsonville, OR 97070-1000 Voice: 503-685-3281 FAX: 503-685-4366 e-mail: wayne.jaeger@tek.com Vice Presidents JAMES KING, Adobe Systems Inc., 345 Park Ave., MS: W14, San Jose, CA 95110-2704 Voice: 408-536-4944 FAX: 408-536-6000 e-mail: jking@adobe.com JAMES R. MILCH, Eastman Kodak Company Research Labs, Bldg. 65, 1700 Dewey Ave., Rochester, New York 14650 Voice: 716-588-9400 FAX: 716-588-3269 e-mail: jrmilch@kodak.com W. E. NELSON, Texas Instruments, P. O. Box 655474, MS 63, Dallas, TX 75265 Voice: 972-575-0270 FAX: 972-575-0090 e-mail: wen@msg.ti.com SHIN OHNO, Sony Corporation, Business & Professional Systems Co., 4-14-1 Okata, Atsugi 243, Kanagawa 243-0021, Japan Voice: 81-462-27-2373 FAX: 81-462-27-2374 e-mail: shin@avctl.cpg.sony.co.jp MELVILLE R. V. SAHYUN, Department of Chemistry, University of Wisconsin, Eau Claire, WI 54702 Voice: 715-836-4175 FAX: 715-836-4979 e-mail: sahyunm@uwec.edu DEREK WILSON, Coates Electrographics, Ltd., Norton Hill, Midsomer Norton, Bath, BA3 4RT, England Voice: 44-1761-408545 FAX: 44-1761-418544 e-mail: derek.wilson@msm.coates.co.uk Secretary BERNICE ROGOWITZ, IBM Corp., T. J. Watson Research, P. O. Box 704, M/S H2-B62, Yorktown Heights, NY 10598-0218 Voice: 914-784-7954 FAX: 914-784-6245 e-mail: rogowtz@us.ibm.com Treasurer GEORGE MARSHALL, Lexmark International, Inc., 6555 Monarch Rd., Dept. 57R/031A, Boulder, CO 80301 Voice: 303-581-5052 FAX: 303-581-5097 e-mail: toner@lexmark.com Immediate Past President JAMES OWENS, Eastman Kodak Company, Research Labs., M.C. 01822, Rochester, NY 14650 Voice: 716-477-7603 FAX: 716-477-0736 e-mail: jcowens@kodak.com Executive Director CALVA LEONARD, IS&T, 7003 Kilworth Lane, Springfield, VA 22151 Voice: 703-642-9090 FAX: 703-642-9094 e-mail: calva@imaging.org Chapter Officers BINGHAMTON, NEW YORK (BI) Bruce Resnick, Director Albert Levit, President Christopher Turock, Secretary ROCHESTER, NEW YORK (RO) Joanne Weber, Director Dennis Abramsohn, President Joanne Weber, Secretary TRI-STATE James Chung, Director Frederic Grevin, President Robert Uzenoff, Secretary BOSTON, MASSACHUSETTS (BO) Lynne Champion, Director Jeffrey Seideman, President Jim Boyack, Secretary ROCHESTER INSTITUTE OF TECHNOLOGY (RT) Faculty Advisors Zoran Ninkov and Jonathan Arney TWIN CITIES, MINNESOTA (TC) Stan Busman, Director Jeanne Haubrich, President Susan K. Yarmey, Secretary EUROPE (EU) Hans Jörg Metz, Director/President Open, Secretary RUSSIA (RU) Michael V. Alfimov, Director/President T. Slavnova, Secretary WASHINGTON, DC (WA) Joseph Kitrosser, Director Open, President Open, Secretary KOREA (KO) J.-H. Kim, Director Young S. You, President TOKYO, JAPAN (JA) Yoichi Miyake, Director Tadaaki Tani, President Takashi Kitamura, Secretary IS&T Corporate Members The Corporate Members of your Society provide a significant amount of financial support that assists IS&T in disseminating information and providing professional services to imaging scientists and engineers. In turn, the Society provides a number of material benefits to its Corporate Members. For complete information on the Corporate Membership program, contact IS&T, 7003 Kilworth Lane, Springfield, VA 22151. Sustaining Corporate Members Applied Science Fiction 8920 Business Park Drive Austin, TX 78759 Imation Corporation 1 Imation Place Oakdale, MN 55128-3414 Tektronix, Inc. P.O. Box 4675 Beaverton, OR 97076-4675 Eastman Kodak Company 343 State Street Rochester, NY 14650 Lexmark International, Inc. 740 New Circle Road NW Lexington, KY 40511 Xerox Corporation Webster Research Center Webster, NY 14580 Hewlett Packard Labs. 1501 Page Mill Road Palo Alto, CA 94304 Polaroid Corporation P.O. Box 150 Cambridge, MA 02139 Supporting Corporate Members Konica Corporation No. 1 Sakura-machi Hino-shi, Tokyo 191 Japan Kodak Polychrome Graphics 401 Merritt 7 Norwalk, CT 06851 Xeikon N.V. Vredebaan 71 2640 Mortsel, Belgium Donor Corporate Members Agfa Division Bayer Corp. 100 Challenger Road Ridgefield Park, NJ 07760 BASF Corporation 100 Cherry Hill Road Parsippany, NJ 07054 Canon , Inc. Shimomaruko 3-30-2 Ohta-ku, Tokyo 146 Japan Clariant GmbH Division Pigments & Additives 65926 Frankfurt am Main Germany Delphax Systems Canton Technology Center 5 Campanelli Circle Canton, MA 02021 Felix Schoeller Jr. GmbH & Co. KG Postfach 3667 49026 Osnabruck, Germany Fuji Photo Film USA, Inc. 555 Taxter Road Elmsford, NY 10523 Fuji Xerox Company Ltd. 3-5 Akasaka, 3-chome Minato-ku, Tokyo 107 Japan Hallmark Cards, Inc. Chemistry R & D 2501 McGee, #359 Kansas City, MO 64141-6580 Hitachi Koki Co., Ltd. 1060 Takeda, Hitachinaka-City Ibaraki- Pref 312 Japan KDY Inc. 9 Townsend West Nashua, NH 03063 Ilford Photo Corporation West 70 Century Road Paramus, NJ 07653 Kind & Knox Gelatin, Inc. P.O. Box 927 Sioux City, IA 51102 Minolta Co., Ltd. 1-2, Sakuramachi Takatsaki, Osaka 569 Japan Mitsubishi Electric 5-1-1 Ofuna, Kamakura Kanagawa 247 Japan Nitta Gelatin NA Inc. 201 W. Passaic Street Rochelle Park, NJ 07662 Questra Consulting 300 Linden Oaks Rochester, NY 14625 Research Laboratories of Australia 7, Valetta Road, Kidman Park S. Australia, 5025, Australia Ricoh Company Ltd. 15-5, Minami-Aoyama 1-chome, Minato-ku, Tokyo 107 Japan SKW Biosystems, Inc. 2021 Cabot Boulevard West Langhorne, PA 19047 Sharp Corporation 492 Minosho-cho Yamatokoriyama 639-11 Japan Sony Corporation 6-7-35 Kita-shinagawa Shinagawa, Tokyo 141 Japan Sony Electronic Photography & Printing 3 Paragon Drive Montvale, NJ 07645 Trebla Chemical Company 8417 Chapin Ind. Drive St. Louis, MO 63114 XMX Corporation 46 Manning Road Billerica, MA 01821-3944