RAY OPTICS NOTES - RIT Center for Imaging Science
Transcription
RAY OPTICS NOTES - RIT Center for Imaging Science
Ray Optics for Imaging Systems Course Notes for IMGS-321 11 December 2013 Roger Easton Chester F. Carlson Center for Imaging Science Rochester Institute of Technology 54 Lomb Memorial Drive Rochester, NY 14623 1-585-475-5969 easton@cis.rit.edu December 11, 2013 Contents Preface 0.1 References: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Introduction 1.1 Models of Light and Propagation . . . . . . . . . . 1.1.1 Ray model of light (“geometrical optics”) . 1.1.2 Wave model of light (“physical optics”): . . 1.1.3 Photon model of light (“quantum optics”): ix 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 2 2 3 2 Ray (Geometric) Optics 2.1 What is an imaging system? . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Simplest Imaging System — Pinhole in Absorber . . . . . . 2.2 First-Order Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Third-Order Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Higher-Order Approximations . . . . . . . . . . . . . . . . . 2.4 Notations and Sign Conventions . . . . . . . . . . . . . . . . . . . 2.4.1 Nature of Objects and Images: . . . . . . . . . . . . . . . . 2.5 Human Eye . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.6 Principle of Least Time . . . . . . . . . . . . . . . . . . . . . . . . 2.7 Fermat’s Principle for Reflection . . . . . . . . . . . . . . . . . . . 2.7.1 Plane Mirrors . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8 Fermat’s Principle for Refraction: . . . . . . . . . . . . . . . . . . . 2.8.1 Dispersion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.8.2 Refractive Constants for Glasses . . . . . . . . . . . . . . . 2.9 Image Formation in the Ray Model . . . . . . . . . . . . . . . . . . 2.9.1 Refraction at a Spherical Surface . . . . . . . . . . . . . . . 2.9.2 Imaging with Spherical Mirrors . . . . . . . . . . . . . . . . 2.10 First-Order Imaging with Thin Lenses . . . . . . . . . . . . . . . . 2.10.1 Examples of Thin Lenses . . . . . . . . . . . . . . . . . . . 2.10.2 Spherical Mirror . . . . . . . . . . . . . . . . . . . . . . . . 2.11 Image Magnifications . . . . . . . . . . . . . . . . . . . . . . . . . . 2.11.1 Transverse Magnification: . . . . . . . . . . . . . . . . . . . 2.11.2 Longitudinal Magnification: . . . . . . . . . . . . . . . . . . 2.11.3 Angular Magnification . . . . . . . . . . . . . . . . . . . . . 2.12 Single Thin Lenses . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.12.1 Positive Lens . . . . . . . . . . . . . . . . . . . . . . . . . . 2.12.2 Negative Lens . . . . . . . . . . . . . . . . . . . . . . . . . . 2.12.3 Meniscus Lenses . . . . . . . . . . . . . . . . . . . . . . . . 2.12.4 Simple Microscope (magnifier, “magnifying glass,” “loupe”) 2.13 Systems of Thin Lenses . . . . . . . . . . . . . . . . . . . . . . . . 2.13.1 Two-Lens System . . . . . . . . . . . . . . . . . . . . . . . . 2.13.2 Effective (Equivalent) Focal Length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 5 5 6 9 10 10 11 13 13 14 17 18 19 21 24 24 27 28 30 32 32 32 33 34 35 35 36 36 37 41 41 43 v . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi CONTENTS 2.13.3 Summary of Distances for Two-Lens System . . . . . . . . . . . 2.13.4 “Effective Power” of Two-Lens System . . . . . . . . . . . . . . 2.13.5 Lenses in Contact: t = 0 . . . . . . . . . . . . . . . . . . . . . . 2.13.6 Positive Lenses Separated by t < f1 + f2 . . . . . . . . . . . . . 2.13.7 Cardinal Points . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.13.8 Lenses separated by t = f1 + f2 : Afocal System (Telescope) . . 2.13.9 Positive Lenses Separated by t = f1 or t = f2 . . . . . . . . . . 2.13.10 Positive Lenses Separated by t > f1 + f2 . . . . . . . . . . . . . 2.13.11 Compound Microscopes . . . . . . . . . . . . . . . . . . . . . . 2.13.12 Two Positive Lenses with Different Focal Lengths and Different 2.13.13 Systems of One Positive and One Negative Lens . . . . . . . . 2.13.14 Newtonian Form of Imaging Equation . . . . . . . . . . . . . . 2.13.15 Example (1) of Two-Lens System . . . . . . . . . . . . . . . . . 2.13.16 Example (2) of Two-Lens System: Telephoto Lens . . . . . . . 2.13.17 Images from Telephoto System: . . . . . . . . . . . . . . . . . . 2.13.18 Example (3) of Two-Lens System: Two Negative Lenses . . . . 2.14 Plane and Spherical Mirrors . . . . . . . . . . . . . . . . . . . . . . . . 2.14.1 Comparison of Thin Lens and Concave Mirror . . . . . . . . . 2.15 Stops and Pupils . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.15.1 Focal Ratio — f-number . . . . . . . . . . . . . . . . . . . . . . 2.15.2 Example: Focal Ratio of Lens-Aperture Systems . . . . . . . . 2.15.3 Example: Exit Pupils of Telescopic Systems . . . . . . . . . . . 2.15.4 Pupils and Diffraction . . . . . . . . . . . . . . . . . . . . . . . 2.15.5 Field Stop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.16 Marginal and Chief Rays . . . . . . . . . . . . . . . . . . . . . . . . . . 2.16.1 Telecentricity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.16.2 Marginal and Chief Rays for Telescopes . . . . . . . . . . . . . 3 Tracing Rays Through Optical Systems 3.1 Paraxial Ray Tracing Equations . . . . . . . . . . . . . . . . . . . . 3.1.1 Paraxial Refraction . . . . . . . . . . . . . . . . . . . . . . . 3.1.2 Paraxial Transfer . . . . . . . . . . . . . . . . . . . . . . . . 3.1.3 Linearity of the Paraxial Refraction and Transfer Equations 3.1.4 Paraxial Ray Tracing . . . . . . . . . . . . . . . . . . . . . 3.2 Matrix Formulation of Paraxial Ray Tracing . . . . . . . . . . . . . 3.2.1 Refraction Matrix . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Ray Transfer Matrix . . . . . . . . . . . . . . . . . . . . . . 3.2.3 “Vertex-to-Vertex Matrix” for System . . . . . . . . . . . . 3.2.4 Example 1: System of Two Positive Thin Lenses . . . . . . 3.2.5 Example 2: Telephoto Lens . . . . . . . . . . . . . . . . . . 3.2.6 MVV0 Derived From Two Rays . . . . . . . . . . . . . . . . 3.3 Object-to-Image (Conjugate) Matrix . . . . . . . . . . . . . . . . . 3.3.1 Matrix of the “Relaxed” Eye (focused at ∞) . . . . . . . . 3.4 Vertex-Vertex Matrices of Simple Imaging Systems . . . . . . . . . 3.4.1 Magnifier (“magnifying glass,” “loupe”) . . . . . . . . . . . 3.4.2 Galilean Telescope of Thin Lenses . . . . . . . . . . . . . . 3.4.3 Keplerian Telescope of Thin Lenses . . . . . . . . . . . . . . 3.4.4 Thick Lenses . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.5 Microscope . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Image Location and Magnification . . . . . . . . . . . . . . . . . . 3.6 Marginal and Chief Rays for the System . . . . . . . . . . . . . . . 3.6.1 Examples of Marginal and Chief Rays for Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Separations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 48 49 49 55 56 58 60 61 62 63 64 65 69 72 74 76 79 79 80 81 85 90 91 91 92 94 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 95 96 97 98 98 100 101 102 104 105 108 109 110 114 115 115 116 117 117 121 122 122 123 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii CONTENTS 4 Depth of Field and Depth of Focus 4.0.2 Examples of Depth of Field from Video and Film 4.1 Criterion for “Acceptable Blur” . . . . . . . . . . . . . . 4.2 Depth of Field via Rayleigh’s Quarter-Wave Rule . . . . 4.3 Hyperfocal Distance . . . . . . . . . . . . . . . . . . . . 4.4 Methods for Increasing Depth of Field . . . . . . . . . . 4.5 Sidebar: Transverse Magnification vs. Focal Length . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 143 149 152 156 156 157 5 Aberrations 5.1 Chromatic Aberration . . . . . . . . . . . . . . . . . . 5.2 Third-Order Optics, Monochromatic Aberrations . . . 5.2.1 Names of Aberrations . . . . . . . . . . . . . . 5.2.2 Aberration Coefficients . . . . . . . . . . . . . 5.2.3 Fourth-Order (Third-Order Ray) Aberrations: . 5.2.4 Zernike Polynomials . . . . . . . . . . . . . . . 5.3 Structural Aberration Coefficients . . . . . . . . . . . 5.4 Optical Imaging Systems and Sampling . . . . . . . . 5.5 Optical System “Rules of Thumb” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 161 165 173 174 181 190 193 193 193 . . . . . . . . . Preface This book is intended to introduce the mathematical tools that can be applied to model and predict the action of optical imaging systems. ix 0.1 REFERENCES: 0.1 1 References: Many references exist for the subject of wave optics, some from the point of view of physics and many others from the subdiscipline of optics. Unfortunately, relatively few from either camp concentrate on the aspects that are most relevant to imaging. Useful Optics Texts: [P3] (the three) Pedrottis, Introduction to Optics, Pearson Prentice-Hall, 2007. [G] Gaskill, Jack D., Linear Systems, Fourier Transforms, and Optics, John Wiley, 1978. [JG] Goodman, Joseph, Introduction to Fourier Optics, Third Edition, Roberts & Company, 2005. [H] Eugene Hecht, Optics, 4th Edition, Addison-Wesley, 2002. [PON] Reynolds, DeVelis, Parrent, Thompson, The New Physical Optics Notebook, SPIE, 1989. [BW] Max Born and Emil Wolf, Principles of Optics, 7th Expanded Edition, Cambridge University Press, 2005. [GF] Grant R. Fowles, Introduction to Modern Optics (Second Edition), Dover Publications, 1975. [RHW] Robert H. Webb, Elementary Wave Optics, Dover Publications, 1997. [FLS] R. Feynman, R. Leighton, M. Sands, The Feynman Lectures on Physics, AddisonWesley, 1964. [KF] M.V. Klein and T.E. Furtak, Optics, Second Edition, Wiley, 1986 [JW] F. Jenkins and H. White, Fundamentals of Optics, 4th Edition, McGraw-Hill, 1976. [NP] A. Nussbaum and R. Phillips, Contemporary Optics for Scientists and Engineers, Prentice-Hall, 1976. [I] K. Iizuka, Engineering Optics, Springer-Verlag, 1985. [FBS] D. Falk, D. Brill, and D. Stork, Seeing the Light, Harper and Row, 1986. Lawrence Mertz, Transformations in Optics, John Wiley & Sons, 1965. Physics Texts with useful discussions: [HR] D. Halliday and R. Resnick, Physics, 3rd Edition, Wiley, 1978. [C] F. Crawford, Waves, Berkeley Physics Series Vol. III, McGraw-Hill, 1968. John D. Jackson, Classical Electrodynamics, Third Edition, Wiley, 1998, §6. Feynman, Leighton, and Sands, Lectures on Physics, particularly Volume 1.§25-§33 and Volume II §32-§33 Curriculum: Geometrical Optics and Imaging 1. Models for light propagation (a) ray model (“geometric optics”) (b) wave model (“physical optics”) (c) photon model (quantum optics) 2. First-order optics (a) third-order optics, aberrations (b) higher-order approximations 3. Sign conventions for distances and angles (a) Nature of objects and images (real and virtual) 2 Preface 4. Human eye 5. Refractive index (a) Optical path length (b) Fermat’s principle of least time (P3 §2.2, H §4.5, BW §3.3) (c) Snell’s law for reflection: θ2 = −θ1 i. plane mirrors (d) Snell’s law for refraction: n1 sin [θ1 ] = n2 sin [θ2 ] i. plane interface between two media (e) Dispersion (variation in n with λ) i. relationship between mean refractive index and dispersion ii. crown and flint glasses (f) Dispersing prisms 6. Refraction at a Spherical Surface (a) Paraxial approximation, imaging equation (b) Reflection at a spherical surface 7. Imaging with thin lenses (a) Imaging equation in terms of object and image distances and focal length (b) system “power” (c) spherical mirrors (d) object/image conjugates (e) Image magnifications i. Transverse magnification ii. Longitudinal magnification iii. Angular magnification (f) Single thin lenses i. ii. iii. iv. positive lens negative lens meniscus lens simple microscope (g) Systems of thin lenses i. ii. iii. iv. v. vi. vii. viii. ix. lenses in contact effective focal length and power of two-lens system focal and principal points afocal systems (telescopes) eyeglasses compound microscopes Newtonian form of imaging equation telephoto lens Stops and pupils A. aperture stop B. entrance and exit pupils 0.1 REFERENCES: C. field stop (h) Marginal and chief (principal) rays i. telecentricity 8. Tracing rays through optical systems (a) paraxial ray tracing equations i. paraxial refractiontransfer ii. paraxial transfer iii. linearity of equations (b) matrix formulation of paraxial ray tracing i. ii. iii. iv. v. vi. refraction matrix transfer matrix Lagrangian invariant vertex-to-vertex matrix for imaging system object-to-image (conjugate) matrix matrix for eye model (c) Examples of imaging system matrices i. ii. iii. iv. v. magnifier Galilean telescope Keplerian telescope thick lens microscope (d) image location and magnification (e) Depth of field and depth of focus i. ii. iii. iv. v. vi. examples from film and video criterion for “acceptable blur” depth of field via Rayleigh’s quarter-wave rule hyperfocal distance methods for increasing depth of field transverse magnification vs. focal length (f) Aberrations i. Chromatic aberration A. achromatic doublet B. apochromatic triplet ii. Third-Order (Seidel) Aberrations A. spherical aberration (relation to defocus) B. coma C. astigmatism D. distortion E. curvature of field F. piston error 9. Computed Ray Tracing, OSLOTM 3 Chapter 1 Introduction The obvious first question to consider is “what is optics” (or perhaps “what are optics?” heh, heh). One reasonable definition of optics is the application of physical principles and observed phenomena to manipulate “light” in useful ways. This presupposes the definition of “light,” which I specify as electromagnetic radiation of any “color,” temporal frequency, and wavelength. This is more general than the definition put forth by humanocentrics (e.g., color scientists), but is much more reasonable in our field, where we want to take advantage of all measureable radiation to learn information about objects that emit, reflect, refract, or otherwise modify radiation. The definition in imaging is somewhat narrower: the application of the properties if materials and of light to form “images,” which are “recognizable (though approximate) replicas of the spatial and spectral distribution of light reflected, transmitted, and/or emitted by an object.” To design optical image-forming systems, we must model the propagation of light from the object (source) to the optic, the action of the optic on the incident light distribution, and finally propagation from the optic to the sensor. The last step of conversion of the spatial (and possibly spectral) distribution of incident light into measurable physical and/or chemical changes in some medium by the sensor, is outside the scope of this discussion. We hope to find a mathematical model of optical imaging as a “system,” where an output distribution g is created from an input object distribution f by the action of an imaging system O, e.g., g [x, y, λ] = O {f [x, y, z, λ]}. We generally use this model to (try to) solve the inverse imaging problem by inferring the input object from the output image and knowledge of the system. The task may be difficult or even impossible; it is easy to see one difficulty because most sensors measure only a 2-D distribution of monochromatic light and therefore cannot possibly recover the three spatial dimensions of a realistic object from a single image. Schematic of an optical system that acts on an input with three spatial dimensions, time, and wavelength f [x, y, z, t, λ] to produce a 2-D monochrome (gray scale) image g [x0 , y 0 ]. 1 2 CHAPTER 1 INTRODUCTION 1.1 Models of Light and Propagation To be able even to write down, let alone solve, the imaging equation(s) for optical systems, we need to specify the mathematical model of light that will describe its behavior as it propagates and interacts with input objects, optical systems, and output sensors. To simplify the descriptions in the different contexts, three physical models for light and its interactions are used that are (loosely speaking) distinguished by the physical scale of the phenomena: 1.1.1 Ray model of light (“geometrical optics”) macroscopic-scale phenomena (e.g., reflection, refraction) 1. (a) light propagates as RAYS that travel in straight lines until encountering an change in properties of a medium or an interface between media. Except to differentiate the color of light, the wavelength λ and temporal frequency ν of the light are assumed to be zero and infinity, respectively (λ→0, ν→∞), which means that there are no effects due to diffraction; (b) uses Fermat’s principle of least time to derive Snell’s law, which describes the phenomena of reflection and refraction; (c) useful for designing imaging systems (to locate the images and determine their magnifications) (d) calculations for modeling the behavior of optical systems (lenses and/or mirrors) are (relatively) simple and may be easily implemented in software; (e) the quality of images from the system is assessed in terms of aberrations of the optical system, which describe deviations of the image from ideal behavior. 1.1.2 Wave model of light (“physical optics”): 1. microscopic-scale phenomena (diffraction/interference, reflection, refraction, refractive index, ...) (a) considers light (electromagnetic radiation) to propagate as WAVES ; (b) propagation and interaction of light are described by Maxwell’s equations; ¢ ¡ (c) light propagates with velocity c in vacuum c / 3 × 108 m s−1 and velocity v < c in transparent materials; (d) light is described by its wavelength in vacuum λ0 and oscillation frequency ν 0 , whose values affect any interactions with matter; (e) the oscillation frequency ν 0 of waves emitted by a particular light source is constant regardless of medium and is related to the vacuum wavelength λ0 via: λ0 · ν 0 = c (f) the ratio of the propagation velocities in vacuum and in a medium is the index of refraction of the medium: c n≡ v (g) the wavelength of the wave in a medium is shorter the “vacuum wavelength” λ0 via: λmedium = λ0 n (h) wave optics explains the image-forming phenomena of reflection, refraction, diffraction (and interference, which is really just another name for diffraction) and the phenomena of polarization and dispersion that affect the quality of images; 1.1 MODELS OF LIGHT AND PROPAGATION 3 (i) mathematical calculations in wave optics are more “complicated” than those in ray optics and often not easy to implement in computers. For example, it is difficult to evaluate the exact form of light after propagating a short distance from the source; (j) uses the Huygens-Fresnel principle to derive the mathematical model for propagation of light, which if often divided into three regions: i. linear, shift-invariant model in the Rayleigh-Sommerfeld diffraction region (valid everywhere) ii. linear, shift-invariant approximation in the near field for propagation by a “sufficiently large” distance from the source (Fresnel diffraction) iii. linear, shift-variant approximation in the far field for propagation to “very large” distances from the source (Fraunhofer diffraction); (k) wave/physical optics is useful for assessing the quality of the images produced by systems. 1.1.3 Photon model of light (“quantum optics”): atomic-scale phenomena (emission and absorption of radiation) 1. (a) light is composed of PHOTONS with both wave and particle characteristics; (b) used to explain/analyze the physical interaction of light and matter, such as emission by sources (e.g., lasers), and the photoelectric effect in sensors; (c) Fundamental relationships: E0 = hν 0 = h Planck’s constant: c E h and momentum p = , where h is = λ0 c λ0 h∼ = 4.136 × 10−15 eV s = 6.626 × 10−34 J s ∼ Phenomena described by the ray and wave models are most relevant to imaging, though the quantum model is vital for understanding the properties and artifacts of light sensing. You probably have seen some consideration of ray optics in undergraduate physics, and any such experience will be useful in this course. The most common treatments of optics consider rays first because the mathematical models and calculations are simpler. However, the preparation of linear systems you just had makes it possible and even desirable to consider the wave model first by applying the concepts of the impulse response and transfer function; these may significantly simplify the concepts and calculations. There are several goals to be reached by the conclusion of this discussion; we want to have the capabilities to do several things: • locate the image(s) of an object generated by the lens, mirror, or system of lenses and/or mirrors; • determine the “character” (real or virtual) and the size(s) (i.e., the transverse magnification) of the image(s); • determine the “field of view” of the imaging system, i.e., the angular subtense of the object that is imaged; • determine the range of distances in the scene from the optical system that appears to be “in focus” (the depth of field); • determine the capability of the optics to distinguish closely spaced objects — this is the “spatial resolution” of the system (often specified in terms of measurements from the “point spread function” or the “modulation transfer function” = “MTF,” which are optical analogues of the “impulse response” and “transfer function” that are considered in the course on Fourier methods); 4 CHAPTER 1 INTRODUCTION • understand the constraints on system performance due to the properties of materials used in the imaging system, such as the variation in refractive index of glass with wavelength (dispersion) Much of this discussion (especially about depth of field and spatial resolution) will benefit from concepts derived in the course on Fourier methods, but we must also be aware of the limitations in these concepts due to nonlinearities and/or shift-variant properties of the optical system. Chapter 2 Ray (Geometric) Optics Ray optics (commonly, though unfortunately, called “geometric optics”) uses the model of light as a ray to evaluate the locations and properties of images created by systems of lenses and/or mirrors. It does not consider any effects due to the wave model of light, such as interference or diffraction (which are actually just different words for the same phenomenon: “interference” considers few light sources and “diffraction” considers an infinite number, or just “many”). The subject of ray optics may be subdivided into categories of “first-order,” “third-order,” and even higher-order optical computations. It also cannot explain other wave-propagation phenomena, such as total internal reflection. 2.1 What is an imaging system? As a simple definition, we may consider an imaging system to map the distribution of the input “object” to a “similar” distribution at the output “image” (where the meaning of “similar” is to be determined). Often the input and output amplitudes are represented in different units. For example, the input often is electromagnetic radiation with units of, say, watts per unit area, while the output may be a transparent negative emulsion measured in dimensionless units of “density” or “transmittance.” In other words, the system often changes the form of the energy; it is a “transducer.” In the ray model, we can think of the imaging system as “selecting” and/or “redirecting” rays of light to map the energy onto the image sensor. The “selection” or “redirection” process uses some type of physical interaction between light and matter to remap the energy emitted or modified by the object onto the sensor. Among the more obvious physical interactions in our experience are refraction and reflection, but these are not the only, nor even the simplest, possible mechanisms. The very simplest interaction between light and matter is absorption, where the light energy is transferred to matter and “disappears” (of course, it does not really “vanish,” but most often is converted into heat in the matter, but it is no longer available to create an image, so it may as well have “disappeared.” We can use an absorber to create the simplest imaging system: the pinhole camera 2.1.1 Simplest Imaging System — Pinhole in Absorber Consider a 3-D volume of space that contains the object. Occasionally, a ray of light emitted (or reflected) from a location in the volume is selected by the pinhole and reaches the sensor. every point in space is “in focus” on the sensor transverse magnification Mt determined by relative distances MT = − negative sign means image is inverted 5 z2 z1 6 CHAPTER 2 RAY (GEOMETRIC) OPTICS The number of rays from the object that actually reach the image is small. The interaction with the sensor requires the quantum model of discrete energy packets, so the number of packets is small if the hole diameter is small. If the object is a uniformly emitting planar source, the numbers of packets measured from different locations in the field are different (Poisson statistics); these numerical variations in what should be identical measurements appear as “noise.” The metric of noise is determined by the mean value μ of the signal and the variation about that mean, which is described by the standard deviation σ. The signal-to-noise ratio is a dimensionless quantity that may be defined many ways, but we’ll use a simple definition that will suit this purpose SN R ≡ μ μ √ =√ = μ σ μ More photons leads to larger signals (μ ↑) and larger standard deviation (σ ↑), but mean increases √ faster than the variance σ = μ, so the SNR is better statistics and less relative noise “Quality” of image depends on diameter d0 of pinhole. Improve statistics by increasing the number of photons. Larger dose or larger pinhole. The “blur” quality of the image is better for smaller pinhole because less uncertainty in ray path. How to improve? Longer exposure time multiple pinholes Depth of field Redirect rays: reflective pinholes Reflection Refraction Diffraction (wave property), e.g., holography 2.2 First-Order Optics Of most concern to us will be “first-order,” “paraxial ” or “Gaussian” optics, where the angles of light rays measured relative to the optical axis are assumed to be small, so that the ray heights remain small as the rays propagate down the optical axis, which is the source of another common term of “paraxial optics,” meaning that the ray remains near the optical axis. In cases such that the ray angle θ ∼ = 0, then we can approximate trigonometric functions by the first terms in their power-series expansions (the “Taylor series” ): ! ! à ¯ à ¯ ¯ 0 1 2 (x − x0 ) (x − x0 ) 1 dn f ¯¯ (x − x0 ) df ¯¯ d2 f ¯¯ f [x] = + + ··· + · (x − x0 )n + · · · · f [x0 ] + · 0! 1! dx ¯x=x0 2! dx2 ¯x=x0 n! dxn ¯x=x0 = ∞ n X (x − x0 ) · f (n) [x0 ] n! n=0 If the base value and the derivatives are evaluated at the origin, we have a “Maclaurin series:” f [x] = ∞ X 1 (n) f [0] · xn n! n=0 7 2.2 FIRST-ORDER OPTICS The Maclaurin series for the sine is: ¯ ∞ X ¯ dn 1 sin [θ] = · θn · n (sin [θ])¯¯ n! dθ θ=0 n=0 1 1 1 1 1 · sin [0] · θ0 + · (+ cos [0]) · θ1 + · (− sin [0]) · θ2 + · (− cos [0]) · θ3 + · (+ sin [0]) · θ4 + · · · 0! 1! 2! 3! 4! θ3 θ5 = 0+θ+0− +0+ − ··· 3! 5! θ3 θ5 = θ− + − ··· 3! 5! θ3 θ5 = θ− + − ··· 6 120 sin [θ] = Note that only odd powers of θ are present in the series for sin [θ], because the sine is an odd (antisymmetric) function that satisfies the condition sin [−θ] = − sin [+θ]. The corresponding series for the even (or symmetric) cosine includes only even powers of θ: cos [θ] = 1− ∞ X θ2 θ4 θ2n (−1)n + − ··· = 2! 4! (2n)! n=0 {cos [θ]} = 1 =⇒ lim ∼ θ =0 =⇒ cos [θ] ≡ 1 − θ2 2 So the approximation of the cosine with two terms is the difference of a constant and a parabola. The series for the (odd, antisymmetric) tangent is less commonly known and includes only the odd powers of θ: ¢ ¡ ∞ X ¡ 2n ¢ 22n − 1 θ3 2 5 tan [θ] = θ + {tan [θ]} = θ 2 + θ + ··· = B2n θ2n−1 =⇒ lim θ∼ 3 15 (2n)! =0 n=0 where B isbthe the tangent are: th Bernoulli number. The first-, third-, and fifth-order series approximations for π tan [θ] ∼ = θ for > |θ| ' 0 2 3 θ tan [θ] ∼ = θ+ 3 3 θ 2 tan [θ] ∼ + θ5 = θ+ 3 15 The validity of these approximations is perhaps more obvious from the graphs, where we can see that sin [θ] / θ and tan [θ] ' θ for small positive values of θ. 8 CHAPTER 2 RAY (GEOMETRIC) OPTICS 0.5 0.4 0.3 0.2 0.1 0.0 0.0 0.1 0.2 0.3 0.4 0.5 theta Comparison of θ (black), sin [θ] (red), and tan [θ] (blue) for 0 ≤ θ ≤ +0.5 radians, showing that sin [θ] / θ and tan [θ] ' θ over this domain. The corresponding first-order approximation to the cosine is the unit constant lim {cos [θ]} = 1 θ→0 1.2 1.1 1.0 0.9 0.8 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20 theta The first-order approximation to cos [θ] (red) compared to the unit constant (black), showing that the two are very similar for small values of θ. The advantage of the first-order approxmation is that evaluation of the ray heights and angles becomes simple because of the proportionality. 9 2.3 THIRD-ORDER OPTICS 2.3 Third-Order Optics It likely is obvious from the definition of first-order optics that “third-order” optics includes the second term in the expansions: θ3 sin [θ] ∼ =θ− = θ− 3! θ3 tan [θ] ∼ = θ+ 3 2 ∼ 1− θ =1− cos [θ] = 2! θ3 6 θ2 2 0.5 0.4 0.3 0.2 0.1 0.0 0.0 0.1 0.2 0.3 0.4 0.5 theta Comparison third-order approximations of sin [θ] (red), and tan [θ] (blue) to the linear term θ (black) . Note that the third-order approximation for the cosine is a biased parabola: 1.2 1.1 1.0 0.9 0.8 0.0 0.1 0.2 0.3 0.4 cos [θ] (black) and its third-order approximation as 1 − 0.5 theta θ2 2 (red). 10 CHAPTER 2 RAY (GEOMETRIC) OPTICS The results for ray angles using third-order optics will differ from those of first-order optics; these differences lead to image aberrations. 2.3.1 Higher-Order Approximations We clearly can add additional terms to the power series that will increase the accuracy of any calculations at the cost of significantly more complexity. 2.4 Notations and Sign Conventions One of the simplest and most difficult aspects of ray optics is the set of conventions to be adopted for all of the quantities to be measured. As in many aspects of optics, there are competing choices for conventions that have their own distinct advantages, but that lead to different equations for image locations, etc. We are going to use the directed distance convention, where distances are positive if measured from left to right. The problem becomes remembering which are the points measured “from” and “to,” respectively. The figure shows sign conventions for the different quantities. Note that in all cases, light travels from left to right in all media with positive refractive index (n > 0), so the distances are positive if measured in the same direction of light travel and negative if measured in the other direction. Sign conventions for distances, heights, angles, and curvatures. The distance is positive if measured from left to right; the height is positive if the endpoint is above the axis; the angle from the axis or from a normal is positive if measured in the counterclockwise direction (positive θ); and the curvature is positive if its center is to the right of the vertex (intersection of the surface and the optical axis). Now consider the example in the figure where an optical system forms acts on a red “object” (the upright red arrow) located at the object point labeled by O to produce an “image” at O0 . The horizontal black line is the line of symmetry of the optical system and is calle the “optical axis.” 2.4 NOTATIONS AND SIGN CONVENTIONS 11 Sign conventions for a specific case: the object height at O is positive, while the image height at O0 is negative. The angle θ of the (blue) ray from the base of the object to the (green) first surface is positive. The radius of curvature R of the first surface is positive. The front and rear surfaces of the optical system are shown in green; their intersections with the optical axis are the vertices of the system. The object space includes all features to the left of the vertex V that is closer to the object, so V is the object-space vertex of the imaging system. Similarly, the image space includes all features to the right of the vertex V0 that is closer to the image O0 , so V0 is the image-space vertex. The ray shown in blue from the object O to the green optical surface makes an angle θ measured from the optical axis to the ray; since this angle is measured counterclockwise, it is a positive angle θ > 0. The image-space ray from V0 to O0 measured from the axis is a clockwise angle, so θ0 < 0. The front surface of the optical system has a radius of curvature R that is measured from the vertex to the center of curvature, i.e. R =VC, where the overscored pair of letters denotes the distance from the first feature to the second. In this case, the distance from V to C is measured from left ot right, so VC ≡ R > 0. In the same manner, the distance from the rear vertex V0 to its center of curvature C0 is measured from right to left, so R0 ≡ V0 C0 < 0; R0 is negative in this example. Two other features are shown in the figure that we have not yet described, one each in object and image space. F and F0 are object-space and image-space focal points, respectively. They are endpoints of the object-space and image-space focal lengths; the other endpoints are either the vertices (if the lenses are “thin”) or the principal points (which we shall label as H and H0 , respectively). That discussion will have to wait until later. We will often have the need to propagate a light ray through an optical system consisting of a set of different thin lenses or a set of surfaces separated by different media. The cascade of calculations requires distances measured from the object to the lens or front surface and from lens or back surface to the image. The need to express multiple distances will be addressed by both subscripts and “primed” notation, depending on context, where the “unprimed” notation will refer to the distance before the lens or surface and the “primed” notation to that after. When multiple surfaces are needed, the first will be denoted by the subscript “1,” the next by “2,” etc. Notation can also be a problem. The two different lower-case Greek letters for “phi” (straight φ and cursive ϕ) will be used in different ways: φ represents the “power” of a lens or surface and is measured in reciprocal length, most commonly reciprocal meters m−1 , which is named the diopter. The cursive phi (ϕ) will be used to represent an angle, and therefore is dimensionless. The cursive letter f is used to represent a function, e.g., f [x, y, t], whereas the “straight” letter f will be used to denote the focal length with dimensions of length. This means that: φ= 2.4.1 1 f Nature of Objects and Images: 1. Real Object: Rays incident on the lens are diverging from the source; the object distance is positive 12 CHAPTER 2 RAY (GEOMETRIC) OPTICS 2. Virtual Object: Rays into the lens are converging toward the “source” located “behind” the lens; object distance is negative 3. Real Image: Rays emerging from the lens are converging toward the image; image distance is positive 4. Virtual Image: Rays emerging from the lens are diverging, so that the “image” is behind the lens and the image distance is negative 2.5 HUMAN EYE 2.5 13 Human Eye Since this course considers optics of imaging systems, and since the images generated by many optical systems are viewed by human eyes, we need to at least introduce the optics of the eye; we will consider it in more detail when we trace rays through the “standard” eye model later. The optics of the human eye include the curved surface (the “cornea,” which exhibits most of the power of the system) and a deformable lens. The system is intended to form an image on the retina, which is a fixed distance from the cornea. The lens is deformed by action of ciliary muscles to change the plane that is viewed “in focus.” When the muscles are relaxed, the lens is “flatter,” i.e., the radii of curvature of the surfaces are larger. To view an object “close up,” the focal length of the eye lens must be shortened by making the lens shape more spherical. This is accomplished by tightening the ciliary muscles (which is the reason why your eyes get tired after an extended time of viewing objects up close). If the retina is located “too far” from the cornea, so that the image is “in front” of the retina when the muscles are relaxed, then the eye sees a “blurry” image of distant objects, but nearby objects may be well focused. This is the condition of “nearsightedness” or “myopia.” If the retina is “too close” to the cornea, the image is focused behind it and the eye sees distant objects more sharply (“hyperopia” or “farsightedness.”) 2.6 Principle of Least Time The mathematical model of ray optics is based on a principle stated by Fermat. Long before that, Hero of Alexandria hypothesized a model of light propagation that could be called the principle of least distance: A ray of light traveling between two arbitrary points traverses the shortest possible path in space. (Hero of Alexandria) This statement applies to reflection and transmission through homogeneous media (i.e., the medium is characterized by a single index of refraction). However, Hero’s principle is not valid if the object and observation points are located in different media (as is the normal situation for refraction) or if multiple media are present between the points. In 1657, Pierre Fermat modified Hero’s statement to formulate the principle of least time (which actually works): A light ray travels the path that requires the least time to traverse. (Fermat) The laws of reflection and refraction may be easily derived from Fermat’s principle. A moving ray 14 CHAPTER 2 RAY (GEOMETRIC) OPTICS (or car, bullet, or baseball) traveling a distance s at a velocity v requires t seconds: t= s v If the ray travels at different velocities for different increments of distance, the total travel time is the summation over the different distances and different velocities: t= M X sm v m=1 m If we define the velocity of a light ray in a medium of index n to be v = t= M X m=1 where the optical path length ³ sm c nm is defined: ´= M X m=1 c . then: n M 1 X (nm sm ) ≡ c m=1 c (nm sm ) ≡ For a single medium, the optical path length is: ≡n·s Note that the optical path length is longer than the physical path length; it is the distance that a ray would travel in vacuum in the same time that it would take to travel the physical distance s; the optical path is longer than the physical path because light travels more slowly in the medium (nm ≥ 1). The principle of least time may be restated as a light ray requires the least time to traverse the path with the shortest optical path length, or: A ray traverses the route with the shortest optical path length. This suggests a philosophical question, “How does the light ray know which path to take before it leaves the source?” I leave it to you to ponder this question, but will say that the difficulty if formulating an answer suggests the limitation of the (simple) ray model for light propagation. 2.7 Fermat’s Principle for Reflection Now consider the path traveled upon reflection that minimizes an easily evaluated optical path length: 2.7 FERMAT’S PRINCIPLE FOR REFLECTION 15 Schematic for determining the angle of reflection using Fermat’s principle. As drawn, the angle θ1 is positive (measured from the normal to the ray) and θ2 is negative (from the normal to the ray). The ray travels in the same medium of index n both before and after reflection. The components of the optical path length are: p so = h2 + x2 q op = b2 + (a − x)2 And the expression for the total optical path length is: = n · (so + op) ¶ µp q 2 2 2 2 h + x + b + (a − x) =n = [x] (a function of x) By Fermat’s principle, the path length traveled is the minimum of the optical path length , so the position of o along the x-axis is found by setting the derivative of with respect to x to zero: ¶¶ µ µp q d d 2 h2 + x2 + b2 + (a − x) =0 = n dx dx ⎛ ⎞ 2x −2 (a − x) ⎠ =n·⎝ √ + q 2 h2 + x2 2 b2 + (a − x)2 x a−x =√ −q =0 2 h2 + x2 b2 + (a − x) a−x x =q =⇒ √ 2 2 2 h +x b2 + (a − x) 16 CHAPTER 2 RAY (GEOMETRIC) OPTICS From the drawing, note that: x sin [θ1 ] = √ h2 + x2 a−x sin [−θ2 ] = q 2 b2 + (a − x) =⇒ sin [θ1 ] = sin [−θ2 ] =⇒ θ2 = −θ1 In words, the magnitudes of the angles of incidence and reflection are equal (as already derived by evaluating Maxwell’s equations at the boundary). The negative sign is necessary because of the sign convention for the angle; the angle is measured from the normal and increases in the counterclockwise direction, but the reversal of the propagation direction of the ray means that it also may be “explained” by assuming that the index of refraction for the image space is the negative of that for the object space. Snell’s law for reflection at interface. Note that Snell’s law for reflection does not include either refractive index n, which means that the outgoing ray angle is not affected by the different refractive indices of the the two media, so the image location and quality are not influenced by the indices. The “amount” of the ray that is reflected IS affected by the two refractive indices via the Fresnel equations, which require the principles of wave optics for explanation. At this point, we will just introduce the relationship without proof. If light is incident normally to the interface between two media (θ = 0) with refractive indices n1 and n2 , the reflectivity of the surface obeys: R= µ n1 − n2 n1 + n2 ¶2 if θ = 0 If the first medium is air with n ' 1 and the second is glass with n ∼ = 1.5, the reflectivity is: R= µ 1 − 1.5 1 + 1.5 ¶2 = 0.04 Note that the reflectivity is the same if the first medium is glass and the second is air: R= µ 1.5 − 1 1.5 + 1 ¶2 = 0.04 The reflectivity at different incident angles obeys more complicated expressions, in part because the light must be decomposed into different polarizations depending on the direction of oscillation of the electric field. 2.7 FERMAT’S PRINCIPLE FOR REFLECTION 2.7.1 17 Plane Mirrors Other than perhaps the pinhole, the simplest image forming system is the plane mirror, which is so familiar that it may seem hardly worth mentioning. Clearly its action obeys Snell’s reflection law that θ2 = −θ1 , which means that the the appearance of an image is “reversed” relative to the object, i.e., the parity of the image is inverted. It also allows introduction of the concepts of object space and image space, which will be used thenceforth and forevermore. The object space is the locus of points where objects may exist, which is all points “in front of” the mirror (real objects) and “behind” the mirror (virtual objects) . A real object forms a virtual image “behind” the mirror, and a virtual object forms a real image “in front of” the mirror. In other words, the object and image spaces for reflection by a plane mirror both include the entire 3-D space. Object and image space for a plane mirror. Rays diverging from a real object forms a virtual image “behind” the mirror, but rays converging to a virtual object “behind” the mirror form a real image “in front of” the mirror. 18 2.8 CHAPTER 2 RAY (GEOMETRIC) OPTICS Fermat’s Principle for Refraction: Schematic for refraction using Fermat’s principle. In this drawing, both θ1 and θ2 are positive (measured from the normal to the interface in the counterclockwise direction). The optical path length is: = n1 · so + n2 · op q p = n1 h2 + x2 + n2 b2 + (a − x)2 By Fermat’s principle, the path length traveled is that such that is minimized, so we again set the derivative of with respect to x to zero and identify trigonometric functions for the resulting ratios. d 2x −2 (a − x) + n2 q =0 = n1 √ dx 2 h2 + x2 2 b2 + (a − x)2 x a−x = n2 q =0 =⇒ n1 √ 2 2 h +x b2 + (a − x)2 x sin [θ1 ] = √ 2 h + x2 a−x sin [θ2 ] = q 2 b2 + (a − x) =⇒ n1 sin [θ1 ] = n2 sin [θ2 ] =⇒ Snell’s Law for refraction Note that with this sign convention, Snell’s law may be applied to reflection by setting the refractive index of the second medium to be the negative of the first: n1 sin [θ1 ] = n2 sin [θ2 ] =⇒ n1 sin [θ1 ] = −n1 sin [θ2 ] =⇒ − sin [θ1 ] = sin [θ2 ] =⇒ θ2 = −θ1 2.8 FERMAT’S PRINCIPLE FOR REFRACTION: 19 The expression of Snell’s law for refraction is general, but we can easily apply the first-order paraxial approximation that sin [θ] ∼ = θ if the ray angles are small (θn ∼ = 0): n1 sin [θ1 ] 2.8.1 = n2 sin [θ2 ] =⇒ n1 · θ1 = n2 · θ2 in paraxial approximation n1 · θ1 in paraxial approximation =⇒ θ2 = n2 Dispersion Unlike the reflection law, Snell’s law for refraction DOES include the refractive indices. This means that the angle of refraction will change as the indices change, as with wavelength. All (or perhaps I should day ALL) transparent materials exhibit a variation in refractive index with wavelength, which is called dispersion. Note that the features of dispersion depend on the material (e.g., glass). The full explanation of dispersion is beyond the scope of this course, so we will just describe its effects. In a transparent matrial over the range of visible wavelengths, the refractive index n DECREASES with increasing λ. In the study of wave optics, this ensures that the phase velocity ω dω for the “average” wave vφ = is larger than the group or modulation velocity . Among other k dk things, this ensures that a signal transmitted as a modulation of a light wave cannot travel at a speed faster than the velocity of light. A schematic dispersion for a hypothetical glass is shown in the figure; note that the slope of the dispersion curve decreases with increasing λ; the curve “flattens out” as λ increases in the visible range. Typical dispersion curve for glass at visible wavelengths, showing the decrease in n with increasing λ and the three spectral wavelengths specified by Fraunhofer and used to specify the “refractivity”, “mean dispersion”, and “partial dispersion” of a material. The refractive indices for several real glasses shows an additional feature of dispersion curves: the relationship between the “amount” of dispersion and the refractive index. Glasses with lower refractive index (n ∼ = 1.5, the so-called crown glasses) have a “flatter” graph and therefore less dispersion. In other words, nblue is larger than nred , but not much larger., so that the smaller the refractive index, the smaller the dispersion. Flint glasses have larger values of the refractive index (n ∼ = 1.7) and larger variations across the visible spectrum: (nblue − nred )flint > (nblue − nred )crown 20 CHAPTER 2 RAY (GEOMETRIC) OPTICS Dispersion curves for various optical glasses as a function of wavelength λ in the visible region of the spectrum (measured in Angstroms, where 1 Å = 0.1 nm = 10−10 m, 4000 Å = 400 nm) The rapid rise in the index at wavelengths in the ultraviolet region is due to the atomic resonances there. If we use the paraxial approximation for rays in air entering a glass with refractive index n, the outgoing ray angle θ2 is: 1 · θ1 in paraxial approximation θ2 = n2 Dispersion ensures that (n2 )blue > (n2 )red , which means that (θ2 )blue < (θ2 )red and the deviation angle δ blue > δ red . Since the outgoing ray angles are different for different colors, images will be formed at different distances in different colors. This is the source of chromatic aberration in imaging systems. Effect of dispersion on refraction: since the refractive index for red light is smaller, the angle of refraction measured from the normal is larger. Put another way, this means that the deviation angle due to refraction is smaller for red light than for blue light. In imaging, we often think of dispersion in refractive elements as an unfortunate “bug” in the 21 2.8 FERMAT’S PRINCIPLE FOR REFRACTION: system, but you probably also know that it can be a very useful feature; it provides a tool for spreading white light into its constituent spectrum in a dispersing prism. Dispersing prism with the two refractions, showing that the angle of deviation from the original path is larger for blue light than for red light. From the figure, note that the angle of deviation of the ray from the original path is larger for blue light due to the dispersion of light δ blue > δ red for prism The relationship between the wavelength and the deviation angle is complicated for refraction. As a side comment, note that light may also be dispersed into its spectrum by the phenomenon of diffraction in gratings. However, the relationship between the wavelength and the deviation angle for diffraction is very simple: the angle of deviation is proportional to the wavelength (for small angles): δ ∝ λ =⇒ δ blue < δ red for grating This means that it is easier to construct an accurate spectrometer based on diffraction than based on refractive dispersion. 2.8.2 Refractive Constants for Glasses The refractive properties of glass are approximately specified by the refractivity and the measured differences in refractive index at the three Fraunhofer wavelengths F, D, and C: Refractivity nD − 1 1.75 ≤ nD ≤ 1.5 Mean Dispersion nF − nC > 0 differences between blue and red indices Partial Dispersion nD − nC > 0 nD − 1 ν≡ nF − nC differences between yellow and red indices Abbé Number ratio of refractivity and mean dispersion, 25 ≤ ν ≤ 65 (note that larger dispersions result in smaller Abbé numbers) Glasses are specified by six-digit numbers abcdef, where nD = 1.abc, to three decimal places, and the Abbé number ν = de.f . Note that larger values of the refractivity mean that the refractive index is larger and thus so is the deviation angle in Snell’s law. A larger Abbé number means that the mean dispersion is smaller and thus there will be a smaller difference in the angles of refraction. Such glasses with larger Abbé numbers and smaller indices and less dispersion are crown glasses, while glasses with smaller Abbé numbers are flint glasses, which are “denser”. Examples of glass specifications include Borosilicate crown glass (BSC), which has a specification number of 517645, so its refractive index in the D line is 1.517 and its Abbé number is ν = 64.5. The specification number 22 CHAPTER 2 RAY (GEOMETRIC) OPTICS for a common flint glass is 619364, so nD = 1.619 (relatively large) and ν = 36.4 (smallish). Now consider the refractive indices in the three lines for two different glasses: “crown” (with a smaller n) and “flint:” Line λ [ nm] n for Crown n for Flint C 656.28 1.51418 1.69427 D 589.59 1.51666 1.70100 F 486.13 1.52225 1.71748 The glass specification numbers for the two glasses are evaluated to be: For the crown glass: refractivity: nD − 1 = 0.51666 ∼ = 0.517 1.51666 − 1 ∼ Abbé number : ν = = 64.0 1.52225 − 1.51418 Glass number = 517640 For the flint glass: refractivity:L nD − 1 = 0.70100 ∼ = 0.701 0.70100 − 1 ∼ Abbé number: ν = = 30.2 1.71748 − 1.69427 Glass number = 701302 Dispersion curve of a material from very short to very long wavelengths. The index increases with increasing λ as additional resonances are passed, but the index of refraction decreases with increasing wavelength in the visible wavelengths (bold face). 2.8 FERMAT’S PRINCIPLE FOR REFRACTION: 23 The dispersion curves for optically transparent materials, such as glass and air, exhibit some very similar features, though the details may be significantly different. Starting at very short wavelengths (λ ' 0), the refractive index n is approximately unity. In words, the wavelength is so short (and the oscillation frequency so large) that the energy per photon is very large, so that photons pass through the material without interacting with the atoms; the material appears to be vacuum. For longer (but still very short) wavelengths (“hard” X rays), the refractive index actually is slightly less than unity, which means that X rays incident on a prism are refracted away from the prism’s base, rather than towards the base in the manner of visible light. This is the reason why X rays can be totally reflected at grazing incidence, which is the focusing mechanism used in X-ray telescopes (such as Chandra). As the wavelength of the incident light increases further, though still within the X-ray region, the radiation incident on the material is heavily absorbed; this is the “K-absorption edge” where the energy of the incident X rays is just sufficient to ionize an electron in the innermost atomic “shell” — the “K shell.” For example, the wavelength of this absorption is λK ∼ = 0.67 nm for silicon. Other absorptions occur at yet longer wavelengths (smaller incident photon energies), where electrons in the L and M shells, etc., of the atom are ionized. The spectrum of a material with a large atomic number (and thus several filled electron shells) will exhibit several such resonant absorptions. Ionization of a K-shell electron by an incoming X ray of sufficient energy. This is the reason for the large absorptions of “hard” X rays by materials. Lower-energy (longer-wavelength) X rays will ionize electrons in the L or M shells, thus producing other absorption “edges.” As the wavelength of the incident radiation increases further, into the “far ultraviolet” region of the spectrum, the real part of the refractive index decreases to a value much less than unity within a wide band of anomalous dispersion. The fact that n < 1 in this region may be confusing because it seems that the velocity of light exceeds c, but these waves do not propagate in the material due to the strong absorption (large value of κ). The wavelength of maximum absorption corresponds to the largest of the several “natural oscillation frequencies” of bound electrons in the material. In the visible region of the spectrum, the dispersion curve exhibits the familiar decrease in n with λ that was shown above. For example, the index of air is n ∼ = 1.000279 at λ = 486.1 nm (Fraunhofer’s “F” line) and n ∼ = 1.000276 at λ = 656.3 nm (“C” line). The corresponding values for diamond are nF = 2.4354 and nC = 2.4100. The closer the nearest ultraviolet absorption to the visible spectrum, the steeper will be the slope dn dλ in the visible region and thus the larger the visible dispersion (defined below). The dispersion curve descends yet more steeply somewhere in the near infrared region and then rises due to anomalous dispersion in the vicinity of an infrared absorption band (labeled “λ2 ” on the graph). For quartz (crystalline SiO2 ), the center of this band is located at λ ∼ = 8.5 μm, but the absorption already is quite strong for wavelengths as short as λ ∼ = 4 μm. Most optical materials have several such infrared absorption bands and the “base level” of the index of refraction is larger after each such band. This behavior is confirmed by far-infrared measurements of the refractive index of quartz (crystalline SiO2 ), which varies over the interval 2.40 ≤ n ≤ 2.14 for 51 μm ≤ λ ≤ 63 μm. The large values of n ensure that the focal length of a convex quartz lens is much shorter at far-infrared 24 CHAPTER 2 RAY (GEOMETRIC) OPTICS wavelengths than at visible wavelengths. As the wavelength is increased still further into the radio region of the spectrum after the last absorption band, the refractive index decreases r slowly due to normal dispersion from that last absorption and approaches a limiting value of . 0 2.9 Image Formation in the Ray Model We know that light rays are deviated at interfaces between media with different refractive indices. The goal in this section is to use interfaces of specified shapes to “collect” the light and “reshape” the wavefronts in a way that recreates “images” of the original sources. 2.9.1 Refraction at a Spherical Surface Optical systems typically are used to form images of the source distribution by constructing optical elements (“lenses”) made out of transparent media with different refractive indices to redirect the electromagnetic radiation. Until rather recently, lenses were fabricated almost exclusively from glass, which required the optical surfaces to be ground to the desired curvature and polished to remove scratches, etc., from the grinding. Two pieces of glass are typically employed in the grinding process: the “optic” and the “tool.” Water and a grinding compound composed of flecks of some hard substance resembling sand are placed on the surface of one glass and the two surfaces rubbed together with some force applied to the top optic. The two glass pieces are In the grinding process, The surface that is easiest to fabricate is a sphere, because the two surfaces will be in contact at all translations. Glass is ground out of the center of the top piece and off of the edges of the bottom piece, leaving a concave sphere on top and a convex sphere on the bottom. The “grit” of the grinding compound is reduced gradually to leave a smoother surface. The surface is then polished using very fine “jeweler’s rouge” to produce smooth surfaces of “optical” quality. More recently, optical elements have been fabricated from thin plates cemented over a hollowed-out “grid” to lighten the weight. Also plastics and other materials have been developed that may be cast to produce optical surfaces of various shapes with minimal polishing. Grinding optical surfaces: a slurry of water and grinding compound (e.g., carborundum) is placed between two glass surfaces. The top glass is pushed down and moved around to grind glass from the center region of the top piece. The resulting surfaces must be spherical because they are the only curves that remain in contact at all locations. Consider the action of a spherical surface of a medium with index n2 on an incident ray in a medium of index n1 : 2.9 IMAGE FORMATION IN THE RAY MODEL 25 Refraction at a spherical surface between two media of refractive index n1 and n2 . The point source is located at s and its distance to the vertex v is sv ≡ z1 > 0. The distance from vertex v to the observation point p is vp ≡ z2 > 0. The physical distance traveled by a ray in medium n1 to the surface is sa ≡ 1 and that in medium n2 is ap ≡ 2 . The radius of curvature of the surface is vc = ac ≡ R > 0 as drawn. For emphasis, we repeat that z1 , z2 , and R are all positive in our convention. The ray intersects the surface at angle ϕ (the “position angle”) measured from the center of curvature c. The optical path length of the ray from s to p through a is OP L = n1 1 + n2 2 = n1 (sa) + n2 (ap) The triangles 4sac and 4acp has sides 1 and R with hypotenuse z1 + R, while 4acp has sides R and z2 − R, with hypotenuse ap ≡ 2 . The physical lengths 1 and 2 may be evaluated from the other two sides and the included angle ϕ via the law of cosines: 4sac =⇒ =⇒ 4acp =⇒ 2 1 1 2 2 2 = (z1 + R) + R2 − 2R (z1 + R) cos [ϕ] q 2 = (z1 + R) + R2 − 2R (z1 + R) cos [ϕ] = (z2 − R)2 + R2 − 2R (z2 − R) cos [π − ϕ] q = (z2 − R)2 + R2 + 2R (z2 − R) cos [ϕ] =⇒ 2 q = (z2 − R)2 + R2 − 2R (R − z2 ) cos [ϕ] The corresponding optical path length is: OP L = n1 1 + n2 2 = n1 · µq ¶ 2 (z1 + R) + R2 − 2R (R + z1 ) cos [ϕ] µq ¶ 2 + n2 · (z2 − R) + R2 − 2R (R − z2 ) cos [ϕ] which is obviously a function of the position angle ϕ. We can now apply Fermat’s principle to find 26 CHAPTER 2 RAY (GEOMETRIC) OPTICS the angle ϕ for which the OPL is a minimum: d (OP L) = 0 dϕ n2 · 2R (R − z2 ) sin [ϕ] n1 · 2R (R + z1 ) sin [ϕ] +q =q 2 2 (z1 + R) + R2 − 2R (R + z1 ) cos [ϕ] (z2 − R) + R2 − 2R (R − z2 ) cos [ϕ] µ ¶ n1 (R + z1 ) n2 (R − z2 ) = 2R sin [ϕ] + 1 2 which may be rearranged to: 0 = 2R sin [ϕ] µ n1 (R + z1 ) 2 + n2 (R − z2 ) 1 =⇒ n1 R + n2 R 1 =⇒ n1 1 + 2 ¶ 2 = 2 n2 n2 (R − z2 ) 1 n1 (R + z1 ) =⇒ 0 = + 1 = R n2 z2 µ 2 − n2 z2 2 n1 z1 − 1 n1 z1 1 ¶ This last relation between the physical path lengths 1 and 2 and the distances z1 and z2 is exact. Now we use the expression for the physical path length 1 to find its ratio relative to the axial distance z1 and use simple algebra to rearrange: q (z1 + R)2 + R2 − 2R (z1 + R) cos [ϕ] 1 = z1 z1 à ! 12 2 (z1 + R) + R2 − 2R (z1 + R) cos [ϕ] = z12 µ 2 ¶1 z1 + R2 + 2Rz1 + R2 − 2R2 cos [ϕ] − 2Rz1 cos [ϕ] 2 = z12 µ µ 2 ¶ ¶ 12 2R 2R 1 = 1+ + (1 − cos [ϕ]) z1 z12 z1 This relation also is exact, but may be approximated by applying a truncated series for cos [ϕ]: cos [ϕ] ϕ2 ϕ4 ϕ6 + − + ··· ∼ = 1 if ϕ ∼ =0 2! 4! 6!µ ¶ ϕ2 ϕ4 ϕ6 + − + ··· =⇒ 1 − cos [ϕ] = 1 − 1 − 2! 4! 6! ϕ2 ϕ4 ϕ6 = − + − ··· 2! 4! 6! ∼ ∼ = 0 if ϕ = 0 = 1− This leads to the first-order approximation that the path length and axial length are approximately equal: 1 ∼ = 1 =⇒ 1 ∼ = z1 z1 2.9 IMAGE FORMATION IN THE RAY MODEL 27 Similarly, we can show that: 2 ∼ = z2 This paraxial or Gaussian approximation (also called first-order optics because it is based on only the first-order term in the cosine series) is valid only for small ray angles ϕ measured from the optical axis. In words, the optical path lengths of rays that travel along the optical axis and rays that travel “away” from the axis (but still with ϕ ∼ = 0) are equal. The simplified imaging equation has the form: ¶ µ 1 n2 z2 n1 z1 ∼ 1 − = (n2 − n1 ) R R 2 1 =⇒ n1 n2 ∼ 1 + = (n2 − n1 ) z1 z2 R This is the paraxial imaging equation for single surface; clearly it is an approximation to the true equation, and also clearly it is similar to the imaging equation we have already considered. Object at Infinite Distance Now consider some pairs of object and image distances z1 and z2 . If the object is located at −∞, then: n1 n2 ∼ 1 n2 = + = (n2 − n1 ) ∞ z2 z2 R n2 R =⇒ z2 ∼ ≡ f2 the “image-space focal length” = n2 − n1 which is what we “normally” think of as being the focal length of the optic. Image at Infinite Distance If the image is located at +∞, the object distance must be n1 R n1 ∼ 1 ≡ f1 the “object-space focal length” = (n2 − n1 ) =⇒ z1 ∼ = z1 R n2 − n1 1 1 = (n2 − n1 ) f1 R Also note that: ¶ n1 R n f1 n2 − n1 ¶ = 1 =⇒ n1 · f2 = n2 · f1 =µ n2 R f2 n2 n2 − n1 µ In words, the ratio of the focal lengths in the two spaces (object and image) is the ratio of the indices of refraction in the two spaces. Rule of Thumb: Estimating focal lengths of converging lenses: For a single positive (converging) lens (i.e., not a lens “system” with multiple elements), it is easy to estimate the focal length of a lens by finding the distance from the lens to the image of a distant bright object. The requirement for “distant” is not critical — forming the image of ceiling lamp on the floor or a tabletop will give a useful estimate for a positive lens with a short focal length. 2.9.2 Imaging with Spherical Mirrors The equation for a single refractive surface may be used to derive the focal length of a spherical mirror by setting the refractive index of image space to the negative of that in object space: 28 CHAPTER 2 RAY (GEOMETRIC) OPTICS 1 n1 1 = (−n1 − n1 ) = −2 f R R In air, the equation for the focal length of a spherical mirror is: φ= f =− R R → − in air 2n 2 In words, the focal length of a spherical mirror is half of the radius of curvature; the focal length is positive (converging) if R > 0 and negative if R < 0, as shown. Spherical mirrors: concave mirror with negative radius of curvature R = VC < 0 makes outgoing light rays converge and so f > 0; convex mirror with positive radius of curvature makes rays diverge and f < 0. 2.10 First-Order Imaging with Thin Lenses Normally we do not consider the case of an object in one medium with the image in another — usually both object and image are in air and a lens (a “device” composed of material with different refractive index n and curved surfaces) diverts the rays to form the image. We can derive the formula for the object and image distances if we know the radii of the lens surfaces and the indices of refraction. We merely cascade the formula for a single surface: n1 n2 n2 − n1 + 0 = z1 z1 R1 n2 n3 n3 − n2 At second surface: + 0 = z2 z2 R2 At first surface: where z1 is the (usually known) object distance, z10 is the image distance for rays refracted by the first surface, z2 is the object distance for the second surface, and z20 is the image distance for rays exiting the second surface (and thus from the lens). For the common “convex-convex” lens, the 2.10 FIRST-ORDER IMAGING WITH THIN LENSES 29 center of curvature of the first surface is to the right of the vertex, and thus the radius R1 of the first surface is positive. Since the vertex is to the right of the center of curvature of the second surface, then R2 < 0. If the lens is “thin”, then the ray encounters the second surface immediately after refraction at the first surface, so the ray heights at the two surfaces are the same. The object distance for the second surface is the negated image distance from the first: z2 = −z10 . Put another way, the absolute value of the image distance for the front surface |z10 | is the same as the object distance for the second surface |z2 |. If the lens is “thick”, then the object distance for the second lens is different from the image distance for the first, and the ray heights will be different if the ray angle is not zero. The thickness t of the lens must satisfy the relationship: z10 + z2 = t =⇒ z2 = t − z10 for thick lens for a thick lens. For a thin lens with t = 0 z2 = 0 − z10 =⇒ z2 = −z10 for thin lens The equations for the two surfaces may be added and the RHS may be rearranged to obtain a single imaging equation for a lens with two surfaces: ¶ µ ¶ µ ¶ µ ¶ µ n2 n3 n2 − n1 n3 − n2 n1 n2 + 0 + + 0 = + z1 z1 z2 z2 R1 R2 µ ¶ 1 n3 1 n1 = + n2 − − R2 R1 R2 R1 For a thin lens with t = 0, substitute z2 = −z10 to obtain: µ ¶ µ ¶ n1 n2 n3 n1 n3 n2 t = 0 =⇒ + 0 = + + 0 + z1 z2 z1 −z2 z2 z2 µ ¶ n3 n3 1 1 n1 n1 + 0 = + n2 − − z1 z2 R2 R1 R2 R1 where the object is immersed in index n1 , the lens has index n2 , and the image is immersed in index n3 . In the usual case of both object and image in air so that n3 = n1 = 1,the equation simplifies to: µ ¶ 1 1 1 1 1 1 + = + n2 − − z1 z20 R2 R1 R2 R1 µ ¶ 1 1 1 1 + 0 = (n2 − 1) − z1 z2 R1 R2 Note the similarity between this equation and that we inferred from the derivation of the image plane using wave optics: 1 1 1 + = z1 z2 f where the distances z1 and z2 from the object to the lens and lens to image are what we had called z1 and z2 previously, and we identify: (n2 − 1) µ 1 1 − R1 R2 ¶ = 1 1 1 + = f z1 z20 (Lensmaker’s Equation) which defines the focal length of the thin lens in terms of its physical parameters for a thin lens. This is the so-called lensmaker’s equation for thin lenses IN AIR; it determines the distance z20 to the image for object distance z1 , the radii of curvatures R1 and R2 of the spherical surfaces, and the 30 CHAPTER 2 RAY (GEOMETRIC) OPTICS index of refraction n2 of the glass. Note that the object distance z1 and the image distance z20 both appear with the same algebraic sign, which may be interpreted as demonstrating an “equivalence” of the object and image because the propagation of light rays may be reversed to exchange the roles of object and image. Corresponding object and image points (or object and image lines or object and image planes) are called conjugate points (or lines or planes). In the more general case where the refractive index of object space is n3 > 1 so that n3 6= n1 , the focal length of the lens is: µ ¶ n1 n3 1 (n2 − 1) − = R1 R2 f and that of image space is n3 . 2.10.1 Examples of Thin Lenses 1. Plano-convex lens, curved side forward (“convexo-planar lens”) R1 = |R1 | > 0 R2 = ±∞ (sign has no effect) µ ¶ 1 1 1 n2 − 1 1 + 0 = (n2 − 1) − = >0 z1 z2 |R1 | ∞ |R1 | If z1 = +∞, then z20 = f > 0, the focal length 1 n2 − 1 = φ system power (measured in meters−1 = diopters) = f R1 R1 ∼ f= = 1.5 for glass) = 2R1 (since n2 ∼ n2 − 1 We often use the “power” φ = f −1 (measured in m−1 = diopters) instead of the focal length f to describe the lens, since powers of different lenses combine by addition, instead of as reciprocals of sums of reciprocals. The power measures the ability of the lens or lens system to deviate rays, i.e., to change the ray angle. 2. Plano-convex lens, plane side forward: R1 = ±∞ R2 = − |R2 | < 0 1 (n2 − 1) (n2 − 1) 1 + =− =+ >0 z1 z20 R2 |R2 | |R2 | ∼ f= = 2 |R2 | n2 − 1 So the focal length of the lens is the same regardless of its orientation (front-to-back). Since the focal lengths for the two configurations (curved side in front or behind lens) are the same, you might assume that the same image quality can be expected for the two configurations. This is NOT the case, but the explanation requires the theory of aberrations. At this point, we will just try to give a bit of motivation for another rule of thumb, while postponing the proof. Rule of Thumb: Orientation of Plano-Convex Lens: When using a plano-convex lens to form an image, the quality of the image is better if the power is more evenly divided among the two surfaces. This means that the the curved side of the lens is placed towards the longer conjugate (which usually is towards the object) and the plane side towards the shorter conjugate. This miniizes the spherical aberration that causes rays from a point object to cross the optical axis at different distances from the lens. This perhaps may be visualized better if we consider the case of a distant object (assume z1 = ∞) and a plano-convex lens with the flat 2.10 FIRST-ORDER IMAGING WITH THIN LENSES 31 side towards the object. For an object at infinity, the rays incident upon the lens are parallel (“collimated”) both when they are incident to and when they exit the flat surface. In other words, the flat side contributes no power to the imaging, so all of the focusing power comes from the curved surface. Rule of thumb: when using a plano-convex lens, place the curved side towards the longer conjugate to get a better image. 3. Plano-concave, plane side forward: R1 = ±∞ R2 = + |R2 | > 0 µ ¶ 1 1 1 1 (n2 − 1) + = (n2 − 1) − =− <0 z1 z20 ∞ + |R2 | |R2 | |R2 | ∼ f =− = −2 |R2 | n2 − 1 4. Double convex lens with equal radii: R1 = |R| > 0 R2 = −R1 = − |R| µ µ ¶¶ 1 1 1 1 (n2 − 1) + 0 = (n2 − 1) − − =2 >0 z1 z2 |R| |R| |R| 2 · (n2 − 1) 1 =φ= f |R| |R| ∼ f= = 1.5 = |R| > 0 if n2 ∼ 2 · (n2 − 1) 32 CHAPTER 2 RAY (GEOMETRIC) OPTICS 2.10.2 Spherical Mirror The mirror changes the direction of rays by reflection that obeys Snell’s law for reflection so that the angle of reflection is the negative of the angle of incidence (measured from the normal to the surface). For a concave spherical mirror, the incident ray angle varies with height above the optical axis. difference in analysis between the single refractive surface and the mirror may be simplified by recognizing that the mirror “reverses” the direction of propagaion of light, which may be explained by setting n2 = −n1 = −1 1 −1 2 R 1 = − =− =⇒ f = − f R R R 2 In words, the focal length of a spherical mirror is half of the radius of curvature. A concave mirror with negative radius is positive (center to left of vertex) 2.11 Image Magnifications The most common use for a lens is to change the apparent size of an object (or image) via the magnifying properties of the lens. The mapping of object space to image space “distorts” the size and shape of the image, i.e., some regions of the image are larger and some are smaller than the original object. We can define three types of magnification: transverse, longitudinal, and angular, where the first two describe the impact of the imaging system on lengths that are respectively perpendicular to and parallel to the optical axis, while the last refers to the action on the angles of rays measured from the optical axis. Note that the very name of “magnification” is rather misleading because most imaging systems produce images that are smaller than the object; they actually “minify” the features because the magnifications are smaller than unity. 2.11.1 Transverse Magnification: The transverse magnification MT is what we usually think of as magnification — it is the ratio of object to image dimension measured transverse to the optical axis. In the figure, note the two similar triangles 4a1 b1 c and 4a2 b2 c: The transverse magnification of the image is the ratio of the height of the image to that of the y2 object: MT = . y1 It is easy to see that: y1 |y2 | y2 = =− (because y2 < 0) z1 z2 z2 z2 y2 ≡ MT = − =⇒ y1 z1 If |MT | is larger than or smaller than unity, the image is magnified or minified, respectively. If MT > 0, the image is upright or erect and if MT < 0, the image is inverted (“upside down”). 33 2.11 IMAGE MAGNIFICATIONS 2.11.2 Longitudinal Magnification: The longitudinal magnification ML is the ratio of the “length” or “depth” of the image measured along the optical axis to the corresponding length of the object; the longitudinal magnification is the ratio of differential elements of length of the image and object, which approach an infinitesimal in the limit: ∆z2 ∆z1 dz2 = dz1 ML = lim ∆z1 →0 ∆z2 ∆z1 The expression may be derived by evaluating the total derivative of the lensmaker’s equation. µ ¶ 1 1 1 1 + = (n − 1) − z1 z2 R 1 R2 Since the imaging equation relates the reciprocal distances z1−1 and z2−1 , the longitudinal magnification varies for different object distances. The total derivative of the left-hand side of the imaging equation is: µ ¶ µ ¶ µ ¶ 1 1 1 1 d + = d +d z1 z2 z1 z2 1 1 = − 2 dz1 − 2 dz2 z1 z2 The derivative of the right-hand side is: ¶¶ ∙µ ¶¸ µ µ 1 1 1 1 − − = (n − 1) · d d (n − 1) R 1 R2 R 1 R2 = 0 (because n, R1 , and R2 are constants) We combine these to see that: − 1 1 dz1 − 2 dz2 z12 z2 1 1 dz1 = 2 dz2 z12 z2 µ ¶2 dz2 z2 =⇒ =− dz1 z1 = 0 =⇒ − We can now identify the ratio of the two differential lengths along the axis as the longitudinal magnification ML : µ ¶2 dz2 z2 ML ≡ =− = − (MT )2 < 0 dz1 z1 The longitudinal magnification is negative because the image moves away from the lens (increasing z2 ) as the object moves towards the lens (decreasing z1 ). The longitudinal magnification affects the irradiance of the image (i.e., the “flux density” of the rays at the image); if |ML | is large, then the light in the vicinity of an on-axis location is “spread out” over a longer longitudinal dimension at the image, which requires the irradiance of the image to decrease. 34 CHAPTER 2 RAY (GEOMETRIC) OPTICS The scaling of the 3-D “image” along the three axes. The scaling along the “transverse” axes x and y define the transverse magnification, while the scaling of the image along the z-axis is determined by the longitudinal magnification. The effect of longitudinal magnification on the irradiance of the image of a uniformly luminous rod of length ab. The section at z1 = 2f is imaged with unit negative transverse magnification at z2 = 2f . Sections of the rod with z1 > 2f are imaged at z2 < 2f , and the energy density is remapped to account for the nonlinear distance relationship z11 + z12 = 1f . 2.11.3 Angular Magnification This is the ratio of the angles of the outgoing ray and the corresponding incoming ray measured relative to the optical axis. Angular magnification is particularly relevant for systems that do not form images, e.g., afocal telescopes. We shall shortly utilize this concept when considering the single-lens magnifier. θout Mθ = θin If |Mθ | > 0, then the angle of the emerging ray is larger than that of the corresponding entering ray. This will increase the angular separation between rays generated by two objects so that it will be easier for the eye to resolve them. The angular magnification is sometimes called teh magnifying power of the lens. 2.12 SINGLE THIN LENSES 2.12 Single Thin Lenses 2.12.1 Positive Lens 35 The power of a single lens with two surfaces is determined by the lensmaker’s equation: µ ¶ 1 1 1 φ = = φ1 + φ2 = (n2 − 1) − f R1 R2 The power is positive if 0 < R1 < |R2 |. The most common case is the “double convex” lens where R1 > 0, R2 < 0, which means that the ray encounters positive power at both surfaces. The action of a single thin positive lens with known focal length on an object with known location may be solved graphically by sketching three specific rays from the tip of the object: 1. the ray parallel to the optical axis; this ray is refracted by the lens to pass through the imagespace focal point F, 2. the ray through the center of the lens, which is not refracted by the thin lens and so maintains the same angle relative to the optical axis, and 3. the ray through the object-space focal point F0 to the lens; this ray is refracted and travels parallel to the optical axis. The intersection of these three rays (or obviously of any two) is the location of the image of the tip of the object: The example in the figure closely matches the situation where the image is an inverted replica of the object, so that h0 = −h and MT = −1. The two equations that must be satisfied are z2 = z1 =⇒ MT = −1 1 1 1 + = =⇒ z1 = z2 = 2 · f z1 z2 f This situation where the object and image distances are twice the focal length is often called imaging at equal conjugates. This drawing assumes that the indices of refraction in object and image space are identical. If the indices are different (e.g., if the object is in water and the image in air), then the imaging equation 36 CHAPTER 2 RAY (GEOMETRIC) OPTICS must be modified: n − n1 n2 − 1 − R1 R2 n1 n2 = + z1 z2 φ = If the refractive indices in object and image spaces are larger than that of the lens, such as a case where the object and image are in glass or water and the lens is “made of” air, the curvatures must be reversed, so that R1 < 0 and R2 > 0 to make a positive lens. Lens made of rare medium (e.g., air) within a dense medium (e.g., glass, water). The reversal of refractive indices requires inverting of the signs of the radii of curvature. 2.12.2 Negative Lens A lens with negative power at both surfaces may be constructed if R1 is negative and R2 is positive. Two (or more) rays that have passed through a lens with negative power will exhibit a larger diivergence on the output side than on the input side. 2.12.3 Meniscus Lenses A lens with radii of curvature with the same sign on both surfaces is a meniscus lens. If both radii are positive, then the powers of the two surfaces are: µ ¶ n−1 1−n 1 1 φ1 = + = (n − 1) · − |R1 | |R2 | |R1 | |R2 | which may be positive or negative depending on the relative sizes of R1 and R2 ; the power is positive if R2 > R1 and negative if R2 < R. An example of a meniscus lens with positive power is shown in the figure. 2.12 SINGLE THIN LENSES 37 Meniscus lens with positive power; the radii of curvature of both surfaces is positive since the vertices are to the left of the centers, but the fact that R2 > R1 ensures that φ > 0. Examples of meniscus lenses with positive and negative power are also shown: Meniscus lenses with positive and negative powers from the Newport optics catalog. The red lines represent rays that show the respective converging and diverging actions of the lenses. 2.12.4 Simple Microscope (magnifier, “magnifying glass,” “loupe”) This is arguably the simplest imaging system, but some of the concepts it illustrates are sufficiently sophisticated that many optickers and/or imaging scientists may not understand them entirely. The simple microscope is a single lens with positive focal length that is used to increase the size of the image on the retina than could be formed with the eye alone. It also may be called the magnifying glass if handheld or a loupe if designed to rest on the object). You may know already that the eye lens is deformed by ciliary muscles that are relaxed when the lens is “flatter,” i.e., the radii of curvature of the surfaces are larger so the focal length is longer. To view an object “close up,” the focal length of the eye lens must be shortened by making the lens shape more spherical. This is accomplished by tightening the ciliary muscles (which is the reason why your eyes get tired after an extended time of viewing objects up close). 38 CHAPTER 2 RAY (GEOMETRIC) OPTICS The closest distance to an object that appears to be sharply focused by the unaided eye is the near point, which (obviously) depends on the flexibility of the deformable eyelens and the capability of the ciliary muscles, which (obviously) vary with individual, and with age for a single individual. The distance to the near point may be as close as 50 mm ∼ = 2 in for a young child and in the range between 1000 mm − 2000 mm for an elderly person. This reduction in “accommodation” for close objects is one of the signs of aging. The near point of an “ideal” eye is assumed to be 250 mm ∼ = 10 in from the front surface. For nearsighted individuals, the near point is closer to the eye, thus increasing the angular subtense of fine details for those individuals. For this reason, nearsighted individuals in ancient times (before optical correction) often were attracted to professions requiring fine work, such as goldsmithing. Since nearsightedness can be a genetic trait, descendents often continued in these crafts. The reference for angular magnification is the angle subtended by the object if viewed at the near point of the average eye so that z1 = 250 mm. If the object height is y, the angle when viewed at the near point is: h i y y ∼ θ250 mm = tan−1 = 250 mm 250 mm where the first-order approximation tan [θ] ∼ = θ if θ ∼ = 0 is used in the last step. Magnifier with Object at Focal Point of Positive Lens If the object is positioned at the object-space (front) focal point of a positive lens with focal length flens , then the arays from the “tip” of the object are parallel when they exit the lens and so may be viewed “in focus” by an eye with a relaxed lens for an object at an infinite distance away. The angle subtended by the object one focal length away is: ∙ ¸ y y −1 ∼ θlens = tan = flens flens 2.12 SINGLE THIN LENSES 39 Magnifier with object at focal point of lens. Figure (a) at top shows the angle θ250 mm subtended by the object when located at the near point; (b) shows the angle θlens subtended by the object when located at the object-space focal point of the lens. The blue ray in (b) emerges parallel to the optic axis, which shows that the object distance z1 = f . The angular magnification or magnifying power of the magnifier is the ratio of the angle subtended by the object when viewed at the closer distance through the lens to the angular subtense viewed at the near point: ∙ ¸ ¶ µ y y tan−1 θlens f flens h lens i∼ ´ Mθ = = =³ y y −1 θ250 mm tan 250 mm 250 mm 250 mm Mθ = , object at focal point flens If the focal length of the magnifying lens is, say f = 50 mm, then the magnifying power of the lens for the object at the focal point is: 250 mm Mθ = =5 50 mm Magnifier with Image Formed at Near Point We can instead use the magnifying lens held close to the eye to form a virtual image at the near point of the eye. This means that the distance from the lens to the virtual image formed by the lens is the distance to the near point: V0 O0 = z2 = −250 mm. ISubstitute this distance into the imaging 40 CHAPTER 2 RAY (GEOMETRIC) OPTICS equation: 1 1 + z1 −250 mm 1 f 1 250 mm · f 1 1 = + =⇒ =⇒ z1 = z1 f 250 mm 250 mm + f = The angle subtended by the object at the near point is the same as before: h y i y1 1 ∼ θ250 mm = tan−1 = 250 mm 250 mm but the angle subtended by the image when positioned at the near point viewed through the lens is different: ∙ ¸ y2 y2 y1 ∼ θlens = tan−1 = = |−250 mm| 250 mm z1 where the similarity of the triangles has been used. This expression may be recast by substituting the expression for z1 : µ ¶ 250 mm + f y1 y1 y1 ∼ ¶= θlens = =µ · 250 mm · f z1 f 250 mm 250 mm + f The magnifying power is: Mθ ³ y ´ µ 250 mm + f ¶ 1 · θlens 250 mm + f f 250 mm ³ y ´ = = = 1 θ250 mm f 250 mm 250 mm Mθ = + 1 image at near point flens Magnifier with image at near point of eye. The top figure again shows the angle θ250 mm subtended by the object when located at the near point. The second figure shows the image at the near point, which is more distant than the object. 41 2.13 SYSTEMS OF THIN LENSES 2.13 Systems of Thin Lenses The images produced by systems of thin lenses may be located by finding the “intermediate” image produced by the first lens, which then become in turn the objects for the second lens, which generates an image that is the object for the third lens, etc. This type of analysis also may be applied directly to the more realistic case of “thick” lenses, where the first “lens” actually represents the first surface of the thick lens and the light propagates through the glass between the surfaces. Though straightforward, this “sequential” solution to the image may be tedious and also not very illuminating (pun intended) about the action of the system of lenses. The object and distance for the nth lens will be denoted by zn and the corresponding image distance by the primed quantity zn0 . 2.13.1 Two-Lens System Consider a two-lens system with first lens L1 and second lens L2 separated by the distance t. The object for the system shown in the figure is labelled by O and the corresponding image by O0 , the object- and image-space focal points are F and F0 , and the object- and image-space vertices (first and last surfaces of the system) by V and V0 . Imaging by a system of two thin lenses L1 and L2 separated by the distance t. The object and image distances for the first lens are z1 and z10 and for the second lens are z2 and z20 . From the diagram, we see that z10 the image distance from the first lens, z2 the object distance for the second lens, and the lens separation t are related by: z10 + z2 = t so the object distance for the second lens is z2 = t − z10 . The imaging equation for the first lens determines z10 : 1 1 1 1 1 z1 − f1 1 + = =⇒ 0 = − = z1 z10 f1 z1 f1 z1 z1 f1 z1 f1 =⇒ z10 = z1 − f1 If z1 = ∞, then the z10 z1 f1 = lim = f1 · lim z1 →∞ z1 − f1 z1 →∞ µ z1 z1 − f1 ¶ = f1 · 1 = f1 42 CHAPTER 2 RAY (GEOMETRIC) OPTICS In words, the image distance from the first lens for an object at ∞ is the focal length of the first lens, as it should be. The object distance to the second lens is z2 = t − z10 , which may be rewritten in terms of z1 , f1 , and t for the general case: z1 f1 z1 − f1 z1 t − f1 t − z1 f1 = z1 − f1 z1 (t − f1 ) − f1 t = z1 − f1 z2 = t − z10 = t − In the limit of infinite object distance, the object distance to the second lens is: µ ¶ f1 t z1 z2 [for z1 = ∞] = lim · (t − f1 ) − z1 →∞ z1 − f1 z1 − f1 = 1 · (t − f1 ) − 0 = t − f1 which is the difference in the separation of the lenses and the distance from the image-space focal point of the first lens; this often is a negative distance (i.e., virtual object for the second lens). In the general case, apply the imaging equation for the second lens and substitute for the expression for z2 : 1 z20 1 z20 1 1 − f2 z2 1 z1 − f1 = − f2 z1 (t − f1 ) − f1 t z1 f1 f2 t− (f1 + f2 ) + (z1 − f1 ) (z1 − f1 ) = z1 f2 · t − f1 · f2 · (z1 − f1 ) µ ¶ f1 · z1 z1 f2 · t − f2 · t − f1 · f2 · (z1 − f1 ) (z1 − f1 ) = =⇒ z20 = z1 f1 f2 z1 · (f1 + f2 ) − f1 f2 t− (f1 + f2 ) + t− (z1 − f1 ) (z1 − f1 ) (z1 − f1 ) = The image distance for a specified (non-infinite) object location is called the back focal distance by some authors: µ ¶ f1 · z1 f2 · t − (z1 − f1 ) BF D = z20 = V0 O0 = z1 · (f1 + f2 ) − f1 f2 t− (z1 − f1 ) 43 2.13 SYSTEMS OF THIN LENSES In the limit of infinite object distance, the BFD becomes the back focal length BFL: lim [z20 ] = z20 [f1 , f2 , t; z1 = ∞] ≡ V0 F0 µ ¶⎞ ⎛ z1 ⎜ f2 · t − f1 · (z1 − f1 ) ⎟ ⎟ = lim ⎜ z1 →∞ ⎝ z1 · (f1 + f2 ) − f1 f2 ⎠ t− (z1 − f1 ) f2 · (t − f1 · 1) = t − 1 · (f1 + f2 ) − 0 · f1 f2 t · f2 − f1 f2 f1 · f2 − f2 · t = = t − (f1 + f2 ) (f1 + f2 ) − t f (f1 − t) · f2 · f − f2 · t 1 2 BFL = V0 F0 = = (f1 + f2 ) − t (f1 + f2 ) − t z1 →∞ These complicated expressions, for the image distances measured from the second lens in terms of the two focal lengths f1 and f2 , the separation t, and the distance z1 from the object to the first lens, are useful, but it tell little on its face about the entire “lens system.” We would much prefer establishing relationships from the object to the lens system and from the system to the image. The first step in this analysis is to define an equivalent or effective focal length for the entire system, which is the focal length of the equivalent single thin lens. 2.13.2 Effective (Equivalent) Focal Length We can use the results just derived to find an expression for the imaging action of a two-lens system by finding the location and focal length of the equivalent single lens that would generate the same image. This is an important concept, so we will do a rigorous derivation, which is perhaps simplified by adding some details to the figure: Ray diagram of system of two positive thin lenses to illustrate the concept of “effective” (or “equivalent”) focal length feff , back focal length BF L = z20 = V0 F0 , and principal point H0 The continuations of the input outgoing rays intersect at B, whose projection onto the optical axis is at H0 , this is the location of the equivalent single lens that would generate the same outgoing ray from the incoming ray. The distance from H0 , the image-space principal point, to F0 is the image-space effective (or equivalent) focal length: H0 F0 ≡ feff 44 CHAPTER 2 RAY (GEOMETRIC) OPTICS We have already evaluated the back focal length, which is the image location for an object at infinity: (f1 − t) · f2 (f1 + f2 ) − t ¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ Compare two sets of similar triangles: ∆ AVF01 ∼ ∆ CV0 F01 and ∆ BH0 F0 ∼ ∆ CV0 F0 shown in the figures: V0 F0 = z20 [z1 = ∞] = ¡ ¢ ¡ ¢ From the first pair of triangles ∆ AVF01 ∼ ∆ CV0 F01 , we can construct ratios of their “heights” and “axial lengths:” h1 VF01 = h2 h2 V0 F01 =⇒ = h1 V0 F01 VF01 Now note that the distance VF01 = f1 , while V0 F01 may be rewritten: V0 F01 = VF01 − VV0 = f1 − t so the ratio may be rewritten: f1 − t h2 = h1 f1 ¡ ¢ ¡ ¢ From the second pair of similar triangles ∆ BH0 F0 ∼ ∆ CV0 F0 , we can define the distance H0 F0 ≡ feff and V0 F0 = BF L = z20 [z1 = ∞], so we now have two expressions for the ratio: h2 BF L V0 F0 = = h1 feff H0 F0 BF L h2 = h1 feff Equate the two boxed equations:: f1 − t f1 BF L feff 1 1 f1 − t =⇒ = · feff BF L f1 = Now substitute the formula for the back focal length BFL, which is z20 if z1 = ∞: 45 2.13 SYSTEMS OF THIN LENSES z20 1 feff f2 · (t − f1 ) (f1 + f2 ) − t 1 =⇒ 0 = t − (f1 + f2 ) z2 (f1 − t) · f2 1 1 f1 − t =⇒ = · feff BF L f1 (f1 + f2 ) − t f1 − t = · (f1 − t) · f2 f1 = which may be rearranged to obtain a relationship for the reciprocal of the effective focal length in terms of the reciprocals of the individual focal lengths: 1 (f1 + f2 ) − t f1 − t = · feff (f1 − t) · f2 f1 (f1 + f2 ) − t 1 1 t = = + − f2 · f1 f1 f2 f1 f2 1 1 1 t = + − feff f1 f2 f1 f2 =⇒ feff = f1 · f2 (f1 + f2 ) − t These two equivalent expressions specify what is certainly the most important equation we have derived to date and arguably the most important to be derived in this class. It determines the effect on the image of separating two thin lenses by some distance t. This expression may also be written in terms of the powers of the two lenses, where the power of the nth lens is the reciprocal of the focal length: φn ≡ fn−1 . φeff = φ1 + φ2 − φ1 · φ2 · t Note that if t = f1 + f2 = 1 1 φ + φ2 + = 1 φ1 φ2 φ1 φ2 then the feff = ∞ =⇒ BF L = +∞ and φeff = 0; the object and image are both an infinite distance from the system. The focal points are located at ±∞ and the system is called afocal. Such a system has infinite focal length and no power, which means that the image of an object at infinity is also at infinity,. Since z1 = z20 = ∞, then the transverse magnification is zero.However, such a system exhibits a useful angular magnification, as we shall see. Back Focal Length and Image-Space Principal Point We have evaluated the back focal length: BF L = V0 F0 = f1 · f2 − f2 · t (f1 + f2 ) − t and the system focal length: feff = f1 · f2 (f1 + f2 ) − t We now define the image-space principal point H0 to be the point that is located one effective focal length from the image-space focal point, i.e., so that H0 F0 = feff H0 F0 ≡ feff = f1 · f2 (f1 + f2 ) − t We can think of H0 as the location of the single equivalent thin lens that generates the same outgoing ray that emerges from the two-lens system. For a single thin lens, H0 coincides with the image-space 46 CHAPTER 2 RAY (GEOMETRIC) OPTICS vertex V0 , which in turn coincides with the object-space vertex V since the thin lens has thickness t = 0. From the equation for the BFL and the definition of the principal point, we can also specify the distance from the principal point to the vertex: feff H0 F0 = H0 V0 + V0 F0 = H0 V0 + BFL f1 · f2 f1 · f2 − f2 · t =⇒ H0 V0 = feff − BFL = − (f1 + f2 ) − t (f1 + f2 ) − t ≡ H0 V0 = f2 · t (f1 + f2 ) − t We can (and will) derive corresponding results in the object space, i.e., object-space principal and focal points. A pair of positive thin lenses showing the image-space principal and focal points H0 and F0 , respecively. Compare Back Focal “Length” and Back Focal “Distance” As the object distance decreases from ∞, the distance from the rear vertex to the the image typically increases, so that the BF D for a finite object distance typically is larger than the BF L for an infinite object distance. This can be seen by comparing the two expressions for some specimen focal lengths. For f1 = 100 mm, f1 = 25 mm. and t = 75 mm, the focal length of the equivalent single lens is: feff = µ 1 1 75 mm + − 100 mm 25 mm 100 mm · 25 mm ¶−1 = +50 mm The back focal length (distance from rear vertex to focal point) is: (f1 − t) · f2 (f1 + f2 ) − t 25 mm · (75 mm − 100 mm) = = 12.5 mm 75 mm − (100 mm + 25 mm) BF L = z20 [z1 = ∞] = 47 2.13 SYSTEMS OF THIN LENSES If the object distance is decreased from z1 = ∞ to z1 = 1000 mm, the back focal distance is: µ ¶ z1 z1 · f1 f2 f2 · t − f2 · t − f1 · f2 · z1 − f1 (z1 − f1 ) µ ¶ = BF D = z1 f1 f2 z1 t− (f1 + f2 ) + (t − f2 ) − · f1 (z1 − f1 ) (z1 − f1 ) z1 − f1 ¶ µ 1000 mm · 100 mm · 25 mm 25 mm · 75 mm − 1000 mm − 100 mm µ ¶ BF D [z1 = 1 m] = ≈ 20. 1000 mm (75 mm − 25 mm) − · 100 mm 1000 mm − 100 mm 1000 mm 25 mm · 75 mm − 100 mm · 25 mm · (1000 mm − 100 mm) = 1000 mm 100 mm·25 mm 75 mm − (100 mm + 25 mm) + (1000 mm − 100 mm) (1000 mm − 100 mm) ≈ 14.773 mm > BF L In words, as the object distance decreases from infinity, the image distance moves “back” away from the focal point. Front Focal Length The front focal length ( F F L) FV is the distance z1 in the case where z20 = ∞. It is calculated by setting the denominator of the expression for z20 to zero: (t − f2 ) − z1 f1 =0 z1 − f1 z1 f1 = t − f2 z1 − f1 z1 t − f2 =⇒ = z1 − f1 f1 =⇒ z1 f1 = (t − f2 ) (z1 − f1 ) =⇒ z1 f1 = tz1 − tf1 − z1 f2 + f1 f2 =⇒ z1 (f1 + f2 − t) = f1 f2 − tf1 =⇒ lim z1 = FV = z20 →∞ f1 · (f2 − t) = FFL (f1 + f2 ) − t Note that this expression has the same form as the front focal distance except that f1 and f2 are “swapped”. Front Focal Distance Also note that the front focal distance ( F F D) is the axial distance from an object to the first surface (front vertex) of the imaging system applies for finite object distances. This is synonymous with the term the working distance, a concept often used in microscopy. µ ¶ f2 · z2 f1 · t − (z2 − f2 ) F F D = OV = 1 t− · (z2 · (f1 + f2 ) − f1 f2 ) (z2 − f2 ) 48 CHAPTER 2 RAY (GEOMETRIC) OPTICS Object-Space Principal Point We have already shown how to find the location of the equivalent single lens on the “output side” by extending the rays entering and exiting the system until they meet. We can locate the equivalent single lens in “object space” by “reversing” the system and introducing rays from the left again.. Since we know the distance from the object-space focal point to the object-space vertex and the effective focal length, we can find the distance from the vertex to principal point in object space. FH = feff = f1 · f2 (f1 + f2 ) − t = FV + VH = F F L + VH f1 · (f2 − t) = + VH (f1 + f2 ) − t This implies that the distance from the object-space vertex to the object-space principal point is: VH = f1 · f2 f1 · (f2 − t) − (f1 + f2 ) − t (f1 + f2 ) − t VH = 2.13.3 f1 · t (f1 + f2 ) − t Summary of Distances for Two-Lens System feff = H0 F0 = FH BF L = V0 F0 H0 V0 = H0 F0 − V0 F0 F F L = FV VH = FH − FV 2.13.4 f1 · f2 (f1 + f2 ) − t f2 · (f1 − t) (f1 + f2 ) − t f2 · t (f1 + f2 ) − t f1 · (f2 − t) (f1 + f2 ) − t f1 · t (f1 + f2 ) − t “Effective Power” of Two-Lens System The expression for the power of the system composed of two lenses in air with focal lengths f1 and f2 is: φeff [Diopters] ≡ 1 1 t 1 = + − feff [ m] f1 [ m] f2 [ m] f1 f2 φeff [Diopters] = φ1 + φ2 − φ1 φ2 t Clearly the power is zero if the separation distance t is equal to the sum of focal lengths; this is the recipe for a telescope. If the two lenses have positive power and the separation is just less than the sum of focal lengths, the effective focal length can be very large. This is also the case if if one of the two lenses has negative power (so that the numerator is negative) and the separation is just larger than the sum of the focal lengths (so that the denominator is negative and approximately zero). 2.13 SYSTEMS OF THIN LENSES 2.13.5 49 Lenses in Contact: t = 0 If the lenses are in contact, then t = 0 and the front and back focal lengths are equal to the focal length of the “equivalent single thin lens”: f1 f2 = feff , if t = 0 f1 + f2 1 1 1 =⇒ = + , if t = 0 feff f1 f2 F F L = BF L = Two “thin” positive lenses in contact. The focal length of the system is shorter than the focal f2 lengths of either, and may be evaluated to see that feff = f1f1+f . The image-space principal point is 2 the location of the “equivalent thin lens”. Since both lenses are “thin”, the principal point coincides with the locations of both lenses, so that V0 = H0 = H = V. The power of the system composed of two thin lenses in contact is the sum of the powers: φeff [Diopters] = φ1 + φ2 − φ1 φ2 · 0 = φ1 + φ2 for two thin lenses in contact This is the assumed system for the magnifier with the lens held “close to the eye.” 2.13.6 Positive Lenses Separated by t < f1 + f2 If two positive thin lenses are separated by less than the sum of the focal lengths, the image-space focal point F0 is closer to the first lens than it would have been had the second lens been absent. As shown, the effective focal length of the system is feff < f1 . We can apply the equation for feff to this case to see that: f1 f2 >0 (f1 + f2 ) − t f1 + f2 > feff > 0 if f1 + f2 > t > 0 feff = 50 CHAPTER 2 RAY (GEOMETRIC) OPTICS A pair of positive thin lenses separated by less than the sum of the focal lengths. Consider a specific example with f1 = 100 mm, f2 = 50 mm, and t = 75 mm. The focal length of the equivalent single lens is: feff = f1 f2 (100 mm) (50 mm) 200 2 = = mm = 66 mm (f1 + f2 ) − t (100 mm + 50 mm) − 75 mm 3 3 The image formed by the first lens is located at its focal point: z10 = µ 1 1 − f1 z1 ¶−1 = µ 1 1 − 100 mm ∞ ¶−1 = 100 mm The object distance to the second lens is therefore the difference t − z10 : z2 = t − z10 = 75 mm − 100 mm = −25 mm The image of an object located at z1 = ∞ appears at z20 : µ 1 1 = − f2 z2 2 V0 F0 = .16 mm 3 z20 ¶−1 = µ 1 1 − 50 mm −25 mm ¶−1 = 50 2 mm = 16 mm 3 3 measured from the rear vertex V0 of the system. We already know that the system focal length is 66 23 mm, so the image-space principal point H0 (the position of the equivalent thin lens) is located 66 23 mm IN FRONT of the system focal point, i.e., 50 mm in front of the second lens and 25 mm behind the first lens. 2 H0 F0 = feff = 66 mm 3 2 V0 F0 = BF L = 16 mm 3 2 2 H0 V0 = H0 F0 − V0 F0 = 66 mm − 16 mm = 50 mm 3 3 We have already shown how to find the location of the equivalent single lens on the “output side” by extending the rays entering and exiting the system until they meet. We can locate the equivalent single lens in “object space” by “reversing” the system, as shown in the figure. The “first” lens in the system is now (what we have called the second lens) L2 with f2 = 50 mm. The “second” lens is L1 with f1 = 100 mm and the separation is t = 75 mm. The resulting effective focal length remains 2 unchanged at feff = 200 3 mm = 66 3 mm. If we bring in a ray from an object at ∞, the “intermediate” 51 2.13 SYSTEMS OF THIN LENSES image formed by L2 is located at the focal point of L2 : z10 = µ 1 1 − f2 z1 ¶−1 = µ 1 1 − 50 mm ∞ ¶−1 = 50 mm Thus the image distance to L1 is: 0 z2 = t − z1 = 75 mm − 50 mm = +25 mm The image of the object at z1 = ∞ produced by the entire system is located at z20 : z20 = µ 1 1 − f1 z2 ¶−1 = µ 1 1 − 100 mm +25 mm ¶−1 =− 1 100 mm = −33 mm 3 3 measured from the “second” lens L1 (or equivalently from the second vertex). The image is “behind” the second lens and is thus virtual. The object-space principal point H is the point such that the distance FH = feff = 66 23 mm, which means that H is located −33 13 mm IN FRONT of L2 . The “object-space” principal point H may be located by “reversing” the system and bringing in a ray from an object at infinity. When we “re-reverse” the system to graph the object- and image-space principal points, H is located “behind” the lens L2 , as shown in the graphical rendering of the entire system: 52 CHAPTER 2 RAY (GEOMETRIC) OPTICS The principal and focal points of the two-lens imaging system in both object and image spaces. The object-space principal point is the location of the equivalent thin lens if the imaging system is reversed. We can now use these locations of the equivalent thin lens in the two spaces to locate the images by applying the thin-lens (Gaussian) imaging equation, BUT the distances z and z 0 are respectively measured from the object V to the object-space principal point H and from the imagespace principal point H0 to the image point O0 . The process is demonstrated after first locating the images via a direct calculation. “Brute Force” Calculation of Image Now consider the location and magnification of the image created by the original two-lens imaging system (with L1 in front) for an object located 1000 mm in front of the system (so that OV = 1000 mm). We can locate the image step by step: z10 = µ 1 1 − f1 z1 ¶−1 Intermediate image created by L1 : µ ¶−1 1 1000 1 = = − mm ∼ = 111.11 mm 100 mm 1000 mm 9 Transverse magnification of intermediate image:: (MT )1 = − 1000 mm z10 1 =− 9 =− z1 1000 mm 9 Distance from intermediate image to L2 : 1000 325 z2 = t − z10 = 75 mm − mm = − mm ∼ = −36.11 mm 9 9 z20 = µ 1 1 − f2 z2 ¶−1 Distance from L2 to final image: µ ¶−1 1 650 1 = =+ − 325 mm ∼ = +20.97 mm 50 mm − 9 mm 31 53 2.13 SYSTEMS OF THIN LENSES Transverse magnification of second image: (MT )2 = − 650 31 mm − 325 9 mm =+ 18 31 The transverse magnification of the image from the entire system is the product of the transverse magnifications from each lens: µ ¶ µ ¶ 1 18 2 MT = (MT )1 · (MT )2 = − · + =− 9 31 31 which indicates that the image is minified and inverted. Imaging Equation using Principal Points We have just seen that the object- and image-space principal points are the “reference” locations from which the system focal length is measured; feff = FH = H0 F0 In exactly the same way, these principal points are the “reference” locations from which the object and image distances are measured: z = OH z 0 = H0 O0 The ray entering the system can be modeled as traveling from the object O to the object-space principal point H. The resulting outgoing (image) ray travels from the image-space principal point H0 to the image point O0 . This may seem a little “weird”, but actually makes perfect sense if we relate the measurements to the equation for a single thin lens. In that situation, focal lengths are measured from the object-space focal point to the thin lens and from the lens to the image-space focal point. In other words, the object- and image-space vertices V and V0 of a thin lens coincide with the principal points H and H0 . We know that an object located at the lens (z = 0) generates an image at the lens (z 0 = 0) with magnification of +1; the heights of the object and image at the principal points are identical. In the realistic system where the object- and image-space principal points are at different locations, the image of an object located at the object space principal point is formed at the image-space principal point with unit transverse magnification MT = +1. In other words, the principal points are the locations of conjugate points with unit transverse magnification. Notice the difference to the situation where the object distance OH = 2f , so that the image distance H0 O0 = 2f with transverse magnification MT = −1: OH = z = 2f 1 1 1 = + z z0 f z 0 = H0 O0 = 2f 2f MT = − = −1 2f This case where the object and image distances are equal so that the transverse magnification is −1 often is called imaging at equal conjugates. Note the positions of the principal and focal planes of the system we just analyzed: f1 = +100 mm, f2 = +50 mm, and t = +75 mm. The principal points are “crossed,” which means that the objectspace principal point is farther towards image space than the image-space principal point (H is “behind” the H0 ). Such a system is more “compact,” because the image is closer to the object-space principal point, so that F0 is closer than V0 O0 54 CHAPTER 2 RAY (GEOMETRIC) OPTICS Principal points of an imaging system: The dashed ray from the object at O reaches the object-space principal point H with height h. The image ray (solid line) departs from the image-space principal point H0 with the same height h and goes to the image point O0 , so that the distances OH = z and H0 O0 = z 0 satisfy the imaging equation z1 + z10 = fe1ff . Location of Image using Principal Points We can also analyze this system by using the model of the single thin lens located at the objectand image-space principal points. We have already shown that the focal length of the system is: feff = FH = H0 F0 = + 200 mm 3 The object and image distances z and z 0 of the single lens equivalent to the two-lens system are respectively measured principal points: z = OH and z 0 = H0 O0 . The object distance is measured to the object-space principal point, which is 100 mm behind L1 (or V), thus the object distance is the distance from O to L1 plus 100 mm: z = OV + VH = 1000 mm + 100 mm = 1100 mm 55 2.13 SYSTEMS OF THIN LENSES The single-lens imaging equation may be used to find the image distance z 0 , which now is MEASURED FROM THE IMAGE-SPACE PRINCIPAL POINT H0 (and NOT from the image-space vertex V0 ). 0 z = µ µ 1 feff 1 − z ¶−1 ¶−1 1 1 mm − 200 1100 3 mm 2200 = H0 O0 = mm ∼ = 70.97 mm 31 = The image distance from the vertex is calculated by subtracting the distance from the image-space principal point H0 to the image-space vertex V0 : V0 O0 = H0 O0 − H0 V0 2200 650 = mm − 50 mm = mm ∼ = +20.97 mm 31 31 The resulting transverse magnification is: MT = − 2200 mm z0 2 ∼ = − 31 =− = −0.065 z 1100 mm 31 Both the image distance and the transverse magnification match the values obtained with the stepby-step calculation performed above (as they must!). 2.13.7 Cardinal Points The object-space and image-space focal and principal points are four of the six so-called cardinal points that determine the paraxial properties of an imaging system. There are three pairs of locations where one of each pair is in object space and the other is in image space. The object- and imagespace focal points are F and F0 , while the principal points H and H0 are the locations on the axis in object and image space that are images of each other with transverse magnification MT = +1. The nodal points N and N0 are the points in object and image space where the ray angle of the entering and exiting rays are identical, which means that the angular magnification of rays “into” and “out of” the nodal points is Mθ = +1. The principal and nodal points coincide for systems with the object and image spaces in the same medium (e.g., both object space and image space in air). A table of significant points on the axis of a paraxial system is given below: A x ia l P o in t O b je ct S p a ce (fro nt) Im a g e S p a c e (b a ck ) C o n ju g a te P o ints? (o b ject a n d im a g e?) Fo c a l P o i n t s F F0 No N o d a l P o ints N 0 N Yes: Mθ = +1 P rin c ip a l P o in ts H H0 Ye s: MT = +1 Vertice s V V0 O b je c t/ Im a g e O O0 E ntra n c e / E x it P u p ils E E0 “ E q u al C on ju ga tes” OH=2feff z20 =H0 O0 =2feff No Ye s: H0 O0 z0 =− z OH Y e s , MT varies MT = − Ye s: MT = −1 56 2.13.8 CHAPTER 2 RAY (GEOMETRIC) OPTICS Lenses separated by t = f1 + f2 : Afocal System (Telescope) If the two lenses are separated by the sum of the focal lengths, then an object at ∞ forms an image at ∞; the system focal length is infinite. Since the focal points are both located at infinity, we say that the system is afocal; it has zero power, i.e., the rays exit the system at the same angle that they entered it. If the focal length of the first lens is longer than that of the second, the system is a telescope. Two thin lenses separated by the sum of their focal lengths. An object located an infinite distance from the first lens forms an “intermediate” image at the image-space focal point f10 of the first lens. The second lens forms an image at infinity. Both object- and image-space focal lengths of the equivalent system are infinite: f = f 0 = ∞. The system has “no” focal points — it is afocal. The focal length of this system is: 1 1 1 t = 0 =⇒ + − =0 feff f1 f2 f1 · f2 ¶ µ ¶ µ 1 f1 + f2 1 + − =0 = f1 f2 f1 f2 =⇒ t = f1 + f2 which shows that the separation between the two lenses is t = f1 + f2 . Angular Magnification of a Telescope The telescope has infinite focal length and therefore no “power,” but you already know that it does “something.” Consider the system’s effect on a ray that enters the first lens at its center at angle θ, so it is transmitted through the lens with no change in angle. Because the ray crossed the axis at the first lens and travels the distance z2 = f1 + f2 to the second lens, where it is deviated to make the angle θ0 with the optical axis. We need to relate θ and θ0 to evaluate the angular magnification. 57 2.13 SYSTEMS OF THIN LENSES Angular magnification of a telescope: the red ray strikes the center of the first lens at angle θ and is transmitted without deviation (because the sides are parallel at the center and the lens is thin). The ray is deviated by the second lens at angle θ0 . The angular magnification is the ratio of these two angles. From the figure, note that the angle of the entering ray is positive and that of the exiting ray is negative. The angle of the entering ray may be determined from the triangle “between” the lenses with sides (f1 + f2 ) and h: h ∼θ tan [θ] = = f1 + f2 To find the exiting angle θ0 , we need to find the distance from the second lens to the point where the ray crosses the axis. This is easy to find using the imaging equation for a thin lens in air: 1 1 1 z2 · f2 + = =⇒ z20 = z2 z20 f2 z2 − f2 where the object distance z2 is the distance between the lenses: z2 = t = f1 + f2 so the image distance for the red ray is: z20 = z2 · f2 (f1 + f2 ) · f2 f2 = z20 = = (f1 + f2 ) · z2 − f2 (f1 + f2 ) − f2 f1 The angle θ0 satisfies the condition: £ ¤ h h tan θ0 = − 0 = − z2 (f1 + f2 ) · f2 f1 =− f1 h ∼ · = θ0 f2 f1 + f2 So the angular magnification is: f1 θ ∼ − f2 · ³ Mθ = = θ 0 ³ h f1 +f2 h f1 +f2 ´ ´ =− f1 f2 where the negative sign means that the two angles have different algebraic signs. In words, the angular magnifcation of a telescope is the ratio of the focal lengths of the lenses. If the two lenses are both positive (Keplerian telescope), then the angular magnification is negative. If the objective (first lens) has positive power and the ocular (second lens) is a negative (Galilean telescope), then 58 CHAPTER 2 RAY (GEOMETRIC) OPTICS the angular magnification is positive. The angular magnification shows that two distant objects separated by a small angle (as a double star in the sky) will be separated by a larger angle if viewed through a telescope. 2.13.9 Positive Lenses Separated by t = f1 or t = f2 We now continue the sequence of examples for two positive lenses separated by increasing distances. If two positive lenses are separated by the focal length of the first lens, then the focal length of the system is: f1 · f2 f1 · f2 feff = = = f1 (if t = f1 ) (f1 + f2 ) − f1 f2 In words, the focal length of a system of two lenses separated by the focal length of the first lens is equal to the focal length of the second lens. If the two lenses are separated by the focal length of the second lens, then the system focal length is f2 . feff = f1 · f2 f1 · f2 = = f2 (if t = f2 ) (f1 + f2 ) − f2 f1 Recall that the transverse magnification is approximately proportional to the focal length if the object is distant: ´ ³ z·f 0 z−f z MT = − = − z z ! à 1 f 1 = −f · =− · z−f z 1 − zf +∞ µ ¶n +∞ µ ¶n+1 X f f X f =− = − · z n=0 z z n=0 f ∼ = − ∝ −f if z À f z where the formula for the converging geometric series has been used. In words, the transverse magnification of a distant object formed by an imaging system is approximately proportional to the focal length (which is why long focal lengths are used to image distant objects). For the purpose of this example, we analyze the second case because it is the basis for probably the most common application of imaging optics. The extension to the first case is trivial. Since the focal length of the system is identical to the focal length of the second lens, this suggests the question of how does the image change if the front lens is added. 59 2.13 SYSTEMS OF THIN LENSES Effect of adding lens L1 at the object-space focal point of lens L2 , so that t = f2 and feff = f2 . The upper sketch is the lens L2 alone, and the lower drawing shows the situation with L1 added. Consider a specific case with f2 = 100 mm and f1 = 200 mm. If only L2 is present and the object distance is z2 = 1100 mm, then the image distance is: z20 = µ 1 1 − f2 z2 ¶−1 = µ 1 1 − 100 mm 1100 mm ¶−1 = 110 mm The associated transverse magnification is: (MT )L 2 alone =− z20 +110 mm 1 =− =− z2 +1100 mm 10 Now add L1 at the front focal point of L2 and find the associated image. The object distance to L1 is 1100 mm − 100 mm = 1000 mm. The first lens forms an image at distance: z10 = µ 1 1 − f1 z1 ¶−1 = µ 1 1 − 200 mm 1000 mm ¶−1 = 250 mm with transverse magnification: (MT )1 = − z10 +250 mm 1 =− =− z1 +1000 mm 4 The object distance to the second lens is: z2 = t − z10 = 100 mm − 250 mm = −150 mm and the resulting image distance behind lens L2 is: z20 = µ 1 1 − f2 z2 ¶−1 = µ 1 1 − 100 mm −150 mm ¶−1 = +60 mm Compare the image distances behind lens L2 and the system focal lengths without and with L1 in the system: z20 (without L1 ) = V0 O0 (without L1 ) = +110 mm > V0 O0 (with L1 ) = +60 mm 60 CHAPTER 2 RAY (GEOMETRIC) OPTICS the image has moved “closer” to lens L2 . feff (without L1 )= 100 mm = feff (with L1 ) Now check the other attributes of the image. Recall that MT = −0.1 if using L2 alone. If using both lenses, the transverse magnification of the image formed by the second lens is: (MT )2 = − 60 mm 2 =+ −150 mm 5 The magnification of the system is the product of the magnifications due to each lens: MT for system with L1 and L2 = (MT )1 · (MT )2 µ ¶µ ¶ 1 2 1 = − + =− = MT for L2 alone 4 5 10 MT (without L1 )= MT (with L1 ) if t = f2 which is the same as for lens L2 alone! The transverse magnification of the system is not changed by the addition of lens L1 with focal length f1 placed at the front focal point of lens L2 , If f1 > 0, the image distance measured from L2 is shorter if L1 is present than if L1 is missing. Obviously, if the first lens has negative power (f1 < 0), the image distance measured from L2 is longer if L1 is present than if L1 is missing. Put another way, the addition of lens L1 located at the object-space focal point of lens L2 moves the principal points and focal points by equal distances either “forward” (towards L2 ) if f1 > 0 or “backwards” (farther from L2 ) if f1 < 0, but the the focal length is unchanged. This system demonstrates the principle of eyeglass lenses, where the ideal location for the corrective lens is at the object-space focal point of the eyelens (this is the reason that eyeglasses are “on your nose”). The corrective action of a negative lens L1 placed at the front focal point of L2 moves the image location “backwards” (away from L2 ) to correct “nearsightedness” without changing the transverse magnification of the imaging system. A positive lens L1 placed at the front focal point of L2 will move the image “forwards” (towards L2 ) to correct “farsightedness.” 2.13.10 Positive Lenses Separated by t > f1 + f2 If the two positive lenses are separated by more than the sum of the focal lengths, the focal length of the resulting system is negative: feff = f1 · f2 <0 (f1 + f2 ) − t If the object distance is ∞, the first lens forms an “intermediate” image at its image-space focal point, i.e., at z10 = f1 . Since the object distance z2 measured from the second lens is larger than f2 , a “real” image is formed by the second lens at the system focal point F 0 . If we extend the exiting ray until it intersects the incoming ray from the object at infinity, we can locate the equivalent single thin lens for the system, i.e., the image-space principal point H0 . In this case, this is located farther from the second lens than the focal point. The effective focal length feff = H0 F0 < 0, so the system has negative power. 61 2.13 SYSTEMS OF THIN LENSES The system composed of two thin lenses separated by d > f1 + f2 . The image-space focal point F0 of the system is beyond the second lens, but the image-space principal point H0 is located even farther from L2 . The distance H0 F0 = feff < 0, so the system has negative power! 2.13.11 Compound Microscopes We have already discussed the simple magnifier, where the object is located closer to the positive lens than the focal length, thus forming a larger upright virtual image close to the near point of the eye. In the compound magnifier (more commonly called the compound microscope) formed from two lenses, the objective and eyelens generally have a short positive focal length and a longer focal length, respectively. The focal points of the two lenses are separated by a fixed distance, the “tube length,” which is now standardized by the Royal Microscope Society as t = 160 mm, though some companies manufacture other lengths (e.g., Leitz with t = 170 mm). Not that it matters in this class, it is important to ensure that the objective is used with the correct tube length to minimize aberrations in the final image. Modern microscope systems are often “infinity corrected,” which means that the object is located in the front focal plane of the objective so that the rays emerging are parallel (collimated). This feature allows a beamsplitter to be introduced in the light path for a second eyelens, camera, or other apparatus. A lens within the microscope tube (the “tube lens,” duh) creates an intermedia image that is viewed by the eyelens. In more traditional microscopes, the object typically is located just beyond the focal point of the short-focal-length positive objective lens (so that the object distance z1 ' f1 ), thus forming a large real inverted image that is positioned at the front focal point of the ocular (eye lens). The eye lens then forms an image at infinity, i.e., the parallel rays emerging from the ocular are viewed by a relaxed eye. Microscope objectives and eyepieces are labeled by “magnifying powers,” e.g. 10X - 40X for the objective and 10X for the ocular. The total magnification is the product, so that a 10X objective and 10X ocular yields a magnification of 100X. The magnifying power of an objective with focal length f1 and tube length 160 mm is: M1 = − 160 mm f1 For example, objectives with these focal lengths have magnifying powers: f1 = 16 mm =⇒ M1 = 10X f1 = 1.6 mm =⇒ M1 = 100X The magnifying power of the eyelens is calculated from the same formula used for the simple mag- 62 CHAPTER 2 RAY (GEOMETRIC) OPTICS nifier: (Mθ )1 = with sample value: 250 mm f2 f2 = 25.4 mm =⇒ M2 ∼ = 10X The magnifying power of the compound microscope is the product of the two magnifying powers: M.P. = (Mθ )1 · (Mθ )2 −160 mm 250 mm = · f1 f2 −160 mm 250 mm ∼ = · = −1000X 1.6 mm 25.4 mm where again the negative sign means that the image is inverted. 2.13.12 Two Positive Lenses with Different Focal Lengths and Different Separations From the list of distances for a two-lens system: feff = H0 F0 = FH BF L = V0 F0 H0 V0 = H0 F0 − V0 F0 F F L = FV VH = FH − FV f1 · f2 (f1 + f2 ) − t (f1 − t) · f2 (f1 + f2 ) − t f2 · t (f1 + f2 ) − t f1 · (f2 − t) (f1 + f2 ) − t f1 · t (f1 + f2 ) − t we can determine the impact of the lens separation t for the specific example: f1 = +100 mm f2 = +25 mm t BF L FFL feff 0 mm +20 mm +20 mm +20 mm +25 mm = f2 0 mm +18.75 mm +25 mm = f2 +50 mm −33 13 mm +16 23 mm +33 13 mm +75 mm −100 mm +12.5 mm +50 mm +100 mm = f1 −300 mm 0 mm +100 mm = f1 +125 mm = f1 + f2 ∞ ∞ ∞ (afocal ) +150 mm +500 mm +50 mm −100 mm +175 mm +300 mm +37.5 mm −50 mm 63 2.13 SYSTEMS OF THIN LENSES The effect of varying the lens separation t on the effective focal length feff for f1 = +100 mm and f2 = +25 mm, with a magnified view in (b). The system is afocal if t = f1 + f2 = 125 mm; feff > 0 for t < f1 + f2 and feff < 0 for t > f1 + f2 . 2.13.13 Systems of One Positive and One Negative Lens We also consider the case where f1 = +100 mm and f2 = −25 mm. The focal length for t = 0 is: feff = µ 1 1 + f1 f2 ¶−1 = µ 1 1 + +100 mm −25 mm ¶−1 = −100 mm ∼ = −33.33 mm 3 The system focal length is negative for t < f1 + f2 = 75 mm, the system is afocal for t = 75 mm, and the focal length is positive for t > 75 mm. The effect of varying the lens separation t on the effective focal length feff for f1 = +100 mm and f2 = −25 mm, with a magnified view in (b). The system is afocal if t = f1 + f2 = 75 mm; feff < 0 for t < f1 + f2 and feff > 0 for t > f1 + f2 . 64 2.13.14 CHAPTER 2 RAY (GEOMETRIC) OPTICS Newtonian Form of Imaging Equation We have already seen the familiar Gaussian form of the imaging equation: 1 1 1 + 0 = z z f An equivalent form is obtained by defining the distances x and x0 that are the differences between the object and image distances and the focal length: z = x + f =⇒ x = z − f z 0 = x0 + f =⇒ x0 = z 0 − f In the case of a real object O and real image O0 as shown in the figure, both x and x0 are positive. The definition of the parameters x, x0 in the Newtonian form of the imaging equation. For a real image, both x and x0 are positive. By simple substitution into the imaging equation, we obtain: 1 1 1 (x0 + f ) + (x + f ) x + x0 + 2f = + 0 = = f x+f x +f (x + f ) · (x0 + f ) xx0 + (x + x0 ) f + f 2 0 0 xx + (x + x ) · f + f 2 =⇒ f = (x + x0 ) + 2f =⇒ x · x0 + f 2 = 2f 2 =⇒ x · x0 = f 2 This is the Newtonian form of the imaging equation. The same expression applies for virtual images, but the sign of the distances must be adjusted, as shown: The parameters x, x0 of the Newtonian form for a virtual image. 65 2.13 SYSTEMS OF THIN LENSES 2.13.15 Example (1) of Two-Lens System Find the cardinal points of the two-lens system f1 = +100 mm f2 = +25 mm t = +50 mm The effective focal length is: f1 · f2 (f1 + f2 ) − t 100 mm · 25 mm 100 1 = =+ mm = +33 mm 100 mm + 25 mm − 50 mm 3 3 feff = Now find the location of the focal point from the formula for the back focal length: f2 · (f1 − t) (f1 + f2 ) − t 25 mm · (50 mm − 100 mm) 50 = = mm 50 mm − (100 mm + 25 mm) 3 BF L = V0 F0 = Alternatively, we can track a ray from infinity through the system. The image distance from the first lens is f1 = +100 mm, so the object distance to the second lens is z2 = t − f1 = 50 mm − 100 mm = −50 mm The image distance from the second lens is: z20 = z2 · f2 (−50 mm) · (+25 mm) 50 = = mm = V0 F0 z2 − f2 (−50 mm) − (+25 mm) 3 (parenthetical note, this is half the focal length). We can now draw the image-space focal and principal points: 66 CHAPTER 2 RAY (GEOMETRIC) OPTICS To find the object-space focal point, we can evaluate the front focal length: f1 = +100 mm f2 = +25 mm t = +50 mm F F L = FV = f1 · (f2 − t) (+100 mm) · (25 mm − 50 mm) 100 = =− mm (f1 + f2 ) − t (100 mm + 25 mm) − 50 mm 3 which says that the object-space focal point is to the right of the object space vertex. From the effective focal length, we can locate the object-space principal point: 100 mm 3 FV = FH + HV 100 −100 mm = + mm + HV 3 100 100 200 =⇒ HV = − mm − mm = − mm 3 3 3 FH = feff = + Alternatively, we “turn the system around” and bring in light from the left. The image distance from the “first lens” (actually L2 ) is equal to its focal length: z10 = f2 = +25 mm So the object distance to the lens with f1 = +100 mm is: z2 = t − z10 = 50 mm − 25 mm = +25 mm So the distance from this lens to the system image-space focal point is: z20 = z2 · f1 (+25 mm) · (+100 mm) 100 = =− mm z2 − f1 (+25 mm) − (+100 mm) 3 The object-space focal point is virtual and the object-space principal is located at the distance f eff behind it in the reversed system. 2.13 SYSTEMS OF THIN LENSES 67 We can now reverse the second case and plot the four cardinal points ( F, F0 , H, H0 ) on the same graph: Object-space and image-space cardinal points for two-lens system with f1 = +100 mm, f2 = +25 mm, t = +50 mm. The ray from infinity on the object side is in red, that from infinity on the image side is in blue. In this case, the object-space focal point F just happens to coincide with the image-space principal point H0 and the same is true for the object-space principal point H and the image-space focal point F0 . This is of no real significance, since the two spaces are independent. 68 CHAPTER 2 RAY (GEOMETRIC) OPTICS Images from System: (1) Object at Object-Space Focal Point An object located at the object-space (“front”) focal point of the system is at the distance equal to the FFL from the first lens. In this case: 100 mm 3¡ ¢ mm · 100 mm − 100 z1 · f1 3 ¢ = = ¡ 100 = +25 mm z1 − f1 − 3 mm − (100 mm) z1 = F F L = − z10 The object distance to the second lens is: z2 = t − z10 = +50 mm − 25 mm = 25 mm which is the same as the focal length of the second lens, which means that the image distance from the second lens is infinite (as expected). Images from System: (2) Object at Object-Space Principal Point An object located at the object-space (“front”) principal point of the system is at the distance equal to the FFL from the first lens. In this case: 100 100 200 z1 = F F L − feff = − mm − mm = − mm 3 3 3 ¢ ¡ 200 mm · 100 mm − z1 · f1 ¢ = ¡ 2003 = +40 mm z10 = z1 − f1 − 3 mm − (100 mm) z0 40 mm 3 (MT )1 = − 1 = − 200 =+ z1 5 − 3 mm The object distance to the second lens is: z2 = t − z10 = +50 mm − 40 mm = +10 mm z2 · f2 10 mm · 25 mm 50 = z20 = = − mm z2 − f2 10 mm − 25 mm 3 50 0 mm − z 5 (MT )2 = − 2 = − 3 =+ z2 10 mm 3 The system magnification for that object distance is the product of the two: µ ¶ µ ¶ 3 5 (MT )system = (MT )1 · (MT )2 = + · + = +1 5 3 as expected for the object and image at the principal points. Images from System: (3) Equal Conjugates If we move the object so that it is one focal length from the focal point and two focal lengths from the principal point, the object distance is: z1 = F F L + feff = − z10 = 0 mm (MT )1 = +1 100 100 mm + mm = 0 mm 3 3 2.13 SYSTEMS OF THIN LENSES 69 The object distance to the second lens is: z2 = t − z10 = +50 mm − 0 mm = +50 mm z2 · f2 +50 mm · 25 mm = z20 = = +50 mm z2 − f2 +50 mm − 25 mm z0 50 mm (MT )2 = − 2 = − = −1 z2 50 mm The system magnification for that object distance is the product of the two: (MT )system = (MT )1 · (MT )2 = (+1) · (−1) = −1 as expected for the object and image at the equal-conjugate points. 2.13.16 Example (2) of Two-Lens System: Telephoto Lens Now consider a system composed of a positive lens and a negative lens separated by just a bit more than the sum of the focal lengths: f1 = +100 mm, f2 = −25 mm, and t = +80 mm. The focal length of the equivalent thin lens is feff = 500 mm: f1 · f2 f1 + f2 − t 100 mm · (−25 mm) = = +500 mm 100 mm + (−25 mm) − 80 mm feff = Note that the focal length of the system is MUCH longer than the focal lengths of either lens. Now locate the image-space focal point and principal point. For an object located at ∞, the BFL is found by substitution into the appropriate equation: (f1 − t) · f2 (f1 + f2 ) − t (100 mm − 80 mm) · (−25 mm) = = 100 mm (100 mm + (−25 mm)) − 80 mm BF L = V0 F0 = The image of an object at ∞ is located 100 mm behind the second lens, and thus 180 mm behind the first lens; this distance VF0 = 180 mm is the physical length, which is MUCH longer than the focal length of 500 mm. This is the advantage of a telephoto lens; the focal length is much longer than the lens itself. The locations of the image-space principal point is determined from the back and equivalent focal lengths: H0 F0 = H0 V0 + V0 F0 500 mm = H0 V0 + 100 mm H0 V0 = +400 mm H0 V = H0 V0 − VV0 = 400 mm − 80 mm = +320 mm so the principal point is located 320 mm in front of the object-space vertex V. A sketch of the system and the image-space cardinal points is shown below: 70 CHAPTER 2 RAY (GEOMETRIC) OPTICS Image-space focal and principal points of the telephoto system. The equivalent focal length of the system is feff = +500 mm, but the image-space focal point is only +100 mm behind the rear vertex V0 . Tthe image-space principal point is 500 mm in front of the focal point. The object-space focal point is located by applying the expression for the “front focal distance”: F F L = FV = f1 · (f2 − t) (+100 mm) ((−25 mm) − 80 mm) = = +2100 mm (f1 + f2 ) − t (100 mm + (−25 mm)) − 80 mm which is far in front of the object-space vertex V. The object-space principal point is found from: FH = FV + VH +500 mm = +2100 mm + VH VH = 500 mm − 2100 mm = −1600 mm =⇒ HV = −VH = +1600 mm So the object-space principal point is very far in front of the first vertex. Object-space focal and principal points of the telephoto system. Both are located far ahead of the front vertex V. We can locate the image of an object at a finite distance say, 3 m in front of the first lens (OV = 3000 mm) using the three methods: (1) “brute-force” calculation, (2) by applying the Gaussian imaging formula for distances measured from the principal points, and (3) from the Newtonian imaging equation. 71 2.13 SYSTEMS OF THIN LENSES (1) “Brute-Force Calculation” The distance from the object to the first thin lens is 3000 mm, so the intermediate image distance satisfies: 1 1 1 + 0 = z1 z1 f1 µ ¶−1 1 3000 1 0 z1 = = − mm ∼ = 103.45 mm 100 mm 3000 mm 29 The transverse magnification of the image from the first lens is: (MT )1 = − z10 1 =− z1 29 The object distance to the second lens is negative: z2 = t − z10 = 80 mm − 3000 680 mm = − mm ∼ = −23.45 mm 29 29 the object is virtual. The image distance from the second lens is: 1 1 1 + 0 = z2 z2 f2 µ 0 z1 = − µ ¶¶−1 1 3400 29 =+ − − mm ∼ = +377.8 mm 25 mm 680 mm 9 The corresponding transverse magnification is: ¢ ¡ 3400 + 9 mm ∼ z20 ¢ = −16.1 (MT )2 = − = − ¡ 680 z2 − 29 mm The system magnification is the product of the component transverse magnifications: à ¡ ¢! mm + 3400 1 5 9 ¢ =− MT = (MT )1 · (MT )2 = − · − ¡ 680 29 9 − 29 mm (2) Gaussian Formula Now evaluate the same image using the Gaussian formula for distances measured from the principal points. The distance from the object to the object-space principal point is: z1 = OH = OV + VH = 3000 mm + (−1600 mm) = +1400 mm The image distance measured from the image-space principal point is found from the Gaussian image formula: µ ¶−1 1 1 1 7000 1 1 0 0 O0 = = − = H =+ =⇒ z − mm ∼ = 777.8 mm z0 feff z 500 mm 1400 mm 9 The distance from the rear vertex to the image is found from the known value for H0 V0 = +400 mm: V0 O0 = H0 O0 − H0 V0 7000 3400 =+ mm − 400 mm = mm ∼ = 377.8 mm 9 9 72 CHAPTER 2 RAY (GEOMETRIC) OPTICS thus matching the distance obtained using “brute force”. The transverse magnification of the image created by the system is: + 7000 z0 5 9 mm MT = − = − =− z +1400 mm 9 (3) Newtonian Lens Formula Now repeat the calculation for the image position using the Newtonian lens formula. The distance from the object to the object-space focal point is: x = OF = OV + VF = OV − FV = 3000 mm − 2100 mm = 900 mm Therefore the distance from the image-space focal point to the image is: 2 x0 = F0 O0 = feff (500 mm) 2500 = = mm ∼ = 277.8 mm x 900 mm 9 So the distance from the rear (image-space) vertex V0 to the image is: V0 O0 = V0 F0 + F0 O0 2500 3400 = 100 mm + mm = mm ∼ = 377.8 mm 9 9 which again agrees with the result obtained by the other two methods. 2.13.17 Images from Telephoto System: Image (1): Object at Object-Space Focal Point An object located at the object-space (“front”) focal point of the system is at the distance equal to the FFL from the first lens. In this case: z1 = F F L = +2100 mm z1 · f1 (+2100 mm) · 100 mm = z10 = = +105 mm z1 − f1 (+2100 mm) − (100 mm) The object distance to the second lens is: z2 = t − z10 = +80 mm − 105 mm = −25 mm which is the same as the focal length of the second lens, which means that the image distance from the second lens is infinite (as expected). z20 = z2 · f2 (−25 mm) · (−25 mm) = =∞ z2 − f2 (−25 mm) − (−25 mm) Image (2) from Telephoto System: Object at Object-Space Principal Point An object located at the object-space (“front”) principal point of the system is at the distance equal to the FFL from the first lens. In this case: z1 = F F L − feff = 2100 mm − 500 mm = 1600 mm z1 · f1 (1600 mm) · 100 mm 320 = =+ mm z10 = z1 − f1 (1600 mm) − (100 mm) 3 + 320 mm z0 1 (MT )1 = − 1 = − 3 =− z1 1600 mm 15 2.13 SYSTEMS OF THIN LENSES 73 : The object distance to the second lens is: 80 320 mm = − mm z2 = t − z10 = +80 mm − 3 ¢3 ¡ 80 mm · (−25 mm) − z · f 2 2 ¢ z20 = = ¡ 803 = −400 mm z2 − f2 − 3 mm − (−25 mm) z0 (−400 mm) ¢ = −15 (MT )2 = − 2 = − ¡ 80 z2 − 3 mm The system magnification for that object distance is the product of the two: µ ¶ 1 (MT )system = (MT )1 · (MT )2 = − · (−15) = +1 15 which again confirms that the transverse magnification is that expected for the object and image at the principal points. Image (3) from Telephoto System: Equal Conjugates If we move the object so that it is one focal length from the focal point and two focal lengths from the principal point, the object distance is: z1 = F F L + feff = 2100 mm + 500 mm = 2600 mm z1 · f1 (+2600 mm) · 100 mm = = +104 mm z10 = z1 − f1 (+2600 mm) − (100 mm) z0 (+104 mm) 1 (MT )1 = − 1 = − =− z1 (2600 mm) 25 The object distance to the second lens is: z2 = t − z10 = +80 mm − 104 mm = −24 mm z2 · f2 (−24 mm) · (−25 mm) = z20 = = +600 mm z2 − f2 (−24 mm) − (−25 mm) z0 (+600 mm) = +25 (MT )2 = − 2 = − z2 (−24 mm) The system magnification for that object distance is the product of the two: µ ¶ 1 (MT )system = (MT )1 · (MT )2 = − · (25) = −1 25 as expected for the object and image at the equal-conjugate points. 74 2.13.18 CHAPTER 2 RAY (GEOMETRIC) OPTICS Example (3) of Two-Lens System: Two Negative Lenses Now consider a system composed of a positive lens and a negative lens separated by just a bit more than the sum of the focal lengths: f1 = −100 mm, f2 = −25 mm, and t = +125 mm. The focal length of the equivalent thin lens is: f1 · f2 = H0 F0 = FH f1 + f2 − t (−100 mm) · (−25 mm) = = −10 mm (−100 mm) + (−25 mm) − 125 mm feff = Note that the focal length of the system negative and shorter than either lens.. Now locate the image-space focal point and principal point. For an object located at ∞, the BFL and FFL are found by substitution into the appropriate equation: (f1 − t) · f2 (f1 + f2 ) − t 45 (−100 mm − 125 mm) · (−25 mm) = − mm = −22.5 mm = (−100 mm) + (−25 mm) − 125 mm 2 BF L = −22.5 mm BF L = V0 F0 = f1 · (f2 − t) (f1 + f2 ) − t (−100 mm) · (−25 mm − 125 mm) = = −60 mm (−100 mm) + (−25 mm) − 125 mm F F L = −60 mm F F L = FV = 2.13 SYSTEMS OF THIN LENSES 75 (1) Object at Object-Space Focal Point An object located at the object-space (“front”) focal point of the system is at the distance equal to the FFL from the first lens. In this case: z1 = F F L = −60 mm (virtual object) z1 · f1 (−60 mm) · (−100 mm) = z10 = = +150 mm z1 − f1 (−60 mm) − (−100 mm) The object distance to the second lens is: z2 = t − z10 = +125 mm − 150 mm = −25 mm which is the same as the focal length of the second lens, which means that the image distance from the second lens is infinite (as expected): z20 = z2 · f2 (−25 mm) · (−25 mm) 625 mm2 = = =∞ z2 − f2 (−25 mm) − (−25 mm) 0 mm Images from System: (2) Object at Object-Space Principal Point An object located at the object-space (“front”) principal point of the system is at the distance equal to the FFL from the first lens. In this case: z1 = F F L − feff = −60 mm − (−10 mm) = −50 mm z1 · f1 (−50 mm) · (−100 mm) = z10 = = +100 mm z1 − f1 (−50 mm) − (−100 mm) z0 +100 mm (MT )1 = − 1 = − = +2 z1 −50 mm The object distance to the second lens is: z2 = t − z10 = +125 mm − 100 mm = +25 mm z2 · f2 (+25 mm) · (−25 mm) = z20 = = −12.5 mm z2 − f2 (+25 mm) − (−25 mm) z0 (−12.5 mm) 1 =+ (MT )2 = − 2 = − z2 (+25 mm) 2 The system magnification for that object distance is the product of the two: µ ¶ 1 (MT )system = (MT )1 · (MT )2 = (+2) · + = +1 2 which again confirms that the transverse magnification is that expected for the object and image at the principal points. Images from System: (3) Equal Conjugates If we move the object so that it is one focal length from the focal point and two focal lengths from the principal point, the object distance is: z1 = F F L + feff = −60 mm + (−10 mm) = −70 mm z1 · f1 (−70 mm) · (−100 mm) 700 1 = z10 = =+ mm = 233 mm z1 − f1 (−70 mm) − (−100 mm) 3 3 ¢ ¡ 700 0 mm z 10 =+ (MT )1 = − 1 = − 3 z1 (−70 mm) 3 76 CHAPTER 2 RAY (GEOMETRIC) OPTICS The object distance to the second lens is: 325 700 mm = − mm ∼ z2 = t − z10 = +125 mm − = −108.3 mm 3 3 ¢ ¡ 325 − 3 mm · (−25 mm) z2 · f2 ¢ z20 = = ¡ 325 = −32.5 mm z2 − f2 − 3 mm − (−25 mm) z0 (−32.5 mm) 3 ¢ =− (MT )2 = − 2 = − ¡ 325 z2 10 − 3 mm The system magnification for that object distance is the product of the two: µ ¶ µ ¶ 10 3 (MT )system = (MT )1 · (MT )2 = + · − = −1 3 10 as expected for the object and image at the equal-conjugate points. 2.14 Plane and Spherical Mirrors One of the most familiar optical elements is the plane mirror (you probably see one every morning!). For each ray incident at angle θ measured from the normal to the surface, a reflected ray is generated at angle −θ relative to the normal. Consider a full sphere with reflective surface on the inside and a point object O at the center, as shown in (a) in the figure. All rays from the object encounter the surface at normal and reflect back to form an image at the center. We can infer the focal length of the spherical concave mirror from this observation by noting that the object and image distances are identically R, so the focal length is determined by the thin-lens imaging equation: 1 1 1 + = f z1 z2 z1 = z2 = R =⇒ 1 1 1 2 R = + = =⇒ f = f R R R 2 Note that in this case of a complete sphere, the algebraic sign of the radius of curvature is not well defined, but since rays converge to form the image, the focal length clearly must be positive. Because the object and image distances are equal, this clearly is imaging at equal conjugates with transverse magnification is MT = −1: z2 2·f MT = − = − = −1 z1 2·f The negative sign on MT means that if the object source is moved “upward” from its position on the horizontal axis at the center, then the reflected rays will converge to a point “below” the optic axis, as shown in part (b) of the figure. In part (c) of the figure, half of the spherical mirror surface is removed so that all rays emitted towards the left will escape without striking the mirror and all rays emitted towards the right will strike the surface one time before returning to the “image” at the center and then escaping to the right. This mirror surface clearly makes rays converge to a real image coincident with the object and so must have a positive focal length EVEN THOUGH the radius of curvature R is negative (because V is to the right of C). 2.14 PLANE AND SPHERICAL MIRRORS 77 Spherical mirror: (a) rays from point source at center of sphere are all normal to the surface and reflect back upon themselves to form a point image at object, so that z1 = z2 = R; (b) if the point source is moved “upward”, the image moves “downward,” which shows that MT = −1; (c) half the sphere is removed leaving a hemisphere with R = CV < 0. Derivation of the focal length of a concave spherical mirror. The magnified section at the bottom shows the triangles used to evaluate f in terms of R: f = R 2 in the paraxial approximation. We can consider the hemispherical concave mirror with radius of curvature R = VC < 0. Even though the radius is negative, we have already inferred that the focal length of this system is positive since the image rays converge, so we have: f= |R| −R R = =− 2 2 2 78 CHAPTER 2 RAY (GEOMETRIC) OPTICS A ray from an object at infinity that is close to (and parallel to) the optical axis, as shown in the in the figure. From triangle ∆CAV in the magnified view, it is apparent that: sin [θ] = From ∆F0 AV, we see that x x x = = −R CV −VC tan [2θ] = x F0 V0 Now apply the paraxial approximation that sin [θ] ∼ = tan [θ] ∼ = θ if θ ∼ = 0: x ∼ = θ =⇒ x = −R · θ −R x∼ tan [2θ] = = 2θ =⇒ x = f · 2θ f sin [θ] = Now equate the two terms to find a relationship between f and R: −R · θ = f · 2θ =⇒ f = − R 2 This expression for the focal length may be substituted into the imaging equation for a single thin lens: 1 1 1 2 + = =− z1 z2 f R For the case just considered of a concave surface, R < 0 and f > 0. If the object distance z1 > f , then the image distance z2 is positive, BUT IS MEASURED FROM RIGHT TO LEFT. If the mirror is a convex spherical surface with R = VC > 0; the image of a ray from an object at infinity crosses the axis at the image-space focal point behind the mirror, so the optic makes rays diverge and therefore has negative power. Convex mirror has positive radius of curvature (R > 0) but the reflected rays diverge and so the R surface has negative focal length via f = − . 2 2.15 STOPS AND PUPILS 2.14.1 79 Comparison of Thin Lens and Concave Mirror Comparison of the vertices, focal points, principal points, and equal-conjugate points of a concave mirror and a thin lens. The vertices and the principal points coincide in both cases so that MT = +1 for object and image at the vertex of the mirror and at the surfaces of the lens. The object- and image-space focal points of the mirror coincide at the distance feff = − R 2 for the mirror, and the equal conjugate points are located at the center of curvature so that z1 = z2 = 2feff . For the lens, the equal conjugate points are also located such that z1 = z2 = 2feff with MT = −1. 2.15 Stops and Pupils In any multielement optical system, the beam of light that passes through the system is shaped like a solid circular “spindle” with different radii at different axial locations. A larger exiting ray cone means that more light reaches the image to make it brighter, so the diameter of this specific element is the limiting factor for image “brightness.” The diameter of one optical element will limit the size of the ray spindle that exits the system; this limiting element is the aperture stop of the system and may be a lens or an aperture with no power (an iris diaphragm) that is placed specifically to limit the diameter of the ray cone. Consider the example of a two-lens system with an iris positioned between them shown in the figure. The iris limits the cone of rays from the object at O 80 CHAPTER 2 RAY (GEOMETRIC) OPTICS Schematic of the aperture stop S and entrance and exit pupils E and E0 , respectively for a system formed from two positive lenses and an iris with no power. The entrance pupil E is the image of the stop S seen from the left through the first lens L1 , while the exit pupil is the image of S seen from the right through the second lens L3 . Note that the element that is the stop may vary with object location O. Obviously, the aperture stop in an imaging system composed of a single lens is that lens. In a two-element system, the stop will be one of the two lenses, determined by the relative diameters and the locations of the lenses. The image of the stop seen from the input “side” of the lens is the entrance pupil, which determines the angular spread of the ray cone from an object point that “gets into” the optical system, and thus determines the “brightness” of the image. The image of the stop seen from the output “side” is the exit pupil (once called the Ramsden disk ). In an imaging system intended for viewing by eye, it is useful to locate the exit pupil at the iris of the eye and to match its diameter to that of the iris of the eye to ensure that all light through the optical system makes it into the eye to form the viewable image. 2.15.1 Focal Ratio — f-number For multilens systems, the size of the entrance pupil determines the angular extent of the ray cone that enters the system from a point source. The figure shows a simple hypothetical imaging system with object-space and image-space principal points H and H0 , respectively and aperture stop of diameter d0 as the first element in the system (the same analysis applies for systems with the entrance pupil at other locations for an object at infinity). In this system, the stop is also is the entrance pupil. A point source at infinity creates a plane wave through the entrance pupil, which is then incident on the object-space principal plane H with the same diameter. The unit transverse magnification of the two principal planes ensures that the light emerging from the image-space 81 2.15 STOPS AND PUPILS principal plane H0 has that same diameter d0 = dNP . The cone angle of rays incident on the image plane at the image-space focal point F0 is the ratio of the diameter to the distance H0 F0 = feff : d0 dNP = feff feff This means that the focal ratio of the system is: f/# = feff dNP Note that a corresponding expression could be constructed based on the diameter of the exit pupil, but the propagation distance then would have to be the distance from the exit pupil to the image, which (in this case) is longer than the effective focal length. Specification of the system focal ratio: the plane wave from a point source at infinity is incident through the aperture stop with diameter d0 onto the object-space principal plane H. The light emerging from the image-space principal plane H0 has the same diameter d0 . The light propagates the focal length feff to the image. The angle of the ray cone is fde 0ff ,which is the system focal ratio f/#. This f-number specifies the ability of the system to collect light. 2.15.2 Example: Focal Ratio of Lens-Aperture Systems The focal ratio of a single thin lens obviously is the ratio of the focal length to the diameter of the lens: f f /# = d0 Note that the smallest possible focal ratio exists for a full sphere (which is anything but thin and the paraxial approximation certainly does not apply over its full diameter). It might be useful to determine the focal ratio for such a case with “normal” glass (n = 1.5). The focal length of the 82 CHAPTER 2 RAY (GEOMETRIC) OPTICS sphere in the (ridiculously invalid) thin-lens paraxial approximation where R = 12.5 mm is obtained from the lensmaker’s equation: µ µ ¶¶−1 1 1 f = (n2 − 1) − R1 R2 µ ¶−1 1 1 = (1.5 − 1) − 12.5 mm −12.5 mm = 3.125 mm The focal ratio is: f /# = 3.125 mm f 1 = = d0 25 mm 8 This is ridiculously invalid because it assumes that the sphere is simultaneously “thin” and “fat” If we assume the spherical lens is composed of two thin lenses at the vertices with the power of a single surface: f1 = f2 = µ 1.5 − 1 12.5 mm ¶−1 = 25 mm t = 25 mm f1 · f2 25 mm · 25 mm feff = = = 25 mm f1 + f2 − t 25 mm + 25 mm − 25 mm (f1 − t) · f2 BF L = =0 f1 + f2 − t Single Thin Lens + Aperture “in front” Consider a system with a diaphragm (iris or aperture) of diameter d0 located at a distance t “in front” of the lens with focal length f1 and diameter d1 . Since the aperture has no power to refract light (φ = 0 diopters), then its “focal length” is infinite (f0 = ∞). The focal length of the two-“lens” system is: µ ¶ f0 · f1 f0 feff = = f1 · lim = f1 f0 →∞ (f0 + f1 ) − t (f0 + f1 ) − t which makes sense: the focal length of a system consisting of one refracting element and one “nonrefracting” element is that of the refracting lens. For an object at infinity (z1 = ∞ =⇒ z2 = f1 ), the diaphragm is the aperture stop if its diameter is smaller than that of the lens: d0 < d1 =⇒ iris is aperture stop and the iris is also the entrance pupil. The focal ratio of the system is: f /# = f1 d0 The exit pupil may be located by applying the imaging equation: zXP = t · f1 t − f1 which shows that the exit pupil is virtual (“behind” the lens as seen from image space) if t < f1 . Note that if t = f1 so that the aperture is located at the object-space focal point of the system, then the distance from the lens to the exit pupil is infinite: the system is “telecentric in image space.” The exit pupil is real (and may be visualized on an observation screen) if zXP > 0 =⇒ t > f1 . Consider some examples with f1 = 100 mm, d1 = 25 mm, t = 25 mm, and d0 = 10 mm. If the iris 83 2.15 STOPS AND PUPILS is deleted, then the focal ratio is: f /# = feff 100 mm = = f /4 d1 25 mm The iris is the stop and entrance pupil. The location of the exit pupil is: t · f1 25 mm · 100 mm 100 = =− mm t − f1 25 mm − 100 mm 3 − 100 mm 4 = − 3 =+ 25 mm 3 40 1 = d0 · MXP = mm = 13 mm 3 3 zXP = MXP dXP The iris is the stop and entrance pupil, so the focal ratio is: f /# = 100 mm feff = = f /10 dNP 10 mm Single Thin Lens + Aperture “behind” If the lens comes first in the system, then we need to find the condition of the iris diameter to determine if it is the aperture stop. At some risk of confusion, we’ll maintain the notation where the diameter of the lens is d1 and that of the aperture is d0 even though it is second in the system. For an object at infinity, the figure shows that the distance to the iris must be less than the focal length to have any possibility of being the aperture stop. The image of the aperture seen from object space is located at t · f1 z= t − f1 which is positive (so the entrance pupil is real) if t < f1 . The transverse magnification of the entrance pupil is: z f1 MT = = t t − f1 which implies that the diameter of the image of the iris is: d00 = MT · d0 If we use the same numerical values as before but with the iris “behind,” the distance to the entrance pupil is: t · f1 25 mm · 100 mm 100 = zN P = =− mm t − f1 25 mm − 100 mm 3 − 100 mm zNP 4 =− 3 =+ 25 mm 25 mm 3 4 40 dN P = + · 10 mm = mm 3 3 This is the diameter of the incoming beam at the lens, so the focal ratio is: MN P = − f /# = 100 mm feff = 40 = f /7.5 dNP 3 mm 84 CHAPTER 2 RAY (GEOMETRIC) OPTICS Three examples of systems: the first is a single thin lens with the aperture stop at the lens, so the stop coincides with the entrance and exit pupils; the second moves the iris “in front” of the lens so that it is also the entrance pupil; in the third, the iris is behind the lens and the magnified diameter of the entrance pupil is the relevant parameter for the focal ratio. 85 2.15 STOPS AND PUPILS 2.15.3 Example: Exit Pupils of Telescopic Systems Galilean Telescope In the example of a telescopic system, such as binoculars, composed of an objective lens L1 with diameter d1 and an eyelens L2 with diameter d2 , where the two lenses are separated by the sum of their focal lengths. Consider the specific example of a Galilean telescope with f1 = +200 mm, D1 = 50 mm, f2 = −25 mm, D2 = 25 mm, and t = f1 + f2 = 175 mm. We have already seen that the angular magnification of the system is the ratio of the focal lengths of the two lenses: Mθ = − f1 +200 mm =− = +8 f2 −25 mm To determine which element is the aperture stop for a ray incident from an object at infinity, we need to determine where this ray strikes the second lens. In this case, it strikes well within the lens diameter — the ray height from the first lens is: ¶ µ µ ¶ 175 mm d1 t 25 d2 = 25 mm · 1 − y= · 1− = mm = 3.125 mm < 2 f1 200 mm 8 2 so the first lens is the aperture stop, and therefore also the entrance pupil. Location of aperture stop for the specified Galilean telescope. Since the ray from infinity that strikes the edge of the positive lens passes well within the boundary of the negative lens, the aperture stop is the positive lens for an object at infinity. The exit pupil is the image of the aperture stop (first lens) seen through the second lens, which has negative focal length, ensuring that the exit pupil will be virtual. The distance from the stop to the second lens is: z2 = t = f1 + f2 = 175 mm and the image distance from the second lens is: z20 = z2 · f2 175 mm · (−25 mm) 175 = =− mm = −21.875 mm z2 − f2 175 mm − (−25 mm) 8 86 CHAPTER 2 RAY (GEOMETRIC) OPTICS Figure 2.1: The size of the exit pupil is determined from the transverse magnification: MT = − − 175 mm z20 1 =− 8 =+ z2 175 mm 8 Since the diameter of the stop is d1 = 50 mm, the diameter of the exit pupil is: 1 dXP = MT · dStop = + · 50 mm = +6.25 mm 8 For the Galilean telescope, the exit pupil is virtual (located 21.875 mm “behind” the eyelens) and small. Keplerian Telescope Now repeat the analysis for a corresponding Keplerian telescope with f1 = +200 mm, d1 = 50 mm, f2 = +25 mm, d2 = 25 mm, t = f1 + f2 = 225 mm and angular magnification: Mθ = − f1 +200 mm =− = −8 f2 +25 mm Again, the height of the ray at the edge of the first lens from an object at infinity has height at the second lens: ¶ µ µ ¶ 225 mm t 25 d1 = 25 mm · 1 − · 1− = − mm = −3.125 mm y = 2 f1 200 mm 8 d2 |y| < 2 The first element is still the stop and the entrance pupil. The image of the first lens through the 87 2.15 STOPS AND PUPILS second is the exit pupil; its location and size are determined using the thin-lens imaging equation: z2 = t = f1 + f2 = 225 mm z2 · f2 225 mm · 25 mm 225 = z20 = = mm = +28.125 mm z2 − f2 225 mm − 25 mm 8 225 mm z0 1 =− MT = − 2 = − 8 z2 225 mm 8 µ ¶ 1 dXP = dStop · MT = 50 mm · − = −6.25 mm 8 The exit pupil is “real” (outside of the system at a distance of 28.125 mm beyond the eyelens) and inverted. In both of the telescopes just considered, note that the diameter of the exit pupil is the ratio of the focal length of the eyepiece and the focal ratio of the object lens: dXP = ³ (dXP )Galilean (dXP )Keplerian f2 f1 d1 d d ´ = ³ 1´ = 1 f1 Mθ f2 50 mm = = 6.25 mm +8 50 mm = −6.25 mm = −8 In words, the diameter of the exit pupil is equal to the ratio of the diameter of the entrance pupil (which is the objective in this case) and the magnifying power; more power means a smaller exit pupil. Common binoculars used for birdwatching are listed as “10 × 50,” which means that the angular magnification (magnifying power) is 10 and the diameter of the entrance pupil (which is that of the objective lens0 is 50 mm / 2 in. The diameter of the eyelens is: dXP = 50 mm = 5 mm 10 Until recently, the most common variety of binocular was the “7 × 50,” which has a magnifying power of 7 and objectives with d = 50 mm, so the diameter of the exit pupil is: dXP = 50 mm ' 7 mm 7 This is a close match to the diameter of the iris of the dark-adapted eye and thus are a good choice for astronomical viewing; for that reason, 7 × 50 binoculars were known as “night glasses.” When used with the smaller iris diameter of the eye during daytime, much of the diameter of the exit pupil would illuminate the opaque iris and not contribute to the brightness of the image on the retina. For a formerly common amateur telescope with a mirror objective with d1 = 6 in ∼ = 150 mm and a focal length f1 = 48 in ∼ = 1220 mm, the focal ratio is: f /# = 48 in =8 6 in so the diameter of the exit pupil is when viewed through an eyelens with focal length f2 is dXP = f2 f2 = f /# 8 If the focal length of the eyelens is f2 = 25 mm ∼ = 1 in, then the diameter of the exit pupil is about 3 mm, which is pretty small. If the focal length of the eyelens is f2 = 4 mm ∼ = 16 in, the magnifying 88 CHAPTER 2 RAY (GEOMETRIC) OPTICS power of the system is: Mθ = f1 ∼ 48 in = +288 = 1 f2 6 in which is a large number that will impress a naive user. BUT the diameter of the exit pupil is very small 1 in f2 1 = 6 = in ∼ dXP = = 0.5 mm 8 8 48 so it would be very difficult to “see” anything through this telescope. This illustrates the flaw in the strategy that was once used often by manufacturers of cheap telescopes intended as gifts for children; the manufacturers would often quote a very large value for the magnifying power that required an eyepiece with a very short focal length and therefore a very small exit pupil. The images were very difficult to see by novices and experienced users alike. The location of the exit pupil also is important. It is useful to have it placed “outside” the imaging system where the eye would be located so that it is feasible to get all of the light through the pupil into the eye. The distance from the rear vertex of the system to the exit pupil is the eye relief : V0 E0 = eye relief An imaging system with “lots of” eye relief may be easier to view through, since the location where the eye is optimally placed is back away from the eyelens. An example of a system that needs a large eye relief is a rifle scope, where the eyepiece lens will be located “far” in front of the viewing eye. For different object distances, it is possible for the aperture stop to “move around,” i.e., the element that defines the aperture stop may change with object distance. The locations and sizes of the pupils are determined by applying the ray-optics imaging equation to these objects. To some, the concept of finding the “image of a lens” may seem confusing, but it is no different from before — just think of the lens as a regular opaque object at its location and find the images through the optics that come after (for the exit pupil) or that came before (entrance pupil). Which element in a multielement system is the “stop” depends on the relative sizes of the lenses. In the first case shown below, the first lens (the objective) is small enough that it acts as the stop (and thus also the entrance pupil). The image of the objective lens seen through the eyelens is the exit pupil, and is “between” the two lenses and very small. Because the exit pupil is small and “remote” (located “within” the optical system), so is the field of view of the Galilean telescope. In the second example, the smaller eyelens is the stop and also the exit pupil, while the image of the eyelens seen through the objective is the entrance pupil and is far behind the eyelens and relatively large. More Examples of Galilean and Keplerian Telescopes Consider the two two-lens telescope designs. The Galilean telescope has a positive-power objective and a negative-power ocular or eyelens. The Keplerian telescope has a positive objective and a positive eyelens. Assume that the objective is identical in the two cases with f1 = +100 mm and d1 = 30 mm. The focal lengths and diameters of the oculars (eyepieces) are f = ±15 mm and d2 = +15 mm (these are the approximate dimensions and focal lengths of the lenses in the OSA Optics Discovery Kit). The lenses of a telescope are separated by f1 + f2 , (f1 + f2 ∼ = 85 mm and 115 mm for the Galilean telescope and Keplerian telescope, respectively). We want to locate the stops and pupils. The stop is found by tracing a ray from an object at ∞ through the edge of the first element and finding the ray height at the second lens. If this ray height is small enough to pass through the second lens, then the first lens is the stop; if not, then the second lens is the stop. 2.15 STOPS AND PUPILS 89 Galilean telescope for object at z1 = +∞: (a) the objective lens is the aperture stop and entrance pupil because it limits the cone of entering rays. The image of the stop seen through the eyelens is the (very small) exit pupil; (b) the larger objective means that the eyelens is the aperture stop and the exit pupil. The image of the eyelens seen through the objective is the entrance pupil, and is behind the eyelens because the object distance to the objective is less than the focal length. Consider the Galilean telescope first. The ray height at the first lens is the “semidiameter” of the lens: d21 = 15 mm; it is not called the “radius” to avoid confusion with a “radius of curvature.” From there, the ray height would decrease to 0 mm at a distance of f1 = +100 mm, but it first encounters the negative lens at a distance of t = +85 mm. The ray height at this lens is 100 mm − 85 mm · 15 mm = 2.25 mm 100 mm which is much smaller than the lens semidiameter of d22 = 7.5 mm. Hence the first lens (the objective lens) is the stop. The entrance pupil is the image of the stop through all of the elements that come before the stop. In this case, the first lens is also the entrance pupil and its transverse magnification is unity. The exit pupil is the image of the stop through all elements that come afterwards, which is the negative lens. The distance to the “object” is f1 + f2 = 85 mm, so the imaging equation is used to locate the exit pupil and determine its magnification: 1 1 1 1 = = + 85 mm z 0 f2 −15 mm µ ¶−1 1 51 1 0 z = − = − mm = −12.75 mm − 15 mm 85 mm 4 z0 −12.75 mm = 0.15 MT = − = − z 85 mm The exit pupil is upright, but more important, its distance from the second lens is negative; the exit pupil is a virtual image and not accessible to the eye. The viewer “sees” the exit pupil in front of 90 CHAPTER 2 RAY (GEOMETRIC) OPTICS the eye. This limits the field of view of the Galilean telescope. Follow the same procedure to determine the stop and locate the pupils and their magnifications for the Keplerian telescope. The ray height at the first lens for an object located at ∞ is again 15 mm. The ray height decreases to 0 mm at the focal point, but then decreases still farther until encountering the ocular lens at a distance of f1 + f2 = 115 mm. The ray height h at this lens is determined from similar triangles: 15 mm 100 mm = =⇒ h = −2.25 mm −h 15 mm So the first lens is the stop and entrance pupil (with unit magnification) in this case too. The distance from the stop to the second lens is f1 + f2 = 115 mm, so the imaging equation for locating the exit pupil and determining its magnification is: 1 1 1 1 = + 0 = 115 mm z f2 +15 mm µ ¶−1 1 69 1 z0 = + = + mm = +17.25 mm − 15 mm 115 mm 4 0 z +17.25 mm ∼ MT = − = − = −0.203 z 85 mm The exit pupil is a real image of the aperture stop in the Keplerian telescope — we can place our eye at it and see a larger field of view. Vignetting The location of the aperture stop is determined for an object located “on” the optical axis. If the object is “off” the axis, the cone of rays that get throught the system is “skewed” or “tilted.” If other elements in the system (lenses or diaphragms) constrain parts of the skewed cone of rays, then the cone of rays is truncated and the brightness of the image is reduced; this phenomenon is “vignetting.” Example of vignetting; the brightness of the scene at the edges is reduced due to the presence of an “out-of-focus” aperture in the system. 2.15.4 Pupils and Diffraction The concept of pupils may be combined with diffraction to evaluate the effective focal ratio (f/number) of the imaging system. For a single thin lens, the diffraction spot is determined by the size and shape specified by the pupil function p [x, y] or p (r) and the distance to the image. If the lens has a circular 91 2.16 MARGINAL AND CHIEF RAYS pupil of diameter d0 , the pupil function p (r) = CY L µ r d0 ¶ determines the extent of the ray cone that enters the system. We derived the resulting diffraction pattern, which is proportional to a scaled circularly symmetric sombrero function, which is the analogue of the SINC function using the first-order Bessel function, and therefore is sometimes called the “besinc” function. ⎞ ⎛ 2 πd0 r ⎠ ´ h (r) ∝ · SOM B ⎝ ³ λ0 z2 4 d0 If the object distance is large, then the image distance z2 ' f and the amplitude of the impulse response is: ⎛ ⎞ r h (r) ∝ SOM B ⎝ ³ ´ ⎠ λ0 f d0 The diameter of the Airy disk is approximately: µ ¶ f ∼ ∼ D0 = 2.44λ0 = 2.44 · λ0 · f/# d0 2.15.5 Field Stop As suggested by its name, a field stop limits the field of view of the system. It may be as simple as the finite size of the sensor (e.g., a rectangular piece of photosensitive emulsion or a CCD sensor), or it may be placed at an intermediate image within the system or even at the object itself. Images of the field stop are located at the same locations as intermediate images of the object. 2.16 Marginal and Chief Rays Many important characteristics of an optical system, including the possible presence of vignetting, are determined by the trace of two specific rays through the imaging system. For an object O with image O0 , aperture stop S and entrance pupil E and exit pupil E0 , the marginal ray traces from the center of O to the edge of S and back to the center of O0 . The chief ray (or principal ray) is traced from the edge of O (or edge of the “field of view”) hrough the center of S to the edge of O0 . Since E and E0 are images of the stop S, the marginal and chief rays also go through the edges and centers of the pupils, respectively. The marginal ray is specified by its ray heights y and ray angle u at different points on the optical axis; the corresponding notation for the chief ray includes “overscores” or “bars:” y, u. Heights and angles of the marginal ray after refraction at a surface are “primed,” e,g, y 0 and u0 . The corresponding quantities for the chief ray are y 0 , and u0 . From the definition of the marginal ray, an object or image is located at any location (value of z) where y = 0. Similarly, the aperture stop, entrance pupil, and exit pupil are located at values of z where y = 0. An image exists wherever the marginal ray crosses the axis and the aperture stop or pupils are located wherever the chief ray crosses the axis. Complete specification of these two rays is sufficient to characterize the location of object and image(s), the field of view, and the magnifications. The chief ray is the axis of the unvignetted light beam from a point at the edge of the field of view. The radius of the unvignetted light beam (or perhaps more appropriately called the semidiameter to avoid potential confusion with the “radius of curvature) is the sum of the heights of the marginal and chief rays: dunvignetted = y + y at any location z 2 92 CHAPTER 2 RAY (GEOMETRIC) OPTICS Figure 2.2: The marginal and chief rays for a two-element imaging system where the second element is the stop. The marginal ray comes from the center of the object O, grazes the edge of the stop and through the center of the image O0 . The chief ray travels fromt the edge of the object through the center of the stop to the edge of the image. Because paraxial calculations are linear, it is customary to normalize the ray heights and angles for the calculation and then scaling the results to satisfy the conditions of the specific system. For example, we generally select the chief ray height y = 1 and the marginal ray angle u = 1 at the object. Clearly the choice of unit ray angle (in radians) is inconsistent with the paraxial approximation, but this is just a computational convenience because all quantities are scalable. 2.16.1 Telecentricity If the aperture stop is located such that the entrance and/or exit pupils are at infinity, then the system is telecentric. One way to do this is to place the aperture stop at one of the focal points of the system, which means that the corresponding pupil is at the same location and the other pupil is at infinite. As shown in the figure, if the stop is located at the object-space focal point of a single thin lens, then the entrance pupil is at the same location and the exit pupil is at infinity in image space — this is an image-space telecentric system. 2.16 MARGINAL AND CHIEF RAYS 93 Telecentric system consisting of single thin lens with aperture stop placed at object-space focal point, showing chief ray (solid blue) and marginal ray (red). The chief intersects the optical axis at that focal point and so emerges from the lens parallel to the optical axis. The dashed blue lines parallel to the chief ray intersect at the image. The defocused image is the same height as the focused image. If the stop is located at the image-space focal plane, then the entrance pupil is at infinity, forming an object-space telecentric system. If either the entrance or exit pupil is at infinity, then the chief ray must be parallel to the optical axis on that side of the imaging system. This means that the system transverse magnification will be constant even if the image is blurry. Put another way, a blurred image has the correct magnification. A “double telecentric” system is an afocal system (telescope) with the stop located at the common focal plane of the two lenses. This means that both the entrance and exit pupils are at infinity. The fact that the magnification of the system does not depend on accuracy of focusing makes telecentric systems particularly useful for metrology. Double telecentric system with the aperture stop at the common focal point of the two lenses. The marginal ray is shown in red and the chief ray in solid blue. 94 2.16.2 CHAPTER 2 RAY (GEOMETRIC) OPTICS Marginal and Chief Rays for Telescopes The marginal ray of an afocal system used to image an object at infinity travels parallel to the optical axis before the first lens and after the last (u = 0, u0 = 0). The relative sizes of the two lenses determine which is the aperture stop — for a Galilean telescope, the aperture stop is usually the negative ocularlens MORE TO COME Chapter 3 Tracing Rays Through Optical Systems The imaging equation(s) become quite complicated in systems with more than a very few lenses. However, we can determine the effect of the optical system by ray tracing, where the action on two (or more) rays is determined. Raytracing may be paraxial or exact. Historically, graphical, matrix, or worksheet ray tracing were commonly used in optical design, but most ray tracing is now implemented in computer software so that exact solutions are more commonly implemented than heretofore. 3.1 Paraxial Ray Tracing Equations Consider the schematic of a two-element optical system made of thick lenses, so the vertices and principal planes of individual lenses do not coincide at the same points. Schematic of ray tracing of a provisional marginal ray from an object at an infinite distance. The system has two elements and the locations Hn and Hn0 are the principal planes of the nth element. The ray height at the nth element is yn and the ray angle during transfer between elements n − 1 and n is un . The two elements are represented by their two principal “planes”, which are the planes of unit magnification. The refractive power of the first element changes the ray angle of the input ray. In the example shown, the input ray angle u1 = 0 radians, i.e., the ray is parallel to the optical axis. The height of this ray above the axis at the object-space principal plane H1 is y1 units. The ray 95 96 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS Figure 3.1: Refraction of a paraxial ray at a surface with radius of curvature R between media with refractive indices n and n0 . The ray height and angle at the surface are y and u, respectively. The angle of the ray measured at the center of curvature is α. The height and angle immediately after refraction are y and u0 . The object and image distances are s and s0 (which are now called z and z 0 in the text). emerges from the principal plane H01 at the same height y1 but with a new ray angle u2 . The ray “transfers” to the second element through the distance t2 in the index n2 and has ray height y2 at principal plane H2 . The ray emerges from the principal plane at the same height but a new angle u3 . 3.1.1 Paraxial Refraction Consider refraction of a paraxial ray emitted from the object O at a surface with radius of curvature R. For a paraxial ray, the surface may be drawn as “vertical”. The height of the ray at the surface is y. From the drawing, the incoming ray angle u measured from the optical axis is: hy i y ∼ u = tan−1 = >0 z z and the corresponding equation for the outgoing ray measured from the optical axis is: hyi y u0 = tan−1 0 ∼ = 0 >0 z z The angle of the height of the ray at the refractive surface measured from the center of curvature is: hyi y ∼ α = − tan−1 =− R R The incident and refracted angles measured from the surface at height y are the angles of incidence and refraction. From the drawing: i = u−α i0 = u0 − α 3.1 PARAXIAL RAY TRACING EQUATIONS 97 Now apply Snell’s law in the paraxial approximation: n sin [i] = n0 sin [i0 ] =⇒ n · i ∼ = n0 · i0 n · (u − α) = n0 · (u0 − α) =⇒ n0 u0 ∼ = nu − nα + n0 α = nu + α (n0 − n) ³ y´ = nu + − · (n0 − n) R (n0 − n) = nu − y · R ≡ nu − y · φ n0 u0 ∼ = nu − y · φ The paraxial refraction equation in terms of the incident angle u, refracted angle u0 , ray height y, 1 surface power φ = , and indices of refraction n and n0 is: f φ= 3.1.2 n0 u0 − nu y Paraxial Transfer Paraxial transfer from one surface to the next in a medium with refractive index n0 . The transfer equation determines the ray height y 0 at the next surface given the initial ray height y, the physical distance t0 and the ray angle u0 in the medium with index n0 . From the drawing, we have: y 0 = y + t0 · u0 µ 0¶ t y0 = y + · (n0 u0 ) n0 where the substitution was made to put the ray angle in the same form n0 u0 that appeared in the 0 refraction equation. The distance nt 0 ≤ t0 is called the reduced thickness (note the potential for 98 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS confusing reduced thickness 3.1.3 t0 n0 and optical path length n0 t0 ). Linearity of the Paraxial Refraction and Transfer Equations Note that both the paraxial refraction and transfer equations are linear in the height and angle, i.e., neither includes any operations involving squares or nonlinear functions (such as sine, tangent, or logarithm). Among other things, this means that they may be scaled by direct multiplication to obtain other “equivalent” rays, as to match the marginal ray height to the semidiameter of the aperture stop or the chief ray angle to the semidiameter of the field stop. For example, the output angle may be scaled by scaling the input ray angle and the height by a constant factor α: α (nu − yφ) = α · (nu) − (α · y) φ = α (n0 u0 ) We will take often advantage of this linear scaling property to scale rays to to find the exact marginal and chief rays from the provisional counterparts. 3.1.4 Paraxial Ray Tracing To characterize the paraxial properties of a system, two provisional rays are traced: 1. Initial height of marginal ray at first surface: y = 1.0, initial marginal ray angle nu = 0; 2. Initial height of chief ray at first surface: y = 0.0, initial chief ray angle nu = 1. We have already named these rays; the first is the provisional marginal ray that intersects the optical axis at the object (and thus also at every image of the object). The second ray (distinguished by the overscore) is called the provisional chief (or principal) ray and travels from the edge of the object to the edge of the field of view through the center of the stop (and thus through the center of the pupils, which are images of the stop). Since the paraxial ray tracing equations are linear, these provisional rays may be scaled to the parameters of the system. The process of ray tracing is perhaps best introduced by example. Consider a two-element three-surface system. The first surface is the cornea, with radius of curvature in the model of R1 = +7.8 mm. The “aqueous humor” between the cornea and the lens has a thickness of in the model of 3.6 mm and refractive index of n2 = 1.336. The surfaces of the lens have curvatures R2 = +10 mm, and R3 = −6 mm, thickness of 3.6 mm, and refractive index n3 = 1.413. The “vitreous humor” between the lens and the retina has the same refractive index of n4 = 1.336 as the “aqueous humor.” 3.1 PARAXIAL RAY TRACING EQUATIONS 99 Marginal and chief rays traced through the three-surface optical system. The refraction at the first surface changes the angle but not the height of a ray from the object. If the incident ray angle is 0 radians, then the new ray angle for the provisional marginal ray is: £ ¤ (n0 u0 )1 = (nu)1 − y1 [ mm] · φ1 mm−1 = 0 − (1.0) (+0.043077) = −0.043077 radian Note that we are retaining 6 decimal places in this calculation to ensure the best result at the end. We will then truncate (round) the value to a more reasonable accuracy. The transfer equation for the provisional marginal ray between the first and second surface changes the height of the ray but not the angle. The height at the second surface is: µ 0¶ t y10 = y1 + (n0 u0 )1 [ mm] n0 1 3.6 =1+ (−0.043077) = +0.883924 mm 1.336 ∼ −0.04 radians and arrives at the Thus the ray exits the first surface at the “reduced angle” n0 u0 = second surface at height y 0 ∼ = +0.88 units. The corresponding equations for the chief ray at the first surface are: (n0 u0 )1 = (nu)1 − y 1 φ1 = 1 − (0.0) (+0.043077) = 1 radian 100 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS y10 µ ¶ t0 = y1 + (n0 u0 )1 n0 1 3.6 =0+ (1) = +2.694611 mm ∼ = 2.695 mm 1.336 Since the provisional chief ray went through the center of the lens, its angle did not change. The height of the chief ray at the second surface is proportional to the ray angle. Ray-Tracing Table The equations may be evaluated in sequence to compute the rays through the system. These are presented in the table. Each column in the table represents a surface in the system and the “primed” quantities refer to distances and angles following the surface. In words, t0 in the first row are the distances from the surface in the column to the next surface. P aram eter In i t i a l R t0 n0 1.0 0 −φ = − n −n R t0 n0 S u rfa ce 1 S u rfa ce 2 S u rfa c e 3 +7.8 mm +10.0 mm −6.0 mm 3.6 mm 3.6 mm 1.336 1.413 1.336 −0.043077 mm−1 3.6 mm = 2.694611 mm 1.336 −0.007700 mm−1 3.6 mm = 2.54771 mm 1.413 −0.012833 mm Im a g e S u rfa c e ⇓ 12.699 mm R ays ⇓ y n0 u0 1 mm 1 mm 0.883924 mm 0.756833 mm 0 mm 0 −0.043077 r a d i a n −0.049883 r a d i a n −0.059596 r a d i a n −0.059596 r a d i a n 0 mm 2.694611 mm 5.189519 mm 16.779317 mm 1 ra d ia n 1 ra d ia n 0.979251 r a d i a n 0.912654 r a d i a n y n0 u0 The raytrace indicates ⎤ marginal ray emerges from the last surface with height ⎡ ⎤ ⎡ that the provisional y 0.756833 mm ⎦. These are used to calculate the (boxed) distance to n0 u0 −0.059596 radians the image location (where the marginal ray height is 0): and angle ⎣ ⎦=⎣ t0 0 0 (n u ) n0 t0 0 = (+0.756833) + 0 (−0.059596) n t0 +0.756833 ∼ =⇒ 0 = = +12.699 mm n 0.059596 y0 = 0 = y + This is the “reduced distance” in the image medium with index n4 ; the physical distance t0 is: =⇒ t0 = +0.756833 mm · n0 = 12.699 · 1.336 ∼ = 16.966 mm 0.059596 The height and angle of the provisional chief ray at the image location are y ∼ = 16.78 mm and n0 u0 ∼ 0.91 radians, respectively, which may be scaled to the size of a known sensor to determine = the field of view. This particular system is often used as a model for the human eye with the lens “relaxed” to view objects at ∞. The first surface represents the cornea of the eye, while the other two surfaces are the front and back of the lens. Note that the power of the cornea (0.043077 mm−1 ∼ = 43 diopters) is considerably larger than the powers of the lens surfaces (7.7 diopters and 12.8 diopters, respectively). 3.2 Matrix Formulation of Paraxial Ray Tracing The same linear paraxial ray tracing equations may be conveniently implemented as matrices acting on ray vectors for the marginal and chief rays whose components are the height and angle. The ray 3.2 MATRIX FORMULATION OF PARAXIAL RAY TRACING 101 vectors may be defined as: ⎡ y paraxial marginal ray vector : ⎣ nu paraxial chief ray vector : ⎣ nu ⎡ y ⎤ ⎦ ⎤ ⎦ Note that there is nothing magical about the convention for the ordering of y and nu (i.e., which goes “on top” of the vector); this is the convention used by Roland Shack at the Optical Sciences Center at the University of Arizona, but Willem Brouwer’s book “Matrix Methods in Optical Instrument Design” uses the opposite order. Note that the choice of convention here determines the form of the system matrix, but the two choices are equivalent. In this notation, the two column vectors that represent the marginal and chief rays may be combined to form a ray matrix L: ⎞⎤ ⎡ ⎤ ⎡⎛ ⎞⎛ y y y y ⎠⎦ = ⎣ ⎦ ⎠⎝ L ≡ ⎣⎝ nu nu nu nu which may be evaluated at any point in the system. The determinant of this ray matrix is: det [L] = y · (nu) − (nu) · y ≡ ℵ which we shall show to be a constant — the so-called Lagrange invariant. In words, the Lagrange invariant is the product of the chief ray height and marginal ray angle subtracted from the product of the marginal ray height and chief ray angle. We denote it by the symbol ℵ (“aleph,” chosen here for the simple reason that it is distinctive). We shall see that ℵ is unaffected by both the refraction and transfer, and therefore is invariant as we progress through different locations in the system. 3.2.1 Refraction Matrix Given the ray vectors or the ray matrix, we can now define operators for refraction and transfer. Recall that paraxial refraction of a marginal ray and of a chief ray at a surface with power φ changes the ray angles but not the heights (at the surfaces): n0 u0 = nu − y · φ for marginal ray n0 u0 = nu − y · φ for chief ray The refraction process for the marginal ray may be written as a matrix R and the output is the product with the ray vector which will have the same ray height and a different angle: ⎡ ⎤ ⎤ ⎡ y y y y ⎦ ⎦ = ⎣ R⎣ nu nu n0 u0 n0 u0 ⎤ ⎡ a c ⎦ R = ⎣ b d 102 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS where we need to evaluate the four values a − d. Consider the action of the refraction matrix on the marginal ray: ⎤ ⎡ ⎤⎡ ⎤ ⎡ ⎡ ⎤ y y a c y ⎦=⎣ ⎦⎣ ⎦ = ⎣ ⎦ R⎣ n0 u0 nu b d nu ay + c · (nu) = y =⇒ a = 1, c = 0 by + d · (nu) = n0 u0 = nu − y · φ =⇒ b = −φ, d = 1 substitute these values to see the form of the refraction matrix: ⎡ ⎤ 1 0 ⎦ R=⎣ −φ 1 The determinant of the refraction matrix is: ⎤ ⎡ 1 0 ⎦ = (1) (1) − (−φ) (0) = 1 det R = det ⎣ −φ 1 The action of a refraction matrix R on a ray matrix L is: RL = L0 ⎤ ⎡ ⎤ y0 1 0 y y y0 ⎣ ⎦⎣ ⎦=⎣ ⎦ nu nu −φ 1 n0 u0 n0 u0 ⎡ ⎤ y y ⎦ =⎣ nu − y · φ nu − y · φ ⎡ ⎤⎡ The determinant of the ray matrix after refraction is: £ ¤ det L0 = y (nu − y · φ) − y (nu − y · φ) = y · nu − yy · φ − y · nu + yy · φ = y · nu − y · nu = ℵ = det [L] which confirms that the Lagrangian invariant is not affected by refraction. 3.2.2 Ray Transfer Matrix The transfer of the marginal ray from one surface to the next within the medium with index n0 is y0 = y + t0 0 0 (n u ) n0 which also may be written as the product of a ray matrix T with the marginal ray vector: µ 0¶⎤ ⎡ ⎤ ⎡ t 0 0 y ⎢ y + (n u ) n0 ⎥ ⎣ ⎦ T =⎣ ⎦ n0 u0 n0 u0 ⎡ µ 0 ¶ ⎤⎡ ⎤ ⎡ ⎤ t 0 1 y y ⎢ ⎦=⎣ ⎦ n0 ⎥ =⎣ ⎦⎣ 0 0 0 0 n u n u 0 1 3.2 MATRIX FORMULATION OF PARAXIAL RAY TRACING 103 so the determinant of the transfer matrix also is 1: ⎡ µ 0¶⎤ t µ 0¶ t ⎢ 1 n0 ⎥ det ⎣ =1 ⎦ = (1) (1) − (0) n0 0 1 The action of the transfer matrix T on the ray matrix L is: ⎡ ⎤ ⎡ µ t0 ¶ ⎤ ⎡ ⎤ 0 0 1 y y y y n0 ⎥ ⎦=⎢ ⎦ L0 = T L = ⎣ ⎦⎣ ⎣ 0 0 0 0 n0 u0 n0 u0 nu nu 0 1 ⎡ µ 0¶ µ 0¶ ⎤ t t 0 0 0 0 y + u y + u · n · n ⎢ ⎥ n0 n0 =⎣ ⎦ n0 u0 n0 u0 and the determinant of the ray matrix after the transfer operation is: det [L0 ] = det [T L] µ µ 0¶ µ µ 0¶ ¶ ¶ t t 0 0 0 0 0 0 = y + n u (n u ) − y + nu (n0 u0 ) n0 n0 µ 0¶ µ 0¶ t t 0 0 0 0 0 0 0 0 =y ·nu + n u · n u − y · nu − n0 u0 · n0 u0 n0 n0 = y 0 · n0 u0 − y 0 · n0 u0 = ℵ = det [L] so the determinants of the ray matrix before and after refraction are also identically the Lagrangian invariant ℵ; in other words, neither the refraction nor the transfer matrices has any effect on the determinant of a ray matrix, so the Lagrangian invariant is preserved by refraction or transfer (hence its name!). Ray Transfer Matrix for an Optical System The refraction and transfer matrices may be combined in sequence to model a complete system. If we start with the marginal ray vector at the input object, the first operation is transfer to the first surface. The next is refraction by that surface, transfer to the next, and so forth until a final transfer to the output image: ¡ ¢ T n Rn · · · T 2 R2 T 1 R1 T 0 Lob ject = Limage If the initial ray matrix is located at the object (as usual), the marginal ray height is zero, so the ray matrix at the object and any images has the form: ⎤ ⎡ 0 y in ⎦ Lob ject = ⎣ (nu)in (nu)in ⎡ ⎤ 0 y out ⎦ Limage = ⎣ (nu)out (nu)out 104 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS so the system from object to image is: S ≡ T n Rn · · · T 2 R2 T 1 R1 T 0 S · Lob ject = Limage ⎡ ⎤ ⎡ ⎤ y in y out 0 0 ⎦=⎣ ⎦ (T n Rn · · · T 2 R2 T 1 R1 T 0 ) ⎣ (nu)in (nu)in (nu)out (nu)out Note that the individual refraction and transfer matrices are sequenced in inverse order, i.e., the last matrix is the first in the sequence for the system. The transfer matrix T 0 acts on the input ray matrix, so it must appear on the right. Ray Matrix for Provisional Marginal and Chief Rays The system is characterized by using provisional marginal and chief rays located at the object. The linearity of the computations ensure that the rays may be scaled subsequently to satisfy other system constraints, such as the diameter of the stop. The provisional marginal ray at the object has height y = 0 and ray angle nu = +1, while the provisional chief ray at the object has height y = +1 and angle nu = 0. Thus the provisional ray matrix at the object is: ⎡ ⎤ 0 1 ⎦ L0 = ⎣ 1 0 3.2.3 “Vertex-to-Vertex Matrix” for System We can construct a matrix that represents JUST the optical system by excluding the input ray matrix, the transfer matrix from object to object-space vertex, the transfer from image-space vertex to image, and the output ray matrix. This subset is the “vertex-to-vertex matrix” MVV0 of the system and is a complete specification of the paraxial properties of the system. The general form for the matrix is: ⎤ ⎡ A B ⎦ MVV0 = (Rn · · · T 2 R2 T 1 R1 ) = ⎣ C D where A, B, C, D are factors to be determined from the various refractions and transfers for a specific system. The entries A and D in the matrix are “pure” numbers (without units), while B and D have dimensions of length and reciprocal length, respectively. From matrix algebra, it is possible to show that the determinant of the matrix product is the product of the determinants. We already know that the determinants of the matrices for any transfer or refraction is unity, which establishes a constraint on the vertex-to-vertex matrix: det [MVV0 ] = = =⇒ =⇒ det Rn · det T n−1 · · · · · det R2 · det T 1 · det R1 1 · 1 · ··· · 1 · 1 = 1 det [MVV0 ] = 1 AD − BC = 1 −1 Consider a simple example of the matrix MVV0 for a two-lens system with powers φ1 = (f1 ) and φ2 = (f2 )−1 separated by t. The product of the two refraction matrices and the transfer matrix 105 3.2 MATRIX FORMULATION OF PARAXIAL RAY TRACING is: MVV0 =R2 T 1 R1 ⎤⎡ ⎤⎡ ⎤ ⎡ 1 t 1 0 1 0 ⎦⎣ ⎦⎣ ⎦ =⎣ 0 1 −φ1 1 −φ2 1 ⎡ ⎤ 1 − φ1 t t ⎦ =⎣ − (φ1 + φ2 − φ1 φ2 t) 1 − φ2 t ⎤ ⎡ t 1 − φ1 t ⎦ MVV0 =⎣ −φeff 1 − φ2 t where the known expression for the system power 1 1 t 1 = + − =⇒ φeff = φ1 + φ2 − φ1 · φ2 · t feff f1 f2 f1 · f2 has been substituted in the last expression. It is easy to confirm that the determinant of this system matrix is unity. We have four equations in the four unknowns A, B, C, D, which may be combined to find useful systems metrics in terms of the elements in the vertex-to-vertex matrix MVV0 : effective focal length of system front focal length back focal length distance from front vertex to object-space principal point distance from image-space principal point to rear vertex distance from rear vertex to image (if obj. dist. t1 is known) distance from object to front vertex (if image dist. t2 is known) 1 1 =− φeff C FV D FFL = =− n C V0 F0 A BF L = =− n C VH D−1 = n C H0 V0 1−A = n0 C 0 0 VO t2 m−A B − At1 = 0 = =− n0 n C D − Ct1 1 D− t1 B + Dt2 OV m = = = n n C A + Ct2 feff = When evaluating matrices, note that you need to retain plenty of significant figures in the calculation (at least 6) to ensure that the derived values are sufficiently accurate. 3.2.4 Example 1: System of Two Positive Thin Lenses To illustrate, consider the system of two thin lenses in the last section with f1 = +100 mm, f2 = 200 +50 mm, and t = 75 mm, which we showed to have feff = + mm ∼ = 66.7 mm. The system matrix 3 is: ⎤ ⎡ ⎤ ⎡ 1 − φ1 t A B t ⎦ ⎦=⎣ MVV0 = ⎣ C D − (φ1 + φ2 − φ1 φ2 t) 1 − φ2 t ⎡ ⎤⎡ ⎤⎡ ⎤ ⎡ ⎤ 1 1 0 1 0 1 75 mm 75 mm 4 ⎦⎣ ⎦⎣ ⎦=⎣ ⎦ =⎣ 1 1 − 50 mm 1 − 100 mm 1 0 1 − 2003mm − 12 106 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS and its determinant evaluates to one: ⎡ det ⎣ 1 4 75 mm − 2003mm − 12 From the values in the last section, we can see that ⎤ ⎦=1 B = 75 mm = t 1 200 − = mm = feff C 3 which in turn demonstrates our old result that the power of a two-lens system is: C=− 1 feff =⇒ φ = φ1 + φ2 − φ1 φ2 t = 1 1 t + − f1 f2 f1 f2 The input ray matrix consists of the provisional marginal and chief rays at the object, which “pass through” the transfer matrix from object to front surface. For example, if the object is located 1000 mm from the front vertex, the transfer matrix is: ⎡ ⎤ 1 1000 mm ⎦ T0 =⎣ 0 1 If a ray is “cast out” from the center of the object (y = 0) at an angle of 1 radian, the ⎤ ⎤ ⎡ ⎡ ⎤ ⎡ ⎤ ⎡ y0 y 0 1000 mm ⎦=T0⎣ ⎦=⎣ ⎦ ⎦=⎣ T0⎣ nu 1 1 n0 u0 In words, the height of the provisional marginal ray at the front vertex is 1000 mm and the angle is 1 radian, a HUGE angle, but remember that all equations in this paraxial assumption are linear, so the angle and ray height can be scaled to any value. The emerging provisional marginal ray is: ⎡ ⎤ ⎡ ⎤⎡ ⎤ ⎤ ⎡ 1 0 325 mm 75 mm 1000 mm y 4 ⎣ ⎦=⎣ ⎦⎣ ⎦ ⎦=⎣ 31 3 − − − 12 1 n0 u0 200 mm 2 In words, the marginal ray from an object 1000 mm at an angle of 1 radian at the front vertex of the lens emerges from the image-space vertex with height y 0 = 325 mm and angle of n0 u0 = − 31 2 radians. To find the location of the image, find the distance until the marginal ray height y = 0, which is the location of the image: ⎤⎡ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ t0 325 mm 325 mm 0 1 0 V0 O0 = T ⎣ 31 ⎦ = ⎣ 31 ⎦ 31 ⎦ = ⎣ n ⎦ ⎣ − − − 0 1 2 µ 2 2 ¶ 0 31 t =⇒ 325 mm + − · 0 = 0 2 n µ ¶ 0 t 2 650 ∼ +20.97 mm =⇒ = 325 mm · + =+ mm = 1 31 31 which agrees with the result obtained earlier. We observed that the transverse magnification of the image in this configuration is MT = − 2 mm ∼ z0 H0 O0 =− =− = −0.064 z 31 mm OH 3.2 MATRIX FORMULATION OF PARAXIAL RAY TRACING 107 so the provisional marginal ray at the image point is: ⎡ ⎤ ⎡ ⎤ ⎤ ⎡ 0 y0 0 ⎣ ⎦ = ⎣ 31 ⎦ = ⎣ ⎦ − n0 u0 MT−1 2 The marginal ray out of the vertex-to-vertex matrix for the object distance OV = 1000. Back Focal Length (BFL) The image of an object located at ∞ is the image-space focal point of the system. This ray enters the system with angle nu = 0 and arbitrary height, which we can model as y = 1. The emerging ray is: ⎡ ⎤⎡ ⎤ ⎡ ⎤ 1 1 75 ⎢ 4 ⎥⎣ 1 ⎦ ⎢ 4 ⎥ =⎣ ⎣ 1 ⎦ 3 3 ⎦ 0 − − − 200 2 200 3 The ray height is 14 mm and the angle is n0 u0 = − 200 . The distance to the point where the ray height is zero is the back focal distance: ⎤ ⎡ ⎡ ⎤⎡ 1 ⎤ ⎡ ⎤ 1 t0 0 1 ⎢ ⎥ ⎥ ⎢ BF L = V0 F0 = T ⎣ 43 ⎦ = ⎣ n0 ⎦ ⎣ 43 ⎦ = ⎣ 3 ⎦ − − − 0 1 200 200 200 ¶ µ 0 1 3 t =⇒ + − =0 · 4 200 mm n0 t0 1 200 mm 100 =⇒ = × = mm ∼ = 16.7 mm 1 4 3 6 Front Focal Length (FFL): Ray Through “Reversed” System To find the front focal distance, we can trace the “provisional” marginal ray “backwards” through the system, or trace it through the “reversed” system where the lenses are placed in the opposite order. The “reversed” system matrix is: ⎤ ⎡ ⎤⎡ ⎤⎡ ⎤ ⎡ 1 − 75 ⎥ 1 0 1 0 1 75 2 ⎦⎣ ⎦⎣ 1 ⎦=⎢ (MVV0 )reversed = ⎣ ⎣ 1 3 1 ⎦ − 1 − 1 0 1 − 100 50 200 4 Note that the “diagonal” elements of the “forward” and “reversed” vertex-to-vertex matrices are “swapped”, while the “off-diagonal” elements are identical. 108 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS If the input ray height is 1 and the angle is 0, the outgoing ray from the reversed matrix is: ⎡ ⎤⎡ ⎤ ⎡ ⎤ 1 1 1 − mm − − 75 mm 1 100 ⎢ ⎥ ⎢ ⎥ 2 2 ⎣ ⎦=⎣ 2 ⎣ 3 1 ⎦ 3 ⎦ =⇒ F F L = FV = µ 3 ¶ = + 3 mm 0 − − − 200 4 200 200 3.2.5 Example 2: Telephoto Lens To illustrate, we apply the vertex-to-vertex matrix for the thin-lens telephoto considered in the last section with f1 = +100 mm, f2 = −25 mm, and t = +80 mm: ⎡ ⎤⎡ ⎤⎡ ⎤ 1 0 1 0 1 +80 mm ⎦⎣ ⎦⎣ ⎦ MVV0 = ⎣ 1 1 − 1 − 1 0 1 −25 mm 100 mm ⎡ ⎤ ⎡ ⎤ 1 80 mm t t 1 − φ ⎢ ⎥ ⎣ 1 5 ⎦ = ⎣ 1 21 ⎦ = − (φ1 + φ2 − φ1 φ2 t) 1 − φ2 t − 500 mm 5 1 =⇒ feff = − = +500 mm C µ ¶ 1 A · (−500 mm) = +100 mm =⇒ BF L = − = − C 5 µ ¶ 21 D =⇒ F F L = − = − · (−500 mm) = +2100 mm C 5 µ ¶ D−1 21 VH =⇒ = = − 1 · (−500 mm) = −1600 mm =⇒ HV = +1600 mm n C 5 µ ¶ D−1 21 VH =⇒ = = − 1 · (−500 mm) = −1600 mm =⇒ HV = +1600 mm n C 5 µ ¶ 1−A 1 H0 V0 =⇒ = = 1 − · (−500 mm) = −400 mm =⇒ V0 H0 = +400 mm n0 C 5 If the object is located 1000 mm from the first surface, the ray matrix at the front vertex of the system is : ⎡ ⎤ ⎡ ⎤ y 0 ⎦=T0⎣ ⎦ T0⎣ nu 1 ⎤ ⎤⎡ ⎤ ⎡ ⎡ 1000 mm 0 1 1000 mm ⎦ ⎦⎣ ⎦ = ⎣ ⎣ 1 1 0 1 The height of the provisional marginal ray at the front vertex is 1000 units and the angle is 1 radian, which are huge values, but can be scaled to any value because all equations are linear. ⎡ ⎤⎡ ⎤ ⎡ ⎤ ⎤ ⎡ 1 80 mm 280 mm 1000 mm y ⎢ ⎥⎣ 5 ⎦ = ⎣ 11 ⎦ = ⎣ ⎦ ⎣ 1 21 ⎦ 1 nu − 5 500 mm 5 In words, the marginal ray from an object 1000 mm in front of the lens emerges with height 280 mm 11 and angle of + radians. 5 3.2 MATRIX FORMULATION OF PARAXIAL RAY TRACING 109 To find the location of the image, find the distance until the marginal ray height y = 0: ⎤⎡ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ t0 280 mm 280 mm 0 1 V0 O0 = T ⎣ 11 ⎦ = ⎣ n0 ⎦ ⎣ 11 ⎦ = ⎣ 11 ⎦ 0 1 5 5 5 µ ¶ 11 t0 =⇒ 280 mm + + · 0 = 0 5 n µ ¶ t0 5 1400 =⇒ = 280 mm · − =− mm ∼ = −127.3 mm 1 11 11 which indicates that the image is virtual. (Figure out why!) The magnification of the image in this configuration is MT = − 3.2.6 2 z0 OH mm =− =− 0 0 z 31 H O mm MVV0 Derived From Two Rays Consider the action of the vertex-vertex matrix on two rays that we know both before and after the system. For two arbitrary (but noncollinear) rays, we have: ⎡ ⎤ ⎡ ⎤ y1 y10 ⎦ = ⎣ ⎦ MVV0 ⎣ nu1 nu01 ⎡ ⎤ ⎡ ⎤ y2 y20 ⎦ = ⎣ ⎦ MVV0 ⎣ nu2 nu02 In actual use, the marginal ray and chief ray are the rays of choice. The marginal ray goes from the center of the object to the center of the image while grazing the edge of the aperture stop (and therefore the edge of the entrance and exit pupils), while the chief ray goes from the edge of the object through the center of the aperture stop (and therefore of the pupils) to the edge of the image. The vertex-vertex matrix applied to the incoming marginal from the center of the object yields the emerging marginal ray: ⎡ ⎤ ⎡ ⎤ MVV0 ⎣ y nu and the same relation for the chief ray is: ⎡ MVV0 ⎣ ȳ nū ⎦=⎣ ⎤ ⎡ ⎦=⎣ y0 n0 u0 ⎦ ȳ 0 ⎤ n0 ū0 ⎦ We can combine the two vectors to form a 2 × 2 matrix: ⎤ ⎡ ⎡ ⎤ y0 y ȳ ȳ 0 ⎦ = ⎣ ⎦ MVV0 ⎣ nu nū n0 u0 n0 ū0 MVV0 L = L0 110 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS We can now use the properties of the 2 × 2 matrix to derive the form of vertex-vertex matrix: (MVV0 L) L−1 (MVV0 L) L−1 = = L0 L−1 ¡ ¢ MVV0 LL−1 = MVV0 · I =⇒ L0 L−1 = MVV0 In words, we can evaluate the vertex-vertex matrix from its action of the marginal and chief rays. The inverse of the input-ray matrix is easy to derive: ⎡ ⎤ y ȳ ⎦ L = ⎣ nu nū ⎡ ⎤ nū −ȳ 1 ⎦ =⇒ L−1 = ·⎣ det L −nu y ⎤ ⎡ nū −ȳ 1 ⎦ ⎣ = y · nū − ȳ · nu −nu y ⎤ ⎡ 1 ⎣ nū −ȳ ⎦ ≡ ℵ −nu y where ℵ ≡ y · nū − ȳ · nu is the previously defined Lagrangian invariant. So the vertex-vertex matrix has the form: ⎡ ⎤ ⎡ ⎤ µ ¶ ȳ 0 y0 nū −ȳ 1 ⎦· ⎣ ⎦ MVV0 = ⎣ y · nū − ȳ · nu n0 u0 n0 ū0 −nu y ⎤⎞ ⎤⎡ ⎛⎡ nū −ȳ ȳ 0 1 ⎝⎣ y 0 ⎦⎠ ⎦⎣ = · ℵ −nu y n0 u0 n0 ū0 ⎤ ⎡ y · ȳ 0 − ȳ · y 0 1 ⎣ y 0 · nū − ȳ 0 · nu ⎦ = · ℵ n0 u0 · nū − n0 ū0 · nu n0 ū0 · y − n0 u0 · ȳ ¯ ⎤ ¯ ¯ ⎡ ¯ ¯ 0 ¯ ¯ ¯ 0 ¯ y ȳ 0 ¯ ¯ y ȳ ¯ ¯ ⎥ ¯ ¯ ⎢ ¯ ¯ ¯ 0 0¯ ⎥ ⎢ ¯ ¯ ¯ ⎥ ¯ ¯ ⎢ y nu nū ȳ 1 ⎢ ¯ ¯ ¯ ¯⎥ = ·⎢ ¯ ¯ ¯ ¯⎥ ℵ ⎢ ¯ nu nū ¯ ¯ y ȳ ¯ ⎥ ¯ ¯ ¯⎦ ⎣ −¯ ¯ 0 0 0 0¯ ¯ 0 0 0 0¯ ¯ n u n ū ¯ ¯ n u n ū ¯ where we have used the shorthand notation for the determinant in the last expression: ¯ ⎡ ⎤ ¯ ¯ 0 ¯ ¯ y ȳ 0 ¯ y 0 ȳ 0 ¯ ⎦=¯ det ⎣ ¯ ¯ ¯ nu nū ¯ nu nū 3.3 Object-to-Image (Conjugate) Matrix The vertex-vertex matrix applied to a “test ray” with height y and angle u in index n from the object to the front vertex is: 111 3.3 OBJECT-TO-IMAGE (CONJUGATE) MATRIX ⎡ MVV0 ⎣ y nu ⎤ ⎡ ⎦ = ⎣ A B C D ⎤⎡ ⎦⎣ y nu ⎤ ⎡ ⎦=⎣ y 0 = A · y + B · (nu) nu0 = C · y + D · (nu) y0 nu0 ⎤ ⎦ For rays emerging from one plane and converging to the corresponsing “conjugate” plane (the image), the output ray height at the image is a function ONLY of the image ray height — the angles of all rays at the object do not matter, since they all converge to the image. In mathematical terms: y0 = Ay + B · (nu) = f [y] (does not depend on angle) =⇒ B = 0 =⇒ y 0 = A · y We know the relationship between y 0 and y is the transverse magnification: y0 = MT = A y rays (a, b, c) diverge from the object and converge as (a0 , b0 , c0 ) to form the image; the choice of specific ray angle at the object has no effect on the location of the convergence — only the heights of the rays at the object matter. If we define the angular magnification to be the ratio of the angles “from” the object and “to” the image:: ∆u0 = Mθ ≡ ∆u we can find a relatiohsip from the matrices: n0 u01 = C · y + D · (nu1 ) n0 u02 = C · y + D · (nu2 ) 112 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS Evaluate the difference of these: n0 (u02 − u01 ) n0 · ∆u0 = = C · y − C · y + D · (nu2 − nu1 ) n · D · (∆u) n ∆u0 =⇒ ≡ Mθ = 0 · D ∆u n n0 =⇒ D = · Mθ n We can combine these two observations to see the form of the “conjugate-to-conjugate” matrix: ⎤ ⎡ M 0 T ⎥ ⎢ MOO0 = ⎣ ⎦ 1 n0 − · Mθ feff n We know that the determinant of this matrix must also be one, which implies that: MT · n0 n0 1 Mθ = 1 =⇒ Mθ = n n MT so we can also write the conjugate matrix as: MOO0 ⎤ MT 0 =⎣ 1 ⎦ 1 − feff MT ⎡ The principal planes H and H0 are those for which MT = +1 MHH0 ⎡ ⎤ +1 0 ⎦ =⎣ 1 − +1 feff The points of equal conjugates are related by MT = −1, so the object-image matrix for these points is: ⎡ ⎤ −1 0 ⎦ MOO0 = ⎣ 1 − −1 feff We can include the translation matrices from object to vertex and from vertex to image along with the vertex-to-vertex matrix MVV0 : ⎡ ⎤ A B ⎦ MVV0 = ⎣ C D The matrix that relates two conjugate planes (object O and image O0 ) may be obtained by adding¢ ¡ transfer matrices for the appropriate distances from the object to the front vertex t1 = n1 · OV 113 3.3 OBJECT-TO-IMAGE (CONJUGATE) MATRIX ¡ ¢ and from the rear vertex to the image t2 = n2 · V0 O0 , which yields for n1 = n2 = 1: ⎤ ⎤ ⎡ ⎡ 1 t2 1 t1 ⎦ • MVV0 • ⎣ ⎦ MOO0 = ⎣ 0 1 0 1 ⎡ ⎤⎡ ⎤⎡ ⎤ 1 t2 A B 1 t1 ⎦⎣ ⎦⎣ ⎦ =⎣ 0 1 C D 0 1 ⎡ ⎤ A + t2 C (A + t2 C) t1 + B + t2 D ⎦ =⎣ C Ct1 + D ⎡ ⎤ MT 0 =⎣ 1 ⎦ −φ MT =⇒ MT = A + t2 C = (Ct1 + D) φ = −C 0 = (A + t2 C) t1 + B + t2 D −1 We know that the marginal ray heights at the object and image are zero (yin = yout = 0), which sets some limits on the “conjugate-to-conjugate” matrix. Apply this matrix to the ray matrix L at the object and at the image: MOO0 L = L0 ⎤⎡ ⎤ ⎡ ⎤ 0 0 A + t2 C (A + t2 C) t1 + B + t2 D y in y out ⎦⎣ ⎦=⎣ ⎣ ⎦ C Ct1 + D (nu)in (nu) in (nu)out (nu)out ⎡ Evaluate the inverse matrix L−1 and apply to both sides from the right: ¡ ¢ (MOO0 L) L−1 = L0 L−1 ⎤ ⎡ ⎡ ⎤ ⎡ ⎤−1 0 A + t2 C (A + t2 C) t1 + B + t2 D 0 y out y in ⎦=⎣ ⎣ ⎦·⎣ ⎦ C Ct1 + D (nu)out (nu)out (nu)in (nu)in ⎤ ⎡ y out 0 ⎥ ⎢ y in ⎥ =⎢ ⎣ (nu)out ·(nu)in −(nu)out ·(nu)in (nu)out ⎦ yin (nu)in (nu)in µ ¶ y out The ratio of the chief ray heights at the object and image is the transverse magnification ≡ MT , y in (nu)out 1 whereas the ratio of the marginal ray angles = (nu)in MT 114 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS Example: System with Two Positive Thin Lenses Again, consider the example of a system composed of two thin lenses with f1 = +100 mm, f2 = +50 mm, and t = +75 mm: ⎤ ⎤⎡ ⎡ ⎤⎡ ⎤ ⎡ 1 75 mm ⎥ 1 0 1 0 1 75 mm 4 ⎦⎣ ⎦⎣ ⎦=⎢ MVV0 = ⎣ ⎣ 1 1 1 ⎦ 3 − − 1 1 0 1 − − 50 mm 100 mm 200 mm 2 From the table of properties of the matrix, we see that: 1 200 =+ mm C 3 D 100 F F L = FV = − = − mm C 3 A 50 BF L = V0 F0 = − = + mm C 3 D−1 VH = = +100 mm C A−1 H0 V0 = = +50 mm C feff = − which again match the results obtained before. The matrix that relates the object and image planes for the two-lens system presented above is: ⎤⎡ ⎤ ⎡ 650 ⎤ ⎡ 1 ⎤ ⎡ 2 75 0 − 1 1 1000 ⎥⎣ ⎥ ⎦=⎢ 31 ⎦ ⎢ T 2 MVV0 T 1 = ⎣ ⎣ 43 ⎣ 31 1⎦ 3 31 ⎦ 0 1 − − − − 0 1 200 2 200 2 which has the form of the principal plane matrix except the diagonal elements are not both unity. However, note that they are reciprocals of teach other, so that ⎡ ⎤ 2 − 0 ⎢ ⎥ det ⎣ 31 3 31 ⎦ = 1 − − 200 2 2 We had evaluated the transverse magnification in this configuration to be MT = − , so we note 31 that the upper-left component of the conjugate-to-conjugate matrix is the transverse magnification. The general form of a conjugate-to-conjugate matrix is: MOO0 ⎤ 0 =⎣ 1 ⎦ −φ MT ⎡ MT and the specific form that relates the principal planes with MT = 1 is ⎤ ⎡ 1 0 ⎦ MHH0 = ⎣ −φ 1 This is the matrix of the equivalent “single thin lens.” 3.3.1 Matrix of the “Relaxed” Eye (focused at ∞) The vertex-to-vertex matrix for the three refractions and two transfers is: 3.4 VERTEX-VERTEX MATRICES OF SIMPLE IMAGING SYSTEMS ⎡ MVV0 = ⎣ 1 −φ3 115 ⎤ ⎤ ⎤⎡ ⎤⎡ ⎤ t02 ⎡ t01 ⎡ 1 1 0 ⎢ 1 0 ⎢ 1 0 0 0 ⎥ ⎥ ⎦ ⎣ n2 ⎦ ⎣ ⎦ ⎣ n1 ⎦ ⎣ ⎦ −φ −φ 1 1 1 2 1 0 1 0 1 where the individual terms evaluate to: φ1 = t01 = n01 φ2 = t02 = n02 φ3 = n01 − n1 1.336 − 1 = = 4.3077 × 10−2 mm−1 = 43.077 m−1 = 43.077 Diopters R1 7.8 mm 3.6 mm = 2. 694 6 mm 1.336 n02 − n2 1.413 − 1.336 = = 0.77 × 10−2 mm−1 = 7.7 Diopters R2 10 mm 3.6 mm = 2.547 8 mm 1.413 n03 − n3 1.336 − 1.413 = = 1.2833 × 10−2 mm−1 = 12.833 Diopters R3 −6 mm so the vertex-to-vertex matrix has the form: ⎡ ⎤ 0.756 83 5.189 5 mm ⎦ MVV0 = ⎣ −2 −1 −5.959 6 × 10 mm 0.912 65 ¡ ¢−1 =⇒ feye = 5.959 6 × 10−2 mm−1 = +16.780 mm −2 −1 =⇒ φeye = 5.9596 × 10 mm = −59.596 m−1 ∼ = 60 Diopters A ray from infinity has a ray angle of zero, but the ray height is determined from the diameter of the iris. If we assume that the iris diameter is 1 mm, then the output ray vector is: ⎡ ⎤ ⎡ ⎤ ⎤⎡ ⎤ ⎡ 0.75683 5.1895 mm 1 mm 0.756 83 mm y0 ⎣ ⎦=⎣ ⎦ ⎦⎣ ⎦=⎣ −5.9596 × 10−2 mm−1 0.91265 0 −5.959 6 × 10−2 n0 u0 3.4 Vertex-Vertex Matrices of Simple Imaging Systems We now get to where the “rubber meets the road;” the discussion of simple examples of actual imaging systems. It is useful to emphasize the point that optical systems may create a real image that may be “sensed” by a CCD or photographic emulsion, while those for human viewing will produce virtual images or are afocal (image at infinity). 3.4.1 Magnifier (“magnifying glass,” “loupe”) The magnifier or loupe is a lens (or system of lenses) with positive focal length that is used to increase the size of the image on the retina than could be formed with the eye alone. Recall that when the ciliary muscles that deform the eye lens are relaxed, the lens becomes “flatter,” increasing the focal length. To view an object “close up,” the focal length of the lens must shorten by making the lens more spherical. The closest distance to an object that appears to be sharply focused by the unaided eye is the “near point,” which (obviously) depends on the flexibility of the deformable eyelens and the capability of the ciliary muscles, which (obviously) vary with individual, and with age for a single individual. The distance to the near point may be as close as 50 mm for a young child and 1000 mm − 2000 mm for an elderly person. This reduction in “accommodation” is one of the signs of aging. The near point of an “ideal” eye is assumed to be 250 mm ∼ = 10 in from the front surface. For nearsighted individuals, the near point is closer to the eye, thus increasing the angular subtense of fine details for those individuals. For this reason, nearsighted individuals 116 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS in ancient times (before optical correction) often were attracted to professions requiring fine work, such as goldsmithing. Since nearsightedness can be a genetic trait, descendents often continued in these crafts. In use, the object is held closer to the eye than the near point and viewed through the positive lens, which in turn is held closer to the eye than its focal length to create a virtual image “behind” the lens at the near point. If the focal length of the magnifying lens is f = 100 mm and the image is distance is z1 = 10 mm, the object-to-image matrix is: ⎡ ⎤⎡ ⎤⎡ ⎤ 1 0 1 −250 mm 1 z mm ⎦⎣ ⎦⎣ ⎦ MOO0 = ⎣ 1 − 1 0 1 0 1 50 mm ⎤ ⎡ =⎣ 6 − 50 1mm (6 · z − 250) mm 1− 1 50 z ⎦ Since this has the form of an “object-to-image” matrix, the off-diagonal element in the upper-right corner must evaluate to zero: (6 · z − 250) mm = 0 =⇒ z = 250 mm 2 = 41 mm 6 3 The diagonal element in the upper-left corner of the “object-to-image” matrix is the transverse magnification 250 mm MT = +6 = 1 + f This is the transverse magnificxation of the magnifier if the image is at the near point. If the object is located at the object-space focal point, then the image is at infinity: ⎡ ⎤⎡ ⎤⎡ ⎤ 1 0 1 ∞ mm 1 50 mm ⎦⎣ ⎦⎣ ⎦ MOO0 = ⎣ 1 − 1 0 1 0 1 50 mm µ ∙ ¶¸ ⎡ ⎤ 1 1 6 − z (z − 250) − z z − 6 mm ⎢ ⎥ 50 50 =⎣ ⎦ 1 1 1− z − 50 mm 50 ⎡ ⎤ ∞ 0 ⎦ =⎣ 1 − 0 50 mm 3.4.2 Galilean Telescope of Thin Lenses The Galilean telescope is an afocal system formed from an objective lens with positive power and an eyelens with negative power separated by the sum of the focal lengths. If the focal length of the objective and eyelens are f1 = +200 and f2 = −25 units, the separation t = (200 − 25) = 175 units. The system matrix is: ⎡ ⎤⎡ ⎤ ⎡ ⎤ ⎤⎡ 1 1 0 1 0 ⎢ ⎥ ⎣ 1 175 mm ⎦ ⎢ ⎥ ⎣ 8 175 mm ⎦ MVV0 = ⎣ ⎦ ⎣ ⎦= 1 1 − 1 − 1 0 1 0 8 (−25 mm) (+200 mm) Note that the system power φ = 0 =⇒ feff = ∞, as it must be for an afocal system (both objectand image-space focal points at infinity). The ray from an object at ∞ with unit height generates 3.4 VERTEX-VERTEX MATRICES OF SIMPLE IMAGING SYSTEMS the outgoing ray: ⎡ ⎣ 1 8 175 mm 0 8 ⎤⎡ ⎦⎣ 1 mm 0 ⎤ ⎡ ⎦=⎣ y 0 [ mm] n0 u0 ⎤ ⎡ ⎦=⎣ 1 8 mm 0 117 ⎤ ⎦ so the outgoing ray is at height 18 and the angle is zero; both incoming and outgoing rays are parallel to the axis. Note that the diagonal elements of MVV0 are positive and the determinant is 1. For a “provisional” chief ray into the system with height 0 and angle 1, the outgoing ray is: ⎤ ⎡ ⎤⎡ ⎤ ⎡ ⎤ ⎡ 1 0 175 mm 175 mm y [ mm] ⎦ ⎣8 ⎦⎣ ⎦ = ⎣ ⎦=⎣ 1 8 0 8 nu So the outgoing ray angle is 8 times larger; this is the angular magnification of the telescope; the image is upright since the incoming and outgoing ray angles are both positive. The form of an afocal system is: ⎤ ⎡ 1 0 ⎦ MVV00 (afocal system) = ⎣ mθ 0 mθ 3.4.3 Keplerian Telescope of Thin Lenses The Keplerian telescope with f1 = +200 and f2 = +25 units with separation t = (200 + 25) = 225 units. The system matrix is: ⎤⎡ ⎤ ⎤⎡ ⎤ ⎡ ⎡ 1 0 1 225 mm − 18 225 mm 1 0 ⎦⎣ ⎦ ⎦⎣ ⎦=⎣ ⎣ − (+2001 mm) 1 0 1 − (25 1mm) 1 0 −8 The diagonal elements are negative, the determinant is 1, and the system power φ = 0 =⇒ feff = ∞. The outgoing ray angle is −8, which specifies that the angular magnification is 8 and the image is inverted. The ray from an object at ∞ with unit height generates the outgoing ray: ⎡ ⎤⎡ ⎤ ⎡ ⎤ ⎤ ⎡ − 18 225 mm 1 mm y 0 [ mm] − 18 mm ⎣ ⎦⎣ ⎦=⎣ ⎦ ⎦=⎣ 0 −8 0 n0 u0 0 so the outgoing ray is at height − 18 — the image is “inverted” and the angle is zero. The “provisional” chief ray into the system has height 0 and angle 1; the outgoing ray is: ⎤ ⎤⎡ ⎤ ⎡ ⎡ ⎤ ⎡ 0 − 18 225 mm 225 mm y 0 [ mm] ⎦ ⎦⎣ ⎦ = ⎣ ⎣ ⎦=⎣ 1 −8 0 −8 n0 u0 So the outgoing ray angle is 8 times larger than the incoming ray but negative (which implies that the image is inverted). 3.4.4 Thick Lenses The matrix method is convenient for thick lenses. If the thick lens is made of glass with n0 = 1.5, radii of curvature R1 = +50 mm, and R2 = −100 mm, and thickness t0 (which we shall vary). It is useful to evaluate the focal length of the single “thin” lens with these radii and refractive index 118 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS from the lensmaker’s equation: ¶ µ 1 1 1 − = (n − 1) · f R1 R2 µ µ ¶¶−1 1 200 1 2 f = (1.5 − 1.0) · =+ − mm = 66 mm 50 mm −100 mm 3 3 The powers of the two surfaces are: n0 − n 1.5 − 1 0.5 1 = =+ =+ R1 50 mm 50 mm 100 mm n − n0 1 − 1.5 −0.5 1 = φ2 = = =+ R2 −100 mm −100 mm 200 mm φ1 = so if the thickness is zero, the focal length evaluates to: φeff = φ1 + φ2 − φ1 · φ2 · t µ ¶ µ ¶ µ ¶ µ ¶ 1 1 1 1 = + + + − + · + ·0 100 mm 200 mm 100 mm 200 mm 3 = 200 mm 1 1 t 200 + − =+ feff = mm f1 f2 f1 · f2 3 which agrees with the result obtained from the lensmaker’s equation. The system matrix for the lens with thickness t0 may be evaluated with this parameter: MVV0 = R2 T 1 R1 ⎡ ⎤ ⎤⎡ t0 1 0 ⎢ µ ¶ ⎥ ⎣ 1 1.5 mm ⎦ ⎢ µ ¶ ⎥ =⎣ ⎣ ⎦ ⎦ 1 1 − + − + 1 1 0 1 200 mm 100 mm ⎤ ⎡ 0 1 − 0.006666 7 · t 0.666 6667 · t0 mm ⎦ =⎣ 1 1 0 0 (0.0033333 · t − 1) − 1 − 0.003333 3 · t 100 mm 200 mm 1 0 ⎤⎡ Note that the thickness t0 is present in each of the four terms in the matrix. Now we can derive matrices for different values of the thickness: t0 = 0 mm, 1 mm, 2 mm, 5 mm, and 10 mm, where we substitute into the table of properties to find the BFL, FFL, VH, and H0 V0 : t0 = 0 mm (thin lens) ⎡ MVV0 (t0 = 0 mm) = ⎣ 1 100 mm 1 − 0.006666 7 · 0 (0.003333 3 · 0 − 1) − ⎡ ⎤ 1 0 ⎦ = ⎣ − 2003mm 1 1 200 mm 0.666 6667 · 0 mm 1 − 0.003333 3 · 0 ⎤ ⎦ 119 3.4 VERTEX-VERTEX MATRICES OF SIMPLE IMAGING SYSTEMS 1 200 2 =+ mm = 66 mm C 3 3 D 1 200 ¢ =+ F F L = FV = − = − ¡ mm = feff C 3 − 2003mm A 1 200 ¢ =+ BF L = V0 F0 = − = − ¡ mm = feff C 3 − 2003mm feff = − D−1 (1 − 1) ¶ = 0 mm =µ 41 C − 50 mm A − 1 (1 − 1) ¶ = 0 mm H0 V0 = =µ 41 C − 50 mm VH = All quantities correspond to the values we would expect for the single thin lens: the front and back focal lengths are identical to the effective focal length, which means that the principal points coincide with the vertices — they are all located AT the lens. t0 = 1 mm ⎡ MVV0 (t0 = 1 mm) = ⎣ 1 100 mm 1 − 0.006666 7 · 1 (0.0033333 · 1 − 1) − ⎡ ⎤ 0.993 33 0.666 67 mm ⎦ = ⎣ 1 − 66.814 0.996 67 mm 1 200 mm 0.666 6667 · 1 mm 1 − 0.003333 3 · 1 ⎤ ⎦ 1 ∼ = 66.814 mm C D 0.996 67 ¢ = 66.592 mm F F L = FV = − = − ¡ 1 C − 66.814 mm A 0.993 33 ¢ = 66.368 mm BF L = V0 F0 = − = − ¡ 1 C − 66.814 mm D−1 (0.996 67 − 1) ¢ = 0.2225 mm VH = = ¡ 1 C − 66.814 mm A−1 (0.993 33 − 1) ¢ = 0.4456 mm H0 V0 = = ¡ 1 C − 66.814 mm feff = − So the object- and image-space principal planes are within the lens and close to the surfaces. Note that the front and back focal lengths are slightly different: the image-space principal point is “more within the lens” since the second surface has less power than the front surface. t0 = 2 mm ⎡ MVV0 (t0 = 2 mm) = ⎣ 1 100 mm 1 − 0.006666 7 · 2 (0.0033333 · 2 − 1) − ⎡ ⎤ 0.986 67 1.3333 mm ⎦ = ⎣ 3×10−2 − 1.493mm 0.993 33 1 200 mm 0.666 6667 · 2 mm 1 − 0.003333 3 · 2 ⎤ ⎦ 120 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS feff = − 1 1 ´∼ = −³ = 66.966 mm 1.493 3×10−2 C − mm D 0.993 33 ´ = 66.519 mm F F L = FV = − = − ³ 1.493 3×10−2 C − mm BF L = V0 F0 = − VH = H0 V0 = A 0.986 67 ´ = 66.073 mm = −³ 1.493 3×10−2 C − mm D−1 (0.993 33 − 1) ´ = 0.4467 mm =³ 3×10−2 C − 1.493mm A−1 (0.986 67 − 1) ´ = 0.8926 mm =³ 3×10−2 C − 1.493mm Note that the same “behavior” exists for this lens: the image-space principal point is farther “inside” the lens than the object-space principal point. t0 = 5 mm ⎡ MVV0 (t0 = 5 mm) = ⎣ 0.666 6667 · 5 mm (0.0033333 · 5 − 1) − 2001mm 1 − 0.003333 3 · 5 ⎡ ⎤ 0.966 67 3. 333 3 mm ⎦ =⇒ feff ∼ = ⎣ = 67.417 mm 3×10−2 − 1. 483mm 0.983 33 feff = − 1 100 mm 1 − 0.006666 7 · 5 ⎤ ⎦ 1 1 ´∼ = −³ = 67.417 mm 1. 483 3×10−2 C − mm F F L = FV = − D 0.983 33 ´ = 66.293 mm = −³ 1. 483 3×10−2 C − mm A 0.966 67 ´ = 65.170 mm BF L = V0 F0 = − = − ³ −2 1. C − 483 3×10 mm VH = H0 V0 = D−1 (0.983 33 − 1) ´ = 1.1238 mm =³ 3×10−2 C − 1. 483mm A−1 (0.966 67 − 1) ´ = 2.247 mm =³ 3×10−2 C − 1. 483mm t0 = 10 mm ⎡ MVV0 (t0 = 10 mm) = ⎣ 1 100 mm 1 − 0.006666 7 · 10 (0.003333 3 · 10 − 1) − ⎡ ⎤ 0.933 33 6.666 7 mm ⎦ = ⎣ 7×10−2 − 1. 466mm 0.966 67 1 200 mm 0.666 6667 · 10 mm 1 − 0.003333 3 · 10 ⎤ ⎦ 3.4 VERTEX-VERTEX MATRICES OF SIMPLE IMAGING SYSTEMS feff = − 121 1 1 ´∼ = −³ = 68.180 mm 1.466 7×10−2 C − mm D 0.966 67 ´ = 66.293 mm F F L = FV = − = − ³ −2 C − 1.466 7×10 mm A 0.933 33 ´ = 63.635 mm BF L = V0 F0 = − = − ³ 1. 466 7×10−2 C − mm VH = H0 V0 = D−1 (0.966 67 − 1) ´ = 2.2724 mm =³ 7×10−2 C − 1. 466mm A−1 (0.933 33 − 1) ´ = 4.5456 mm =³ 7×10−2 C − 1. 466mm From these results, we see that the effective focal length gets LONGER as the lens gets THICKER for the same radii of curvature and that the image-space principal point “penetrates” more inside the lens as the lens thickness is increased. 3.4.5 Microscope A simple microscope is also composed of two lenses (assumed to be “thin” in this discussion, though the optical components generally are composed of multiple elements). The distance t between the image-space (rear) focal point of the first lens and the object-space (front) focal point of the ocular (the “tube length”) is fixed, often at t = 160 mm. The first lens (the “objective”) has a (very) short focal length and the object typically is placed just “outside” its object-space focal point so that z1 ' f1 . The objective generates a real image between the objective and eyepiece (or “ocular”), which is a lens with a short focal length used as a simple magnifier. Assume f1 = 5 mm, f2 = 50 mm ⎡ MVV0 ⎤⎡ ⎤ ⎤⎡ 1 0 1 0 ⎢ ⎥ ⎣ 1 160 mm ⎦ ⎢ ⎥ =⎣ ⎣ ⎦ ⎦ 1 1 − − 1 1 0 1 (−50 mm) (5 mm) ⎡ ⎤ −31 160 mm =⎣ 21 ⎦ 41 − 50 mm 5 ⎡ ⎤ −31 160 mm det ⎣ 41 21 ⎦ = 1 − 50 mm 5 122 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS 1 50 = + mm ∼ = +1.220 mm C 41 µ ¶ 21 D 210 ∼ 5 ¶ =+ F F L = FV = − = − µ = −5.12 mm 41 C 41 mm − 50 mm A −31 1550 ∼ ¶ =− BF L = V0 F0 = − = − µ = −37.8 mm 41 C 41 mm − 50 mm µ ¶ 21 −1 D−1 160 5 µ ¶ =− = mm ∼ VH = = −3.902 mm 41 C 41 − 50 mm A − 1 −31 −1 1600 ¶= H0 V0 = =µ mm = 39.02 mm 41 C 41 − 50 mm feff = − MOV0 3.5 ⎤⎡ ⎤⎡ ⎤⎡ ⎤ 1 0 1 0 ⎢ ⎥ ⎣ 1 160 mm ⎦ ⎢ ⎥ ⎣ 1 3 mm ⎦ =⎣ ⎣ ⎦ ⎦ 1 1 − − 1 1 0 1 0 1 (−50 mm) (5 mm) ⎡ ⎤ −31 160 mm ⎦ =⎣ 41 21 − 50 mm 5 ⎡ Image Location and Magnification 1 1 1 + = z1 z2 f MT = − z2 ∼ f = − in usual case z1 z1 µ ¶−1 1 1 1 z1 f 1 1 + = = =⇒ z2 = − z1 z2 f f z1 z1 − f z2 f f ∼ MT = − = − = − ∝ f if z1 À f z1 z1 − f z1 In words, if the object distance z1 is large (compared to the focal length f ), then the transverse magnification is (approximately) proportional to the focal length. Therefore, doubling the focal length doubles the magnification if the object is distant (with the caveat that the magnification is still negative and smaller than unity, −1 < MT < 0). 3.6 Marginal and Chief Rays for the System ⎡⎛ L = ⎣⎝ y nu ⎞⎛ ⎠⎝ ȳ nū ⎞⎤ ⎡ ⎠⎦ = ⎣ det [L] = y · nū − ȳ · nu ≡ ℵ y ȳ nu nū ⎤ ⎦ 3.6 MARGINAL AND CHIEF RAYS FOR THE SYSTEM 123 The marginal ray goes through the center of the object and any image(s) (i.e., the point where the marginal ray crosses the optical axis is either the object or an image of the object). It also “grazes” the edge of the aperture stop, so if we know the location and the diameter of the aperture stop in the system, we can scale the height of the marginal ray so that its height matches the semidiameter of the aperture stop at that location. The chief ray goes through the center of the stop (and of the entrance and exit pupils), so we set the chief ray height at the location of the stop to be zero and its angle to be arbitrary (say unity), then propagate that provisional ray “forward” towards the image-space vertex and “backwards” towards the object-space vertex (note that when tracing “backwards” toward the first lens, the matrices in the ray trace must be inverted). During the tracing, we find the element that most constrains the chief ray, and then scale the height of the provisional chief ray to make sure that it gets “through” the other elements. The angle of the chief ray emerging from the front vertex to the object is the half-angle of the field of view; the angle of the chief ray emerging from the image-space vertex is the half angle of the image field at the sensor. 3.6.1 Examples of Marginal and Chief Rays for Systems In the lab, you constructed Keplerian and/or Galilean telescope with an iris diaphragm at various locations. We can use this as a model for demonstrating how to evaluate the marginal and chief rays. To evaluate the location of the stop, we must know the diameters as well as the locations of the lenses. We can cast a provisional marginal ray into the system from the object to determine which element is the aperture stop. We then scale the provisional marginal ray so that its height and the semidiameter of the stop “match.” We then propagate a provisional chief ray forward and backward from the center of the stop and scale its angle so that it grazes the element that constrains it. From the angle of the chief ray entering and exiting the system, we can determine the field of view. We will use the Galilean telescope as the first example. Example 1: Galilean telescope, object at ∞ Consider a telescope with the following parameters. L1 : f1 = +200 mm, d1 = 40 mm L2 : f2 = −40 mm, d2 = 5 mm t = f1 + f2 = 160 mm ⎡ R1 = ⎣ 1 0 ⎤ ⎦ − +2001 mm 1 ⎤ ⎡ 1 160 mm ⎦ T = ⎣ 0 1 ⎤ ⎡ 1 0 ⎦ R2 = ⎣ − −401mm 1 124 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS The vertex-vertex matrix of this system is ⎤ ⎡ ⎤⎡ ⎤⎡ ⎤ ⎡ 1 160 mm 1 0 1 160 mm 1 0 ⎦ ⎦⎣ ⎦⎣ ⎦=⎣ 5 MVV0 = ⎣ 0 5 − −401mm 1 0 1 − +2001 mm 1 ⎤ ⎡ 1 160 mm ⎦ MVV0 = ⎣ 5 0 5 for which element C = 0, which is characteristic of an afocal system. For an object at at infinity, the provisional marginal ray into the system is has angle of zero and height equal to the semidiameter of the first element. ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ d1 y 20 mm ⎣ ⎦ ⎦ =⎣ 2 ⎦=⎣ nu 0 0 provisional We can propagate this ray through the first lens and translate it to the second lens: ⎡ ⎤ ⎡ ⎤⎡ ⎤⎡ ⎤ ⎡ ⎤ y 20 mm 4 mm 1 160 mm 1 0 ⎦ ⎦⎣ ⎦⎣ ⎦=⎣ ⎦ T R1 ⎣ =⎣ 1 nu 0 1 − +2001 mm 1 0 − 10 provisional In words, the height of the provisional marginal ray at the second lens is 4 mm. Note that the ray after the second lens has the form: ⎤ ⎤ ⎡ ⎤ ⎤⎡ ⎡ ⎡ 1 4 mm 1 mm y 160 mm ⎦ ⎦=⎣ ⎦ ⎦⎣ MV V 0 ⎣ =⎣ 5 0 0 nu 0 5 provisional so that the height of the provisional marginal ray at the second lens is the same before and after refraction (no surprise there) and that the ray angle after the second lens is 0 (parallel to the optical axis, again no surprise). Note that the ray height at L2 is larger than the specified semidiameter of the second lens: d2 5 mm y0 > = = 2.5 mm =⇒ L2 is aperture stop 2 2 This means two things: (1) that the second lens is the aperture stop, and (2) that we must scale the height and angle of the provisional marginal ray to ensure that it grazes the edge of the stop. The scaling factor is the ratio of the height of the provisional marginal ray ¡ d2 ¢ 2.5 mm 5 2 = = y at L2 4 mm 8 We apply this scale factor to the marginal ray at all locations in the system. The marginal ray at the first lens from an object at infinite distance is: ⎤ ⎤ ⎡ ⎤ ⎤ ⎡ ⎡ ⎡ y 12.5 mm y 20 mm 5 5 ⎦ ⎦ ⎣ ⎦ ⎦=⎣ = = ⎣ ·⎣ 8 8 nu 0 nu 0 at L1 provisional ⎤ ⎤ ⎡ ⎡ 12.5 mm y ⎦ ⎦ ⎣ =⎣ 0 nu at L1 which means that the marginal ray strikes the first lens well inside of the semidiameter; the entering “tube” of rays does not fill the lens. 125 3.6 MARGINAL AND CHIEF RAYS FOR THE SYSTEM Now that we know that the second lens is the aperture stop, we can propagate a provisional chief ray from center of the stop in both directions. One possible choice for the provisional chief ray is: ⎤ ⎡ ⎤ ⎡ 0 mm y0 ⎦ ⎣ ⎦ =⎣ 1 n0 u0 provisional where again an angle of 1 radian is HUGE, but we will scale it based on the parameters of the rest of the system. Propagate this ray through the system (towards image space) to obtain ⎤⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ 0 mm 0 mm 0 mm 1 0 ⎦⎣ ⎦=⎣ ⎦ ⎦=⎣ R2 ⎣ 1 1 1 − −401mm 1 so the height and angle of the provisional chief ray is unchanged by the action of the lens L2 that is the stop because it passes through the center of the lens. The provisional chief ray may be propagated from the stop “backwards” towards the first lens. The translation matrix is inverted because the light is traveling “backwards” because we are traveling from right to left. ⎡ T −1 ⎣ 0 mm 1 ⎤ ⎛⎡ ⎦ = ⎝⎣ 1 +160 mm 0 1 ⎤⎞−1 ⎡ ⎦⎠ ⎣ 0 mm 1 ⎤ ⎡ ⎦=⎣ −160 mm 1 ⎤ ⎦ The height of the provisional chief ray at the first element is negative, which means that it is BELOW the optical axis at a MUCH LARGER distance than the semidiameter d21 = 20 mm of L1 . To ensure that the chief ray “gets through” the first lens, we have to scale its angle by the factor: ¡ d1 ¢ 20 mm 1 2 = y 160 mm 8 So now go back to the original prescription for the provisional chief ray and scale it to obtain the “actual” chief ray: ⎤ ⎤ ⎤ ⎡ ⎤ ⎡ ⎡ ⎡ ⎤ ⎡ 0 mm y y y 0 mm 1 ⎦ ⎦= ·⎣ ⎦ ⎦ =⇒ ⎣ ⎣ at L2 = ⎣ =⎣ 1 ⎦ 8 nu nu nu 1 provisional provisional 8 ⎡ ⎡ ⎤ ⎤ 0 mm y0 ⎣ ⎦ =⎣ 1 ⎦ n0 u0 at L2 8 ⎡ ⎤ ⎡ ⎤ −20 mm y0 ⎣ ⎦ ⎦ =⎣ 1 n0 u0 at L1 8 We can now propagate this ray through L1 . The chief ray emerging from the front vertex is: ⎡ ⎤⎞−1 ⎛⎡ ⎤⎞−1 ⎡ ⎤ ⎛⎡ ⎤ 0 mm 0 mm 1 +160 mm 1 0 −1 ⎣ ⎦⎠ ⎝⎣ ⎦⎠ ⎣ 1 ⎦ R−1 1 T 1 ⎦ = ⎝⎣ 0 1 − +2001 mm 1 8 8 ⎡ ⎤ −20 mm ⎦ = ⎣ 1 40 126 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS Now propagate this chief ray forwards through the system by multiplying by MVV0 ⎤⎡ ⎡ ⎤ ⎡ ⎤ 1 160 mm −20 mm 0 mm ⎦⎣ ⎣5 ⎦=⎣ ⎦ 1 1 0 5 40 8 which has height of zero emerging from L2 (the aperture stop), as expected. The field of view of the system is twice the angle at the front of L1 : FoV = 2 · 1 1 1 180◦ ∼ radian = radian = · = 2.864◦ 40 20 20 π The exit pupil is (obviously) located at the aperture stop L2 , while the entrance pupil is the image of the stop in object space, so we can evaluate the location of the entrance pupil from the calculation of the chief ray emerging from the front vertex: ⎤ ⎡ ⎤ ⎡ y0 −20 mm ⎦ (emerging from front vertex) = ⎣ ⎦ ⎣ 1 n0 u0 40 The height is 20 mm and the angle is the optical axis is: 1 40 radian, so the distance to the location where the ray crosses zV 0 N P = − −20 mm 1 40 = +800 mm the distance from the vertex to the entrance pupil is positive, so the pupil is behind the objective and is virtual. The transverse magnification of the entrance pupil is: MT = − 800 mm = +5 −160 mm so the diameter of the entrance pupil is magnified: dN P = 5 · 5 mm = 25 mm 3.6 MARGINAL AND CHIEF RAYS FOR THE SYSTEM 127 Marginal ray (red) and chief ray (blue) from object at infinity traced through Galilean telescope with aperture stop at second lens (eyepiece). Example 2: Galilean telescope with aperture stop at FIRST lens, object at ∞ We already know that the height of the provisional marginal ray height at the second lens was y = 4 mm, so we can select a diameter for L2 that exceeds this value, so that the aperture stop is now the first lens: L1 : f1 = +200 mm, d1 = 40 mm L2 : f2 = −40 mm, d2 = 10 mm t = f1 + f2 = 160 mm The vertex-vertex matrix is the same as before: ⎡ MVV0 = ⎣ 1 5 160 mm 0 5 ⎤ ⎦ We know from the results just calculated that if d2 = 10 mm, then its semidiameter exceeds that height of the provisional marginal ray, so the aperture stop then becomes the first lens. The marginal ray we calculated for the first lens then becomes the actual marginal ray; at the first lens, the marginal ray is: ⎤ ⎤ ⎡ ⎡ ⎣ y nu ⎦ (at L1 ) = ⎣ 20 mm 0 ⎦ 128 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS and the marginal ray leaving the system after L2 is: ⎡ ⎤ ⎤ ⎡ y y0 ⎣ ⎦ (after L1 ) = ⎣ ⎦ nu n0 u0 ⎡ ⎤ 20 mm ⎦ = MVV0 ⎣ 0 ⎤ ⎤ ⎡ ⎡ ⎤⎡ 1 4 mm 20 mm 160 mm ⎦ ⎦=⎣ ⎦⎣ = ⎣5 0 0 0 5 Since aperture stop has moved to L1 from L2 , we have to evaluate a different chief ray; it will go through the center of L1 , so the provisional chief ray at L1 is: ⎤ ⎤ ⎡ ⎡ 0 mm y ⎦ ⎦ ⎣ (at L1 ) = ⎣ 1 nu provisional After the first refraction, the provisional chief ray is: ⎡ ⎤ ⎤⎡ ⎤ ⎡ ⎤ ⎡ 1 0 y0 0 mm 0 mm ⎣ ⎦ ⎦⎣ ⎦=⎣ ⎦ (after L1 ) = ⎣ 1 − 1 n0 u0 1 1 provisional +200 mm which again should be no surprise, since the chief ray goes through the center of L1 , the lens has no impact on the ray. Now propagate the provisional chief ray to L2 by applying the translation matrix: ⎡ ⎤ ⎡ ⎤⎡ ⎤ ⎡ ⎤ 0 mm 1 160 mm 0 mm 160 mm ⎦=⎣ ⎦⎣ ⎦=⎣ ⎦ T ⎣ 1 0 1 1 1 so the ray height of the chief ray is again MUCH larger than the semidiameter of the lens. The scaling factor that must be applied to the provisional chief ray is the ratio of the semidiameter of L2 to the ray height: ¡ d2 ¢ 5 mm 5 1 2 = = = y 160 mm 160 32 Therefore the true chief ray at the first lens is: ⎤ ⎤ ⎡ ⎡ y y ⎦ (at L1 ) = 1 · ⎣ ⎦ ⎣ 32 nu nu provisional ⎤ ⎡ ⎤ ⎡ 1 ⎣ 0 mm ⎦ ⎣ 0 mm ⎦ = = · 1 32 1 32 ⎡ ⎣ y nu ⎤ ⎡ ⎦ (at L1 ) = ⎣ 0 mm 1 32 ⎤ ⎦ In words, the angle of the chief ray into the first lens (and therefore into the aperture stop) is 1 32 3.6 MARGINAL AND CHIEF RAYS FOR THE SYSTEM 129 radians, so the full-angle field of view of the system is: 1 radian 16 1 180 ∼ = · = 3.58◦ 16 π F oV = 2 · u = which is larger than the field of view in the first case with the smaller diameter for L2 . Just for fun, propagate both the marginal and chief rays through the system at the same time: ⎞⎤ ⎡⎛ ⎡⎛ ⎞⎛ ⎞⎛ ⎞⎤ y y0 y0 y ⎠⎦ = ⎣⎝ ⎠⎝ ⎠⎝ ⎠⎦ MVV0 ⎣⎝ nu nu0 nu0 nu ⎡ ⎤⎡ ⎤ ⎡ ⎤ 1 20 mm 0 mm 4 mm 5 mm 160 mm ⎦⎣ ⎦=⎣ ⎦ = ⎣5 1 5 0 0 0 5 32 32 ⎡⎛ ⎞⎛ ⎞⎤ ⎡⎛ ⎞⎛ ⎞⎤ 4 mm 5 mm y0 y0 ⎠⎝ ⎠⎦ = ⎣⎝ ⎠⎝ ⎠⎦ = ⎣⎝ 5 0 nu0 nu0 32 So the ray height of the marginal ray after the second lens is 4 mm and the ray angle is 0 radians 5 (propagates to the image at infinity), while the chief ray height after L2 is 5 mm and the angle is 32 10 5 ◦ radians. The full angle of the image field is 32 = 16 radians ∼ = 17.9 . Marginal ray (red) and chief ray (blue) from object at infinity traced through Galilean telescope with stop at first lens. The entrance pupil coincides with the aperture stop in this system, while the exit pupil is the image of the aperture stop seen through L2 . The object distance to the stop is f1 + f2 = 160 mm, so the exit pupil distance is: zXP = z1 · f2 160 mm · (−40 mm) = = −32 mm z1 − f2 160 mm − (−40 mm) 130 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS and the diameter of the exit pupil is: dXP = MT · 40 mm = − −32 mm · 40 mm = +8 mm 160 mm Example 3: Galilean telescope with aperture stop between lenses, object at ∞ Now consider the result if we place an iris diaphragm with diameter d = 8 mm midway between L1 and L2 . The prescription for the system is: L1 L2 t S : : = : f1 = +200 mm, d1 = 40 mm f2 = −40 mm, d2 = 10 mm f1 + f2 = 160 mm VS = 80 mm, SV0 = 80 mm, dStop = 8 mm The matrix for the imaging elements is unchanged: ⎡ MVV0 = ⎣ 1 5 160 mm 0 5 ⎤ ⎦ but we need to confirm that the new iris is the aperture stop. Cast in a provisional marginal ray from an object at infinity: ⎤⎡ ⎤ ⎡ ⎡ ⎤ ⎡ ⎤ 20 mm 20 mm 20 mm 1 0 ⎦⎣ ⎦=⎣ ⎦=⎣ ⎦ R1 ⎣ 1 0 − 10 0 − +2001 mm 1 Now propagate this ray to the iris, located at a distance of 80 mm after L1 : ⎡ ⎤ ⎡ ⎤ 20 mm 1 80 mm 20 mm ⎦ = ⎣ ⎦ T ⎣ 1 1 − 10 0 1 − 10 ⎡ ⎤ 12 mm ⎦ =⇒ y = 12 mm > dStop = 8 mm = 4 mm at iris = ⎣ 1 2 2 − 10 So again we need to scale the provisional marginal ray by the ratio: ´ ³ dS t o p 2 4 mm 1 = = y 12 mm 3 So the marginal ray at the first lens is: ⎤ ⎡ 20 ⎤ ⎡ 2 ⎤ ⎡ 6 mm mm 1 ⎣ 20 mm ⎦ ⎣ ⎦=⎣ 3 ⎦ 3 = 3 0 0 0 ⎤ ⎡ 20 ⎡ ⎤ mm y ⎦=⎣ 3 ⎣ ⎦ nu 0 3.6 MARGINAL AND CHIEF RAYS FOR THE SYSTEM 131 Now propagate this ray through the first surface to the iris: ⎡ ⎤⎡ ⎤ ⎡ ⎤ ⎤⎡ 20 1 80 mm 1 0 mm 4 mm ⎣ ⎦⎣ ⎦=⎣ ⎦ ⎦⎣ 3 1 0 1 − +2001 mm 1 0 − 30 We can now propagate this from the iris to and through the second lens: ⎤⎡ ⎤ ⎤⎡ ⎤ ⎡ ⎡ 4 4 mm 1 80 mm 1 0 mm ⎦⎣ ⎦ ⎦⎣ ⎦=⎣ 3 ⎣ 1 1 − 30 0 1 − −40 mm 1 0 So the marginal ray exiting the system is at a height of the axis, as expected for a telescope). 4 3 mm and an angle of 0 radians (parallel to Now propagate the provisional chief ray forward (toward L1 ) from the iris; the translation from the iris is: ⎤ ⎡ ⎤ ⎡ 0 mm y ⎦ ⎣ ⎦ = ⎣ 1 nu at stop ⎛⎡ ⎤⎞−1 ⎡ ⎤ ⎡ ⎤⎡ ⎤ ⎡ ⎤ 1 +80 mm 0 mm 1 −80 mm 0 mm −80 mm ⎝⎣ ⎦⎠ ⎣ ⎦ = ⎣ ⎦⎣ ⎦=⎣ ⎦ 0 1 1 0 1 1 1 If we propagate the provisional chief ray from the iris towards L2 , we obtain: ⎡ ⎤⎡ ⎤ ⎡ ⎤ 1 +80 mm 0 mm +80 mm ⎣ ⎦⎣ ⎦=⎣ ⎦ 0 1 1 1 Note both ray heigths are too large, but that the ray height of the provisional chief ray at L2 is much larger in percentage than its height at L1 ; the ratios are: ¡ d1 ¢ 20 mm 1 2 = = 80 80 mm 4 ¡ dmm ¢ 2 5 mm 1 2 = = 80 mm 80 mm 16 So the second lens constrains the chief ray. Apply the scaling factor to the provisional chief ray to find the true chief ray at the iris: ⎡ 1 ⎣ · 16 0 mm 1 ⎤ ⎤ ⎡ 0 mm y ⎦ ⎦ = ⎣ 1 ⎦=⎣ nu at 16 ⎤ ⎡ stop Propagate it “forward” towards and through L1 to find the prescription for the chief ray entering 132 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS the system: ⎛⎡ ⎤⎞−1 ⎛⎡ ⎤⎞−1 ⎡ ⎤ ⎡ ⎤ 1 0 0 mm −5 mm 1 +80 mm ⎝⎣ ⎦⎠ ⎝⎣ ⎦⎠ ⎣ 1 ⎦ = ⎣ ⎦ 1 3 − 1 0 1 +200 mm 16 80 ⎡ ⎤ ⎡ ⎤ −5 mm y ⎣ ⎦ ⎦ =⎣ 3 nu into L1 80 The field of view of the system is twice the chief ray angle into the system: FoV = 2 · 3 3 3 180 ◦ ∼ radians = radians = · = 4.30◦ 80 40 40 π Propagate the chief ray towards and through L2 to find the chief ray exiting the system: ⎤⎡ ⎡ ⎤⎡ ⎤ ⎡ ⎤ 0 mm +5 mm 1 0 1 +80 mm ⎦⎣ 1 ⎦ = ⎣ ⎣ ⎦⎣ ⎦ 3 − −401mm 1 0 1 16 16 ⎤ ⎡ ⎡ ⎤ +5 mm y ⎦ ⎣ ⎦ =⎣ 3 nu out of L2 16 Marginal ray (red) and chief ray (blue) from object at infinity traced through Galilean telescope with iris diaphragm between lenses. 133 3.6 MARGINAL AND CHIEF RAYS FOR THE SYSTEM Example 4: Keplerian telescope, object at ∞ Substitute a positive lens with the diameter of 5 mm for L2 , which also means that we have to change the distance between the lenses: L1 : f1 = +200 mm, d1 = 40 mm L2 : f2 = +40 mm, d2 = 5 mm t = f1 + f2 = 240 mm The vertex-vertex (system) matrix is: ⎤⎡ ⎤⎡ ⎤ ⎡ 1 0 1 240 mm 1 0 ⎦⎣ ⎦⎣ ⎦ MVV0 = ⎣ − +2001 mm 1 0 1 − +401mm 1 ⎤ ⎡ 1 − +240 mm ⎦ MVV0 = ⎣ 5 0 −5 The prescription for provisional marginal ray into system from object at infinity has the same ray height as the semidiameter of L1 : ⎤ ⎤ ⎡ ⎡ 20 mm y ⎦ ⎦ ⎣ =⎣ 0 nu provisional The outgoing provisional marginal ray from the system is: ⎤⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ 20 mm −4 mm y − 15 240 mm ⎦⎣ ⎦=⎣ ⎦ ⎦ MVV0 ⎣ =⎣ 0 −5 0 0 nu provisional Since the ray height of the provisional ray is larger than the semidiameter aperture stop: d2 y0 > =⇒ L2 is aperture stop 2 so we must scale the provisional marginal ray by a factor ⎤ ⎤ ⎤ ⎡ ⎡ á ¢! ⎡ d2 5 y y y mm 5 2 ⎦ = ⎦ ⎦ ⎣ ⎣ = 2 = ·⎣ y 4 mm 8 nu nu nu provisional provisional ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ 5 · 20 mm 12.5 mm y ⎦=⎣ ⎦ ⎣ ⎦ = ⎣8 0 0 nu of L2 , then L2 is the ⎡ ·⎣ y nu ⎤ ⎦ provisional at L1 Now to the chief ray; the provisional chief ray emerging from center of aperture stop has zero height and angle of unity: ⎡ ⎤ ⎡ ⎤ y0 0 mm ⎣ ⎦ ⎦ =⎣ n0 u0 −1 provisional 134 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS The ray is propagated to the first lens: ⎡ T ⎣ 0 mm −1 ⎤ ⎛⎡ ⎦ = ⎝⎣ 1 +240 mm 0 1 ⎤⎞−1 ⎡ ⎦⎠ ⎣ 0 mm −1 ⎤ ⎡ ⎦=⎣ +240 mm −1 ⎤ ⎦ so the height of the provisional chief ray at the first element is |y| = 240 mm, which is MUCH larger than the semidiameter d21 = 20 mm of L1 . To ensure that the chief ray “gets through” the first lens, we have to scale its angle by the factor: 1 20 mm = 240 mm 12 So now go back to the original prescription for the provisional chief ray: ⎤ ⎡ ⎤ ⎤ ⎤ ⎡ ⎡ ⎡ 0 mm y0 y0 y0 1 ⎦ =⇒ ⎣ ⎦ ⎦= ⎦ ⎣ ⎣ = ⎣ 12 n0 u0 −1 n0 u0 n0 u0 provisional ⎡ provisional ⎡ ⎣ y0 n0 u0 ⎤ ⎡ ⎦=⎣ ⎤ ⎤ 0 mm =⎣ 1 ⎦ − 12 0 mm 1 ⎦ − 12 We can now propagate it from the rear vertex to and through the front vertex of the system. The chief ray emerging from the front vertex is: ⎛⎡ ⎝⎣ 1 0 − +2001 mm 1 ⎤⎞−1 ⎛⎡ ⎦⎠ ⎝⎣ 1 +240 mm 0 1 ⎤⎞−1 ⎛⎡ ⎦⎠ ⎝⎣ 1 0 − +401mm 1 ⎤⎞−1 ⎡ ⎦⎠ ⎣ 0 mm 1 − 12 ⎤ ⎡ ⎦=⎣ +20 mm 1 + 60 ⎤ ⎦ 1 In words, the chief ray height at the front surface is y = 20 mm and the chief ray angle is nu = + 60 radian (where the negative sign again just means that the ray angle into the system is the negative of that emerging therefrom). The field of view of the system is twice the angle: FoV = 2 · 1 1 1 180◦ ∼ radian = radian = · = 1.91◦ 60 30 30 π 3.6 MARGINAL AND CHIEF RAYS FOR THE SYSTEM 135 Marginal ray (red) and chief ray (blue) from object at infinity traced through Keplerian telescope with aperture stop at second lens. ¡ ¢ Example 5: Keplerian telescope, stop at eyepiece, nearby object OV = 500 mm Consider a telescope with the following parameters. L1 L2 t z1 : : = = f1 = +200 mm, d1 = 40 mm f2 = +40 mm, d2 = 5 mm f1 + f2 = 240 mm OV = 500 mm The provisional marginal ray goes from the center of the object to the edge of the first lens, through the system, and to the center of the image. The first provisional ray is: ⎡ ⎤ ⎡ ⎤ y 0 mm ⎣ ⎦ ⎦ (at object) = ⎣ nu 1 provisional It is useful to locate the image by propagating this provisional ray through the system: ⎤⎡ ⎤⎡ ⎤⎞ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎛⎡ 1 240 mm 1 0 1 500 mm 0 mm 140 mm 1 0 ⎦⎣ ⎦⎣ ⎦⎠ · ⎣ ⎦·⎣ ⎦=⎣ ⎦ ⎝⎣ 0 1 − +2001 mm 1 0 1 1 −5 − +401mm 1 So the image location relative to the rear vertex is: V0 O0 = − y 140 mm = = +28 mm u −5 radians V0 O0 = +28 mm 136 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS so the image is real. Now find the height of the provisional marginal ray at L1 : ⎡ ⎤ ⎡ ⎤⎡ ⎤ ⎡ ⎤ y 1 500 mm 0 mm 500 mm ⎣ ⎦ ⎦⎣ ⎦=⎣ ⎦ (at L1 ) = ⎣ nu 0 1 1 1 provisional where the ray height is MUCH too large and must be scaled to “fit” into the lens. The scale factor is: ¡ d1 ¢ 20 mm 1 2 = = y (at lens) 500 mm 25 So the second iteration of the provisional marginal ray at the front of the first lens is: ⎤ ⎡ ⎤ ⎡ 1 ⎣ 500 mm ⎦ ⎣ 20 mm ⎦ · = 1 25 1 25 which has a much smaller incident angle. Now propagate this ray through the first lens to the second lens: ⎤⎡ ⎤⎡ ⎡ ⎤ ⎡ ⎤ 1 0 20 mm 20 mm 1 240 mm ⎦⎣ ⎦⎣ ⎦ = ⎣ ⎦ T R1 ⎣ 1 1 1 − 0 1 1 25 +200 mm 25 ⎡ ⎤ ⎡ ⎤ 28 5 3 mm mm ⎦=⎣ 5 ⎦ = ⎣ 5 3 3 − 50 − 50 so the ray height is still too large; it is blocked by L2 (which therefore is the aperture stop); scale this ray to fit into the second lens by applying the factor: ¡ d2 ¢ 2.5 mm 25 12.5 2 = 28 = = y (at L2 ) 28 56 mm 5 So the third iteration produces the actual marginal ray from an object at a distance of 500 mm from L1 : ⎡ ⎤ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎣ y nu ⎦ = 25 at ob ject ⎡ ⎣ y nu 25 ⎣ 0 mm ⎦ ⎣ 0 mm ⎦ ∼ ⎣ 0 mm ⎦ = · = 1 1 56 0.017857 ⎤ ⎦ at ob ject ⎡ =⎣ 56 0 mm 1 56 The prescription for the marginal ray at L1 is: ⎤⎡ ⎤ ⎡ ⎡ 0 mm 1 500 mm ⎦⎣ ⎦=⎣ ⎣ 1 0 1 56 ⎤ ⎡ ⎦∼ =⎣ 125 14 mm 1 56 0 mm 0.017857 ⎤ ⎡ ⎦∼ =⎣ ⎤ ⎦ 8.929 mm 1 56 ⎤ ⎦ where the ray height is much smaller than the semidiameter of L1 , so the lens is overly large. We can propagate this through the system to find the actual prescription for the exiting marginal 137 3.6 MARGINAL AND CHIEF RAYS FOR THE SYSTEM ray: ⎡ MVV0 · ⎣ 1 500 mm 0 1 ⎤ ⎡ ⎦·⎣ 0 mm 1 56 ⎤ ⎡ ⎦ = ⎣ − 15 240 mm ⎡ ⎣ 0 y nu ⎤ ⎦ −5 at V0 ⎤⎡ 1 500 mm ⎡ mm ⎦⎣ =⎣ 5 2 0 5 − 56 1 ⎤ ⎤⎡ ⎦⎣ 0 mm 1 56 ⎤ ⎦ ⎦ Just to check, find the distance to the image to make sure it matches the result for the provisional marginal ray: 5 mm y V0 O0 = − = − 2 5 = +28 mm nu − 56 which agrees with what we found earlier. Now that we know that L2 is the aperture stop for the specified object location, we can propagate a provisional chief ray from center of the stop in both directions. (We will find that the chief ray is unaffected by the location of the object.) The provisional chief ray is: ⎡ ⎤ ⎡ ⎤ y0 0 mm ⎣ ⎦ ⎦ =⎣ n0 u0 +1 provisional Propagate through the system towards image space to obtain ⎤ ⎤ ⎡ ⎤ ⎡ ⎡ ⎤⎡ 1 0 0 mm 0 mm 0 mm ⎦ ⎦=⎣ ⎦=⎣ ⎦⎣ R2 ⎣ − −401mm 1 1 1 1 so the height and angle of the provisional chief ray is unchanged by the action of the lens L2 that is the stop because it passes through the center of the lens. The provisional chief ray may be propagated from the stop “forwards” towards the first lens. The translation matrix yields the ray height and angle at the first lens: ⎡ T ⎣ 0 mm 1 ⎤ ⎛⎡ ⎦ = ⎝⎣ 1 +240 mm 0 1 ⎤⎞−1 ⎡ ⎦⎠ ⎣ 0 mm 1 ⎤ ⎡ ⎦=⎣ −240 mm 1 ⎤ ⎡ ⎦=⎣ y0 n0 u0 ⎤ ⎦ (at L1 ) provisional Note that the height of the provisional chief ray at L1 is y = −240 mm, which means that it is BELOW the optical axis at a MUCH value than the semidiameter d21 = 20 mm of L1 . To ensure that the chief ray “gets through” the first lens, we have to scale its angle by the factor: ¡d ¢ 1 20 mm 1 2 = y 240 mm 12 So now go back to the original prescription for the provisional chief ray and scale it to obtain the “actual” chief ray: ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ 0 0 y0 y0 y y 0 mm 0 mm ⎣ ⎦ ⎦ =⇒ ⎣ ⎦= 1 ⎣ ⎦ ⎦=⎣ ⎦ =⎣ =⎣ 1 0 0 0 0 0 0 12 n0 u0 nu nu n u 1 12 provisional provisional Note that this is the same chief ray as for the case where the object is at infinity. In words, the chief ray is determined by the stop and the diameters of the other elements, not by the location of the object. 138 CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS We can now propagate the scaled chief ray from the rear vertex to and through the front vertex of the system. The chief ray emerging from the front vertex is: ⎛⎡ ⎝⎣ 1 0 − +2001 mm 1 ⎤⎞−1 ⎛⎡ ⎦⎠ ⎝⎣ 1 +240 mm 0 1 ⎤⎞−1 ⎡ ⎦⎠ ⎣ 0 mm 1 12 ⎤ ⎡ ⎦=⎣ −20 mm 1 − 60 ⎤ ⎦ 1 which has the correct ray height (the semidiameter of L1 ) y = 20 mm and angle nu = − 60 radian. The field of view of the system is twice the angle: F oV = 2 · 1 1 1 180◦ ∼ radian = radian = · = 1.91◦ 60 30 30 π The exit pupil is (obviously) located at the aperture stop L2 , while the entrance pupil is the image of the stop in object space, so we can evaluate the location of the entrance pupil from the calculation of the chief ray emerging from the front vertex: ⎤ ⎡ ⎤ ⎡ 20 mm y0 ⎦ (emerging from front vertex) = ⎣ ⎦ ⎣ 1 − 60 n0 u0 The height is 20 mm and the angle is the optical axis is: 1 40 radian, so the distance to the location where the ray crosses zV0 N P = − 20 mm = +1200 mm 1 − 60 in front the objective; the entrance pupil is real and its magnification is: MT = +1200 mm =5 240 mm so the diameter of the entrance pupil is: dNP = 5 · dStop = 5 · 5 mm = 25 mm 3.6 MARGINAL AND CHIEF RAYS FOR THE SYSTEM 139 Marginal ray (red) and chief ray (blue) from object at a distance of 500 mm from the first lens traced through Keplerian telescope with aperture stop at second lens. Chapter 4 Depth of Field and Depth of Focus From experience with snapshots or movies, we all know that the optical images are not “in focus” for objects at all distances from the lens; objects at distances other than that focused appear blurry. This is not necessarily bad — it is used as a creative tool by photographers and cinematographers to concentrate the attention of the viewer on particular objects of interest. However, in many (if not all) scientific applications, this limitation to the region of “good” imaging is detrimental; we’d like to see the entire 3-D object “in sharp focus.” For this reason, it is essential to understand the factors that affect the depth of the region of “sharp focus,” which is the so-called “depth of field” on the object as “seen” through the imaging system. The concept of depth of field and focus and the dependence on f/# is illustrated in the figure for a specified linear dimension of “acceptable sharpness.” The extent of the cone of rays between the two locations truncated by this sharpness criterion is the “depth of focus.” Clearly this range is larger for a smaller cone angle (larger f/#). This would lead us to the conclusion that the depth of focus (and also its object-space equivalent, the depth of field) is proportional to the f/#: ∆z ∝ f /# A more accurate criterion requires application of the principles of wave optics to show that diffraction induces a “blur spot” whose linear dimension also increases with focal ratio that defines the dimension of “acceptable” blur. A hybrid combination of the principles of ray and wave optics leads to a criterion that the depths of field and of focus actually vary with the square of the f/#: ∆z ∝ (f /#)2 This hybrid criterion is discussed after illustrating the concept of depths of field and focus using examples from film and television. 141 142 CHAPTER 4 DEPTH OF FIELD AND DEPTH OF FOCUS The depth of focus for a known linear dimension of “acceptable sharpness” depends on angle of the cone of rays, which is determined by the focal ratio (f/# ) of the system. If the cone of rays is large (small f/#), then the extent of the cone in front of and behind the point of best focus is small; if the angle of the ray cone is small (large f/#), then a wider range of depths appear “in focus.” 143 4.0.2 Examples of Depth of Field from Video and Film Extensive discussion in Wikipedia at http://en.wikipedia.org/wiki/Depth_of_field 1. The Colbert Report, video image with “normal lens” shows the different in apparent sharpness with depth in the scene. This naturally draws attention to the object that is in focus and often serves as a cue to the audience about which is the object of interest. There are three areas of interest at different distances from the lens, which is focused on the nearest plane (Stephen Colbert); the more distant plane where Jon Stewart sits is noticeably blurry, but the bookshelf in the distant plane is very blurry. Note the difference in sharpness with depth; Stephen Colbert in the foreground is in sharp focus, Jon Stewart is clearly less sharp, and the items in the background are quite blurry. 144 CHAPTER 4 DEPTH OF FIELD AND DEPTH OF FOCUS c 2011, Masterpiece Mystery from the BBC, using limited depth of field to draw 2. Sherlock ° attention to the point of interest This example shows how the director draws the attention of the audience to the desired point of interest. The two frames are from A Scandal in Belgravia, the first episode in the second season of Sherlock broadcast by the BBC and PBS. The two frames are taken from the same camera position and separated in time by approximately two seconds. In the first frame, “Sherlock” (Benedict Cumberbatch) is speaking about the camera phone of “Irene Adler” (Lara Pulver). After he finishes speaking, the camera focus shifts rapidly to Adler in the background for her reply. Note that her form is barely distinguishable in the first frame, which focuses the viewer’s attention upon Sherlock in the foreground. Use of limited depth of field to draw the attention of the audience to the subject of interest. The camera shifts focus rapidly from the foreground character (at top) to the background character (at bottom). 145 3. Citizen Kane by Orson Welles, small aperture (large f/#) =⇒ large depth of field Both foreground and background are in focus — note cheek of “Mr. Bernstein” (Everett Sloane) in near foreground on right and venetian blinds in the windows at the back. “Walter Thatcher” (George Coulouris) on left and “Charles Foster Kane” (Orson Welles) in center are in focus. The distance to the windows appears to be small because of the sharp focus. Different frame of same scene from “Citizen Kane” shot with same focus setting. George Coulouris (as “Walter Thatcher”) and Everett Sloane (as “Mr. Bernstein”) remain in focus in the foreground. Orson Welles (as “Charles Foster Kane”) has walked to the windows, which are now clearly many feet from the foreground characters. “Kane’s” stature appears to have been diminished. c 1941 RKO Pictures, Inc.) is famous for its creative cinematogThe film “Citizen Kane”(° raphy by Gregg Toland and the director/star Orson Welles, including original camera angles (especially upward shots from the floor or even from beneath the floor plane), movements, transitions, and the use of “deep focus.” Consider the two frames from the film of a group of three characters: the standing Orson Welles in the center (at age 26 as the elderly “Charles Foster Kane,” a testament to the skill of makeup artist Maurice Seiderman), George Coulouris on the left (as “Walter Parks Thatcher”, who had been Kane’s guardian), and Everett Sloane on the right (as Kane’s assistant “Mr. Bernstein”). In the first frame, the three characters are grouped together and the entire scene appears to be in focus, from the skin on Bernstein’s face on the right to the venetian window blinds in the back. From the sharp focus of the background windows and expectations about depth of field based on past experience, viewers likely 146 CHAPTER 4 DEPTH OF FIELD AND DEPTH OF FOCUS will surmise that the windows must be physically close to the characters and therefore that Kane is much taller than the background window sill. Between the first and second images, the standing Kane has taken 18 steps to walk to the windows (perhaps 35-50 feet from the foreground characters), while remaining in focus the entire time. His height is now shown to be approximately the same as the height of the window sill. The apparent “shrinking” of his size during the walk may be interpreted as an artistic metaphor for the diminishing stature of Kane due to the partial failure of his media empire during the Depression. He subsequently walks back to the foreground to sign the agreement held by Mr. Thatcher that sells much of his publishing/broadcasting empire back to Thatcher’s bank. The very large depth of field can only be obtained by a small aperture stop, which reduces the light reaching the sensor. Clearly the emulsion must have good sensitivity (it must have been a “fast film”) and the lighting must be sufficiently strong to record “useful” images. The sequence is available on “YouTube”- at http://www.youtube.com/watch?v=WTmVlDh2V2g. Interested readers might want to view the documentary about the movie (http://www.youtube.com/watch?v=eCkYlCBFV6w). Another scene in the movie that is interesting from the perspective of optics is the so-called “mirror scene,” which is at the end of the 1-minute clip at http://www.youtube.com/watch?v=8fIP7g9en10 Still from the “mirror scene” in “Citizen Kane.” Again, note the depth of field. 147 c Selznick International Pictures, Vanguard Films 1945 ) 4. Spellbound, by Alfred Hitchcock (° The climactic scene in this classic movie is a confrontation between “Dr. Murchison” (Leo G. Carroll) and “Dr. Constance Petersen” (Ingrid Bergman), where Petersen reveals she has evidence that Murchison murdered Dr. Anthony Edwardes, whose “substitute imposter” is played by Gregory Peck. Frames from the scene are shown in the figure. The frames from the viewpoint of Dr. Murchison show the view of his hand, the gun, and Ingrid Bergman, with all apparently “in focus.” To avoid problems with depth of field, the hand and gun are actually models that are larger than life size that were positioned closer to Bergman than to the camera. The website for Turner Classic Movies states that the scene took a week to set up and 19 takes to get the final result (http://www.tcm.com/this-month/article/18621%7C0/Spellbound.html). YouTube clip available from http://www.youtube.com/watch?v=8rDMotFmCJc. c Selznick International Pictures 1945), showing (a) Leo G. Carroll Scenes from “Spellbound” (° holding a revolver; (b) Ingrid Bergman walking towards the door as Carroll’s character aims the revolver; (c) and (d) after Bergman’s exit, the hand and gun turn towards the camera and fires. An additional note of interest in this black-and-white film is that two color frames as the gun fires were spliced into each print by hand. 148 CHAPTER 4 DEPTH OF FIELD AND DEPTH OF FOCUS One of the two color frames of the gunshot spliced into each print of the film “Spellbound.” 5. Somewhere in Time, split-diopter lens to focus on two distances simultaneously, giving the appearance of expanded depth of field Split-diopter lens (Fig. 5.13 from Visual effects cinematography By Zoran Perisic), which is attached to the front of a normal lens and which adds power on one side of the field of view. c Universal Studios, 1980 ) illustrates the action of The frame from “Somewhere in Time” (° the “split-diopter” lens added to the normal camera lens. Both the foreground field on the right (with Christopher Reeve as “Richard Collier”) and left-hand background field (with Jane Seymour as “Elyse McKenna,” the white garden bench, and the trees) appear to be “in focus.” The split diopter lens adds refractive power (thus shortening the focal length) for half the field. Because the sensor is the same distance from the rear vertex of these two “half-systems,” the object plane that is in focus in the half field with the additional power is closer to the lens. In this example, the split-diopter lens is oriented to “split” the fields through the vertical white pillar and adds power to the right half of the field. The left side of the vertical pillar is “fuzzier” than the right side, where the features of the wood grain are visible. Note that the trees in the background on the right are out of focus, while those on the left are sharp. The audience likely does not notice the discrepancies in the image planes. 4.1 CRITERION FOR “ACCEPTABLE BLUR” 149 c Universal Studios, 1980) showing use of “split-diopter Frame from “Somewhere in Time” (° lens.” Both foreground and background are “in focus” but note that the left side of the foreground pillar is “fuzzy” while the right side is “sharp.” A system consisting of both optics and sensor is “diffraction-limited” if the pixel size of the sensor (smallest resolvable spot) is smaller than the linear dimension of the diffraction spot. The system is “detector-limited” / “sensor-limited” if the linear dimension of the individual sensor elements is larger than the diffraction spot. 4.1 Criterion for “Acceptable Blur” The discussion of the limiting “blur” of an imaging system may be extended to characterize the range of “distances” (or “depths”) over which images of point objects exhibit the “same” (or at least “similar”) blur dimensions. If specified in object space, the distance range is called the “depth of field;” the same metric in image space is the “depth of focus.” The depth of field may be thought of as the “zone of acceptable sharpness” for object locations. There is no one way to define the depths of field and focus, but we can rather easily derive a metric based on ray optics and a hybrid metric that includes the concept of “diffraction” from wave optics (where the aspects must be taken “on faith” at this point). The measurement is based upon the linear dimension B 0 of the “acceptable blur.” This may be due to a metric of acceptable spatial resolution or the size of the sensor elements, or the diameter of the diffraction spot in the hybrid metric. Consider a hypothetical value of B 0 shown in the figure. From this value, it is easy to determine the range of possible axial distances that correspond to B 0 in the ray model and use that to evaluate the corresponding dimension B in object space via the transverse magnification z0 B0 MT = − = . z B 150 CHAPTER 4 DEPTH OF FIELD AND DEPTH OF FOCUS The calculation of depth of field: B 0 is the linear dimension of the blur for the system (either the diameter of the diffraction spot in a diffraction-limited system or the dimension of the sensor element in a detector-limited system). The locations z 0 ± δ 0 specify locations in image space where the geometrical blur has the same linear size. The corresponding locations in object space are the limits of the “depth of field.” As shown in the figure for a given B 0 , the “blur” spots are located at two positions equidistant from the “in-focus” image. We assign the name δ 0 to the distance between the “in-focus” image and the geometrically blurred images, so these two planes are located at z 0 ± δ 0 . The depth of focus in this model is twice δ 0 : ∆z 0 = 2 · δ 0 In the ray model, the drawing shows that: B0 z0 ∼ 0 D = 0 =⇒ δ 0 = B 0 · = B · f/# 0 z D δ (in the case where the object distance is “many” focal lengths so that the image distance is only slightly longer than a focal length). If B 0 is small, so must be δ 0 ; if the f/# is large, so must be δ 0 . The object distances z1 and z2 corresponding to these image locations may be evaluated from the imaging equation for the corresponding image distances z10 = z 0 − δ 0 and z20 = z 0 + δ 0 . It is easy to see that the absolute magnification |MT | is smaller for the smaller image distance, i.e., MT for z10 = z 0 − δ 0 is smaller than MT for the larger object distance z20 = z 0 + δ 0 . The nonlinearity of the imaging equation ensures that the distances between the in-focus object distance z and the extrema are not equal, i.e., z1 − z 6= z − z2 , thus requiring labels for both: z1 = z + δ 1 and z2 = z − δ 2 . However, if δ 0 is small, then the concept of longitudinal magnification ML allows simple approximate expressions for the object distances. We already derived a simple expression for ML in terms of the 151 4.1 CRITERION FOR “ACCEPTABLE BLUR” transverse magnification MT : Differentiate both sides of the imaging equation: µ ¶ µ ¶ 1 1 1 d + =d =0 z1 z2 f ¶ µ ¶ ¶ µ µ 1 1 1 1 = − 2 dz1 + − 2 dz2 = 0 + d z1 z2 z1 z2 dz2 =⇒ =− dz1 µ z22 z12 ¶ =− (∆z)0 ML = =− ∆z µ z2 z1 µ z2 z1 ¶2 ¶2 2 = − (MT ) < 0 = − (MT )2 < 0 The increments in object distance are related to the increments in image distance via the longitudinal magnification: δ0 ∼ = |ML | · δ 1 ∼ = |ML | · δ 2 =⇒ δ 1 ∼ = δ2 ∼ = z1 = z + δ 1 ∼ =z+ z2 ∼ = z − δ2 ∼ =z− δ0 δ0 =z− 2 |ML | MT δ0 |ML | δ0 δ0 =z+ 2 |ML | MT So the depth of field is proportional to the f/# and to the linear dimension of the acceptable blur: B 0 · f/# δ0 δ0 ∆z = z1 − z2 = δ 1 + δ 2 ∼ =2· 2 =2· =2· |ML | MT MT2 µ ¶ 0 B ∆z ∼ = 2 · 2 · f/# ∝ f/# MT In the detector-limited case where the blur dimension is determined by the pixel dimension b0 , the depth of field is proportional to the f/#: b0 ∆z ∼ = 2 · 2 · f/# ∝ f/# (in ray model) MT Note that the depth of field is larger in “slower” systems (with large f-numbers and small cone angles). If we add the wave concept of “diffraction,” the linear dimension B 0 is determined by the diffraction pattern, which may be written in terms of the wavelength and the focal ratio. Assume that the linear dimension of image blur has been measured for a particular imaging system at the specific pair of object and image distances (z and z 0 respectively) of interest: 152 CHAPTER 4 DEPTH OF FIELD AND DEPTH OF FOCUS Blur in a diffraction-limited system with aperture diameter D. The image of the point source is a diffraction pattern at the image plane whose linear dimension (using some criterion) is B 0 . For example, the image of a point source located a distance z from the system could be measured to find this limiting “blur diameter” B 0 , where the prime indicates that the measurement is made in image space. In a diffraction-limited system, the discussion of Fraunhofer diffraction in imaging shows that one possible measure for B 0 is the diameter of the central lobe of the diffraction spot: B 0 = 2.44 · λ0 · z0 ∼ f = 2.44 · λ0 · f/# = 2.44 · λ0 · D D B0 ∼ = 2.44 · λ0 · f/# f/# λ0 · (f/#) ∆z ∼ = 2 · (2.44 · λ0 · f/#) · 2 = 4.88 · MT MT2 2 2 λ0 · (f/#) ∆z ∼ = 4.88 · MT2 (if accounting for diffraction) So the depths of field and of focus are proportional to the square of the f/# in the diffraction-limited case. 4.2 Depth of Field via Rayleigh’s Quarter-Wave Rule We can also derive the depth of focus by finding the range of image locations that satisfy Rayleigh’s rule applied to defocus, and then transform those image distances back into object space via the imaging relation to find the depth of field. The necessary task is to find the change in the image location for change in the wavefront error at the edge of the pupil. In the figure, the ideal reference wavefront has radius R1 (R1 ∼ = f if the object is a large distance away) and the wavefront with defocus has radius R2 = R1 + δ 0 ∼ = f + δ0 , 153 4.2 DEPTH OF FIELD VIA RAYLEIGH’S QUARTER-WAVE RULE where δ 0 is the change in location of the focal plane with an added quadratic phase of ∆W020 = ± λ40 . The quadratic-phase approximation to the new wavefront is: W [x, y] = x2 + y 2 x2 + y 2 ¢= = ¡ 2R2 2 R1 + δ 0 = x2 + y 2 2R1 = x2 + y 2 2R1 x2 + y 2 ∼ = 2R1 2 x + y2 = 2R1 x2 + y 2 µ ¶ δ0 2R1 1 + R1 µ +∞ µ 0 ¶n 0 ¶−1 2 2 X δ δ x +y 1+ = · R1 2R1 R 1 n=0 ! à µ 0 ¶2 µ ¶2 0 (−1) (−2) δ (−1) (−2) (−3) δ 0 δ + + + ··· 1 + (−1) R1 2! R1 3! R1 ¯ 0 ¯ ¯ 0¯ µ ¶ µ 0 ¶2 ¯ δ ¯ ¯δ ¯ δ0 δ ¿ ¯¯ ¯¯ ∼ 1− (if = ¯¯ ¯¯ ¿ 1) R1 R1 R1 f 2 2 x +y − δ0 · 2R12 where the first term is the quadratic-phase approximation to the ideal wavefront and the second term is the additional effect of the defocus. Change in image position δ 0 as a function of the wavefront error ∆W = W020 for defocus. In the limit where the object distance is large, the image distance R1 is approximately equal to the focal length f , so this expression simplifies to: µ 2 ¶ x2 + y 2 x + y2 0 ∼ W [x, y] = −δ 2f 2f 2 µ 2 ¶ 2 2 x + y2 x +y 0 ∼ ∼ =⇒ ∆W [x, y] = −δ ∆W [x, y] = W [x, y] − 2f 2f 2 If the wavefront error is positive, ∆W > 0 =⇒ δ 0 < 0, which means that the image moves “towards” the lens as shown in the figure. The magnitude of the wavefront error at the edge of the pupil (where, say, x = ¡ d0 ¢2 ¯ ∙ ¸¯ ¯ ¯ + 02 d2 d 0 0 = δ 0 · 02 |∆W | = ¯¯W x = , y = 0 ¯¯ = δ · 2 2 2 2f 8f d0 and y = 0) is: 2 We can now apply Rayleigh’s rule that the image is effectively ideal if the maximum wavefront error 154 CHAPTER 4 DEPTH OF FIELD AND DEPTH OF FOCUS is less than a quarter wave, so that the single-sided depth of field is easy to evaluate: µ d2 λ0 f 2 λ0 > |∆W | = δ 0 · 02 =⇒ δ 0 ∼ · = 4 8f 2 2 d0 ¶2 = 2λ0 µ f2 d20 ¶ = 2λ0 µ f d0 ¶2 2 =⇒ δ 0 ∼ = 2λ0 · (f/#) using Rayleigh’s rule for ideal imaging In visible light with λ0 ∼ = 0.5 μm, the change in image position under the Rayleigh criterion is 2 δ 0 [λ0 ∼ = 0.5 μm] ∼ = (f/#) [ μm] In words, an image in visible light appears to be “in focus” if the distance of the actual image plane from the ideal image plane in micrometers is no larger than the square of the f/#. For example, if the lens is used at f/4, the actual image plane must be within 16 μm of the ideal location; if at f/16, the actual image plane must be within 256 μm ∼ = 0.25 mm of the ideal location. Note the similarities and the differences with the rule of thumb that the size of the diffraction spot in micrometers is equal to the f/#. The depth of focus is twice this value because we can defocus on either side of the ideal image plane: 2 2 Depth of focus: (∆z)0 = 2δ 0 ∼ = 4λ0 (f/#) ∼ = 2 · (f/#) [ μm] Now convert this to the object space via the longitudinal magnification to find the depth of field: 0 δ0 (∆z) = = − (MT )2 δ ∆z 0 0 (∆z) (∆z) ∆z ∼ = =2·δ = |ML | (MT )2 ML = 4λ0 (f/#)2 ∆z ∼ = 2 (MT ) which again is proportional to the square of the f-ratio and is quite similar to the “hybrid” metric for depth of field in the diffraction-limited case from the last section: Depth of field: ∆zHybrid ∼ = 4.88 · à λ0 (f/#)2 (MT ) 2 ! ' ∆zRayleigh ∼ =4· à λ0 (f/#)2 2 (MT ) ! These two expressions are quite similar; the fact that these are not identical should be no surprise since they were derived using different assumptions. Note that the depth of field increases as the square of the f/#, so stopping down the lens by a factor of 2 has a big impact — it increases the depth of field by about a factor of 4. Since the transverse magnification is less than unity for most real imaging setups (and a lot less for distant objects), the depth of field increases rapidly as the object distance increases. It might be useful to do an example. Consider a normal lens with f = 50 mm acting in visible light (λ0 = 500 nm = 0.5 μm) with the aperture wide open (say, f/2 so that the diameter of the entrance pupil is d0 = 25 mm) imaging a nearby object with z1 = 1 m: µ ¶−1 1 1 ∼ z2 = − = 52.63 mm 50 mm 1000 mm z2 52.63 mm = −0.5263 MT = − = − z1 1000 mm where (again) the negative sign on the transverse magnification means that the image is “upside 4.2 DEPTH OF FIELD VIA RAYLEIGH’S QUARTER-WAVE RULE 155 down” compared to the object. The depth of focus is: 0 depth of focus at f/2: (∆z) = 2δ 0 ∼ = 4 · 0.5 μm · 22 = 8 μm And the depth of field is obtained by scaling by the square of the transverse magnification: 0 8 μm (∆z) ∼ depth of field at f/2: ∆z ∼ = = = 28.9 μm MT2 (−0.5263)2 If we stop the lens down to, say, f/16 (a factor of 8), the depths of focus and field are much larger: depth of focus at f/16: (∆z)0 = 2δ 0 ∼ = 0.5 mm = 4 · 0.5 μm · 162 = 512 μm ∼ depth of focus at f/16: ∆z ∼ = 512 μm (−0.5263)2 ∼ = 1.85 mm If the object is a large distance away, say z1 = 100 m with the lens wide open at f/2, the transverse magnification is much smaller: µ ¶−1 1 1 ∼ − = 50.025 mm 50 mm 100 m z2 50.025 mm MT = − = − = −5.0025 × 10−4 z1 100 m z2 = The depth of focus is the same as it was for the close-up image at f/2: 0 (∆z) = 4 · 0.5 μm · 22 = 8 μm but the much smaller value for the transverse magnification means that the depths of field and focus are much larger: 8 μm ∼ ∆z ∼ = 2 = 32 m (−5.002 5 × 10−4 ) ∆z ∼ = 512 μm 2 (−5.002 5 × 10−4 ) ∼ = 2 km Depth of field of lens focused at z1 = 20 ft ∼ = 6 m for three focal ratios: f /1.8, f /5.6, and f /16 showing increase in depth of field with increasing focal ratio (from http://www.engadget.com). 156 4.3 CHAPTER 4 DEPTH OF FIELD AND DEPTH OF FOCUS Hyperfocal Distance The last example just presented where the object distance z1 = 100 m and the depth of field ∆z ∼ = 2 km suggests another useful imaging metric: the shortest object distance for which the depth of field extends to infinity, which is called the hyperfocal distance (z1 )hyp erfocal and the corresponding image distance (z2 )hyp erfocal is the sum of the focal length and the “defocus distance” δ 0 : (z1 )hyp erfocal + δ 1 = ∞ =⇒ (z2 )hyp erfocal − δ 0 = f =⇒ (z2 )hyp erfocal = f + δ 0 The hyperfocal object distance (z1 )hyp erfocal satisfies the imaging equation for this image distance: 1 1 1 + = (z1 )hyp erfocal (z2 )hyperfocal f µ ¶−1 1 1 − f f + δ0 f 2 + δ0 f f2 = =f+ 0 0 δ δ 2 f ∼ = f+ 2λ0 (f/#)2 f2 ∼ = 2 2λ0 (f/#) Hyperfocal Distance (z1 )hyp erfocal = where we can also interpret this in terms of the diameter of the diffraction spot: (z1 )hyp erfocal ∼ = f2 f2 = (2λ0 f/#) · (f/#) (f/#) · ddiffraction sp ot where ddiffraction sp ot ∼ = 2 · λ0 · f/#. So if we have a so-called “normal lens” with f = 50 mm acting at f/2 (close to wide open) and in light with λ0 = 500 nm, the hyperfocal distance is: (z1 )hyperfocal ∼ = (50 mm)2 ∼ = 625 m 2 · 500 nm · 22 which is quite distant. If we stop the lens down to f/16, we get: (z1 )hyp erfocal ∼ = (50 mm)2 ∼ = 9.8 m 2 · 500 nm · 162 which is quite a lot closer to the lens. This means that objects at all distances in the interval 10 m / z1 < ∞ should appear to be “in focus” if the lens is used at f/16. 4.4 Methods for Increasing Depth of Field 1. Google Lens: http://www.google.com/patents/US6320979 2. Focus stacking: digital combinations of images collected at different focus settings. Different images are combined based on local sharpness to produce an image with extended depth of field. 3. Light-field camera = plenoptic camera that captures the four-dimensional field [x, y, z, t]. An example of such a camery is the Lytro, which uses a matrix of microlenses to collect ray 4.5 SIDEBAR: TRANSVERSE MAGNIFICATION VS. FOCAL LENGTH 157 direction information in addition to color and lightness. This stored information allows recovery of focused information at different depths. 4. Cameras with different focal settings for different colors of light. The information is combined digitally to extract the sharp edge data from the color with the large f/# with the blurrier structure in other colors. 4.5 Sidebar: Transverse Magnification vs. Focal Length It may be useful to derive the relationship between transverse magnification and focal length for a given object distance. We know the imaging equation for object distance z1 , image distance z2 , and focal length f 1 1 1 + = f z1 z2 We already know that for an imaging system consisting of two or more lenses, the object distance is measured to the object-space principal point, the image distance is measured from the image-space principal point, and the focal length is replaced by the effective focal length. For a specific object distance z1 and a fixed focal length f , the equation may be rearranged to determine the image distance: z1 · f z2 = z1 − f We can substitute the expression for the transverse magnification: ´ ³ ! à à ! z1 ·f z1 −f f z2 f f 1 1 =− MT = − = − = = z1 z1 f − z1 z1 zf − 1 z1 1 − zf 1 1 ¯ ¯ ¯ ¯ If the focal length is shorter than the object distance, then the term ¯ zf1 ¯ < 1: MT MT ! µ ¶ à 1 f = − · z1 1 − zf1 µ ¶ X ¶n µ ∞ 1 f f = − · − z1 n! z1 n=0 ! µ ¶ à µ ¶2 f 1 f f = − + − ··· · 1− z1 z1 2 z1 µ ¶2 µ ¶3 f f 1 f = − + − + ··· z1 z1 2 z1 f ∼ = − if f ¿ z1 z1 where the series for (1 − t)−1 has been used. For a lens with a fixed focal length f but two object distances (z1 )a and (z1 )b the transverse magnifications are: f (MT )a ∼ = − (z1 )a f (MT )b ∼ = − (z1 )b 158 CHAPTER 4 DEPTH OF FIELD AND DEPTH OF FOCUS so the difference in transverse magnifications is: (MT )a − (MT )b = ∆MT ∼ = ∆MT ∆MT ¶ µ ¶ µ f f − − − (z1 )a (z1 )b ¶ 1 1 − (z1 ) (z1 )b ¶ µ a (z1 )b − (z1 )a = (−f ) · (z1 )a · (z1 )b (z1 )a − (z1 )b ∆z1 = f· =f· (z1 )a · (z1 )b (z1 )a · (z1 )b ∼ = (−f ) µ We have already seen that the transverse magnification varies with the focal length of the lens: µ ¶ µ µ ¶¶ 1 1 1 z2 1 z2 1 1 + = · +1 = · 1− − · (1 − MT ) = = f z1 z2 z2 z1 z2 z1 z2 z2 =⇒ = (1 − MT ) f f 1 =⇒ = z2 1 − MT If the object distance z1 is large, then |MT | / 0, which means that we can substitute the geometric series: +∞ X 1 t if |t| < 1 = 1−t =0 1 f = z2 1 − MT +∞ X 2 = (MT ) = 1 + MT + (MT ) + · · · ∼ = 1 + MT if |MT | < 1 =⇒ z2 ' f =0 f ∼ = 1 + MT if |MT | < 1 =⇒ z2 ' f z2 which implies that the magnification increases with the focal length We should check this for some known cases: if the object distance z1 = +∞, then z2 = f and : f =1∼ = 1 + |MT | z2 =⇒ |MT | ∼ = 0, correct answer z1 = ∞ =⇒ If the object distance is z1 = 100 · f , then the image distance and approximate transverse magnification are: 100 99 ∼ f 1 = z2 = f =⇒ = 1 + MT =⇒ MT ∼ =− 99 z2 100 100 The actual transverse magnification is: MT = − so the approximation is still quite good. ¡ 100 ¢ 99 100 =− 1 ∼ 1 =− 99 100 4.5 SIDEBAR: TRANSVERSE MAGNIFICATION VS. FOCAL LENGTH 159 Now consider two distant objects a and b at object distances (z1 )a > (z1 )b À f , we have: (z1 )a (z1 )b ∆z − = f f f (z1 )a (z1 )b ∼ − = (1 + MT )a − (1 + MT )b = (MT )a − (MT )b = ∆MT f f ∆z1 ∼ = ∆MT f which shows that the difference in transverse magnifications decreases as the focal length f increases for fixed ∆z1 . In words, if two distant objects are separated along the optical axis by the distance ∆z, the transverse magnifications for the two objects are more similar if the focal length f is large, which gives the impression to the viewer that the objects are “close together.” Consider the example shown below; the subjects are a pair of 15- in diameter Rodman smoothbore cannon dating from 1864 that are preserved on restored carriages at Fort Foote, Maryland, near my childhood home (when I was growing up, the two barrels had not been mounted, but were lying on the ground). The near and distant cannons are separated by the fixed distance ∆z1 . The images were taken with a zoom lens: the first used a “telephoto” setting with equivalent focal length f1 = 140 mm for the 35 mm film format (the actual focal length was f1 = 22.2 mm). The second image was taken with equivalent focal length f2 = 32 mm for the 35 mm format (a “wide-angle” lens; the actual focal length f2 = 6.6 mm). The difference in transverse magnifications clearly is smaller with the long focal length (first image) as the distant cannon is readily visible; the tiny distant cannon is barely visible in the second image. The transverse magnifications for the background cannon differ by nearly a factor of 2.5 for the two images. This effect leads to the statement that telephoto lenses “compress” the depth of field (though some vigorously dispute this statement for psychological reasons!). 160 CHAPTER 4 DEPTH OF FIELD AND DEPTH OF FOCUS Illustration of the variation in transverse magnification with focal length of the lens. The equivalent focal length of the lens used to make the top image is f ∼ = 140 mm (telephoto) and that for the bottom is f ∼ = 32 mm (wide angle). The background cannon is MUCH smaller in the second image. Chapter 5 Aberrations Aberrations may be loosely defined as deviations from predicted behavior of an optical system. Chromatic aberrations describe deviations from predicted behavior due to variations in the refractive index for different wavelengths of light. Monochromatic aberrations are variations from calculated behavior due to the approximations used. For example, if we use just the first-order approxmation sin [θ] ∼ = tan [θ] ∼ =θ we can describe the deviations from predicted first-order behavior as the third-order aberrations. The aberrations may be described in terms of waves or of rays. The wave aberration is the departure of the wavefront from the ideal spherical wave that “should” emerge from the exit pupil of the system to the image: p [x, y] · exp [+iΦ [x, y]] = p [x, y] · exp [+iπW [x, y]] where W [x, y] is the scalar wave aberration function measured in units of π radians at each point in the exit pupil. Note that the spherical wave “converges” to a real image or “diverges” from a virtual image. The wave aberration function is the difference of the actual emerging wave from the ideal sphere, which has the form: r (x2 + y 2 ) 2 2 2 2 x + y + z = R =⇒ z = R · 1 − R2 5.1 Chromatic Aberration In the earliest days of optics, all optical systems were constructed from single lenses (“singlets”) and therefore suffered from chromatic aberrations due to the physical mechanism of dispersion.We saw that the index of refraction of optical materials decreases with increasing wavelength λ in regions of normal dispersion. At longer wavelengths in a regime with normal dispersion, a lens with positive power will have less refractive power φ (longer focal length f ). Conversely, a lens with negative power will have a longer negative focal length at longer wavelengths. The impact of chromatic aberration on the image was minimized if the focal is long and the focal ratio is large. For this reason, early telescopes for astronomical viewing were made very long in part for magnification and in part to reduce the visibility of chromatic aberrations. 161 162 CHAPTER 5 ABERRATIONS The aerial telescope of Johannes Hevelius with a focal length of f = 45 m ∼ = 148 ft with an aperture diameter of d ∼ = 220 mm ∼ = 8.5 in The observation that different glasses have different dispersions is the basis for the principle of achromatization (from the Greek words for without color ), where two optical elements made from glasses with different dispersion characteristics are combined to match the focal lengths at two different wavelengths (typically red and blue). An achromatic doublet is fabricated from a positive element made from crown glass with a lower refractive index and lower dispersion, and a negative element made of flint glass with a larger refractive index and a larger dispersion. For an achromat with a positive focal length (converging lens), the lens is made of a positive lens from crown glass and a negative lens from flint glass so that the chromatic aberrations act in opposition to match at the two wavelengths. If the component lenses are in contact (and often the curvatures are designed to match so that they may be cemented together, then the positive power must be larger (focal length must be shorter). Lens systems may be built that correct for three or more wavelengths. It may be obvious that the number of elements must match or exceed the number of corrected wavelengths. Apochromats have at least three elements to correct the focal length at three different wavelengths (typically red, green, and blue) and are fabricated from three glass elements with different dispersion characteristics. Of course, the need for the additional element(s) means that apochromats tend to be more expensive than achromats. 5.1 CHROMATIC ABERRATION 163 Principle of the achromat: the first singlet lens exhibits chromatic aberration because of the dispersion of the glass (nred < ngreen < nblue ), which means that red light focuses farther away. Add a second element of flint glass with negative power that matches the focal lengths for red and blue light to form an “achromat.” 164 CHAPTER 5 ABERRATIONS Apochromat made of three elements to correct focus at three wavelengths. The traditional wavelengths used to design optics were specified by Fraunhofer based on absorption lines in the solar spectrum: Line λ [ nm] n for Crown n for Flint C 656.28 1.51418 1.69427 D 589.59 1.51666 1.70100 F 486.13 1.52225 1.71748 The design of acromats is based on the dispersion of the glass, which we already specified Refractivity nD − 1 1.75 ≤ nD ≤ 1.5 Mean Dispersion nF − nC > 0 differences between blue and red indices Partial Dispersion nD − nC > 0 nD − 1 ν≡ nF − nC differences between yellow and red indices Abbé Number ratio of refractivity and mean dispersion, 25 ≤ ν ≤ 65 For a single thin lens, the power of the system is: ¶ µ 1 1 1 φ = = (n − 1) · ≡ (n − 1) · (C1 − C2 ) − f R1 R2 where 1 R The effect of dispersion on the power is obtained by differentiating: C≡ dφ φ dn nF − nC φ = (C1 − C2 ) = =⇒ dφ = φ · =φ· ≡ dn n−1 n−1 n−1 ν where ν is the Abbé number. 165 5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS For a two-lens system, we have already determined the formula for the power: φeff = φ1 + φ2 − φ1 · φ2 · t =⇒ dφeff = dφ1 + dφ2 − φ2 t · dφ1 − φ1t · dφ2 = (1 − φ2 t) · dφ1 + (1 − φ1 t) · dφ2 The power at the two wavelengths is matched so that: dφeff = 0 = (1 − φ2 t) · dφ1 + (1 − φ1 t) · dφ2 φ φ = (1 − φ2 t) · 1 + (1 − φ1 t) · 2 ν1 ν2 φ1 φ =⇒ − (1 − φ2 t) · = (1 − φ1 t) · 2 ν1 ν2 ν1 ν2 + f ν + f ν φ φ2 1 1 2 2 =⇒ t = 1 = ν1 + ν2 ν1 + ν2 φ1 ν 1 + φ2 ν 2 = ν1 + ν2 φeff If the two lenses are in contact so that t = 0, then: f1 ν2 φ2 = =− φ1 f2 ν1 For an achromat that has the same focal length for red light (C line, λ = 656.28 nm) and blue light (F line, λ = 486.13 nm). Note that it is possible to use the same glass and adjust the focal lengths and distance to achromatize. If ν 1 = ν 2 ≡ ν, then (f1 + f2 ) ν f1 ν 1 + f2 ν 2 f1 + f2 →t= = ν1 + ν2 2ν 2 1 φ1 + φ2 2 f f ´ =2· 1 2 = = =⇒ feff = ³ 1 1 feff 2 f + f2 1 + t = φeff f1 5.2 f2 Third-Order Optics, Monochromatic Aberrations Aberrations may be interpreted as corrections to the paraxial imaging behavior of optics that result by adding the second term to the approximations for the trigonometric functions: for cos [ϕ]: ϕ3 sin [ϕ] ∼ = ϕ− 3! 2 ϕ cos [ϕ] ∼ = 1− 2! 3 ϕ tan [ϕ] ∼ = ϕ+ 3 The expression for the cosine may be substituted into the formula for the path length in terms of the object distance z1 , the angle ϕ and the radius of curvature R: µ µ 2 ¶ ¶ 12 2R 2R = 1+ + (1 − cos [ϕ]) z1 z12 z1 1 1 of the ray 166 CHAPTER 5 ABERRATIONS µ 1 z1 ¶ third order 1 µ µ 2 ¶µ µ ¶¶¶ 12 2R 2R ϕ2 = 1+ + 1− 1− z12 z1 2! 1 µ ¶¶ µ 2 Rϕ2 R = 1+ +1 z1 z1 ¢1 ¡ 2 ∼ = z1 + Rϕ2 z1 · (R + 1) 2 = which is a significantly more complicated expression than the first-order solution: µ ¶ 1 ∼ = 1 =⇒ 1 ∼ = z1 z1 first order The wavefront emerging from the aperture of the system (the exit pupil ) may be characterized by its shape or by rays at different locations in the pupil that are orthogonal to the wavefront. The rays are defined by the end-point coordinates in the pupil plane (with height r from which they emerge) and in the image plane (with height r0 to which they travel). The deviations from the wave or of the rays from the ideal behavior are characterized by the concept of ray aberrations, which typically are as a set of numerical values (coefficients) that describe the amount of deviation of the ray or of the wavefront from the ideal. The order of the aberrations is determined by the highest power of the term kept in the expansion for the sine in Snell’s law: sin [θ] = θ − θ3 θ5 + − ··· 3! 5! The inclusion of these larger powers in the expansion results in larger deviation of the theoretical calculation from the actual behavior at larger off-axis angles. We can also consider deviations of the actual wavefront from the ideal in first-order paraxial or Gaussian optics. For example, a translation of the ideal wavefront down the z-axis from the “ideal” image location may be characterized by an “aberration” that is called defocus. The decomposition of the wavefront into deviations from the ideal requires six coefficients of powers of r and r0 : Spherical Aberration r4 Coma r3 r0 cos [θ] Astigmatism r2 r02 cos2 [θ] Curvature of Field r2 r02 Distortion rr03 cos [θ] Piston Error r04 The last of these, piston error, is a measure of a z-axis translation of the wavefront analogous to defocus. As such, it has no effect on the image and often is not included in the list of aberrations. 5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS 167 In spherical aberration with positive coefficients, the rays from the margin of the pupil cross the axis closer to the optic than the paraxial rays. The image of a point object created by a system with spherical aberration shows a bright central region surrounded by a “halo” of light from the margin of the pupil. Spherical aberration describes the deviation of the rays emerging from the pupil from the ideal convergence to an image point. If the aberration coefficient is positive, the rays emerging from the margin of the pupil cross the optical axis closer to the optic than the paraxial rays close to axis. In other words, the focal length for marginal rays is shorter than that for paraxial rays. Spherical aberration is a circularly symmetric deviation of the wavefront from the quadratic-phase ideal of Gaussian optics. The resulting wavefront emerging from the pupil is a 4th power of the pupil coordinates, which has the shape of a china bowl. This shows that the rays near the edge of the pupil are directed towards a point on the axis that is closer to the optic. Since spherical aberration is a function only of the pupil-plane coordinates, it describes a shift-invariant deviation that may be characterized by an impulse response. The shape of the wavefronts emerging from the pupil for spherical aberration (black) and defocus (red). Marginal rays emerging from a pupil that exhibits spherical aberration will cross the axis (i.e., “focus”) closer to the pupil than the paraxial rays. For coma, the deviations from ideal performance for coma are larger for larger values of the image plane coordinate r0 . If a point source and its image are located on axis, coma in the system will have no effect on the image, but the image of a point source located off axis will be spread differently at different values of the image plane coordinates. The image of an off-axis point source will be “teardrop” shaped. To introduce the concept of monochromatic aberrations, consider the complex amplitude of the 168 CHAPTER 5 ABERRATIONS wavefront diverging from a specific object point [x0 , y0 ] to the location [x, y] in the entrance pupil: w [x, y; x0 , y0 ] = p [x, y] · exp [+i Φ [x, y; x0 , y0 ]] where: µ ¶ ∙ µ ¶¸ z1 1 r2 1 Φ [x, y; x0 , y0 ] = exp +2πi − · exp +iπ · exp [+2πi · ∆Φ [x, y; x0 , y0 ]] λ0 λ0 z1 f is the phase at the pupil due to a point source located at [x0 , y0 ] in the object plane, which includes the quadratic phase of the ideal “spherical” wavefront converging to the image point plus any phase error ∆Φ [x, y; x0 , y0 ] and p [x, y] specifies the magnitude function of the pupil (the so-called apodization function). A similar expression may be written for light converging to the image point [x00 , y00 ] from the location [x0 , y 0 ] in the exit pupil. If the actual wavefront at [x, y] in the pupil lags behind the ideal sphere (actually a paraboloid), then the light from that location converging to the image plane must have been emitted earlier in time; the phase difference ∆φ at that location [x, y] in the pupil is positive. The map of ∆Φ [x, y; x0 , y0 ] may be decomposed into different “shapes” described by different powers of the object coordinates [x0 , y0 ] and of the pupil coordinates [x, y]. The weights of each of these different shapes present in the actual wavefront are the aberration coefficients, which are commonly used to specify the differences of the behavior from the ideal. Comparison of ideal and actual wavefronts emerging from optical system. The difference between the wavefronts may be specified by the difference in phase or by the intersections of rays normal to the wavefront. Alternatively, we can describe the difference in action of the optic from the ideal in terms of the “rays” from different points in the pupil. The rays are (of course) perpendicular to the wavefront emerging from the pupil. Unaberrated rays should all cross the optical axis exactly at the image point. Rays from an aberrated wavefront will cross at different locations. 5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS 169 Rays from different points on the wavefront emerging from the pupil of an optic with spherical aberration; the rays cross the optical axis at different locations. The aberration function specifies the difference in optical phase between the actual and ideal wavefronts that converge to the ideal real image point (or diverge from the ideal virtual image point). Since the shape of the wavefront due to a point object generally varies with its location in the object plane, the aberration function generally depends on coordinates in both the object and pupil planes; it is a 4-D function. The coordinates used in the calculations of the rays are shown in the figure: 170 CHAPTER 5 ABERRATIONS Coordinates used to evaluate aberrations. Light propagates from the pupil plane (coordinates without subscripts) over the distance z2 to the image plane (coordinates with subscripts). Note that the pupil and image plane coordinates are normalized so that rmax = (r0 )max = 1. A ray of light with wavelength λ0 that emerges from the exit pupil at [x, y] and crosses the image plane at [x0 , y0 ] has the form: w [x, y; x0 , y0 ] = p [x, y] · exp [+2πi · Φ [x, y; x0 , y0 ]] where p [x, y] specifies the magnitude of the pupil transmittance of the exit pupil (the so-called apodization function) and Φ [x, y; x0 , y0 ] is the phase at the pupil for an object point at coordinates [x0 , y0 ] emerging from the pupil at [x, y]. The phase includes the converging “spherical” (actually parabolic) wave and the phase difference term: µ ¶ r2 1 1 + ∆Φ [x, y; x0 , y0 ] Φ [x, y; x0 , y0 ] = +i − 2λ0 f z2 We consider the locations in polar coordinates: the image location is [x0 , y0 ] = (r0 , α) and the pupil coordinates [x, y] = (r, θ). If the optical system has a circular cross-section (i.e., if the optical system is rotationally symmetric), then the behavior of the aberration does not depend on the absolute azimuthal coordinates but only on their difference, so that we can consider a threedimensional description based on radial coordinates r, r0 , and relative azimuthal angle θ − α ≡ ϕ; i.e., we can write the phase error function in the form ∆Φ [r, r0 , ϕ]. The relative phase between the 5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS 171 object point and a location in the pupil is 2π radians (per cycle) multiplied by the number of cycles, which is the ratio of the distance between the locations in the object plane and in the pupil divided by the wavelength λ0 : o1 n 2 2 2 distance: R = z 2 + (r cos θ − r0 cos α) + (r sin θ − r0 sin α) o 12 R 2π n 2 Φ [x, y; x0 , y0 , z] = 2π = z + (r cos θ − r0 cos α)2 + (r sin θ − r0 sin α)2 λ0 λ0 ¢ ¡ ¢ª 1 2π © 2 ¡ 2 = z + r cos2 θ + r02 cos2 α − 2rr0 cos θ cos α + r2 sin2 θ + r02 sin2 α − 2rr0 sin θ sin α 2 λ0 ª1 2π © 2 = z + r2 + r02 − 2rr0 (cos θ cos α + sin θ sin α) 2 λ0 ª1 2π © 2 = z + r2 + r02 − 2rr0 cos [θ − α] 2 λ0 ½ ∙µ 2 ¶¸¾ 12 ¶ µ z r + r02 2rr0 = 2π · 1+ cos [θ − α] + − λ0 z2 z2 ½ ∙µ 2 ¶¸¾ 12 ¶ µ z r + r02 2rr0 ≡ 2π · 1+ cos [ϕ] + − λ0 z2 z2 This expression may be expanded into a power series via the binomial theorem: n n (n − 1) 2 u+ u + ··· 1! 2! 1 1 1 1 =⇒ (1 + u) 2 = 1 + u − u2 + u3 − · · · 2 8 16 n (1 + u) = 1 + In the current expression, we can identify: ¶ ¶ µ µ 2 r + r02 2rr0 cos [ϕ] + − u≡ z2 z2 ¶ ³ µ 2 ´ 1 rr0 r + r02 cos [ϕ] + − =⇒ u= 2 2z 2 z2 ¶¸2 ¶ µ ∙µ 2 1 1 2rr0 r + r02 =⇒ − u2 = − cos [ϕ] + − 8 8 z2 z2 "µ ¶2 µ 2 ¶# ¶2 µ ¶µ 1 2rr0 r + r02 2rr0 r2 + r02 =− + − 2 cos [ϕ] + 2 − 2 cos [ϕ] 8 z2 z z2 z ¶ µ 2 2 ¶³ ¶ µ 2 ∙µ 4 ´¸ 4 2 2 2 4r r0 rr0 r + r0 r + r0 + 2r r0 1 2 + cos [ϕ] − 4 cos [ϕ] =− 8 z4 z4 z2 z2 ¶ µ ¶¸ ¶ µ ∙µ 4 r3 r0 4r2 r02 r + r04 + 2r2 r02 1 rr03 2 =− cos [ϕ] − 4 cos [ϕ] + cos [ϕ] + 8 z4 z4 z4 z4 ¶ µ 2 2 µ 4 ¶ µ 3 ¶ r r0 1 r + r04 + 2r2 r02 r r0 rr03 2 − − u2 = − cos [ϕ] + cos [ϕ] + cos [ϕ] 8 8z 4 2z 4 2z 4 2z 4 So the power series for the phase function truncated to the second order becomes: µ µ 2 ¶ ³ rr ´¶ r + r02 z 0 1 + + 2π − 2π cos [ϕ] Φ [x, y; x0 , y0 , z] ∼ = λ0 2z 2 z2 ¶ µ 3 ¶¶ µ µ 4 ¶ µ z rr03 r r0 r + r04 + 2r2 r02 r2 r02 2 + 2π cos [ϕ] + 2π cos [ϕ] + cos [ϕ] − − 2π λ0 8z 4 2z 4 2z 4 2z 4 172 CHAPTER 5 ABERRATIONS z , which produces 10 terms: a constant, λ0 three terms from the first-order polynomial, and six from the second-order polynomial: á ¢! 2 2 + r r z rr0 0 + 2π Φ [x, y; x0 , y0 , z] ∼ − 2π cos [ϕ] = 2π λ0 2λ0 z λ0 z ¶ µ 2 2 ¶ µ 4 r r0 r3 r0 rr03 r + r04 + 2r2 r02 2 − 2π cos [ϕ] + 2π cos [ϕ] + 2π cos [ϕ] − 2π 3 3 3 8λ0 z 2λ0 z 2λ0 z 2λ0 z 3 Now we can multiply through by the leading factor of 2π z λ0 r2 r2 rr0 + 2π + 2π 0 − 2π cos [ϕ] 2λ0 z 2λ0 z λ0 z r4 r2 r02 r2 r02 r3 r0 rr03 r4 − 2π 0 3 − 2π − 2π cos2 [ϕ] + 2π cos [ϕ] + 2π cos [ϕ] − 2π 3 3 3 3 8λ0 z 8λ0 z 4λ0 z 4λ0 z 2λ0 z 2λ0 z 3 = 2π which may be reordered into: z Φ [x, y; x0 , y0 , z] ∼ = 2π λ0 π 2 π 2 2π + r + r0 − · r r0 cos [ϕ] λ0 z λ0 z λ0 z π π 3 π 2 2 − r4 + r r0 cos [ϕ] − r r0 3 3 4λ0 z λ0 z λ0 z 3 π 2 2 π π 4 − r r0 cos2 [ϕ] + r r3 cos [ϕ] − r λ0 z 3 λ0 z 3 0 λ0 z 3 0 In other words, we have “decomposed” the phase of the spherical wave into terms with different powers of the coordinate in the pupil plane (with coordinates [x, y] = (r, θ)) and in the image plane (with coordinates [x0 , y0 ] = (r0 , α) in a manner analogous to the decomposition into sinusoidal components in the Fourier transform. Our goal will be to decompose the phase difference between the ideal and actual wavefronts using these same terms. Again, since the system is assumed circularly symmetric, only the difference in azimuthal coordinates θ − α ≡ ϕ is relevant. 5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS 5.2.1 173 Names of Aberrations The difference in the shape of the “actual” wavefront from the ideal spherical wavefront is decomposed into the same terms as the phase; each term has its unique “shape” and name, and will be described by a coefficient that determines “how much” of each “shape” is present in the phase difference. From the series above, we can apply weighting coefficients to the three relevant coordinates distinguished by subscripts: the index j of the power of the radial coordinate r0 at the image (the “image height”), the index m of the power of the radial coordinate r at the pupil, and the index n of the power of cos [ϕ]. From the series above we can see that only some powers are included in the summation, so we can write the phase difference as ∆Φ [x, y; x0 , y0 , z] = Φideal [x, y; x0 , y0 , z] − Φactual [x, y; x0 , y0 , z2 ] X = Wjmn r0j rm cosn ϕ j,m,n = W000 (propagation from pupil to image) + W200 r02 (piston error) + W111 r0 r cos ϕ (tip-tilt) + W020 r2 (defocus) + W040 r4 (spherical aberration) + W131 r0 r3 cos ϕ (coma) + W220 r02 r2 (curvature of field) + W222 r02 r2 cos2 ϕ (astigmatism) + W311 r03 r cos ϕ (distortion) + W400 r04 (piston error) + ··· The coefficients Wjmn measure the “amplitudes” of the individual terms and typically are specified in units of wavelengths (the “number of waves” of the aberration) at the edge of the pupil (i.e., at r = 1); they must be multiplied by 2π radians per wavelength to convert to phase angle. For example, a sample system might be specified as having “one-half wave of spherical and a quarter wave of astigmatism.” Shift Invariant or Not? Note that phase errors that depend on r0 will produce different images for different image “heights” and therefore are shift-variant effects that strictly cannot be characterized by impulse responses and/or transfer functions. That being said, it is common practice to examine the “impulse response” and/or the “transfer function” in a local region as though the aberration were shift invariant, which allows the analyst to create a (“pseudo”) frequency-domain description of the action of the aberration. 174 5.2.2 CHAPTER 5 ABERRATIONS Aberration Coefficients To get an idea of the behavior in the wavefront due to these terms, we can plot graphs of these “shapes” at the pupil for specified locations in the object plane. The examples are plotted for different object locations and assuming that λ0 = z2 = 1. The aberrations are grouped by the numerical powers of the radial terms in the series, e.g., j + m = 0 for W000 , j + m = 2 for W200 , W111 , and W200 , j + m = 4 for W040 , W131 , etc. You might expect that the second-order grouping would include W200 (piston error), W111 (tip-tilt), and W020 (defocus). However, for historical reasons, the groupings are based on the powers for the “rays” derived from the “wavefronts” via the gradient operator (a first-order derivative), so these three form the group of the first-order aberrations. The terms with j + m = 4 are the third-order aberrations, etc. Zero-Order Term: Propagation: constant phase (zero-order piston error = propagation from pupil to image): ⎧ p ⎨ 1 if x2 + y 2 ≤ 1 λ0 ∆Φ [x, y; x0 , y0 , z] = 2π · W000 · p ⎩ 0 if x2 + y 2 > 1 The coefficient W000 is the number of incremental wavelengths due to propagation “downstream” from the object to the pupil is a normal part of the imaging; it is not considered to be an aberration. In any event, its only effect on the irradiance is the constant attenuation of the image field due to the inverse square law identical to the constant phase term in the Fresnel and Fraunhofer diffraction terms. zero-order term, constant phase, piston error aberration 5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS 175 Second-Order Wave (First-Order Ray) Aberrations: These include the three terms for which the sums of the powers of r and r0 equal two. Since the rays are oriented orthogonal (and must be calculated by derivatives), these correspond to the “first-order” aberrations for rays. In fact, these three terms often are not considered to be aberrations since the only one that has a degrading effect on an irradiance image is defocus, which may (of course) be compensated by changing the location of the sensor so that it coincides with the image. Constant Phase — First-Order Piston Error constant phase (first-order piston error): ⎧ 2 p ⎪ ⎨ + r0 if x2 + y 2 ≤ 1 2λ0 z ∆Φ [x, y; x0 , y0 ] = 2π · W200 · p ⎪ ⎩ 0 if x2 + y 2 > 1 This is an additional constant phase due to the off-axis location in the image plane; it is quadratic in the image coordinate, but constant in the pupil coordinate, so it is a constant for a particular image location. Since this measures the “constant” phase difference, it has no effect on the measured irradiance and therefore no impact on the quality of the image. constant phase from first-order terms: piston error 176 CHAPTER 5 ABERRATIONS Bilinear-Phase — “Tip-Tilt” linear phase from both object and pupil (tip or tilt): ⎧ rr p ⎨ − 0 cos [ϕ] if x2 + y 2 ≤ 1 λ0 z ∆Φ [x, y; x0 , y0 ] = 2π · W111 · p ⎩ 0 if x2 + y 2 > 1 A phase that has linear contributions from the pupil location r and image location r0 (a “bilinear” phase) means that the shape of the field emerging from the pupil for a particular object location is a “flat” plane tilted in proportion to the off-axis position of the object and the image. Because it is a linear phase in the pupil, it displaces the resulting image towards the direction where the phase is negative. In atmospheric imaging scenarios (imaging along a vertical path through turbulence), the timevarying tip-tilt aberration is dominant. For example, the centers of the images of individual stars appear to move around over short time intervals of the order of hundredths of a second. The correction of tip-tilt aberration has a very significant positive effect on the quality of the resulting image. For an example, see the animated GIF file at URL: http://www.ast.cam.ac.uk/~optics/Lucky_Web_Site/100Her_10ms_200fr.gif first-order linear term, tip-tilt error 5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS 177 Quadratic-Phase Error, Focus Shift = “Defocus” quadratic phase =⇒ defocus = focus shift ⎧ 2 p ⎪ ⎨+ r x2 + y 2 ≤ 1 if 2λ0 z ∆Φ [x, y; x0 , y0 ] = 2π · W020 · p ⎪ ⎩ 0 if x2 + y 2 > 1 This quadratic term is the error in the Fresnel propagation from the exit pupil if the observation plane does not coincide with the image plane and is therefore called “defocus.” Since it is not a result of flaws in the optics, it is often not considered to be an “aberration,” but there is reason to do so in some applications. As an example, consider the atmospheric imaging scenario mentioned under tip-tilt; any time-varying quadratic contribution to the relative phase displaces the focal plane (slightly), so images through atmospheric turbulence with quadratic contributions appear to go in and out of focus over short time intervals (but, as already mentioned, the tip-tilt aberration is dominant, totalling 87% of the light energy under certain assumptions — see Noll, JOSA, 66, pp.207-211, 1976 and van Dam & Lane, JOSA A, 19, pp. 745-752). first-order quadratic term, focus shift error = “defocus” Since defocus is a function only of the pupil-plane coordinates, it is shift invariant at the image plane; the effect of defocus does not vary with “image height” and therefore may be described by an impulse response and a transfer function. For example, consider a small first-order focus error of π radians at the edge of a rectangular pupil with linear dimension d0 = 1 unit. The complex-valued wavefront has the form shown: 178 CHAPTER 5 ABERRATIONS Pupil function with defocus of π radians at edge of the pupil (“half-wave of defocus”): (a) real part; (b) imaginary part; (c) magnitude; (d) phase, showing quadratic nature. The incoherent transfer function is the scaled autocorrelation of the pupil and the impulse response is the inverse Fourier transform. The MTF has a zero at the normalized spatial frequency ρ∼ = 0.5. Note that the image with defocus is “wider” and the peak irradiance is “smaller” than the diffraction-limited image. (a) MTF of incoherent optical system with square aperture with one-half wave of defocus compared to MTF without defocus (red); (b) psf with one-half wave of defocus (black) and without defocus (red). Other examples of transfer functions (MTFs) and impulse responses for square apertures with different amounts of defocus (measured in waves at the edge of the pupil) are shown. Note in particular that the intermediate frequencies are degraded more rapidly than either the smallest or largest spatial 5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS 179 frequencies. Note that the MTF at certain frequencies is negative, which means that the modulation has changed sign (“lighter” regions in the original object become “darker” in the defocused image). This can be seen in an object with different spatial frequencies. MTF and corresponding psfs for square pupil with different amounts of defocus from λ40 at the edge of the pupil to 1.5λ0 . Note that the decrease in MTF is most pronounced at intermediate spatial frequencies. For larger amounts of defocus, the MTF goes negative over regions of the frequency domain (contrast reversal). The psf widens with increasing defocus. The spatial frequency of a “radial grating” f [x, y] increases as the reciprocal of the distance from the center. In the examples shown, the irradiance is biased up so that its normalized maximum and minimum amplitudes are 1 and 0, respectively. The grating is imaged through a real optical system onto a CCD sensor that samples the image and thus the image is aliased at large spatial frequencies (near the center). The three images are at the focal plane (i.e., “in focus”) and with two increments of defocus. Track a radial line in the original (in red) to see that the amplitude of the in-focus does not vary from unity (except where there is aliasing), while the defocused image exhibits several changes in phase, from light to dark to light, etc. The contrast of the smallest spatial frequency (at the edge of the image) is reversed in the image with more defocus, and this image also exhibits more changes in phase. 180 CHAPTER 5 ABERRATIONS Effect of two increments of defocus on the image of a radial grating. The negative regions of the MTF of defocus imply that the contrast of those spatial frequencies is “reversed” (darker gray → lighter gray and vice versa). Track the “lightness” along the red lines to see the contrast reversals. Note that the “in-focus” image exhibits some sampling (“aliasing”) artifacts in the center where the azimuthal spatial frequency is large. This artifact is often called “spurious resolution,” because the object is not reproduced at the locations of the phase change. 5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS 5.2.3 181 Fourth-Order (Third-Order Ray) Aberrations: the “Seidel aberrations” r4 − =⇒ no variation at object, quartic phase at pupil =⇒ spherical aberration W040 (LSI) 2λ0 z 3 rr03 cos [ϕ] 2λ0 z 3 r2 r02 − 4λ0 z 3 2 2 r r0 − cos2 [ϕ] 2λ0 z 3 r3 r0 + cos [ϕ] 2λ0 z 3 r4 − 0 3 8λ0 z + =⇒ cubic phase at object, linear phase at pupil =⇒ coma, W131 =⇒ quadratic phase at object and pupil =⇒ field curvature, W220 =⇒ quadratic phase at object and pupil + azimuth variation =⇒ astigmatism, W222 =⇒ linear phase at object, cubic phase at pupil =⇒ distortion, W311 =⇒ quartic phase at object, no variation at pupil =⇒ third-order piston error, W400 Note that the four of these six terms have even powers of both the pupil coordinate r and the image coordinate r0 , whereas coma and distortion include odd powers of both. Spherical Aberration This is the simplest third-order aberration to describe mathematically since it depends only on the coordinates in the pupil plane; its effect is constant across the image plane. This means that spherical aberration is the only one of the six Seidel terms that is shift invariant (and may therefore be described as a convolution). The wavefront shape for spherical aberration resembles a deeper “bowl” than the paraboloid for defocus. Note that the negative sign on the phase means that the spherical aberration is negative if the phase contribution is positive. linear phase from both object and pupil (tip or tilt): ⎧µ ¶ 4 p ⎪ ⎨ − r x2 + y 2 ≤ 1 if 2λ0 z 3 ∆Φ [x, y; x0 , y0 ] = 2π · W040 · p ⎪ ⎩ 0 if x2 + y 2 > 1 quadratic term from second order of expansion: spherical aberration If the numerical coefficient of spherical aberration is positive, then rays from the marginal regions of the pupil have a steeper slope than those from the paraxial region near the optical axis. In other 182 CHAPTER 5 ABERRATIONS words, the “marginal focus” is closer to the lens than the ideal “paraxial focus.” The paraxial image of a point object is not “sharp” but exhibits a halo of light around a bright central core. Negative coefficient of spherical aberration of positive lens: rays from the margin of the pupil cross axis closer to the optic than paraxial rays. The image of a point object at the paraxial focus exhibits a bright central region surrounded by a “halo” of light from the margin of the pupil. Because it is a shift-invariant effect at the image plane, spherical aberration may be described by an impulse response and by a transfer function. Spherical aberration is a distortion of the true spherical wavefront that makes a “deeper bowl” so that the incremental phase error is large near the edge of the pupil (far from the optical axis, for the marginal part of the wave) and small near the center of the pupil (near the optical axis, for the paraxial part of the wave). Example of quartic wavefront error of spherical aberration compared to quadratic error from defocus. Spherical aberration error is a “deeper bowl.” Consider an example for spherical aberration where the phase error is π radians at the edge of a square pupil, the same phase error at the edge that was considered for defocus. The profiles of the phase in the pupil are: 5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS 183 Pupil function for one-half wave of spherical aberration: (a) real part; (b) imaginary part; (c) magnitude; (d) phase in units of π radians, showing the fourth-power behavior. The incoherent MTF shows a significant decrease as the frequency approaches cutoff and the psf is noticeably wider and “shorter:” 184 CHAPTER 5 ABERRATIONS (a) MTF of incoherent optical system with square aperture with one-half wave of negative spherical aberration at the edge of the pupil compared to MTF without aberration (red); (b) psf with one-half wave of aberration (black) and without aberration (red). Note that the image with spherical aberration is “shorter” and “fatter.” MTF and corresponding psfs for square pupil with different amounts of spherical aberration from λ0 4 at the edge of the pupil to 1.5λ0 . The MTF has a similar behavior as for defocus; it decreases most rapidly at the middle frequencies rather than at smallest or largest, and it may go negative at some frequencies. The MTF for spherical aberration decreases more slowly than for defocus because the phase changes more slowly except near the edge of the pupil. The uncorrected optical system in the Hubble Space telescope suffered from significant spherical aberration due to flaws in the primary mirror that were disguised during mirror testing. Spherical aberration of the wave emerging from different parts of the pupil may be partially balanced by changing the focus, i.e., by “adding defocus.” For example, the phase at the edge of the 5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS 185 pupil may be compensated by applying a defocus aberration in the opposite direction so that µ ¶ 14 12 + 2π · W020 · 2π · W040 · − =0 3 2λ0 z 2λ0 z W040 =⇒ W020 = z2 If we use defocus cancel the phase error due to spherical aberration at the edge of the pupil, the resulting transfer function and image have the form shown, so that the image is improved markedly by using the appropriate amount of defocus. Application of defocus to balance spherical aberration at edge of square pupil: (a) MTF without aberrations (black), with 1/2 wave of spherical aberration (red), and after balancing with -1/2 wave of defocus; (b) corresponding impulse responses. Coma =⇒ linear phase from both object and pupil (tip or tilt): ⎧ 3 p ⎪ ⎨ + r0 r cos [ϕ] if x2 + y 2 ≤ 1 3 2λ0 z ∆Φ [x, y; x0 , y0 ] = 2π · W131 · p ⎪ ⎩ 0 if x2 + y 2 > 1 The surface shape is proportional to the cube of the image height, proportional to the height of the ray in the pupil. This produces a different phase error, and therefore different images, for different values of the image height r0 as shown in the example. The images have a “comet-like” shape, hence the name for the aberration. 186 CHAPTER 5 ABERRATIONS Star field imaged through optical system with coma; elongation of the star images increases with distance from optical axis (which is located below bottom of the image). Credit: “Star Gazing with Telescope and Camera,” George T. Keene, Amphoto, Garden City, 1967, p. 93. Curvature of Field quadratic phase from object and pupil ⎧ 2 2 p ⎪ ⎨ − r0 r if x2 + y 2 ≤ 1 3 2λ0 z ∆Φ [x, y; x0 , y0 ] = W220 · p ⎪ ⎩ 0 if x2 + y 2 > 1 As indicated by the name, the “best” images in systems with this aberration are on a curved surface. 5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS 187 Some imaging systems (e.g., Schmidt cameras) are deliberately designed with curved fields because it produces good images over wide fields of view. The sensors used in wide-field Schmidt astronomical cameras were glass plates that were predistorted” prior to being installed in the camera. Since the plates could be as large as 14" square, this was a touchy operation. Astigmatism The Latin word for “points” is “stigmata,” so that a system with astigmatism is not capable of producing points. It focuses “horizontal” and “vertical” patterns at different focal planes, as shown: Astigmatism focues vertical and horizontal lines at different planes (horizontal lines in the “sagittal” plane and vertical lines in the “meridional” plane) http://www.olympusmicro.com/primer/anatomy/aberrations.html The aberration coefficient for astigmatism is: 188 CHAPTER 5 ABERRATIONS quadratic phase from object and pupil and azimuthal variation ⎧ p ⎨ − 1 r02 r2 cos2 [ϕ] if x2 + y 2 ≤ 1 3 2λ0 z ∆Φ [x, y; x0 , y0 ] = 2π · W222 · p ⎩ 0 if x2 + y 2 > 1 The error is quadratic with an azimuthal dependence; the additional quadratic is maximized along the azimuthal direction ϕ = 0 and, and zero along the orthogonal direction. It therefore adds an azimuthally dependent “focusing” power. In other words, object lines oriented along different directions are focused at different distances from the optic. The eye systems of many people exhibit astigmatism, which means that the corrective lenses must have different powers along the orthogonal axes; in other words, lenses with cylindrical power are needed. Lenses that have been corrected for astigmatism are known as anastigmats. Distortion cubic phase at pupil, linear phase at object, azimuthal variation ⎧ 3 p ⎪ ⎨ + r0 r cos [ϕ] if x2 + y 2 ≤ 1 3 2λ0 z ∆Φ [x, y; x0 , y0 ] = 2π · W311 · p ⎪ ⎩ 0 if x2 + y 2 > 1 This is a cubic dependence on the pupil coordinate and linear variation the image coordinate. Like coma, the effect of distortion varies with image height. 5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS 189 The image shapes resulting from distortion with coefficients of different algebraic signs are different. If W311 < 0 or W311 > 0, the images suffer from “pincushion distortion” or “barrel distortion,” respectively. Images of a grid object through systems with (a) no aberrations; (b) “pincushion” distortion ( W311 < 0); (c) “barrel” distortion ( W311 > 0). Piston Error quartic phase at object ⎧ 4 p ⎪ ⎨ − r0 if x2 + y 2 ≤ 1 2λ0 z 3 ∆Φ [x, y; x0 , y0 ] = 2π · W400 · p ⎪ ⎩ 0 if x2 + y 2 > 1 This is a constant phase due to the off-axis distance at the image plane and has no effect on the irradiance of the image, hence it often is not considered to be an aberration. However, it does have an important effect on optical systems with “sparse” primary elements, such as multiple-mirror telescopes. 190 CHAPTER 5 ABERRATIONS constant term from second-order expansion: piston error Of course, the ultimate resolution of optical systems may be due in part to other uncontrollable factors. For example, ground-based astronomical telescopes are ultimately limited by random variations in local air temperature that create random variations in the refractive index of atmospheric “patches.” These variations are often decomposed into the Seidel aberrations. The constant phase (“piston”) error has no effect on the irradiance (the squared magnitude of the amplitude). Linear phase errors move the image from side to side and or top to bottom (“tip-tilt”). Quadratic phase errors (“defocus”) add or subtract power from the lens to move the image plane along the axis forwards (towards the optic) or backwards (away from the optic), respectively. In correction for atmospheric phase errors, the tip-tilt error is most significant, which means that correcting this aberration significantly improves the image quality. The field of correcting atmospheric aberrations is called “adaptive optics,” and is an active research area. 5.2.4 Zernike Polynomials It should be no surprise that other useful decompositions of the wavefront errors exist. Another common set of basis functions are the Zernike polynomials, which are often used for fitting data from interferometric optical testing (though NOT in the presence of air turbulence; Zernikes have little value in this situation). The Zernike polynomials are functions of radial and azimuthal coordinates that describe “surfaces” on the unit circle such that the average value of each is zero: Zn (r, ϕ) = Rn (r) · cos ( · ϕ) Zn− (r, ϕ) = Rn (r) · sin ( · ϕ) where the radial part is defined as: ⎧ (n− )/2 k ⎪ X ⎪ (−1) (n − k)! ⎪ ⎪ µ ¶ µ ¶ · rn−2k if n − is even ⎨ n+ n− k=0 k! · Rn (r) = −k !· −k ! ⎪ 2 2 ⎪ ⎪ ⎪ ⎩ 0 if n − is odd 5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS 191 So that: 0! · r0 = 1 (r) =⇒ Z00 (r, ϕ) = 1 (r) · cos (0 · ϕ) = 1 (r) 0! · 0! · 0! (−1)0 · 1! R11 (r) = · r1 =⇒ Z11 (r, ϕ) = r · cos (1 · ϕ) = r · cos (ϕ) 0! · (0)! · (0)! R00 (r) = Z1−1 (r, ϕ) = R11 (r) · sin (1 · ϕ) = r · sin (ϕ) etc. One advantage of the Zernike polynomials is that distinct polynomials are orthogonal over the unit circle (i.e., the scalar product of any pair of distinct Zernike polynomials vanishes): ⎧ Z r=1 ⎨ 1 if n = m Rn (r) · Rm (r) r dr ∝ ≡ δ nm ⎩ 0 if n 6= m r=0 where δ nm is the Kronecker delta function. The set of the first 36 (nonconstant) Zernike polynomials yields a decomposition with minimum RMS wavefront error. Since they all represent wavefront errors at the exit pupil, the corresponding impulse responses and transfer functions may be calculated; the former are shown in a figure. 192 CHAPTER 5 ABERRATIONS First 28 Zernike polynomials ordered by azimuthal index (horizontally) and radial index(vertically). Ref: http://scien.stanford.edu/class/psych221/projects/03/pmaeda/index_files. psfs (impulse responses) of the aberrations for each of the first 28 Zernike Polynomials (ref: http://scien.stanford.edu/class/psych221/projects/03/pmaeda/index_files/image096.gif ) 5.3 STRUCTURAL ABERRATION COEFFICIENTS 5.3 193 Structural Aberration Coefficients Structural aberration coefficients are due to the “configuration” or “orientation” of the lens. We have just seen that the lensmaker’s equation ensures that there are many prescriptions for a thin lens with a fixed focal length made from one glass. For example, if n2 = 1.5 and f = 100 mm, we can have R1 = R2 = 100 mm (double convex) or R1 = 50 mm and R2 = ∞ (plano-convex, curved side towards object) or R1 = ∞ and R2 = 50 mm (plano-convex, curved side towards image), and many other possibilities. It is perhaps logical that the aberrations from these different prescriptions will be different too. The calculation leads to one of the “rules of thumb” for optical systems; a better image is generated by an optical system if the side of the optic with the larger radius is on the side with the shorter conjugate, which “divides” the power of the lens more equally between the two surfaces. For example, for a plano-convex lens with the source point at infinity (so that the image is at the focal point), the image exhibits better quality if the curved side of the lens is towards the object. With the flat side towards the object, the front flat surface contributes no power to the image. 5.4 Optical Imaging Systems and Sampling Q factor 5.5 Optical System “Rules of Thumb” 1. If imaging with a singlet lens, the aberrations are smaller if the lens surface with more curvature (shorter radius of curvature) is on the side of the longer conjugate. Since the transverse magnification is smaller than 1 in most cases (distant object), the “more curved” side of the lens should be towards the distant object. This divides the power of the surfaces more evenly and minimizes the spherical aberration. 2. If imaging in visible light, the diameter of the diffraction spot in micrometers is approximately equal to the f-number of the system. 3. The MTF at the Rayleigh limit is about 9% (www.normankoren.com/Tutorials/MTF1A.html). Lenses are sharpest in the interval of about two stops between the (small) aperture where diffraction starts to dominate and two stops smaller than the maximum aperture. For 35mm lenses, the maximum aperture often is of the order of f/2, so two stops smaller is typically f/5.6. The aperture at which diffraction starts to dominate depends on wavelength, but is generally accepted as about f/22. Therefore the sharpest range for a 35mm lens is between about f/5.6 and f/11.At larger apertures (smaller f/ numbers), resolution is limited by aberrations (astigmatism, coma, etc.); at small apertures, resolution is limited by diffraction. The MTF if the lens is used “wide open” is almost always poorer than MTF at f/8 because of the aberrations. Note that this discussion does not consider the effects of the sensor, just the lens. 4. Image is visually unaberrated if the Strehl ratio D ' 0.8 =⇒ σ(∆W ) / 0.075 · λ0 =⇒ λ0 λ0 ∆Wmax / . =⇒ σ ∆W / 4 14 5. If imaging in visible light, the image appears to be “in focus” if the defocus distance measured 2 in micrometers is smaller than (f/#) . 6. Depending on source, the resolution r of lens in line pairs per mm is approximately 1600 1390 /r/ f/# f/# 194 7. More to come... CHAPTER 5 ABERRATIONS