RAY OPTICS NOTES - RIT Center for Imaging Science

Transcription

Ray Optics for Imaging Systems
Course Notes for IMGS-321
11 December 2013
Roger Easton
Chester F. Carlson Center for Imaging Science
Rochester Institute of Technology
54 Lomb Memorial Drive
Rochester, NY 14623
1-585-475-5969
easton@cis.rit.edu
December 11, 2013
Contents
Preface
0.1 References: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 Introduction
1.1 Models of Light and Propagation . . . . . . . . . .
1.1.1 Ray model of light (“geometrical optics”) .
1.1.2 Wave model of light (“physical optics”): . .
1.1.3 Photon model of light (“quantum optics”):
ix
1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
2
2
2
3
2 Ray (Geometric) Optics
2.1 What is an imaging system? . . . . . . . . . . . . . . . . . . . . . .
2.1.1 Simplest Imaging System — Pinhole in Absorber . . . . . .
2.2 First-Order Optics . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3 Third-Order Optics . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.1 Higher-Order Approximations . . . . . . . . . . . . . . . . .
2.4 Notations and Sign Conventions . . . . . . . . . . . . . . . . . . .
2.4.1 Nature of Objects and Images: . . . . . . . . . . . . . . . .
2.5 Human Eye . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.6 Principle of Least Time . . . . . . . . . . . . . . . . . . . . . . . .
2.7 Fermat’s Principle for Reflection . . . . . . . . . . . . . . . . . . .
2.7.1 Plane Mirrors . . . . . . . . . . . . . . . . . . . . . . . . . .
2.8 Fermat’s Principle for Refraction: . . . . . . . . . . . . . . . . . . .
2.8.1 Dispersion . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.8.2 Refractive Constants for Glasses . . . . . . . . . . . . . . .
2.9 Image Formation in the Ray Model . . . . . . . . . . . . . . . . . .
2.9.1 Refraction at a Spherical Surface . . . . . . . . . . . . . . .
2.9.2 Imaging with Spherical Mirrors . . . . . . . . . . . . . . . .
2.10 First-Order Imaging with Thin Lenses . . . . . . . . . . . . . . . .
2.10.1 Examples of Thin Lenses . . . . . . . . . . . . . . . . . . .
2.10.2 Spherical Mirror . . . . . . . . . . . . . . . . . . . . . . . .
2.11 Image Magnifications . . . . . . . . . . . . . . . . . . . . . . . . . .
2.11.1 Transverse Magnification: . . . . . . . . . . . . . . . . . . .
2.11.2 Longitudinal Magnification: . . . . . . . . . . . . . . . . . .
2.11.3 Angular Magnification . . . . . . . . . . . . . . . . . . . . .
2.12 Single Thin Lenses . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.12.1 Positive Lens . . . . . . . . . . . . . . . . . . . . . . . . . .
2.12.2 Negative Lens . . . . . . . . . . . . . . . . . . . . . . . . . .
2.12.3 Meniscus Lenses . . . . . . . . . . . . . . . . . . . . . . . .
2.12.4 Simple Microscope (magnifier, “magnifying glass,” “loupe”)
2.13 Systems of Thin Lenses . . . . . . . . . . . . . . . . . . . . . . . .
2.13.1 Two-Lens System . . . . . . . . . . . . . . . . . . . . . . . .
2.13.2 Effective (Equivalent) Focal Length . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5
5
5
6
9
10
10
11
13
13
14
17
18
19
21
24
24
27
28
30
32
32
32
33
34
35
35
36
36
37
41
41
43
v
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
vi
CONTENTS
2.13.3 Summary of Distances for Two-Lens System . . . . . . . . . . .
2.13.4 “Effective Power” of Two-Lens System . . . . . . . . . . . . . .
2.13.5 Lenses in Contact: t = 0 . . . . . . . . . . . . . . . . . . . . . .
2.13.6 Positive Lenses Separated by t < f1 + f2 . . . . . . . . . . . . .
2.13.7 Cardinal Points . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.13.8 Lenses separated by t = f1 + f2 : Afocal System (Telescope) . .
2.13.9 Positive Lenses Separated by t = f1 or t = f2 . . . . . . . . . .
2.13.10 Positive Lenses Separated by t > f1 + f2 . . . . . . . . . . . . .
2.13.11 Compound Microscopes . . . . . . . . . . . . . . . . . . . . . .
2.13.12 Two Positive Lenses with Different Focal Lengths and Different
2.13.13 Systems of One Positive and One Negative Lens . . . . . . . .
2.13.14 Newtonian Form of Imaging Equation . . . . . . . . . . . . . .
2.13.15 Example (1) of Two-Lens System . . . . . . . . . . . . . . . . .
2.13.16 Example (2) of Two-Lens System: Telephoto Lens . . . . . . .
2.13.17 Images from Telephoto System: . . . . . . . . . . . . . . . . . .
2.13.18 Example (3) of Two-Lens System: Two Negative Lenses . . . .
2.14 Plane and Spherical Mirrors . . . . . . . . . . . . . . . . . . . . . . . .
2.14.1 Comparison of Thin Lens and Concave Mirror . . . . . . . . .
2.15 Stops and Pupils . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.15.1 Focal Ratio — f-number . . . . . . . . . . . . . . . . . . . . . .
2.15.2 Example: Focal Ratio of Lens-Aperture Systems . . . . . . . .
2.15.3 Example: Exit Pupils of Telescopic Systems . . . . . . . . . . .
2.15.4 Pupils and Diffraction . . . . . . . . . . . . . . . . . . . . . . .
2.15.5 Field Stop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.16 Marginal and Chief Rays . . . . . . . . . . . . . . . . . . . . . . . . . .
2.16.1 Telecentricity . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.16.2 Marginal and Chief Rays for Telescopes . . . . . . . . . . . . .
3 Tracing Rays Through Optical Systems
3.1 Paraxial Ray Tracing Equations . . . . . . . . . . . . . . . . . . . .
3.1.1 Paraxial Refraction . . . . . . . . . . . . . . . . . . . . . . .
3.1.2 Paraxial Transfer . . . . . . . . . . . . . . . . . . . . . . . .
3.1.3 Linearity of the Paraxial Refraction and Transfer Equations
3.1.4 Paraxial Ray Tracing . . . . . . . . . . . . . . . . . . . . .
3.2 Matrix Formulation of Paraxial Ray Tracing . . . . . . . . . . . . .
3.2.1 Refraction Matrix . . . . . . . . . . . . . . . . . . . . . . .
3.2.2 Ray Transfer Matrix . . . . . . . . . . . . . . . . . . . . . .
3.2.3 “Vertex-to-Vertex Matrix” for System . . . . . . . . . . . .
3.2.4 Example 1: System of Two Positive Thin Lenses . . . . . .
3.2.5 Example 2: Telephoto Lens . . . . . . . . . . . . . . . . . .
3.2.6 MVV0 Derived From Two Rays . . . . . . . . . . . . . . . .
3.3 Object-to-Image (Conjugate) Matrix . . . . . . . . . . . . . . . . .
3.3.1 Matrix of the “Relaxed” Eye (focused at ∞) . . . . . . . .
3.4 Vertex-Vertex Matrices of Simple Imaging Systems . . . . . . . . .
3.4.1 Magnifier (“magnifying glass,” “loupe”) . . . . . . . . . . .
3.4.2 Galilean Telescope of Thin Lenses . . . . . . . . . . . . . .
3.4.3 Keplerian Telescope of Thin Lenses . . . . . . . . . . . . . .
3.4.4 Thick Lenses . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.5 Microscope . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5 Image Location and Magnification . . . . . . . . . . . . . . . . . .
3.6 Marginal and Chief Rays for the System . . . . . . . . . . . . . . .
3.6.1 Examples of Marginal and Chief Rays for Systems . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
Separations
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
48
48
49
49
55
56
58
60
61
62
63
64
65
69
72
74
76
79
79
80
81
85
90
91
91
92
94
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
95
95
96
97
98
98
100
101
102
104
105
108
109
110
114
115
115
116
117
117
121
122
122
123
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
vii
CONTENTS
4 Depth of Field and Depth of Focus
4.0.2 Examples of Depth of Field from Video and Film
4.1 Criterion for “Acceptable Blur” . . . . . . . . . . . . . .
4.2 Depth of Field via Rayleigh’s Quarter-Wave Rule . . . .
4.3 Hyperfocal Distance . . . . . . . . . . . . . . . . . . . .
4.4 Methods for Increasing Depth of Field . . . . . . . . . .
4.5 Sidebar: Transverse Magnification vs. Focal Length . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
141
143
149
152
156
156
157
5 Aberrations
5.1 Chromatic Aberration . . . . . . . . . . . . . . . . . .
5.2 Third-Order Optics, Monochromatic Aberrations . . .
5.2.1 Names of Aberrations . . . . . . . . . . . . . .
5.2.2 Aberration Coefficients . . . . . . . . . . . . .
5.2.3 Fourth-Order (Third-Order Ray) Aberrations: .
5.2.4 Zernike Polynomials . . . . . . . . . . . . . . .
5.3 Structural Aberration Coefficients . . . . . . . . . . .
5.4 Optical Imaging Systems and Sampling . . . . . . . .
5.5 Optical System “Rules of Thumb” . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
161
161
165
173
174
181
190
193
193
193
.
.
.
.
.
.
.
.
.
Preface
This book is intended to introduce the mathematical tools that can be applied to model and predict
the action of optical imaging systems.
ix
0.1 REFERENCES:
0.1
1
References:
Many references exist for the subject of wave optics, some from the point of view of physics and many
others from the subdiscipline of optics. Unfortunately, relatively few from either camp concentrate
on the aspects that are most relevant to imaging.
Useful Optics Texts:
[P3] (the three) Pedrottis, Introduction to Optics, Pearson Prentice-Hall, 2007.
[G] Gaskill, Jack D., Linear Systems, Fourier Transforms, and Optics, John Wiley, 1978.
[JG] Goodman, Joseph, Introduction to Fourier Optics, Third Edition, Roberts & Company,
2005.
[H] Eugene Hecht, Optics, 4th Edition, Addison-Wesley, 2002.
[PON] Reynolds, DeVelis, Parrent, Thompson, The New Physical Optics Notebook, SPIE,
1989.
[BW] Max Born and Emil Wolf, Principles of Optics, 7th Expanded Edition, Cambridge
University Press, 2005.
[GF] Grant R. Fowles, Introduction to Modern Optics (Second Edition), Dover Publications,
1975.
[RHW] Robert H. Webb, Elementary Wave Optics, Dover Publications, 1997.
[FLS] R. Feynman, R. Leighton, M. Sands, The Feynman Lectures on Physics, AddisonWesley, 1964.
[KF] M.V. Klein and T.E. Furtak, Optics, Second Edition, Wiley, 1986
[JW] F. Jenkins and H. White, Fundamentals of Optics, 4th Edition, McGraw-Hill, 1976.
[NP] A. Nussbaum and R. Phillips, Contemporary Optics for Scientists and Engineers,
Prentice-Hall, 1976.
[I] K. Iizuka, Engineering Optics, Springer-Verlag, 1985.
[FBS] D. Falk, D. Brill, and D. Stork, Seeing the Light, Harper and Row, 1986.
Lawrence Mertz, Transformations in Optics, John Wiley & Sons, 1965.
Physics Texts with useful discussions:
[HR] D. Halliday and R. Resnick, Physics, 3rd Edition, Wiley, 1978.
[C] F. Crawford, Waves, Berkeley Physics Series Vol. III, McGraw-Hill, 1968.
John D. Jackson, Classical Electrodynamics, Third Edition, Wiley, 1998, §6.
Feynman, Leighton, and Sands, Lectures on Physics, particularly Volume 1.§25-§33 and Volume II §32-§33
Curriculum: Geometrical Optics and Imaging
1. Models for light propagation
(a) ray model (“geometric optics”)
(b) wave model (“physical optics”)
(c) photon model (quantum optics)
2. First-order optics
(a) third-order optics, aberrations
(b) higher-order approximations
3. Sign conventions for distances and angles
(a) Nature of objects and images (real and virtual)
2
Preface
4. Human eye
5. Refractive index
(a) Optical path length
(b) Fermat’s principle of least time (P3 §2.2, H §4.5, BW §3.3)
(c) Snell’s law for reflection: θ2 = −θ1
i. plane mirrors
(d) Snell’s law for refraction: n1 sin [θ1 ] = n2 sin [θ2 ]
i. plane interface between two media
(e) Dispersion (variation in n with λ)
i. relationship between mean refractive index and dispersion
ii. crown and flint glasses
(f) Dispersing prisms
6. Refraction at a Spherical Surface
(a) Paraxial approximation, imaging equation
(b) Reflection at a spherical surface
7. Imaging with thin lenses
(a) Imaging equation in terms of object and image distances and focal length
(b) system “power”
(c) spherical mirrors
(d) object/image conjugates
(e) Image magnifications
i. Transverse magnification
ii. Longitudinal magnification
iii. Angular magnification
(f) Single thin lenses
i.
ii.
iii.
iv.
positive lens
negative lens
meniscus lens
simple microscope
(g) Systems of thin lenses
i.
ii.
iii.
iv.
v.
vi.
vii.
viii.
ix.
lenses in contact
effective focal length and power of two-lens system
focal and principal points
afocal systems (telescopes)
eyeglasses
compound microscopes
Newtonian form of imaging equation
telephoto lens
Stops and pupils
A. aperture stop
B. entrance and exit pupils
0.1 REFERENCES:
C. field stop
(h) Marginal and chief (principal) rays
i. telecentricity
8. Tracing rays through optical systems
(a) paraxial ray tracing equations
i. paraxial refractiontransfer
ii. paraxial transfer
iii. linearity of equations
(b) matrix formulation of paraxial ray tracing
i.
ii.
iii.
iv.
v.
vi.
refraction matrix
transfer matrix
Lagrangian invariant
vertex-to-vertex matrix for imaging system
object-to-image (conjugate) matrix
matrix for eye model
(c) Examples of imaging system matrices
i.
ii.
iii.
iv.
v.
magnifier
Galilean telescope
Keplerian telescope
thick lens
microscope
(d) image location and magnification
(e) Depth of field and depth of focus
i.
ii.
iii.
iv.
v.
vi.
examples from film and video
criterion for “acceptable blur”
depth of field via Rayleigh’s quarter-wave rule
hyperfocal distance
methods for increasing depth of field
transverse magnification vs. focal length
(f) Aberrations
i. Chromatic aberration
A. achromatic doublet
B. apochromatic triplet
ii. Third-Order (Seidel) Aberrations
A. spherical aberration (relation to defocus)
B. coma
C. astigmatism
D. distortion
E. curvature of field
F. piston error
9. Computed Ray Tracing, OSLOTM
3
Chapter 1
Introduction
The obvious first question to consider is “what is optics” (or perhaps “what are optics?” heh, heh).
One reasonable definition of optics is the application of physical principles and observed phenomena
to manipulate “light” in useful ways. This presupposes the definition of “light,” which I specify as
electromagnetic radiation of any “color,” temporal frequency, and wavelength. This is more general
than the definition put forth by humanocentrics (e.g., color scientists), but is much more reasonable
in our field, where we want to take advantage of all measureable radiation to learn information
about objects that emit, reflect, refract, or otherwise modify radiation. The definition in imaging
is somewhat narrower: the application of the properties if materials and of light to form “images,”
which are “recognizable (though approximate) replicas of the spatial and spectral distribution of
light reflected, transmitted, and/or emitted by an object.”
To design optical image-forming systems, we must model the propagation of light from the
object (source) to the optic, the action of the optic on the incident light distribution, and finally
propagation from the optic to the sensor. The last step of conversion of the spatial (and possibly
spectral) distribution of incident light into measurable physical and/or chemical changes in some
medium by the sensor, is outside the scope of this discussion.
We hope to find a mathematical model of optical imaging as a “system,” where an output distribution g is created from an input object distribution f by the action of an imaging system O,
e.g., g [x, y, λ] = O {f [x, y, z, λ]}. We generally use this model to (try to) solve the inverse imaging
problem by inferring the input object from the output image and knowledge of the system. The task
may be difficult or even impossible; it is easy to see one difficulty because most sensors measure only
a 2-D distribution of monochromatic light and therefore cannot possibly recover the three spatial
dimensions of a realistic object from a single image.
Schematic of an optical system that acts on an input with three spatial dimensions, time, and
wavelength f [x, y, z, t, λ] to produce a 2-D monochrome (gray scale) image g [x0 , y 0 ].
1
2
CHAPTER 1 INTRODUCTION
1.1
Models of Light and Propagation
To be able even to write down, let alone solve, the imaging equation(s) for optical systems, we
need to specify the mathematical model of light that will describe its behavior as it propagates and
interacts with input objects, optical systems, and output sensors. To simplify the descriptions in
the different contexts, three physical models for light and its interactions are used that are (loosely
speaking) distinguished by the physical scale of the phenomena:
1.1.1
Ray model of light (“geometrical optics”)
macroscopic-scale phenomena (e.g., reflection, refraction)
1. (a) light propagates as RAYS that travel in straight lines until encountering an change in
properties of a medium or an interface between media. Except to differentiate the color
of light, the wavelength λ and temporal frequency ν of the light are assumed to be zero
and infinity, respectively (λ→0, ν→∞), which means that there are no effects due to
diffraction;
(b) uses Fermat’s principle of least time to derive Snell’s law, which describes the phenomena
of reflection and refraction;
(c) useful for designing imaging systems (to locate the images and determine their magnifications)
(d) calculations for modeling the behavior of optical systems (lenses and/or mirrors) are
(relatively) simple and may be easily implemented in software;
(e) the quality of images from the system is assessed in terms of aberrations of the optical
system, which describe deviations of the image from ideal behavior.
1.1.2
Wave model of light (“physical optics”):
1. microscopic-scale phenomena (diffraction/interference, reflection, refraction, refractive index,
...)
(a) considers light (electromagnetic radiation) to propagate as WAVES ;
(b) propagation and interaction of light are described by Maxwell’s equations;
¢
¡
(c) light propagates with velocity c in vacuum c / 3 × 108 m s−1 and velocity v < c in
transparent materials;
(d) light is described by its wavelength in vacuum λ0 and oscillation frequency ν 0 , whose
values affect any interactions with matter;
(e) the oscillation frequency ν 0 of waves emitted by a particular light source is constant
regardless of medium and is related to the vacuum wavelength λ0 via:
λ0 · ν 0 = c
(f) the ratio of the propagation velocities in vacuum and in a medium is the index of refraction
of the medium:
c
n≡
v
(g) the wavelength of the wave in a medium is shorter the “vacuum wavelength” λ0 via:
λmedium =
λ0
n
(h) wave optics explains the image-forming phenomena of reflection, refraction, diffraction
(and interference, which is really just another name for diffraction) and the phenomena
of polarization and dispersion that affect the quality of images;
1.1 MODELS OF LIGHT AND PROPAGATION
3
(i) mathematical calculations in wave optics are more “complicated” than those in ray optics
and often not easy to implement in computers. For example, it is difficult to evaluate the
exact form of light after propagating a short distance from the source;
(j) uses the Huygens-Fresnel principle to derive the mathematical model for propagation of
light, which if often divided into three regions:
i. linear, shift-invariant model in the Rayleigh-Sommerfeld diffraction region (valid
everywhere)
ii. linear, shift-invariant approximation in the near field for propagation by a “sufficiently large” distance from the source (Fresnel diffraction)
iii. linear, shift-variant approximation in the far field for propagation to “very large”
distances from the source (Fraunhofer diffraction);
(k) wave/physical optics is useful for assessing the quality of the images produced by systems.
1.1.3
Photon model of light (“quantum optics”):
atomic-scale phenomena (emission and absorption of radiation)
1. (a) light is composed of PHOTONS with both wave and particle characteristics;
(b) used to explain/analyze the physical interaction of light and matter, such as emission by
sources (e.g., lasers), and the photoelectric effect in sensors;
(c) Fundamental relationships: E0 = hν 0 = h
Planck’s constant:
c
E
h
and momentum p =
, where h is
=
λ0
c
λ0
h∼
= 4.136 × 10−15 eV s
= 6.626 × 10−34 J s ∼
Phenomena described by the ray and wave models are most relevant to imaging, though the
quantum model is vital for understanding the properties and artifacts of light sensing. You probably
have seen some consideration of ray optics in undergraduate physics, and any such experience will
be useful in this course. The most common treatments of optics consider rays first because the
mathematical models and calculations are simpler. However, the preparation of linear systems you
just had makes it possible and even desirable to consider the wave model first by applying the
concepts of the impulse response and transfer function; these may significantly simplify the concepts
and calculations.
There are several goals to be reached by the conclusion of this discussion; we want to have the
capabilities to do several things:
• locate the image(s) of an object generated by the lens, mirror, or system of lenses and/or
mirrors;
• determine the “character” (real or virtual) and the size(s) (i.e., the transverse magnification)
of the image(s);
• determine the “field of view” of the imaging system, i.e., the angular subtense of the object
that is imaged;
• determine the range of distances in the scene from the optical system that appears to be “in
focus” (the depth of field);
• determine the capability of the optics to distinguish closely spaced objects — this is the “spatial
resolution” of the system (often specified in terms of measurements from the “point spread
function” or the “modulation transfer function” = “MTF,” which are optical analogues of
the “impulse response” and “transfer function” that are considered in the course on Fourier
methods);
4
CHAPTER 1 INTRODUCTION
• understand the constraints on system performance due to the properties of materials used in the
imaging system, such as the variation in refractive index of glass with wavelength (dispersion)
Much of this discussion (especially about depth of field and spatial resolution) will benefit from
concepts derived in the course on Fourier methods, but we must also be aware of the limitations in
these concepts due to nonlinearities and/or shift-variant properties of the optical system.
Chapter 2
Ray (Geometric) Optics
Ray optics (commonly, though unfortunately, called “geometric optics”) uses the model of light as a
ray to evaluate the locations and properties of images created by systems of lenses and/or mirrors.
It does not consider any effects due to the wave model of light, such as interference or diffraction
(which are actually just different words for the same phenomenon: “interference” considers few light
sources and “diffraction” considers an infinite number, or just “many”). The subject of ray optics
may be subdivided into categories of “first-order,” “third-order,” and even higher-order optical
computations. It also cannot explain other wave-propagation phenomena, such as total internal
reflection.
2.1
What is an imaging system?
As a simple definition, we may consider an imaging system to map the distribution of the input
“object” to a “similar” distribution at the output “image” (where the meaning of “similar” is to be
determined). Often the input and output amplitudes are represented in different units. For example,
the input often is electromagnetic radiation with units of, say, watts per unit area, while the output
may be a transparent negative emulsion measured in dimensionless units of “density” or “transmittance.” In other words, the system often changes the form of the energy; it is a “transducer.”
In the ray model, we can think of the imaging system as “selecting” and/or “redirecting” rays of
light to map the energy onto the image sensor. The “selection” or “redirection” process uses some
type of physical interaction between light and matter to remap the energy emitted or modified by
the object onto the sensor. Among the more obvious physical interactions in our experience are
refraction and reflection, but these are not the only, nor even the simplest, possible mechanisms.
The very simplest interaction between light and matter is absorption, where the light energy is
transferred to matter and “disappears” (of course, it does not really “vanish,” but most often is
converted into heat in the matter, but it is no longer available to create an image, so it may as well
have “disappeared.” We can use an absorber to create the simplest imaging system: the pinhole
camera
2.1.1
Simplest Imaging System — Pinhole in Absorber
Consider a 3-D volume of space that contains the object. Occasionally, a ray of light emitted (or
reflected) from a location in the volume is selected by the pinhole and reaches the sensor.
every point in space is “in focus” on the sensor
transverse magnification Mt determined by relative distances
MT = −
negative sign means image is inverted
5
z2
z1
6
CHAPTER 2 RAY (GEOMETRIC) OPTICS
The number of rays from the object that actually reach the image is small. The interaction
with the sensor requires the quantum model of discrete energy packets, so the number of packets
is small if the hole diameter is small. If the object is a uniformly emitting planar source, the
numbers of packets measured from different locations in the field are different (Poisson statistics);
these numerical variations in what should be identical measurements appear as “noise.” The metric
of noise is determined by the mean value μ of the signal and the variation about that mean, which
is described by the standard deviation σ. The signal-to-noise ratio is a dimensionless quantity that
may be defined many ways, but we’ll use a simple definition that will suit this purpose
SN R ≡
μ
μ
√
=√ = μ
σ
μ
More photons leads to larger signals (μ ↑) and larger standard deviation (σ ↑), but mean increases
√
faster than the variance σ = μ, so the SNR is
better statistics and less relative noise
“Quality” of image depends on diameter d0 of pinhole. Improve statistics by increasing the
number of photons. Larger dose or larger pinhole. The “blur” quality of the image is better for
smaller pinhole because less uncertainty in ray path.
How to improve?
Longer exposure time
multiple pinholes
Depth of field
Redirect rays:
reflective pinholes
Reflection
Refraction
Diffraction (wave property), e.g., holography
2.2
First-Order Optics
Of most concern to us will be “first-order,” “paraxial ” or “Gaussian” optics, where the angles of
light rays measured relative to the optical axis are assumed to be small, so that the ray heights
remain small as the rays propagate down the optical axis, which is the source of another common
term of “paraxial optics,” meaning that the ray remains near the optical axis. In cases such that
the ray angle θ ∼
= 0, then we can approximate trigonometric functions by the first terms in their
power-series expansions (the “Taylor series” ):
!
!
Ã ¯
Ã
¯
¯
0
1
2
(x − x0 )
(x − x0 )
1 dn f ¯¯
(x − x0 )
df ¯¯
d2 f ¯¯
f [x] =
+
+ ··· +
· (x − x0 )n + · · ·
· f [x0 ] +
·
0!
1!
dx ¯x=x0
2!
dx2 ¯x=x0
n! dxn ¯x=x0
=
∞
n
X
(x − x0 )
· f (n) [x0 ]
n!
n=0
If the base value and the derivatives are evaluated at the origin, we have a “Maclaurin series:”
f [x] =
∞
X
1 (n)
f [0] · xn
n!
n=0
7
2.2 FIRST-ORDER OPTICS
The Maclaurin series for the sine is:
¯
∞
X
¯
dn
1
sin [θ] =
· θn
· n (sin [θ])¯¯
n!
dθ
θ=0
n=0
1
1
1
1
1
· sin [0] · θ0 + · (+ cos [0]) · θ1 + · (− sin [0]) · θ2 + · (− cos [0]) · θ3 + · (+ sin [0]) · θ4 + · · ·
0!
1!
2!
3!
4!
θ3
θ5
= 0+θ+0−
+0+
− ···
3!
5!
θ3
θ5
= θ−
+
− ···
3!
5!
θ3
θ5
= θ−
+
− ···
6
120
sin [θ] =
Note that only odd powers of θ are present in the series for sin [θ], because the sine is an odd
(antisymmetric) function that satisfies the condition sin [−θ] = − sin [+θ].
The corresponding series for the even (or symmetric) cosine includes only even powers of θ:
cos [θ]
=
1−
∞
X
θ2
θ4
θ2n
(−1)n
+
− ··· =
2!
4!
(2n)!
n=0
{cos [θ]} = 1
=⇒ lim
∼
θ =0
=⇒ cos [θ] ≡ 1 −
θ2
2
So the approximation of the cosine with two terms is the difference of a constant and a parabola.
The series for the (odd, antisymmetric) tangent is less commonly known and includes only the
odd powers of θ:
¢
¡
∞
X
¡ 2n ¢ 22n − 1
θ3
2 5
tan [θ] = θ +
{tan [θ]} = θ
2
+ θ + ··· =
B2n θ2n−1 =⇒ lim
θ∼
3
15
(2n)!
=0
n=0
where B isbthe
the tangent are:
th
Bernoulli number. The first-, third-, and fifth-order series approximations for
π
tan [θ] ∼
= θ for > |θ| ' 0
2
3
θ
tan [θ] ∼
= θ+
3
3
θ
2
tan [θ] ∼
+ θ5
= θ+
3
15
The validity of these approximations is perhaps more obvious from the graphs, where we can see
that sin [θ] / θ and tan [θ] ' θ for small positive values of θ.
8
0.5
0.4
0.3
0.2
0.1
0.0
0.0
0.1
0.2
0.3
0.4
0.5
theta
Comparison of θ (black), sin [θ] (red), and tan [θ] (blue) for 0 ≤ θ ≤ +0.5 radians, showing that
sin [θ] / θ and tan [θ] ' θ over this domain.
The corresponding first-order approximation to the cosine is the unit constant
lim {cos [θ]} = 1
θ→0
1.2
1.1
1.0
0.9
0.8
0.00
0.02
0.04
0.06
0.08
0.10
0.12
0.14
0.16
0.18
0.20
theta
The first-order approximation to cos [θ] (red) compared to the unit constant (black), showing that
the two are very similar for small values of θ.
The advantage of the first-order approxmation is that evaluation of the ray heights and angles
becomes simple because of the proportionality.
9
2.3 THIRD-ORDER OPTICS
2.3
Third-Order Optics
It likely is obvious from the definition of first-order optics that “third-order” optics includes the
second term in the expansions:
θ3
sin [θ] ∼
=θ−
= θ−
3!
θ3
tan [θ] ∼
= θ+
3
2
∼ 1− θ =1−
cos [θ] =
2!
θ3
6
θ2
2
0.5
0.4
0.3
0.2
0.1
0.0
0.0
0.1
0.2
0.3
0.4
0.5
theta
Comparison third-order approximations of sin [θ] (red), and tan [θ] (blue) to the linear term θ
(black) .
Note that the third-order approximation for the cosine is a biased parabola:
1.2
1.1
1.0
0.9
0.8
0.0
0.1
0.2
0.3
0.4
cos [θ] (black) and its third-order approximation as 1 −
0.5
theta
θ2
2
(red).
10
The results for ray angles using third-order optics will differ from those of first-order optics; these
differences lead to image aberrations.
2.3.1
Higher-Order Approximations
We clearly can add additional terms to the power series that will increase the accuracy of any
calculations at the cost of significantly more complexity.
2.4
Notations and Sign Conventions
One of the simplest and most difficult aspects of ray optics is the set of conventions to be adopted for
all of the quantities to be measured. As in many aspects of optics, there are competing choices for
conventions that have their own distinct advantages, but that lead to different equations for image
locations, etc. We are going to use the directed distance convention, where distances are positive
if measured from left to right. The problem becomes remembering which are the points measured
“from” and “to,” respectively. The figure shows sign conventions for the different quantities. Note
that in all cases, light travels from left to right in all media with positive refractive index (n > 0), so
the distances are positive if measured in the same direction of light travel and negative if measured
in the other direction.
Sign conventions for distances, heights, angles, and curvatures. The distance is positive if measured
from left to right; the height is positive if the endpoint is above the axis; the angle from the axis or
from a normal is positive if measured in the counterclockwise direction (positive θ); and the
curvature is positive if its center is to the right of the vertex (intersection of the surface and the
optical axis).
Now consider the example in the figure where an optical system forms acts on a red “object” (the
upright red arrow) located at the object point labeled by O to produce an “image” at O0 . The
horizontal black line is the line of symmetry of the optical system and is calle the “optical axis.”
2.4 NOTATIONS AND SIGN CONVENTIONS
11
Sign conventions for a specific case: the object height at O is positive, while the image height at O0
is negative. The angle θ of the (blue) ray from the base of the object to the (green) first surface is
positive. The radius of curvature R of the first surface is positive.
The front and rear surfaces of the optical system are shown in green; their intersections with the
optical axis are the vertices of the system. The object space includes all features to the left of the
vertex V that is closer to the object, so V is the object-space vertex of the imaging system. Similarly,
the image space includes all features to the right of the vertex V0 that is closer to the image O0 ,
so V0 is the image-space vertex. The ray shown in blue from the object O to the green optical
surface makes an angle θ measured from the optical axis to the ray; since this angle is measured
counterclockwise, it is a positive angle θ > 0. The image-space ray from V0 to O0 measured from
the axis is a clockwise angle, so θ0 < 0.
The front surface of the optical system has a radius of curvature R that is measured from the
vertex to the center of curvature, i.e. R =VC, where the overscored pair of letters denotes the
distance from the first feature to the second. In this case, the distance from V to C is measured
from left ot right, so VC ≡ R > 0. In the same manner, the distance from the rear vertex V0 to
its center of curvature C0 is measured from right to left, so R0 ≡ V0 C0 < 0; R0 is negative in this
example.
Two other features are shown in the figure that we have not yet described, one each in object
and image space. F and F0 are object-space and image-space focal points, respectively. They
are endpoints of the object-space and image-space focal lengths; the other endpoints are either
the vertices (if the lenses are “thin”) or the principal points (which we shall label as H and H0 ,
respectively). That discussion will have to wait until later.
We will often have the need to propagate a light ray through an optical system consisting of
a set of different thin lenses or a set of surfaces separated by different media. The cascade of
calculations requires distances measured from the object to the lens or front surface and from lens
or back surface to the image. The need to express multiple distances will be addressed by both
subscripts and “primed” notation, depending on context, where the “unprimed” notation will refer
to the distance before the lens or surface and the “primed” notation to that after. When multiple
surfaces are needed, the first will be denoted by the subscript “1,” the next by “2,” etc.
Notation can also be a problem. The two different lower-case Greek letters for “phi” (straight φ
and cursive ϕ) will be used in different ways: φ represents the “power” of a lens or surface and is
measured in reciprocal length, most commonly reciprocal meters m−1 , which is named the diopter.
The cursive phi (ϕ) will be used to represent an angle, and therefore is dimensionless. The cursive
letter f is used to represent a function, e.g., f [x, y, t], whereas the “straight” letter f will be used
to denote the focal length with dimensions of length. This means that:
φ=
2.4.1
1
f
Nature of Objects and Images:
1. Real Object: Rays incident on the lens are diverging from the source; the object distance is
positive
12
2. Virtual Object: Rays into the lens are converging toward the “source” located “behind” the
lens; object distance is negative
3. Real Image: Rays emerging from the lens are converging toward the image; image distance is
positive
4. Virtual Image: Rays emerging from the lens are diverging, so that the “image” is behind the
lens and the image distance is negative
2.5 HUMAN EYE
2.5
13
Human Eye
Since this course considers optics of imaging systems, and since the images generated by many
optical systems are viewed by human eyes, we need to at least introduce the optics of the eye; we
will consider it in more detail when we trace rays through the “standard” eye model later.
The optics of the human eye include the curved surface (the “cornea,” which exhibits most of
the power of the system) and a deformable lens. The system is intended to form an image on the
retina, which is a fixed distance from the cornea. The lens is deformed by action of ciliary muscles
to change the plane that is viewed “in focus.” When the muscles are relaxed, the lens is “flatter,”
i.e., the radii of curvature of the surfaces are larger. To view an object “close up,” the focal length
of the eye lens must be shortened by making the lens shape more spherical. This is accomplished by
tightening the ciliary muscles (which is the reason why your eyes get tired after an extended time
of viewing objects up close).
If the retina is located “too far” from the cornea, so that the image is “in front” of the retina
when the muscles are relaxed, then the eye sees a “blurry” image of distant objects, but nearby
objects may be well focused. This is the condition of “nearsightedness” or “myopia.” If the retina
is “too close” to the cornea, the image is focused behind it and the eye sees distant objects more
sharply (“hyperopia” or “farsightedness.”)
2.6
Principle of Least Time
The mathematical model of ray optics is based on a principle stated by Fermat. Long before that,
Hero of Alexandria hypothesized a model of light propagation that could be called the principle of
least distance:
A ray of light traveling between two arbitrary points
traverses the shortest possible path in space. (Hero of Alexandria)
This statement applies to reflection and transmission through homogeneous media (i.e., the medium
is characterized by a single index of refraction). However, Hero’s principle is not valid if the object
and observation points are located in different media (as is the normal situation for refraction) or if
multiple media are present between the points.
In 1657, Pierre Fermat modified Hero’s statement to formulate the principle of least time (which
actually works):
A light ray travels the path that requires the least time to traverse. (Fermat)
The laws of reflection and refraction may be easily derived from Fermat’s principle. A moving ray
14
(or car, bullet, or baseball) traveling a distance s at a velocity v requires t seconds:
t=
s
v
If the ray travels at different velocities for different increments of distance, the total travel time is
the summation over the different distances and different velocities:
t=
M
X
sm
v
m=1 m
If we define the velocity of a light ray in a medium of index n to be v =
t=
M
X
m=1
where the optical path length
³
sm
c
nm
is defined:
´=
M
X
m=1
c
. then:
n
M
1 X
(nm sm ) ≡
c m=1
c
(nm sm ) ≡
For a single medium, the optical path length is:
≡n·s
Note that the optical path length is longer than the physical path length; it is the distance that a
ray would travel in vacuum in the same time that it would take to travel the physical distance s;
the optical path is longer than the physical path because light travels more slowly in the medium
(nm ≥ 1). The principle of least time may be restated as a light ray requires the least time to
traverse the path with the shortest optical path length, or:
A ray traverses the route with the shortest optical path length.
This suggests a philosophical question, “How does the light ray know which path to take before
it leaves the source?” I leave it to you to ponder this question, but will say that the difficulty if
formulating an answer suggests the limitation of the (simple) ray model for light propagation.
2.7
Fermat’s Principle for Reflection
Now consider the path traveled upon reflection that minimizes an easily evaluated optical path
length:
2.7 FERMAT’S PRINCIPLE FOR REFLECTION
15
Schematic for determining the angle of reflection using Fermat’s principle.
As drawn, the angle θ1 is positive (measured from the normal to the ray) and θ2 is negative (from the
normal to the ray). The ray travels in the same medium of index n both before and after reflection.
The components of the optical path length are:
p
so = h2 + x2
q
op = b2 + (a − x)2
And the expression for the total optical path length
is:
= n · (so + op)
¶
µp
q
2
2
2
2
h + x + b + (a − x)
=n
= [x]
(a function of x)
By Fermat’s principle, the path length traveled is the minimum of the optical path length , so the
position of o along the x-axis is found by setting the derivative of with respect to x to zero:
¶¶
µ µp
q
d
d
2
h2 + x2 + b2 + (a − x)
=0
=
n
dx
dx
⎛
⎞
2x
−2
(a
−
x)
⎠
=n·⎝ √
+ q
2 h2 + x2 2 b2 + (a − x)2
x
a−x
=√
−q
=0
2
h2 + x2
b2 + (a − x)
a−x
x
=q
=⇒ √
2
2
2
h +x
b2 + (a − x)
16
From the drawing, note that:
x
sin [θ1 ] = √
h2 + x2
a−x
sin [−θ2 ] = q
2
b2 + (a − x)
=⇒ sin [θ1 ] = sin [−θ2 ]
=⇒ θ2 = −θ1
In words, the magnitudes of the angles of incidence and reflection are equal (as already derived
by evaluating Maxwell’s equations at the boundary). The negative sign is necessary because of
the sign convention for the angle; the angle is measured from the normal and increases in the
counterclockwise direction, but the reversal of the propagation direction of the ray means that it
also may be “explained” by assuming that the index of refraction for the image space is the negative
of that for the object space.
Snell’s law for reflection at interface.
Note that Snell’s law for reflection does not include either refractive index n, which means that
the outgoing ray angle is not affected by the different refractive indices of the the two media, so the
image location and quality are not influenced by the indices. The “amount” of the ray that is reflected
IS affected by the two refractive indices via the Fresnel equations, which require the principles of
wave optics for explanation. At this point, we will just introduce the relationship without proof. If
light is incident normally to the interface between two media (θ = 0) with refractive indices n1 and
n2 , the reflectivity of the surface obeys:
R=
µ
n1 − n2
n1 + n2
¶2
if θ = 0
If the first medium is air with n ' 1 and the second is glass with n ∼
= 1.5, the reflectivity is:
R=
µ
1 − 1.5
1 + 1.5
¶2
= 0.04
Note that the reflectivity is the same if the first medium is glass and the second is air:
R=
µ
1.5 − 1
1.5 + 1
¶2
= 0.04
The reflectivity at different incident angles obeys more complicated expressions, in part because the
light must be decomposed into different polarizations depending on the direction of oscillation of
the electric field.
2.7 FERMAT’S PRINCIPLE FOR REFLECTION
2.7.1
17
Plane Mirrors
Other than perhaps the pinhole, the simplest image forming system is the plane mirror, which is
so familiar that it may seem hardly worth mentioning. Clearly its action obeys Snell’s reflection
law that θ2 = −θ1 , which means that the the appearance of an image is “reversed” relative to the
object, i.e., the parity of the image is inverted. It also allows introduction of the concepts of object
space and image space, which will be used thenceforth and forevermore. The object space is the
locus of points where objects may exist, which is all points “in front of” the mirror (real objects)
and “behind” the mirror (virtual objects) . A real object forms a virtual image “behind” the mirror,
and a virtual object forms a real image “in front of” the mirror. In other words, the object and
image spaces for reflection by a plane mirror both include the entire 3-D space.
Object and image space for a plane mirror. Rays diverging from a real object forms a virtual image
“behind” the mirror, but rays converging to a virtual object “behind” the mirror form a real image
“in front of” the mirror.
18
2.8
Fermat’s Principle for Refraction:
Schematic for refraction using Fermat’s principle.
In this drawing, both θ1 and θ2 are positive (measured from the normal to the interface in the
counterclockwise direction). The optical path length is:
= n1 · so + n2 · op
q
p
= n1 h2 + x2 + n2 b2 + (a − x)2
By Fermat’s principle, the path length traveled is that such that is minimized, so we again set the
derivative of with respect to x to zero and identify trigonometric functions for the resulting ratios.
d
2x
−2 (a − x)
+ n2 q
=0
= n1 √
dx
2 h2 + x2
2 b2 + (a − x)2
x
a−x
= n2 q
=0
=⇒ n1 √
2
2
h +x
b2 + (a − x)2
x
sin [θ1 ] = √
2
h + x2
a−x
sin [θ2 ] = q
2
b2 + (a − x)
=⇒ n1 sin [θ1 ] = n2 sin [θ2 ]
=⇒ Snell’s Law for refraction
Note that with this sign convention, Snell’s law may be applied to reflection by setting the refractive
index of the second medium to be the negative of the first:
n1 sin [θ1 ] = n2 sin [θ2 ]
=⇒ n1 sin [θ1 ] = −n1 sin [θ2 ]
=⇒ − sin [θ1 ] = sin [θ2 ]
=⇒ θ2 = −θ1
2.8 FERMAT’S PRINCIPLE FOR REFRACTION:
19
The expression of Snell’s law for refraction is general, but we can easily apply the first-order paraxial
approximation that sin [θ] ∼
= θ if the ray angles are small (θn ∼
= 0):
n1 sin [θ1 ]
2.8.1
=
n2 sin [θ2 ] =⇒ n1 · θ1 = n2 · θ2 in paraxial approximation
n1
· θ1 in paraxial approximation
=⇒ θ2 =
n2
Dispersion
Unlike the reflection law, Snell’s law for refraction DOES include the refractive indices. This means
that the angle of refraction will change as the indices change, as with wavelength. All (or perhaps
I should day ALL) transparent materials exhibit a variation in refractive index with wavelength,
which is called dispersion. Note that the features of dispersion depend on the material (e.g., glass).
The full explanation of dispersion is beyond the scope of this course, so we will just describe its
effects.
In a transparent matrial over the range of visible wavelengths, the refractive index n DECREASES with increasing λ. In the study of wave optics, this ensures that the phase velocity
ω
dω
for the “average” wave vφ =
is larger than the group or modulation velocity
. Among other
k
dk
things, this ensures that a signal transmitted as a modulation of a light wave cannot travel at a
speed faster than the velocity of light. A schematic dispersion for a hypothetical glass is shown in
the figure; note that the slope of the dispersion curve decreases with increasing λ; the curve “flattens
out” as λ increases in the visible range.
Typical dispersion curve for glass at visible wavelengths, showing the decrease in n with increasing
λ and the three spectral wavelengths specified by Fraunhofer and used to specify the “refractivity”,
“mean dispersion”, and “partial dispersion” of a material.
The refractive indices for several real glasses shows an additional feature of dispersion curves:
the relationship between the “amount” of dispersion and the refractive index. Glasses with lower
refractive index (n ∼
= 1.5, the so-called crown glasses) have a “flatter” graph and therefore less
dispersion. In other words, nblue is larger than nred , but not much larger., so that the smaller the
refractive index, the smaller the dispersion. Flint glasses have larger values of the refractive index
(n ∼
= 1.7) and larger variations across the visible spectrum:
(nblue − nred )flint > (nblue − nred )crown
20
Dispersion curves for various optical glasses as a function of wavelength λ in the visible region of
the spectrum (measured in Angstroms, where 1 Å = 0.1 nm = 10−10 m, 4000 Å = 400 nm) The rapid
rise in the index at wavelengths in the ultraviolet region is due to the atomic resonances there.
If we use the paraxial approximation for rays in air entering a glass with refractive index n, the
outgoing ray angle θ2 is:
1
· θ1 in paraxial approximation
θ2 =
n2
Dispersion ensures that (n2 )blue > (n2 )red , which means that (θ2 )blue < (θ2 )red and the deviation
angle δ blue > δ red .
Since the outgoing ray angles are different for different colors, images will be formed at different
distances in different colors. This is the source of chromatic aberration in imaging systems.
Effect of dispersion on refraction: since the refractive index for red light is smaller, the angle of
refraction measured from the normal is larger. Put another way, this means that the deviation
angle due to refraction is smaller for red light than for blue light.
In imaging, we often think of dispersion in refractive elements as an unfortunate “bug” in the
21
system, but you probably also know that it can be a very useful feature; it provides a tool for
spreading white light into its constituent spectrum in a dispersing prism.
Dispersing prism with the two refractions, showing that the angle of deviation from the original
path is larger for blue light than for red light.
From the figure, note that the angle of deviation of the ray from the original path is larger for blue
light due to the dispersion of light
δ blue > δ red for prism
The relationship between the wavelength and the deviation angle is complicated for refraction.
As a side comment, note that light may also be dispersed into its spectrum by the phenomenon
of diffraction in gratings. However, the relationship between the wavelength and the deviation angle
for diffraction is very simple: the angle of deviation is proportional to the wavelength (for small
angles):
δ ∝ λ =⇒ δ blue < δ red for grating
This means that it is easier to construct an accurate spectrometer based on diffraction than based
on refractive dispersion.
2.8.2
Refractive Constants for Glasses
The refractive properties of glass are approximately specified by the refractivity and the measured
differences in refractive index at the three Fraunhofer wavelengths F, D, and C:
Refractivity
nD − 1
1.75 ≤ nD ≤ 1.5
Mean Dispersion
nF − nC > 0
differences between blue and red indices
Partial Dispersion
nD − nC > 0
nD − 1
ν≡
nF − nC
differences between yellow and red indices
Abbé Number
ratio of refractivity and mean dispersion, 25 ≤ ν ≤ 65
(note that larger dispersions result in smaller Abbé numbers)
Glasses are specified by six-digit numbers abcdef, where nD = 1.abc, to three decimal places,
and the Abbé number ν = de.f . Note that larger values of the refractivity mean that the refractive
index is larger and thus so is the deviation angle in Snell’s law. A larger Abbé number means that
the mean dispersion is smaller and thus there will be a smaller difference in the angles of refraction.
Such glasses with larger Abbé numbers and smaller indices and less dispersion are crown glasses,
while glasses with smaller Abbé numbers are flint glasses, which are “denser”. Examples of glass
specifications include Borosilicate crown glass (BSC), which has a specification number of 517645, so
its refractive index in the D line is 1.517 and its Abbé number is ν = 64.5. The specification number
22
for a common flint glass is 619364, so nD = 1.619 (relatively large) and ν = 36.4 (smallish). Now
consider the refractive indices in the three lines for two different glasses: “crown” (with a smaller n)
and “flint:”
Line
λ [ nm]
n for Crown
n for Flint
C
656.28
1.51418
1.69427
D
589.59
1.51666
1.70100
F
486.13
1.52225
1.71748
The glass specification numbers for the two glasses are evaluated to be:
For the crown glass:
refractivity: nD − 1 = 0.51666 ∼
= 0.517
1.51666 − 1
∼
Abbé number : ν =
= 64.0
1.52225 − 1.51418
Glass number = 517640
For the flint glass:
refractivity:L nD − 1 = 0.70100 ∼
= 0.701
0.70100 − 1
∼
Abbé number: ν =
= 30.2
1.71748 − 1.69427
Glass number = 701302
Dispersion curve of a material from very short to very long wavelengths. The index increases with
increasing λ as additional resonances are passed, but the index of refraction decreases with
increasing wavelength in the visible wavelengths (bold face).
23
The dispersion curves for optically transparent materials, such as glass and air, exhibit some very
similar features, though the details may be significantly different. Starting at very short wavelengths
(λ ' 0), the refractive index n is approximately unity. In words, the wavelength is so short (and
the oscillation frequency so large) that the energy per photon is very large, so that photons pass
through the material without interacting with the atoms; the material appears to be vacuum. For
longer (but still very short) wavelengths (“hard” X rays), the refractive index actually is slightly
less than unity, which means that X rays incident on a prism are refracted away from the prism’s
base, rather than towards the base in the manner of visible light. This is the reason why X rays can
be totally reflected at grazing incidence, which is the focusing mechanism used in X-ray telescopes
(such as Chandra). As the wavelength of the incident light increases further, though still within the
X-ray region, the radiation incident on the material is heavily absorbed; this is the “K-absorption
edge” where the energy of the incident X rays is just sufficient to ionize an electron in the innermost
atomic “shell” — the “K shell.” For example, the wavelength of this absorption is λK ∼
= 0.67 nm
for silicon. Other absorptions occur at yet longer wavelengths (smaller incident photon energies),
where electrons in the L and M shells, etc., of the atom are ionized. The spectrum of a material
with a large atomic number (and thus several filled electron shells) will exhibit several such resonant
absorptions.
Ionization of a K-shell electron by an incoming X ray of sufficient energy. This is the reason for
the large absorptions of “hard” X rays by materials. Lower-energy (longer-wavelength) X rays will
ionize electrons in the L or M shells, thus producing other absorption “edges.”
As the wavelength of the incident radiation increases further, into the “far ultraviolet” region of
the spectrum, the real part of the refractive index decreases to a value much less than unity within
a wide band of anomalous dispersion. The fact that n < 1 in this region may be confusing because
it seems that the velocity of light exceeds c, but these waves do not propagate in the material due
to the strong absorption (large value of κ). The wavelength of maximum absorption corresponds to
the largest of the several “natural oscillation frequencies” of bound electrons in the material.
In the visible region of the spectrum, the dispersion curve exhibits the familiar decrease in n
with λ that was shown above. For example, the index of air is n ∼
= 1.000279 at λ = 486.1 nm
(Fraunhofer’s “F” line) and n ∼
= 1.000276 at λ = 656.3 nm (“C” line). The corresponding values for
diamond are nF = 2.4354 and nC = 2.4100. The closer the nearest ultraviolet absorption to the
visible spectrum, the steeper will be the slope dn
dλ in the visible region and thus the larger the visible
dispersion (defined below).
The dispersion curve descends yet more steeply somewhere in the near infrared region and then
rises due to anomalous dispersion in the vicinity of an infrared absorption band (labeled “λ2 ” on
the graph). For quartz (crystalline SiO2 ), the center of this band is located at λ ∼
= 8.5 μm, but the
absorption already is quite strong for wavelengths as short as λ ∼
= 4 μm. Most optical materials have
several such infrared absorption bands and the “base level” of the index of refraction is larger after
each such band. This behavior is confirmed by far-infrared measurements of the refractive index of
quartz (crystalline SiO2 ), which varies over the interval 2.40 ≤ n ≤ 2.14 for 51 μm ≤ λ ≤ 63 μm. The
large values of n ensure that the focal length of a convex quartz lens is much shorter at far-infrared
24
wavelengths than at visible wavelengths.
As the wavelength is increased still further into the radio region of the spectrum after the last
absorption band, the refractive index decreases
r slowly due to normal dispersion from that last
absorption and approaches a limiting value of
.
0
2.9
Image Formation in the Ray Model
We know that light rays are deviated at interfaces between media with different refractive indices.
The goal in this section is to use interfaces of specified shapes to “collect” the light and “reshape”
the wavefronts in a way that recreates “images” of the original sources.
2.9.1
Refraction at a Spherical Surface
Optical systems typically are used to form images of the source distribution by constructing optical
elements (“lenses”) made out of transparent media with different refractive indices to redirect the
electromagnetic radiation. Until rather recently, lenses were fabricated almost exclusively from
glass, which required the optical surfaces to be ground to the desired curvature and polished to
remove scratches, etc., from the grinding. Two pieces of glass are typically employed in the grinding
process: the “optic” and the “tool.” Water and a grinding compound composed of flecks of some
hard substance resembling sand are placed on the surface of one glass and the two surfaces rubbed
together with some force applied to the top optic. The two glass pieces are In the grinding process,
The surface that is easiest to fabricate is a sphere, because the two surfaces will be in contact
at all translations. Glass is ground out of the center of the top piece and off of the edges of the
bottom piece, leaving a concave sphere on top and a convex sphere on the bottom. The “grit”
of the grinding compound is reduced gradually to leave a smoother surface. The surface is then
polished using very fine “jeweler’s rouge” to produce smooth surfaces of “optical” quality. More
recently, optical elements have been fabricated from thin plates cemented over a hollowed-out “grid”
to lighten the weight. Also plastics and other materials have been developed that may be cast to
produce optical surfaces of various shapes with minimal polishing.
Grinding optical surfaces: a slurry of water and grinding compound (e.g., carborundum) is placed
between two glass surfaces. The top glass is pushed down and moved around to grind glass from the
center region of the top piece. The resulting surfaces must be spherical because they are the only
curves that remain in contact at all locations.
Consider the action of a spherical surface of a medium with index n2 on an incident ray in a
medium of index n1 :
2.9 IMAGE FORMATION IN THE RAY MODEL
25
Refraction at a spherical surface between two media of refractive index n1 and n2 .
The point source is located at s and its distance to the vertex v is sv ≡ z1 > 0. The distance
from vertex v to the observation point p is vp ≡ z2 > 0. The physical distance traveled by a ray in
medium n1 to the surface is sa ≡ 1 and that in medium n2 is ap ≡ 2 . The radius of curvature of
the surface is vc = ac ≡ R > 0 as drawn. For emphasis, we repeat that z1 , z2 , and R are all positive
in our convention. The ray intersects the surface at angle ϕ (the “position angle”) measured from
the center of curvature c. The optical path length of the ray from s to p through a is
OP L = n1
1
+ n2
2
= n1 (sa) + n2 (ap)
The triangles 4sac and 4acp has sides 1 and R with hypotenuse z1 + R, while 4acp has sides
R and z2 − R, with hypotenuse ap ≡ 2 . The physical lengths 1 and 2 may be evaluated from the
other two sides and the included angle ϕ via the law of cosines:
4sac =⇒
=⇒
4acp =⇒
2
1
1
2
2
2
= (z1 + R) + R2 − 2R (z1 + R) cos [ϕ]
q
2
= (z1 + R) + R2 − 2R (z1 + R) cos [ϕ]
= (z2 − R)2 + R2 − 2R (z2 − R) cos [π − ϕ]
q
= (z2 − R)2 + R2 + 2R (z2 − R) cos [ϕ]
=⇒ 2
q
= (z2 − R)2 + R2 − 2R (R − z2 ) cos [ϕ]
The corresponding optical path length is:
OP L = n1
1
+ n2
2
= n1 ·
µq
¶
2
(z1 + R) + R2 − 2R (R + z1 ) cos [ϕ]
µq
¶
2
+ n2 ·
(z2 − R) + R2 − 2R (R − z2 ) cos [ϕ]
which is obviously a function of the position angle ϕ. We can now apply Fermat’s principle to find
26
the angle ϕ for which the OPL is a minimum:
d
(OP L) = 0
dϕ
n2 · 2R (R − z2 ) sin [ϕ]
n1 · 2R (R + z1 ) sin [ϕ]
+q
=q
2
2
(z1 + R) + R2 − 2R (R + z1 ) cos [ϕ]
(z2 − R) + R2 − 2R (R − z2 ) cos [ϕ]
µ
¶
n1 (R + z1 ) n2 (R − z2 )
= 2R sin [ϕ]
+
1
2
which may be rearranged to:
0 = 2R sin [ϕ]
µ
n1 (R + z1 )
2
+
n2 (R − z2 )
1
=⇒
n1 R
+
n2 R
1
=⇒
n1
1
+
2
¶
2
=
2
n2
n2 (R − z2 )
1
n1 (R + z1 )
=⇒ 0 =
+
1
=
R
n2 z2
µ
2
−
n2 z2
2
n1 z1
−
1
n1 z1
1
¶
This last relation between the physical path lengths 1 and 2 and the distances z1 and z2 is exact.
Now we use the expression for the physical path length 1 to find its ratio relative to the axial
distance z1 and use simple algebra to rearrange:
q
(z1 + R)2 + R2 − 2R (z1 + R) cos [ϕ]
1
=
z1
z1
Ã
! 12
2
(z1 + R) + R2 − 2R (z1 + R) cos [ϕ]
=
z12
µ 2
¶1
z1 + R2 + 2Rz1 + R2 − 2R2 cos [ϕ] − 2Rz1 cos [ϕ] 2
=
z12
µ
µ 2
¶
¶ 12
2R
2R
1
= 1+
+
(1
−
cos
[ϕ])
z1
z12
z1
This relation also is exact, but may be approximated by applying a truncated series for cos [ϕ]:
cos [ϕ]
ϕ2 ϕ4 ϕ6
+
−
+ ··· ∼
= 1 if ϕ ∼
=0
2!
4!
6!µ
¶
ϕ2 ϕ4 ϕ6
+
−
+ ···
=⇒ 1 − cos [ϕ] = 1 − 1 −
2!
4!
6!
ϕ2 ϕ4 ϕ6
=
−
+
− ···
2!
4!
6!
∼
∼
= 0 if ϕ = 0
=
1−
This leads to the first-order approximation that the path length and axial length are approximately
equal:
1 ∼
= 1 =⇒ 1 ∼
= z1
z1
2.9 IMAGE FORMATION IN THE RAY MODEL
27
Similarly, we can show that:
2
∼
= z2
This paraxial or Gaussian approximation (also called first-order optics because it is based on only
the first-order term in the cosine series) is valid only for small ray angles ϕ measured from the optical
axis. In words, the optical path lengths of rays that travel along the optical axis and rays that travel
“away” from the axis (but still with ϕ ∼
= 0) are equal.
The simplified imaging equation has the form:
¶
µ
1 n2 z2
n1 z1 ∼ 1
−
= (n2 − n1 )
R
R
2
1
=⇒
n1 n2 ∼ 1
+
= (n2 − n1 )
z1
z2
R
This is the paraxial imaging equation for single surface; clearly it is an approximation to the true
equation, and also clearly it is similar to the imaging equation we have already considered.
Object at Infinite Distance
Now consider some pairs of object and image distances z1 and z2 . If the object is located at −∞,
then:
n1
n2 ∼ 1
n2
=
+
= (n2 − n1 )
∞
z2
z2
R
n2 R
=⇒ z2 ∼
≡ f2 the “image-space focal length”
=
n2 − n1
which is what we “normally” think of as being the focal length of the optic.
Image at Infinite Distance
If the image is located at +∞, the object distance must be
n1 R
n1 ∼ 1
≡ f1 the “object-space focal length”
= (n2 − n1 ) =⇒ z1 ∼
=
z1
R
n2 − n1
1
1
= (n2 − n1 )
f1
R
Also note that:
¶
n1 R
n
f1
n2 − n1
¶ = 1 =⇒ n1 · f2 = n2 · f1
=µ
n2 R
f2
n2
n2 − n1
µ
In words, the ratio of the focal lengths in the two spaces (object and image) is the ratio of the indices
of refraction in the two spaces.
Rule of Thumb: Estimating focal lengths of converging lenses: For a single positive
(converging) lens (i.e., not a lens “system” with multiple elements), it is easy to estimate the focal
length of a lens by finding the distance from the lens to the image of a distant bright object. The
requirement for “distant” is not critical — forming the image of ceiling lamp on the floor or a tabletop
will give a useful estimate for a positive lens with a short focal length.
2.9.2
Imaging with Spherical Mirrors
The equation for a single refractive surface may be used to derive the focal length of a spherical
mirror by setting the refractive index of image space to the negative of that in object space:
28
1
n1
1
= (−n1 − n1 ) = −2
f
R
R
In air, the equation for the focal length of a spherical mirror is:
φ=
f =−
R
R
→ − in air
2n
2
In words, the focal length of a spherical mirror is half of the radius of curvature; the focal length is
positive (converging) if R > 0 and negative if R < 0, as shown.
Spherical mirrors: concave mirror with negative radius of curvature R = VC < 0 makes outgoing
light rays converge and so f > 0; convex mirror with positive radius of curvature makes rays diverge
and f < 0.
2.10
First-Order Imaging with Thin Lenses
Normally we do not consider the case of an object in one medium with the image in another — usually
both object and image are in air and a lens (a “device” composed of material with different refractive
index n and curved surfaces) diverts the rays to form the image. We can derive the formula for the
object and image distances if we know the radii of the lens surfaces and the indices of refraction.
We merely cascade the formula for a single surface:
n1 n2
n2 − n1
+ 0 =
z1
z1
R1
n2 n3
n3 − n2
At second surface:
+ 0 =
z2
z2
R2
At first surface:
where z1 is the (usually known) object distance, z10 is the image distance for rays refracted by the
first surface, z2 is the object distance for the second surface, and z20 is the image distance for rays
exiting the second surface (and thus from the lens). For the common “convex-convex” lens, the
2.10 FIRST-ORDER IMAGING WITH THIN LENSES
29
center of curvature of the first surface is to the right of the vertex, and thus the radius R1 of the
first surface is positive. Since the vertex is to the right of the center of curvature of the second
surface, then R2 < 0. If the lens is “thin”, then the ray encounters the second surface immediately
after refraction at the first surface, so the ray heights at the two surfaces are the same. The object
distance for the second surface is the negated image distance from the first: z2 = −z10 . Put another
way, the absolute value of the image distance for the front surface |z10 | is the same as the object
distance for the second surface |z2 |. If the lens is “thick”, then the object distance for the second
lens is different from the image distance for the first, and the ray heights will be different if the ray
angle is not zero. The thickness t of the lens must satisfy the relationship:
z10 + z2 = t =⇒ z2 = t − z10 for thick lens
for a thick lens. For a thin lens with t = 0
z2 = 0 − z10 =⇒ z2 = −z10 for thin lens
The equations for the two surfaces may be added and the RHS may be rearranged to obtain a
single imaging equation for a lens with two surfaces:
¶ µ
¶ µ
¶ µ
¶
µ
n2 n3
n2 − n1
n3 − n2
n1 n2
+ 0 +
+ 0 =
+
z1
z1
z2
z2
R1
R2
µ
¶
1
n3
1
n1
=
+ n2
−
−
R2
R1 R2
R1
For a thin lens with t = 0, substitute z2 = −z10 to obtain:
µ
¶ µ
¶
n1
n2 n3
n1
n3
n2
t = 0 =⇒
+ 0 =
+
+ 0
+
z1
z2
z1
−z2
z2
z2
µ
¶
n3
n3
1
1
n1
n1
+ 0 =
+ n2
−
−
z1
z2
R2
R1 R2
R1
where the object is immersed in index n1 , the lens has index n2 , and the image is immersed in index
n3 .
In the usual case of both object and image in air so that n3 = n1 = 1,the equation simplifies to:
µ
¶
1
1
1
1
1
1
+
=
+ n2
−
−
z1 z20
R2
R1 R2
R1
µ
¶
1
1
1
1
+ 0 = (n2 − 1)
−
z1 z2
R1
R2
Note the similarity between this equation and that we inferred from the derivation of the image
plane using wave optics:
1
1
1
+
=
z1 z2
f
where the distances z1 and z2 from the object to the lens and lens to image are what we had called
z1 and z2 previously, and we identify:
(n2 − 1)
µ
1
1
−
R1 R2
¶
=
1
1
1
+
=
f
z1 z20
(Lensmaker’s Equation)
which defines the focal length of the thin lens in terms of its physical parameters for a thin lens.
This is the so-called lensmaker’s equation for thin lenses IN AIR; it determines the distance z20 to
the image for object distance z1 , the radii of curvatures R1 and R2 of the spherical surfaces, and the
30
index of refraction n2 of the glass. Note that the object distance z1 and the image distance z20 both
appear with the same algebraic sign, which may be interpreted as demonstrating an “equivalence”
of the object and image because the propagation of light rays may be reversed to exchange the roles
of object and image. Corresponding object and image points (or object and image lines or object
and image planes) are called conjugate points (or lines or planes).
In the more general case where the refractive index of object space is n3 > 1 so that n3 6= n1 ,
the focal length of the lens is:
µ
¶
n1
n3
1
(n2 − 1)
−
=
R1
R2
f
and that of image space is n3 .
2.10.1
Examples of Thin Lenses
1. Plano-convex lens, curved side forward (“convexo-planar lens”)
R1 = |R1 | > 0
R2 = ±∞ (sign has no effect)
µ
¶
1
1
1
n2 − 1
1
+ 0 = (n2 − 1)
−
=
>0
z1 z2
|R1 | ∞
|R1 |
If z1 = +∞, then z20 = f > 0, the focal length
1
n2 − 1
= φ system power (measured in meters−1 = diopters)
=
f
R1
R1 ∼
f=
= 1.5 for glass)
= 2R1 (since n2 ∼
n2 − 1
We often use the “power” φ = f −1 (measured in m−1 = diopters) instead of the focal length
f to describe the lens, since powers of different lenses combine by addition, instead of as
reciprocals of sums of reciprocals. The power measures the ability of the lens or lens system
to deviate rays, i.e., to change the ray angle.
2. Plano-convex lens, plane side forward:
R1 = ±∞
R2 = − |R2 | < 0
1
(n2 − 1)
(n2 − 1)
1
+
=−
=+
>0
z1 z20
R2
|R2 |
|R2 | ∼
f=
= 2 |R2 |
n2 − 1
So the focal length of the lens is the same regardless of its orientation (front-to-back). Since
the focal lengths for the two configurations (curved side in front or behind lens) are the same,
you might assume that the same image quality can be expected for the two configurations.
This is NOT the case, but the explanation requires the theory of aberrations. At this point,
we will just try to give a bit of motivation for another rule of thumb, while postponing the
proof.
Rule of Thumb: Orientation of Plano-Convex Lens: When using a plano-convex lens
to form an image, the quality of the image is better if the power is more evenly divided among
the two surfaces. This means that the the curved side of the lens is placed towards the longer
conjugate (which usually is towards the object) and the plane side towards the shorter conjugate. This miniizes the spherical aberration that causes rays from a point object to cross the
optical axis at different distances from the lens. This perhaps may be visualized better if we
consider the case of a distant object (assume z1 = ∞) and a plano-convex lens with the flat
2.10 FIRST-ORDER IMAGING WITH THIN LENSES
31
side towards the object. For an object at infinity, the rays incident upon the lens are parallel
(“collimated”) both when they are incident to and when they exit the flat surface. In other
words, the flat side contributes no power to the imaging, so all of the focusing power comes
from the curved surface.
Rule of thumb: when using a plano-convex lens, place the curved side towards the longer conjugate
to get a better image.
3. Plano-concave, plane side forward:
R1 = ±∞
R2 = + |R2 | > 0
µ
¶
1
1
1
1
(n2 − 1)
+
= (n2 − 1)
−
=−
<0
z1 z20
∞ + |R2 |
|R2 |
|R2 | ∼
f =−
= −2 |R2 |
n2 − 1
4. Double convex lens with equal radii:
R1 = |R| > 0
R2 = −R1 = − |R|
µ
µ
¶¶
1
1
1
1
(n2 − 1)
+ 0 = (n2 − 1)
− −
=2
>0
z1 z2
|R|
|R|
|R|
2 · (n2 − 1)
1
=φ=
f
|R|
|R|
∼
f=
= 1.5
= |R| > 0 if n2 ∼
2 · (n2 − 1)
32
2.10.2
Spherical Mirror
The mirror changes the direction of rays by reflection that obeys Snell’s law for reflection so that
the angle of reflection is the negative of the angle of incidence (measured from the normal to the
surface). For a concave spherical mirror, the incident ray angle varies with height above the optical
axis. difference in analysis between the single refractive surface and the mirror may be simplified by
recognizing that the mirror “reverses” the direction of propagaion of light, which may be explained
by setting n2 = −n1 = −1
1
−1
2
R
1
= −
=−
=⇒ f = −
f
R
R
R
2
In words, the focal length of a spherical mirror is half of the radius of curvature. A concave mirror
with negative radius is positive (center to left of vertex)
2.11
Image Magnifications
The most common use for a lens is to change the apparent size of an object (or image) via the
magnifying properties of the lens. The mapping of object space to image space “distorts” the size
and shape of the image, i.e., some regions of the image are larger and some are smaller than the
original object. We can define three types of magnification: transverse, longitudinal, and angular,
where the first two describe the impact of the imaging system on lengths that are respectively
perpendicular to and parallel to the optical axis, while the last refers to the action on the angles
of rays measured from the optical axis. Note that the very name of “magnification” is rather
misleading because most imaging systems produce images that are smaller than the object; they
actually “minify” the features because the magnifications are smaller than unity.
2.11.1
Transverse Magnification:
The transverse magnification MT is what we usually think of as magnification — it is the ratio of
object to image dimension measured transverse to the optical axis. In the figure, note the two similar
triangles 4a1 b1 c and 4a2 b2 c:
The transverse magnification of the image is the ratio of the height of the image to that of the
y2
object: MT = .
y1
It is easy to see that:
y1
|y2 |
y2
=
=−
(because y2 < 0)
z1
z2
z2
z2
y2
≡ MT = −
=⇒
y1
z1
If |MT | is larger than or smaller than unity, the image is magnified or minified, respectively. If
MT > 0, the image is upright or erect and if MT < 0, the image is inverted (“upside down”).
33
2.11 IMAGE MAGNIFICATIONS
2.11.2
Longitudinal Magnification:
The longitudinal magnification ML is the ratio of the “length” or “depth” of the image measured
along the optical axis to the corresponding length of the object; the longitudinal magnification is
the ratio of differential elements of length of the image and object, which approach an infinitesimal
in the limit:
∆z2
∆z1
dz2
=
dz1
ML =
lim
∆z1 →0
∆z2
∆z1
The expression may be derived by evaluating the total derivative of the lensmaker’s equation.
µ
¶
1
1
1
1
+
= (n − 1)
−
z1
z2
R 1 R2
Since the imaging equation relates the reciprocal distances z1−1 and z2−1 , the longitudinal magnification varies for different object distances. The total derivative of the left-hand side of the imaging
equation is:
µ
¶
µ ¶
µ ¶
1
1
1
1
d
+
= d
+d
z1 z2
z1
z2
1
1
= − 2 dz1 − 2 dz2
z1
z2
The derivative of the right-hand side is:
¶¶
∙µ
¶¸
µ
µ
1
1
1
1
−
−
= (n − 1) · d
d (n − 1)
R 1 R2
R 1 R2
= 0 (because n, R1 , and R2 are constants)
We combine these to see that:
−
1
1
dz1 − 2 dz2
z12
z2
1
1
dz1 = 2 dz2
z12
z2
µ ¶2
dz2
z2
=⇒
=−
dz1
z1
=
0 =⇒ −
We can now identify the ratio of the two differential lengths along the axis as the longitudinal
magnification ML :
µ ¶2
dz2
z2
ML ≡
=−
= − (MT )2 < 0
dz1
z1
The longitudinal magnification is negative because the image moves away from the lens (increasing
z2 ) as the object moves towards the lens (decreasing z1 ). The longitudinal magnification affects the
irradiance of the image (i.e., the “flux density” of the rays at the image); if |ML | is large, then the
light in the vicinity of an on-axis location is “spread out” over a longer longitudinal dimension at
the image, which requires the irradiance of the image to decrease.
34
The scaling of the 3-D “image” along the three axes. The scaling along the “transverse” axes x and
y define the transverse magnification, while the scaling of the image along the z-axis is determined
by the longitudinal magnification.
The effect of longitudinal magnification on the irradiance of the image of a uniformly luminous rod
of length ab. The section at z1 = 2f is imaged with unit negative transverse magnification at
z2 = 2f . Sections of the rod with z1 > 2f are imaged at z2 < 2f , and the energy density is
remapped to account for the nonlinear distance relationship z11 + z12 = 1f .
2.11.3
Angular Magnification
This is the ratio of the angles of the outgoing ray and the corresponding incoming ray measured
relative to the optical axis. Angular magnification is particularly relevant for systems that do not
form images, e.g., afocal telescopes. We shall shortly utilize this concept when considering the
single-lens magnifier.
θout
Mθ =
θin
If |Mθ | > 0, then the angle of the emerging ray is larger than that of the corresponding entering
ray. This will increase the angular separation between rays generated by two objects so that it will
be easier for the eye to resolve them. The angular magnification is sometimes called teh magnifying
power of the lens.
2.12 SINGLE THIN LENSES
2.12
Single Thin Lenses
2.12.1
Positive Lens
35
The power of a single lens with two surfaces is determined by the lensmaker’s equation:
µ
¶
1
1
1
φ = = φ1 + φ2 = (n2 − 1)
−
f
R1
R2
The power is positive if 0 < R1 < |R2 |. The most common case is the “double convex” lens
where R1 > 0, R2 < 0, which means that the ray encounters positive power at both surfaces. The
action of a single thin positive lens with known focal length on an object with known location may
be solved graphically by sketching three specific rays from the tip of the object:
1. the ray parallel to the optical axis; this ray is refracted by the lens to pass through the imagespace focal point F,
2. the ray through the center of the lens, which is not refracted by the thin lens and so maintains
the same angle relative to the optical axis, and
3. the ray through the object-space focal point F0 to the lens; this ray is refracted and travels
parallel to the optical axis.
The intersection of these three rays (or obviously of any two) is the location of the image of
the tip of the object:
The example in the figure closely matches the situation where the image is an inverted replica of
the object, so that h0 = −h and MT = −1. The two equations that must be satisfied are
z2 = z1 =⇒ MT = −1
1
1
1
+
=
=⇒ z1 = z2 = 2 · f
z1 z2
f
This situation where the object and image distances are twice the focal length is often called imaging
at equal conjugates.
This drawing assumes that the indices of refraction in object and image space are identical. If the
indices are different (e.g., if the object is in water and the image in air), then the imaging equation
36
must be modified:
n − n1 n2 − 1
−
R1
R2
n1
n2
=
+
z1
z2
φ =
If the refractive indices in object and image spaces are larger than that of the lens, such as a
case where the object and image are in glass or water and the lens is “made of” air, the curvatures
must be reversed, so that R1 < 0 and R2 > 0 to make a positive lens.
Lens made of rare medium (e.g., air) within a dense medium (e.g., glass, water). The reversal of
refractive indices requires inverting of the signs of the radii of curvature.
2.12.2
Negative Lens
A lens with negative power at both surfaces may be constructed if R1 is negative and R2 is positive.
Two (or more) rays that have passed through a lens with negative power will exhibit a larger
diivergence on the output side than on the input side.
2.12.3
Meniscus Lenses
A lens with radii of curvature with the same sign on both surfaces is a meniscus lens. If both radii
are positive, then the powers of the two surfaces are:
µ
¶
n−1 1−n
1
1
φ1 =
+
= (n − 1) ·
−
|R1 |
|R2 |
|R1 | |R2 |
which may be positive or negative depending on the relative sizes of R1 and R2 ; the power is positive
if R2 > R1 and negative if R2 < R. An example of a meniscus lens with positive power is shown in
the figure.
37
Meniscus lens with positive power; the radii of curvature of both surfaces is positive since the
vertices are to the left of the centers, but the fact that R2 > R1 ensures that φ > 0.
Examples of meniscus lenses with positive and negative power are also shown:
Meniscus lenses with positive and negative powers from the Newport optics catalog. The red lines
represent rays that show the respective converging and diverging actions of the lenses.
2.12.4
Simple Microscope (magnifier, “magnifying glass,” “loupe”)
This is arguably the simplest imaging system, but some of the concepts it illustrates are sufficiently
sophisticated that many optickers and/or imaging scientists may not understand them entirely. The
simple microscope is a single lens with positive focal length that is used to increase the size of the
image on the retina than could be formed with the eye alone. It also may be called the magnifying
glass if handheld or a loupe if designed to rest on the object). You may know already that the
eye lens is deformed by ciliary muscles that are relaxed when the lens is “flatter,” i.e., the radii of
curvature of the surfaces are larger so the focal length is longer. To view an object “close up,” the
focal length of the eye lens must be shortened by making the lens shape more spherical. This is
accomplished by tightening the ciliary muscles (which is the reason why your eyes get tired after an
extended time of viewing objects up close).
38
The closest distance to an object that appears to be sharply focused by the unaided eye is the
near point, which (obviously) depends on the flexibility of the deformable eyelens and the capability
of the ciliary muscles, which (obviously) vary with individual, and with age for a single individual.
The distance to the near point may be as close as 50 mm ∼
= 2 in for a young child and in the range
between 1000 mm − 2000 mm for an elderly person. This reduction in “accommodation” for close
objects is one of the signs of aging. The near point of an “ideal” eye is assumed to be 250 mm ∼
= 10 in
from the front surface. For nearsighted individuals, the near point is closer to the eye, thus increasing
the angular subtense of fine details for those individuals. For this reason, nearsighted individuals
in ancient times (before optical correction) often were attracted to professions requiring fine work,
such as goldsmithing. Since nearsightedness can be a genetic trait, descendents often continued in
these crafts.
The reference for angular magnification is the angle subtended by the object if viewed at the
near point of the average eye so that z1 = 250 mm. If the object height is y, the angle when viewed
at the near point is:
h
i
y
y
∼
θ250 mm = tan−1
=
250 mm
250 mm
where the first-order approximation tan [θ] ∼
= θ if θ ∼
= 0 is used in the last step.
Magnifier with Object at Focal Point of Positive Lens
If the object is positioned at the object-space (front) focal point of a positive lens with focal length
flens , then the arays from the “tip” of the object are parallel when they exit the lens and so may be
viewed “in focus” by an eye with a relaxed lens for an object at an infinite distance away. The angle
subtended by the object one focal length away is:
∙
¸
y
y
−1
∼
θlens = tan
=
flens
flens
39
Magnifier with object at focal point of lens. Figure (a) at top shows the angle θ250 mm subtended by
the object when located at the near point; (b) shows the angle θlens subtended by the object when
located at the object-space focal point of the lens. The blue ray in (b) emerges parallel to the optic
axis, which shows that the object distance z1 = f .
The angular magnification or magnifying power of the magnifier is the ratio of the angle subtended
by the object when viewed at the closer distance through the lens to the angular subtense viewed
at the near point:
∙
¸
¶
µ
y
y
tan−1
θlens
f
flens
h lens
i∼
´
Mθ =
=
=³
y
y
−1
θ250 mm
tan
250 mm
250 mm
250 mm
Mθ =
, object at focal point
flens
If the focal length of the magnifying lens is, say f = 50 mm, then the magnifying power of the lens
for the object at the focal point is:
250 mm
Mθ =
=5
50 mm
Magnifier with Image Formed at Near Point
We can instead use the magnifying lens held close to the eye to form a virtual image at the near
point of the eye. This means that the distance from the lens to the virtual image formed by the lens
is the distance to the near point: V0 O0 = z2 = −250 mm. ISubstitute this distance into the imaging
40
equation:
1
1
+
z1
−250 mm
1
f
1
250 mm · f
1
1
= +
=⇒
=⇒ z1 =
z1
f
250 mm
250 mm + f
=
The angle subtended by the object at the near point is the same as before:
h y
i
y1
1
∼
θ250 mm = tan−1
=
250 mm
250 mm
but the angle subtended by the image when positioned at the near point viewed through the lens is
different:
∙
¸
y2
y2
y1
∼
θlens = tan−1
=
=
|−250 mm|
250 mm
z1
where the similarity of the triangles has been used. This expression may be recast by substituting
the expression for z1 :
µ
¶
250 mm + f
y1
y1
y1
∼
¶=
θlens =
=µ
·
250 mm · f
z1
f
250 mm
250 mm + f
The magnifying power is:
Mθ
³ y ´ µ 250 mm + f ¶
1
·
θlens
250 mm + f
f
250 mm
³ y
´
=
=
=
1
θ250 mm
f
250 mm
250 mm
Mθ =
+ 1 image at near point
flens
Magnifier with image at near point of eye. The top figure again shows the angle θ250 mm subtended
by the object when located at the near point. The second figure shows the image at the near point,
which is more distant than the object.
41
2.13 SYSTEMS OF THIN LENSES
2.13
Systems of Thin Lenses
The images produced by systems of thin lenses may be located by finding the “intermediate” image
produced by the first lens, which then become in turn the objects for the second lens, which generates
an image that is the object for the third lens, etc. This type of analysis also may be applied
directly to the more realistic case of “thick” lenses, where the first “lens” actually represents the
first surface of the thick lens and the light propagates through the glass between the surfaces.
Though straightforward, this “sequential” solution to the image may be tedious and also not very
illuminating (pun intended) about the action of the system of lenses. The object and distance for
the nth lens will be denoted by zn and the corresponding image distance by the primed quantity zn0 .
2.13.1
Two-Lens System
Consider a two-lens system with first lens L1 and second lens L2 separated by the distance t. The
object for the system shown in the figure is labelled by O and the corresponding image by O0 , the
object- and image-space focal points are F and F0 , and the object- and image-space vertices (first
and last surfaces of the system) by V and V0 .
Imaging by a system of two thin lenses L1 and L2 separated by the distance t. The object and
image distances for the first lens are z1 and z10 and for the second lens are z2 and z20 .
From the diagram, we see that z10 the image distance from the first lens, z2 the object distance for
the second lens, and the lens separation t are related by:
z10 + z2 = t
so the object distance for the second lens is z2 = t − z10 . The imaging equation for the first lens
determines z10 :
1
1
1
1
1
z1 − f1
1
+
=
=⇒ 0 =
−
=
z1 z10
f1
z1
f1 z1
z1 f1
z1 f1
=⇒ z10 =
z1 − f1
If z1 = ∞, then the
z10
z1 f1
= lim
= f1 · lim
z1 →∞ z1 − f1
z1 →∞
µ
z1
z1 − f1
¶
= f1 · 1 = f1
42
In words, the image distance from the first lens for an object at ∞ is the focal length of the first
lens, as it should be. The object distance to the second lens is z2 = t − z10 , which may be rewritten
in terms of z1 , f1 , and t for the general case:
z1 f1
z1 − f1
z1 t − f1 t − z1 f1
=
z1 − f1
z1 (t − f1 ) − f1 t
=
z1 − f1
z2 = t − z10 = t −
In the limit of infinite object distance, the object distance to the second lens is:
µ
¶
f1 t
z1
z2 [for z1 = ∞] = lim
· (t − f1 ) −
z1 →∞ z1 − f1
z1 − f1
= 1 · (t − f1 ) − 0
= t − f1
which is the difference in the separation of the lenses and the distance from the image-space focal
point of the first lens; this often is a negative distance (i.e., virtual object for the second lens).
In the general case, apply the imaging equation for the second lens and substitute for the expression for z2 :
1
z20
1
z20
1
1
−
f2 z2
1
z1 − f1
=
−
f2 z1 (t − f1 ) − f1 t
z1
f1 f2
t−
(f1 + f2 ) +
(z1 − f1 )
(z1 − f1 )
=
z1
f2 · t − f1 · f2 ·
(z1 − f1 )
µ
¶
f1 · z1
z1
f2 · t −
f2 · t − f1 · f2 ·
(z1 − f1 )
(z1 − f1 )
=
=⇒ z20 =
z1
f1 f2
z1 · (f1 + f2 ) − f1 f2
t−
(f1 + f2 ) +
t−
(z1 − f1 )
(z1 − f1 )
(z1 − f1 )
=
The image distance for a specified (non-infinite) object location is called the back focal distance by
some authors:
µ
¶
f1 · z1
f2 · t −
(z1 − f1 )
BF D = z20 = V0 O0 =
z1 · (f1 + f2 ) − f1 f2
t−
(z1 − f1 )
43
In the limit of infinite object distance, the BFD becomes the back focal length BFL:
lim [z20 ] = z20 [f1 , f2 , t; z1 = ∞] ≡ V0 F0
µ
¶⎞
⎛
z1
⎜ f2 · t − f1 · (z1 − f1 ) ⎟
⎟
= lim ⎜
z1 →∞ ⎝
z1 · (f1 + f2 ) − f1 f2 ⎠
t−
(z1 − f1 )
f2 · (t − f1 · 1)
=
t − 1 · (f1 + f2 ) − 0 · f1 f2
t · f2 − f1 f2
f1 · f2 − f2 · t
=
=
t − (f1 + f2 )
(f1 + f2 ) − t
f
(f1 − t) · f2
·
f
−
f2 · t
1
2
BFL = V0 F0 =
=
(f1 + f2 ) − t
(f1 + f2 ) − t
z1 →∞
These complicated expressions, for the image distances measured from the second lens in terms of
the two focal lengths f1 and f2 , the separation t, and the distance z1 from the object to the first
lens, are useful, but it tell little on its face about the entire “lens system.” We would much prefer
establishing relationships from the object to the lens system and from the system to the image. The
first step in this analysis is to define an equivalent or effective focal length for the entire system,
which is the focal length of the equivalent single thin lens.
2.13.2
Effective (Equivalent) Focal Length
We can use the results just derived to find an expression for the imaging action of a two-lens system
by finding the location and focal length of the equivalent single lens that would generate the same
image. This is an important concept, so we will do a rigorous derivation, which is perhaps simplified
by adding some details to the figure:
Ray diagram of system of two positive thin lenses to illustrate the concept of “effective” (or
“equivalent”) focal length feff , back focal length BF L = z20 = V0 F0 , and principal point H0
The continuations of the input outgoing rays intersect at B, whose projection onto the optical axis
is at H0 , this is the location of the equivalent single lens that would generate the same outgoing
ray from the incoming ray. The distance from H0 , the image-space principal point, to F0 is the
image-space effective (or equivalent) focal length:
H0 F0 ≡ feff
44
We have already evaluated the back focal length, which is the image location for an object at infinity:
(f1 − t) · f2
(f1 + f2 ) − t
¡
¢
¡
¢
¡
¢
¡
¢
Compare two sets of similar triangles: ∆ AVF01 ∼ ∆ CV0 F01 and ∆ BH0 F0 ∼ ∆ CV0 F0
shown in the figures:
V0 F0 = z20 [z1 = ∞] =
¡
¢
¡
¢
From the first pair of triangles ∆ AVF01 ∼ ∆ CV0 F01 , we can construct ratios of their
“heights” and “axial lengths:”
h1
VF01
=
h2
h2
V0 F01
=⇒
=
h1
V0 F01
VF01
Now note that the distance VF01 = f1 , while V0 F01 may be rewritten:
V0 F01 = VF01 − VV0 = f1 − t
so the ratio may be rewritten:
f1 − t
h2
=
h1
f1
¡
¢
¡
¢
From the second pair of similar triangles ∆ BH0 F0 ∼ ∆ CV0 F0 , we can define the distance
H0 F0 ≡ feff and V0 F0 = BF L = z20 [z1 = ∞], so we now have two expressions for the ratio:
h2
BF L
V0 F0
=
=
h1
feff
H0 F0
BF L
h2
=
h1
feff
Equate the two boxed equations::
f1 − t
f1
BF L
feff
1
1
f1 − t
=⇒
=
·
feff
BF L
f1
=
Now substitute the formula for the back focal length BFL, which is z20 if z1 = ∞:
45
z20
1
feff
f2 · (t − f1 )
(f1 + f2 ) − t
1
=⇒ 0 =
t − (f1 + f2 )
z2
(f1 − t) · f2
1
1
f1 − t
=⇒
=
·
feff
BF L
f1
(f1 + f2 ) − t f1 − t
=
·
(f1 − t) · f2
f1
=
which may be rearranged to obtain a relationship for the reciprocal of the effective focal length in
terms of the reciprocals of the individual focal lengths:
1
(f1 + f2 ) − t f1 − t
=
·
feff
(f1 − t) · f2
f1
(f1 + f2 ) − t
1
1
t
=
=
+ −
f2 · f1
f1 f2 f1 f2
1
1
1
t
=
+ −
feff
f1 f2 f1 f2
=⇒ feff =
f1 · f2
(f1 + f2 ) − t
These two equivalent expressions specify what is certainly the most important equation we have
derived to date and arguably the most important to be derived in this class. It determines the effect
on the image of separating two thin lenses by some distance t.
This expression may also be written in terms of the powers of the two lenses, where the power
of the nth lens is the reciprocal of the focal length: φn ≡ fn−1 .
φeff = φ1 + φ2 − φ1 · φ2 · t
Note that if
t = f1 + f2 =
1
1
φ + φ2
+
= 1
φ1
φ2
φ1 φ2
then the feff = ∞ =⇒ BF L = +∞ and φeff = 0; the object and image are both an infinite distance
from the system. The focal points are located at ±∞ and the system is called afocal. Such a system
has infinite focal length and no power, which means that the image of an object at infinity is also
at infinity,. Since z1 = z20 = ∞, then the transverse magnification is zero.However, such a system
exhibits a useful angular magnification, as we shall see.
Back Focal Length and Image-Space Principal Point
We have evaluated the back focal length:
BF L = V0 F0 =
f1 · f2 − f2 · t
(f1 + f2 ) − t
and the system focal length:
feff =
f1 · f2
(f1 + f2 ) − t
We now define the image-space principal point H0 to be the point that is located one effective focal
length from the image-space focal point, i.e., so that H0 F0 = feff
H0 F0 ≡ feff =
f1 · f2
(f1 + f2 ) − t
We can think of H0 as the location of the single equivalent thin lens that generates the same outgoing
ray that emerges from the two-lens system. For a single thin lens, H0 coincides with the image-space
46
vertex V0 , which in turn coincides with the object-space vertex V since the thin lens has thickness
t = 0.
From the equation for the BFL and the definition of the principal point, we can also specify the
distance from the principal point to the vertex:
feff
H0 F0 = H0 V0 + V0 F0 = H0 V0 + BFL
f1 · f2
f1 · f2 − f2 · t
=⇒ H0 V0 = feff − BFL =
−
(f1 + f2 ) − t
(f1 + f2 ) − t
≡
H0 V0 =
f2 · t
(f1 + f2 ) − t
We can (and will) derive corresponding results in the object space, i.e., object-space principal and
focal points.
A pair of positive thin lenses showing the image-space principal and focal points H0 and F0 ,
respecively.
Compare Back Focal “Length” and Back Focal “Distance”
As the object distance decreases from ∞, the distance from the rear vertex to the the image typically
increases, so that the BF D for a finite object distance typically is larger than the BF L for an infinite
object distance. This can be seen by comparing the two expressions for some specimen focal lengths.
For f1 = 100 mm, f1 = 25 mm. and t = 75 mm, the focal length of the equivalent single lens is:
feff =
µ
1
1
75 mm
+
−
100 mm 25 mm 100 mm · 25 mm
¶−1
= +50 mm
The back focal length (distance from rear vertex to focal point) is:
(f1 − t) · f2
(f1 + f2 ) − t
25 mm · (75 mm − 100 mm)
=
= 12.5 mm
75 mm − (100 mm + 25 mm)
BF L = z20 [z1 = ∞] =
47
If the object distance is decreased from z1 = ∞ to z1 = 1000 mm, the back focal distance is:
µ
¶
z1
z1
· f1 f2
f2 · t −
f2 · t − f1 · f2 ·
z1 − f1
(z1 − f1 )
µ
¶
=
BF D =
z1
f1 f2
z1
t−
(f1 + f2 ) +
(t − f2 ) −
· f1
(z1 − f1 )
(z1 − f1 )
z1 − f1
¶
µ
1000 mm
· 100 mm · 25 mm
25 mm · 75 mm −
1000 mm − 100 mm
µ
¶
BF D [z1 = 1 m] =
≈ 20.
1000 mm
(75 mm − 25 mm) −
· 100 mm
1000 mm − 100 mm
1000 mm
25 mm · 75 mm − 100 mm · 25 mm ·
(1000 mm − 100 mm)
=
1000 mm
100 mm·25 mm
75 mm −
(100 mm + 25 mm) +
(1000 mm − 100 mm)
(1000 mm − 100 mm)
≈ 14.773 mm > BF L
In words, as the object distance decreases from infinity, the image distance moves “back” away from
the focal point.
Front Focal Length
The front focal length ( F F L) FV is the distance z1 in the case where z20 = ∞. It is calculated by
setting the denominator of the expression for z20 to zero:
(t − f2 ) −
z1 f1
=0
z1 − f1
z1 f1
= t − f2
z1 − f1
z1
t − f2
=⇒
=
z1 − f1
f1
=⇒ z1 f1 = (t − f2 ) (z1 − f1 )
=⇒ z1 f1 = tz1 − tf1 − z1 f2 + f1 f2
=⇒ z1 (f1 + f2 − t) = f1 f2 − tf1
=⇒
lim z1 = FV =
z20 →∞
f1 · (f2 − t)
= FFL
(f1 + f2 ) − t
Note that this expression has the same form as the front focal distance except that f1 and f2 are
“swapped”.
Front Focal Distance
Also note that the front focal distance ( F F D) is the axial distance from an object to the first surface
(front vertex) of the imaging system applies for finite object distances. This is synonymous with the
term the working distance, a concept often used in microscopy.
µ
¶
f2 · z2
f1 · t −
(z2 − f2 )
F F D = OV =
1
t−
· (z2 · (f1 + f2 ) − f1 f2 )
(z2 − f2 )
48
Object-Space Principal Point
We have already shown how to find the location of the equivalent single lens on the “output side”
by extending the rays entering and exiting the system until they meet. We can locate the equivalent
single lens in “object space” by “reversing” the system and introducing rays from the left again..
Since we know the distance from the object-space focal point to the object-space vertex and the
effective focal length, we can find the distance from the vertex to principal point in object space.
FH = feff =
f1 · f2
(f1 + f2 ) − t
= FV + VH = F F L + VH
f1 · (f2 − t)
=
+ VH
(f1 + f2 ) − t
This implies that the distance from the object-space vertex to the object-space principal point is:
VH =
f1 · f2
f1 · (f2 − t)
−
(f1 + f2 ) − t (f1 + f2 ) − t
VH =
2.13.3
f1 · t
(f1 + f2 ) − t
Summary of Distances for Two-Lens System
feff = H0 F0 = FH
BF L = V0 F0
H0 V0 = H0 F0 − V0 F0
F F L = FV
VH = FH − FV
2.13.4
f1 · f2
(f1 + f2 ) − t
f2 · (f1 − t)
(f1 + f2 ) − t
f2 · t
(f1 + f2 ) − t
f1 · (f2 − t)
(f1 + f2 ) − t
f1 · t
(f1 + f2 ) − t
“Effective Power” of Two-Lens System
The expression for the power of the system composed of two lenses in air with focal lengths f1 and
f2 is:
φeff [Diopters] ≡
1
1
t
1
=
+
−
feff [ m]
f1 [ m] f2 [ m] f1 f2
φeff [Diopters] = φ1 + φ2 − φ1 φ2 t
Clearly the power is zero if the separation distance t is equal to the sum of focal lengths; this is the
recipe for a telescope. If the two lenses have positive power and the separation is just less than the
sum of focal lengths, the effective focal length can be very large. This is also the case if if one of the
two lenses has negative power (so that the numerator is negative) and the separation is just larger
than the sum of the focal lengths (so that the denominator is negative and approximately zero).
2.13.5
49
Lenses in Contact: t = 0
If the lenses are in contact, then t = 0 and the front and back focal lengths are equal to the focal
length of the “equivalent single thin lens”:
f1 f2
= feff , if t = 0
f1 + f2
1
1
1
=⇒
=
+ , if t = 0
feff
f1 f2
F F L = BF L =
Two “thin” positive lenses in contact. The focal length of the system is shorter than the focal
f2
lengths of either, and may be evaluated to see that feff = f1f1+f
. The image-space principal point is
2
the location of the “equivalent thin lens”. Since both lenses are “thin”, the principal point coincides
with the locations of both lenses, so that V0 = H0 = H = V.
The power of the system composed of two thin lenses in contact is the sum of the powers:
φeff [Diopters] = φ1 + φ2 − φ1 φ2 · 0
= φ1 + φ2 for two thin lenses in contact
This is the assumed system for the magnifier with the lens held “close to the eye.”
2.13.6
Positive Lenses Separated by t < f1 + f2
If two positive thin lenses are separated by less than the sum of the focal lengths, the image-space
focal point F0 is closer to the first lens than it would have been had the second lens been absent. As
shown, the effective focal length of the system is feff < f1 . We can apply the equation for feff to this
case to see that:
f1 f2
>0
(f1 + f2 ) − t
f1 + f2 > feff > 0 if f1 + f2 > t > 0
feff =
50
A pair of positive thin lenses separated by less than the sum of the focal lengths.
Consider a specific example with f1 = 100 mm, f2 = 50 mm, and t = 75 mm. The focal length of
the equivalent single lens is:
feff =
f1 f2
(100 mm) (50 mm)
200
2
=
=
mm = 66 mm
(f1 + f2 ) − t
(100 mm + 50 mm) − 75 mm
3
3
The image formed by the first lens is located at its focal point:
z10 =
µ
1
1
−
f1 z1
¶−1
=
µ
1
1
−
100 mm ∞
¶−1
= 100 mm
The object distance to the second lens is therefore the difference t − z10 :
z2 = t − z10 = 75 mm − 100 mm = −25 mm
The image of an object located at z1 = ∞ appears at z20 :
µ
1
1
=
−
f2 z2
2
V0 F0 = .16 mm
3
z20
¶−1
=
µ
1
1
−
50 mm −25 mm
¶−1
=
50
2
mm = 16 mm
3
3
measured from the rear vertex V0 of the system. We already know that the system focal length is
66 23 mm, so the image-space principal point H0 (the position of the equivalent thin lens) is located
66 23 mm IN FRONT of the system focal point, i.e., 50 mm in front of the second lens and 25 mm
behind the first lens.
2
H0 F0 = feff = 66 mm
3
2
V0 F0 = BF L = 16 mm
3
2
2
H0 V0 = H0 F0 − V0 F0 = 66 mm − 16 mm = 50 mm
3
3
We have already shown how to find the location of the equivalent single lens on the “output side”
by extending the rays entering and exiting the system until they meet. We can locate the equivalent
single lens in “object space” by “reversing” the system, as shown in the figure. The “first” lens in
the system is now (what we have called the second lens) L2 with f2 = 50 mm. The “second” lens is
L1 with f1 = 100 mm and the separation is t = 75 mm. The resulting effective focal length remains
2
unchanged at feff = 200
3 mm = 66 3 mm. If we bring in a ray from an object at ∞, the “intermediate”
51
image formed by L2 is located at the focal point of L2 :
z10
=
µ
1
1
−
f2 z1
¶−1
=
µ
1
1
−
50 mm ∞
¶−1
= 50 mm
Thus the image distance to L1 is:
0
z2 = t − z1 = 75 mm − 50 mm = +25 mm
The image of the object at z1 = ∞ produced by the entire system is located at z20 :
z20 =
µ
1
1
−
f1
z2
¶−1
=
µ
1
1
−
100 mm +25 mm
¶−1
=−
1
100
mm = −33 mm
3
3
measured from the “second” lens L1 (or equivalently from the second vertex). The image is “behind”
the second lens and is thus virtual. The object-space principal point H is the point such that the
distance FH = feff = 66 23 mm, which means that H is located −33 13 mm IN FRONT of L2 .
The “object-space” principal point H may be located by “reversing” the system and bringing in a
ray from an object at infinity.
When we “re-reverse” the system to graph the object- and image-space principal points, H is
located “behind” the lens L2 , as shown in the graphical rendering of the entire system:
52
The principal and focal points of the two-lens imaging system in both object and image spaces.
The object-space principal point is the location of the equivalent thin lens if the imaging system
is reversed. We can now use these locations of the equivalent thin lens in the two spaces to locate
the images by applying the thin-lens (Gaussian) imaging equation, BUT the distances z and z 0 are
respectively measured from the object V to the object-space principal point H and from the imagespace principal point H0 to the image point O0 . The process is demonstrated after first locating the
images via a direct calculation.
“Brute Force” Calculation of Image
Now consider the location and magnification of the image created by the original two-lens imaging
system (with L1 in front) for an object located 1000 mm in front of the system (so that OV =
1000 mm). We can locate the image step by step:
z10 =
µ
1
1
−
f1
z1
¶−1
Intermediate image created by L1 :
µ
¶−1
1
1000
1
=
=
−
mm ∼
= 111.11 mm
100 mm 1000 mm
9
Transverse magnification of intermediate image::
(MT )1 = −
1000
mm
z10
1
=− 9
=−
z1
1000 mm
9
Distance from intermediate image to L2 :
1000
325
z2 = t − z10 = 75 mm −
mm = −
mm ∼
= −36.11 mm
9
9
z20 =
µ
1
1
−
f2 z2
¶−1
Distance from L2 to final image:
µ
¶−1
1
650
1
=
=+
− 325
mm ∼
= +20.97 mm
50 mm − 9 mm
31
53
Transverse magnification of second image:
(MT )2 = −
650
31 mm
− 325
9 mm
=+
18
31
The transverse magnification of the image from the entire system is the product of the transverse
magnifications from each lens:
µ
¶ µ
¶
1
18
2
MT = (MT )1 · (MT )2 = −
· +
=−
9
31
31
which indicates that the image is minified and inverted.
Imaging Equation using Principal Points
We have just seen that the object- and image-space principal points are the “reference” locations
from which the system focal length is measured;
feff = FH = H0 F0
In exactly the same way, these principal points are the “reference” locations from which the object
and image distances are measured:
z = OH
z 0 = H0 O0
The ray entering the system can be modeled as traveling from the object O to the object-space
principal point H. The resulting outgoing (image) ray travels from the image-space principal point
H0 to the image point O0 . This may seem a little “weird”, but actually makes perfect sense if we
relate the measurements to the equation for a single thin lens. In that situation, focal lengths are
measured from the object-space focal point to the thin lens and from the lens to the image-space
focal point. In other words, the object- and image-space vertices V and V0 of a thin lens coincide
with the principal points H and H0 . We know that an object located at the lens (z = 0) generates
an image at the lens (z 0 = 0) with magnification of +1; the heights of the object and image at the
principal points are identical. In the realistic system where the object- and image-space principal
points are at different locations, the image of an object located at the object space principal point
is formed at the image-space principal point with unit transverse magnification MT = +1. In other
words, the principal points are the locations of conjugate points with unit transverse magnification.
Notice the difference to the situation where the object distance OH = 2f , so that the image distance
H0 O0 = 2f with transverse magnification MT = −1:
OH = z = 2f
1
1
1
=
+
z z0
f
z 0 = H0 O0 = 2f
2f
MT = − = −1
2f
This case where the object and image distances are equal so that the transverse magnification is −1
often is called imaging at equal conjugates.
Note the positions of the principal and focal planes of the system we just analyzed: f1 = +100 mm,
f2 = +50 mm, and t = +75 mm. The principal points are “crossed,” which means that the objectspace principal point is farther towards image space than the image-space principal point (H is
“behind” the H0 ). Such a system is more “compact,” because the image is closer to the object-space
principal point, so that F0 is closer than V0 O0
54
Principal points of an imaging system: The dashed ray from the object at O reaches the
object-space principal point H with height h. The image ray (solid line) departs from the
image-space principal point H0 with the same height h and goes to the image point O0 , so that the
distances OH = z and H0 O0 = z 0 satisfy the imaging equation z1 + z10 = fe1ff .
Location of Image using Principal Points
We can also analyze this system by using the model of the single thin lens located at the objectand image-space principal points. We have already shown that the focal length of the system is:
feff = FH = H0 F0 = +
200
mm
3
The object and image distances z and z 0 of the single lens equivalent to the two-lens system are
respectively measured principal points: z = OH and z 0 = H0 O0 .
The object distance is measured to the object-space principal point, which is 100 mm behind L1 (or
V), thus the object distance is the distance from O to L1 plus 100 mm:
z = OV + VH = 1000 mm + 100 mm = 1100 mm
55
The single-lens imaging equation may be used to find the image distance z 0 , which now is MEASURED FROM THE IMAGE-SPACE PRINCIPAL POINT H0 (and NOT from the image-space
vertex V0 ).
0
z =
µ
µ
1
feff
1
−
z
¶−1
¶−1
1
1
mm
−
200
1100
3 mm
2200
= H0 O0 =
mm ∼
= 70.97 mm
31
=
The image distance from the vertex is calculated by subtracting the distance from the image-space
principal point H0 to the image-space vertex V0 :
V0 O0 = H0 O0 − H0 V0
2200
650
=
mm − 50 mm =
mm ∼
= +20.97 mm
31
31
The resulting transverse magnification is:
MT = −
2200
mm
z0
2 ∼
= − 31
=−
= −0.065
z
1100 mm
31
Both the image distance and the transverse magnification match the values obtained with the stepby-step calculation performed above (as they must!).
2.13.7
Cardinal Points
The object-space and image-space focal and principal points are four of the six so-called cardinal
points that determine the paraxial properties of an imaging system. There are three pairs of locations
where one of each pair is in object space and the other is in image space. The object- and imagespace focal points are F and F0 , while the principal points H and H0 are the locations on the axis
in object and image space that are images of each other with transverse magnification MT = +1.
The nodal points N and N0 are the points in object and image space where the ray angle of the
entering and exiting rays are identical, which means that the angular magnification of rays “into”
and “out of” the nodal points is Mθ = +1. The principal and nodal points coincide for systems with
the object and image spaces in the same medium (e.g., both object space and image space in air).
A table of significant points on the axis of a paraxial system is given below:
A x ia l P o in t
O b je ct S p a ce (fro nt)
Im a g e S p a c e (b a ck )
C o n ju g a te P o ints? (o b ject a n d im a g e?)
Fo c a l P o i n t s
F
F0
No
N o d a l P o ints
N
0
N
Yes:
Mθ = +1
P rin c ip a l P o in ts
H
H0
Ye s:
MT = +1
Vertice s
V
V0
O b je c t/ Im a g e
O
O0
E ntra n c e / E x it P u p ils
E
E0
“ E q u al C on ju ga tes”
OH=2feff
z20 =H0 O0 =2feff
No
Ye s:
H0 O0
z0
=−
z
OH
Y e s , MT varies
MT = −
Ye s:
MT = −1
56
2.13.8
Lenses separated by t = f1 + f2 : Afocal System (Telescope)
If the two lenses are separated by the sum of the focal lengths, then an object at ∞ forms an image
at ∞; the system focal length is infinite. Since the focal points are both located at infinity, we say
that the system is afocal; it has zero power, i.e., the rays exit the system at the same angle that
they entered it. If the focal length of the first lens is longer than that of the second, the system is a
telescope.
Two thin lenses separated by the sum of their focal lengths. An object located an infinite distance
from the first lens forms an “intermediate” image at the image-space focal point f10 of the first lens.
The second lens forms an image at infinity. Both object- and image-space focal lengths of the
equivalent system are infinite: f = f 0 = ∞. The system has “no” focal points — it is afocal.
The focal length of this system is:
1
1
1
t
= 0 =⇒
+ −
=0
feff
f1 f2
f1 · f2
¶ µ
¶
µ
1
f1 + f2
1
+
−
=0
=
f1 f2
f1 f2
=⇒ t = f1 + f2
which shows that the separation between the two lenses is t = f1 + f2 .
Angular Magnification of a Telescope
The telescope has infinite focal length and therefore no “power,” but you already know that it does
“something.” Consider the system’s effect on a ray that enters the first lens at its center at angle θ,
so it is transmitted through the lens with no change in angle. Because the ray crossed the axis at
the first lens and travels the distance z2 = f1 + f2 to the second lens, where it is deviated to make
the angle θ0 with the optical axis. We need to relate θ and θ0 to evaluate the angular magnification.
57
Angular magnification of a telescope: the red ray strikes the center of the first lens at angle θ and
is transmitted without deviation (because the sides are parallel at the center and the lens is thin).
The ray is deviated by the second lens at angle θ0 . The angular magnification is the ratio of these
two angles.
From the figure, note that the angle of the entering ray is positive and that of the exiting ray is
negative. The angle of the entering ray may be determined from the triangle “between” the lenses
with sides (f1 + f2 ) and h:
h
∼θ
tan [θ] =
=
f1 + f2
To find the exiting angle θ0 , we need to find the distance from the second lens to the point where
the ray crosses the axis. This is easy to find using the imaging equation for a thin lens in air:
1
1
1
z2 · f2
+
=
=⇒ z20 =
z2 z20
f2
z2 − f2
where the object distance z2 is the distance between the lenses:
z2 = t = f1 + f2
so the image distance for the red ray is:
z20 =
z2 · f2
(f1 + f2 ) · f2
f2
= z20 =
= (f1 + f2 ) ·
z2 − f2
(f1 + f2 ) − f2
f1
The angle θ0 satisfies the condition:
£ ¤
h
h
tan θ0 = − 0 = −
z2
(f1 + f2 ) ·
f2
f1
=−
f1
h
∼
·
= θ0
f2 f1 + f2
So the angular magnification is:
f1
θ ∼ − f2 ·
³
Mθ =
=
θ
0
³
h
f1 +f2
h
f1 +f2
´
´
=−
f1
f2
where the negative sign means that the two angles have different algebraic signs. In words, the
angular magnifcation of a telescope is the ratio of the focal lengths of the lenses. If the two lenses
are both positive (Keplerian telescope), then the angular magnification is negative. If the objective
(first lens) has positive power and the ocular (second lens) is a negative (Galilean telescope), then
58
the angular magnification is positive.
The angular magnification shows that two distant objects separated by a small angle (as a double
star in the sky) will be separated by a larger angle if viewed through a telescope.
2.13.9
Positive Lenses Separated by t = f1 or t = f2
We now continue the sequence of examples for two positive lenses separated by increasing distances.
If two positive lenses are separated by the focal length of the first lens, then the focal length of the
system is:
f1 · f2
f1 · f2
feff =
=
= f1 (if t = f1 )
(f1 + f2 ) − f1
f2
In words, the focal length of a system of two lenses separated by the focal length of the first lens is
equal to the focal length of the second lens.
If the two lenses are separated by the focal length of the second lens, then the system focal length
is f2 .
feff =
f1 · f2
f1 · f2
=
= f2 (if t = f2 )
(f1 + f2 ) − f2
f1
Recall that the transverse magnification is approximately proportional to the focal length if the
object is distant:
´
³
z·f
0
z−f
z
MT = − = −
z
z
!
Ã
1
f
1
= −f ·
=− ·
z−f
z
1 − zf
+∞ µ ¶n
+∞ µ ¶n+1
X
f
f X f
=−
= − ·
z n=0 z
z
n=0
f
∼
= − ∝ −f if z À f
z
where the formula for the converging geometric series has been used. In words, the transverse
magnification of a distant object formed by an imaging system is approximately proportional to the
focal length (which is why long focal lengths are used to image distant objects).
For the purpose of this example, we analyze the second case because it is the basis for probably
the most common application of imaging optics. The extension to the first case is trivial. Since
the focal length of the system is identical to the focal length of the second lens, this suggests the
question of how does the image change if the front lens is added.
59
Effect of adding lens L1 at the object-space focal point of lens L2 , so that t = f2 and feff = f2 . The
upper sketch is the lens L2 alone, and the lower drawing shows the situation with L1 added.
Consider a specific case with f2 = 100 mm and f1 = 200 mm. If only L2 is present and the object
distance is z2 = 1100 mm, then the image distance is:
z20 =
µ
1
1
−
f2
z2
¶−1
=
µ
1
1
−
100 mm 1100 mm
¶−1
= 110 mm
The associated transverse magnification is:
(MT )L 2
alone
=−
z20
+110 mm
1
=−
=−
z2
+1100 mm
10
Now add L1 at the front focal point of L2 and find the associated image. The object distance to
L1 is 1100 mm − 100 mm = 1000 mm. The first lens forms an image at distance:
z10 =
µ
1
1
−
f1
z1
¶−1
=
µ
1
1
−
200 mm 1000 mm
¶−1
= 250 mm
with transverse magnification:
(MT )1 = −
z10
+250 mm
1
=−
=−
z1
+1000 mm
4
The object distance to the second lens is:
z2 = t − z10 = 100 mm − 250 mm = −150 mm
and the resulting image distance behind lens L2 is:
z20
=
µ
1
1
−
f2 z2
¶−1
=
µ
1
1
−
100 mm −150 mm
¶−1
= +60 mm
Compare the image distances behind lens L2 and the system focal lengths without and with L1 in
the system:
z20 (without L1 ) = V0 O0 (without L1 ) = +110 mm > V0 O0 (with L1 ) = +60 mm
60
the image has moved “closer” to lens L2 .
feff (without L1 )= 100 mm = feff (with L1 )
Now check the other attributes of the image. Recall that MT = −0.1 if using L2 alone. If using
both lenses, the transverse magnification of the image formed by the second lens is:
(MT )2 = −
60 mm
2
=+
−150 mm
5
The magnification of the system is the product of the magnifications due to each lens:
MT for system with L1 and L2 = (MT )1 · (MT )2
µ
¶µ
¶
1
2
1
= −
+
=−
= MT for L2 alone
4
5
10
MT (without L1 )= MT (with L1 ) if t = f2
which is the same as for lens L2 alone! The transverse magnification of the system is not
changed by the addition of lens L1 with focal length f1 placed at the front focal point
of lens L2 , If f1 > 0, the image distance measured from L2 is shorter if L1 is present than if L1
is missing. Obviously, if the first lens has negative power (f1 < 0), the image distance measured
from L2 is longer if L1 is present than if L1 is missing. Put another way, the addition of lens L1
located at the object-space focal point of lens L2 moves the principal points and focal points by
equal distances either “forward” (towards L2 ) if f1 > 0 or “backwards” (farther from L2 ) if f1 < 0,
but the the focal length is unchanged. This system demonstrates the principle of eyeglass lenses,
where the ideal location for the corrective lens is at the object-space focal point of the eyelens (this
is the reason that eyeglasses are “on your nose”). The corrective action of a negative lens L1 placed
at the front focal point of L2 moves the image location “backwards” (away from L2 ) to correct
“nearsightedness” without changing the transverse magnification of the imaging system. A positive
lens L1 placed at the front focal point of L2 will move the image “forwards” (towards L2 ) to correct
“farsightedness.”
2.13.10
Positive Lenses Separated by t > f1 + f2
If the two positive lenses are separated by more than the sum of the focal lengths, the focal length
of the resulting system is negative:
feff =
f1 · f2
<0
(f1 + f2 ) − t
If the object distance is ∞, the first lens forms an “intermediate” image at its image-space focal
point, i.e., at z10 = f1 . Since the object distance z2 measured from the second lens is larger than f2 , a
“real” image is formed by the second lens at the system focal point F 0 . If we extend the exiting ray
until it intersects the incoming ray from the object at infinity, we can locate the equivalent single
thin lens for the system, i.e., the image-space principal point H0 . In this case, this is located farther
from the second lens than the focal point. The effective focal length feff = H0 F0 < 0, so the system
has negative power.
61
The system composed of two thin lenses separated by d > f1 + f2 . The image-space focal point F0 of
the system is beyond the second lens, but the image-space principal point H0 is located even farther
from L2 . The distance H0 F0 = feff < 0, so the system has negative power!
2.13.11
Compound Microscopes
We have already discussed the simple magnifier, where the object is located closer to the positive
lens than the focal length, thus forming a larger upright virtual image close to the near point of
the eye. In the compound magnifier (more commonly called the compound microscope) formed from
two lenses, the objective and eyelens generally have a short positive focal length and a longer focal
length, respectively. The focal points of the two lenses are separated by a fixed distance, the “tube
length,” which is now standardized by the Royal Microscope Society as t = 160 mm, though some
companies manufacture other lengths (e.g., Leitz with t = 170 mm). Not that it matters in this
class, it is important to ensure that the objective is used with the correct tube length to minimize
aberrations in the final image.
Modern microscope systems are often “infinity corrected,” which means that the object is located
in the front focal plane of the objective so that the rays emerging are parallel (collimated). This
feature allows a beamsplitter to be introduced in the light path for a second eyelens, camera, or other
apparatus. A lens within the microscope tube (the “tube lens,” duh) creates an intermedia image
that is viewed by the eyelens. In more traditional microscopes, the object typically is located just
beyond the focal point of the short-focal-length positive objective lens (so that the object distance
z1 ' f1 ), thus forming a large real inverted image that is positioned at the front focal point of the
ocular (eye lens). The eye lens then forms an image at infinity, i.e., the parallel rays emerging from
the ocular are viewed by a relaxed eye.
Microscope objectives and eyepieces are labeled by “magnifying powers,” e.g. 10X - 40X for the
objective and 10X for the ocular. The total magnification is the product, so that a 10X objective
and 10X ocular yields a magnification of 100X.
The magnifying power of an objective with focal length f1 and tube length 160 mm is:
M1 = −
160 mm
f1
For example, objectives with these focal lengths have magnifying powers:
f1 = 16 mm =⇒ M1 = 10X
f1 = 1.6 mm =⇒ M1 = 100X
The magnifying power of the eyelens is calculated from the same formula used for the simple mag-
62
nifier:
(Mθ )1 =
with sample value:
250 mm
f2
f2 = 25.4 mm =⇒ M2 ∼
= 10X
The magnifying power of the compound microscope is the product of the two magnifying powers:
M.P. = (Mθ )1 · (Mθ )2
−160 mm 250 mm
=
·
f1
f2
−160 mm 250 mm ∼
=
·
= −1000X
1.6 mm 25.4 mm
where again the negative sign means that the image is inverted.
2.13.12
Two Positive Lenses with Different Focal Lengths and Different
Separations
From the list of distances for a two-lens system:
feff = H0 F0 = FH
BF L = V0 F0
H0 V0 = H0 F0 − V0 F0
F F L = FV
VH = FH − FV
f1 · f2
(f1 + f2 ) − t
(f1 − t) · f2
(f1 + f2 ) − t
f2 · t
(f1 + f2 ) − t
f1 · (f2 − t)
(f1 + f2 ) − t
f1 · t
(f1 + f2 ) − t
we can determine the impact of the lens separation t for the specific example:
f1 = +100 mm
f2 = +25 mm
t
BF L
FFL
feff
0 mm
+20 mm
+20 mm
+20 mm
+25 mm = f2
0 mm
+18.75 mm
+25 mm = f2
+50 mm
−33 13 mm
+16 23 mm
+33 13 mm
+75 mm
−100 mm
+12.5 mm
+50 mm
+100 mm = f1
−300 mm
0 mm
+100 mm = f1
+125 mm = f1 + f2
∞
∞
∞ (afocal )
+150 mm
+500 mm
+50 mm
−100 mm
+175 mm
+300 mm
+37.5 mm
−50 mm
63
The effect of varying the lens separation t on the effective focal length feff for f1 = +100 mm and
f2 = +25 mm, with a magnified view in (b). The system is afocal if t = f1 + f2 = 125 mm; feff > 0
for t < f1 + f2 and feff < 0 for t > f1 + f2 .
2.13.13
Systems of One Positive and One Negative Lens
We also consider the case where f1 = +100 mm and f2 = −25 mm. The focal length for t = 0 is:
feff =
µ
1
1
+
f1 f2
¶−1
=
µ
1
1
+
+100 mm −25 mm
¶−1
=
−100
mm ∼
= −33.33 mm
3
The system focal length is negative for t < f1 + f2 = 75 mm, the system is afocal for t = 75 mm, and
the focal length is positive for t > 75 mm.
The effect of varying the lens separation t on the effective focal length feff for f1 = +100 mm and
f2 = −25 mm, with a magnified view in (b). The system is afocal if t = f1 + f2 = 75 mm; feff < 0
for t < f1 + f2 and feff > 0 for t > f1 + f2 .
64
2.13.14
Newtonian Form of Imaging Equation
We have already seen the familiar Gaussian form of the imaging equation:
1
1
1
+ 0 =
z z
f
An equivalent form is obtained by defining the distances x and x0 that are the differences between
the object and image distances and the focal length:
z = x + f =⇒ x = z − f
z 0 = x0 + f =⇒ x0 = z 0 − f
In the case of a real object O and real image O0 as shown in the figure, both x and x0 are positive.
The definition of the parameters x, x0 in the Newtonian form of the imaging equation. For a real
image, both x and x0 are positive.
By simple substitution into the imaging equation, we obtain:
1
1
1
(x0 + f ) + (x + f )
x + x0 + 2f
=
+ 0
=
=
f
x+f
x +f
(x + f ) · (x0 + f )
xx0 + (x + x0 ) f + f 2
0
0
xx + (x + x ) · f + f 2
=⇒ f =
(x + x0 ) + 2f
=⇒ x · x0 + f 2 = 2f 2
=⇒ x · x0 = f 2
This is the Newtonian form of the imaging equation. The same expression applies for virtual images,
but the sign of the distances must be adjusted, as shown:
The parameters x, x0 of the Newtonian form for a virtual image.
65
2.13.15
Example (1) of Two-Lens System
Find the cardinal points of the two-lens system
f1 = +100 mm
f2 = +25 mm
t = +50 mm
The effective focal length is:
f1 · f2
(f1 + f2 ) − t
100 mm · 25 mm
100
1
=
=+
mm = +33 mm
100 mm + 25 mm − 50 mm
3
3
feff =
Now find the location of the focal point from the formula for the back focal length:
f2 · (f1 − t)
(f1 + f2 ) − t
25 mm · (50 mm − 100 mm)
50
=
=
mm
50 mm − (100 mm + 25 mm)
3
BF L = V0 F0 =
Alternatively, we can track a ray from infinity through the system. The image distance from the
first lens is f1 = +100 mm, so the object distance to the second lens is
z2 = t − f1 = 50 mm − 100 mm = −50 mm
The image distance from the second lens is:
z20 =
z2 · f2
(−50 mm) · (+25 mm)
50
=
=
mm = V0 F0
z2 − f2
(−50 mm) − (+25 mm)
3
(parenthetical note, this is half the focal length).
We can now draw the image-space focal and principal points:
66
To find the object-space focal point, we can evaluate the front focal length:
f1 = +100 mm
f2 = +25 mm
t = +50 mm
F F L = FV =
f1 · (f2 − t)
(+100 mm) · (25 mm − 50 mm)
100
=
=−
mm
(f1 + f2 ) − t
(100 mm + 25 mm) − 50 mm
3
which says that the object-space focal point is to the right of the object space vertex. From the
effective focal length, we can locate the object-space principal point:
100
mm
3
FV = FH + HV
100
−100 mm = +
mm + HV
3
100
100
200
=⇒ HV = −
mm −
mm = −
mm
3
3
3
FH
=
feff = +
Alternatively, we “turn the system around” and bring in light from the left. The image distance
from the “first lens” (actually L2 ) is equal to its focal length:
z10 = f2 = +25 mm
So the object distance to the lens with f1 = +100 mm is:
z2 = t − z10 = 50 mm − 25 mm = +25 mm
So the distance from this lens to the system image-space focal point is:
z20 =
z2 · f1
(+25 mm) · (+100 mm)
100
=
=−
mm
z2 − f1
(+25 mm) − (+100 mm)
3
The object-space focal point is virtual and the object-space principal is located at the distance f eff
behind it in the reversed system.
67
We can now reverse the second case and plot the four cardinal points ( F, F0 , H, H0 ) on the same
graph:
Object-space and image-space cardinal points for two-lens system with f1 = +100 mm,
f2 = +25 mm, t = +50 mm. The ray from infinity on the object side is in red, that from infinity on
the image side is in blue.
In this case, the object-space focal point F just happens to coincide with the image-space principal
point H0 and the same is true for the object-space principal point H and the image-space focal point
F0 . This is of no real significance, since the two spaces are independent.
68
Images from System: (1) Object at Object-Space Focal Point
An object located at the object-space (“front”) focal point of the system is at the distance equal to
the FFL from the first lens. In this case:
100
mm
3¡
¢
mm · 100 mm
− 100
z1 · f1
3
¢
=
= ¡ 100
= +25 mm
z1 − f1
− 3 mm − (100 mm)
z1 = F F L = −
z10
z2 = t − z10 = +50 mm − 25 mm = 25 mm
which is the same as the focal length of the second lens, which means that the image distance from
the second lens is infinite (as expected).
Images from System: (2) Object at Object-Space Principal Point
An object located at the object-space (“front”) principal point of the system is at the distance equal
to the FFL from the first lens. In this case:
100
100
200
z1 = F F L − feff = −
mm −
mm = −
mm
3
3
3
¢
¡ 200
mm · 100 mm
−
z1 · f1
¢
= ¡ 2003
= +40 mm
z10 =
z1 − f1
− 3 mm − (100 mm)
z0
40 mm
3
(MT )1 = − 1 = − 200
=+
z1
5
− 3 mm
z2 = t − z10 = +50 mm − 40 mm = +10 mm
z2 · f2
10 mm · 25 mm
50
=
z20 =
= − mm
z2 − f2
10 mm − 25 mm
3
50
0
mm
−
z
5
(MT )2 = − 2 = − 3
=+
z2
10 mm
3
The system magnification for that object distance is the product of the two:
µ
¶ µ
¶
3
5
(MT )system = (MT )1 · (MT )2 = +
· +
= +1
5
3
as expected for the object and image at the principal points.
Images from System: (3) Equal Conjugates
If we move the object so that it is one focal length from the focal point and two focal lengths from
the principal point, the object distance is:
z1 = F F L + feff = −
z10 = 0 mm
(MT )1 = +1
100
100
mm +
mm = 0 mm
3
3
69
z2 = t − z10 = +50 mm − 0 mm = +50 mm
z2 · f2
+50 mm · 25 mm
=
z20 =
= +50 mm
z2 − f2
+50 mm − 25 mm
z0
50 mm
(MT )2 = − 2 = −
= −1
z2
50 mm
(MT )system = (MT )1 · (MT )2 = (+1) · (−1) = −1
as expected for the object and image at the equal-conjugate points.
2.13.16
Example (2) of Two-Lens System: Telephoto Lens
Now consider a system composed of a positive lens and a negative lens separated by just a bit more
than the sum of the focal lengths: f1 = +100 mm, f2 = −25 mm, and t = +80 mm. The focal length
of the equivalent thin lens is feff = 500 mm:
f1 · f2
f1 + f2 − t
100 mm · (−25 mm)
=
= +500 mm
100 mm + (−25 mm) − 80 mm
feff =
Note that the focal length of the system is MUCH longer than the focal lengths of either lens.
Now locate the image-space focal point and principal point. For an object located at ∞, the
BFL is found by substitution into the appropriate equation:
(f1 − t) · f2
(f1 + f2 ) − t
(100 mm − 80 mm) · (−25 mm)
=
= 100 mm
(100 mm + (−25 mm)) − 80 mm
BF L = V0 F0 =
The image of an object at ∞ is located 100 mm behind the second lens, and thus 180 mm behind
the first lens; this distance VF0 = 180 mm is the physical length, which is MUCH longer than the
focal length of 500 mm. This is the advantage of a telephoto lens; the focal length is much longer
than the lens itself.
The locations of the image-space principal point is determined from the back and equivalent focal
lengths:
H0 F0 = H0 V0 + V0 F0
500 mm = H0 V0 + 100 mm
H0 V0 = +400 mm
H0 V = H0 V0 − VV0 = 400 mm − 80 mm = +320 mm
so the principal point is located 320 mm in front of the object-space vertex V. A sketch of the
system and the image-space cardinal points is shown below:
70
Image-space focal and principal points of the telephoto system. The equivalent focal length of the
system is feff = +500 mm, but the image-space focal point is only +100 mm behind the rear vertex
V0 . Tthe image-space principal point is 500 mm in front of the focal point.
The object-space focal point is located by applying the expression for the “front focal distance”:
F F L = FV =
f1 · (f2 − t)
(+100 mm) ((−25 mm) − 80 mm)
=
= +2100 mm
(f1 + f2 ) − t
(100 mm + (−25 mm)) − 80 mm
which is far in front of the object-space vertex V. The object-space principal point is found from:
FH = FV + VH
+500 mm = +2100 mm + VH
VH = 500 mm − 2100 mm = −1600 mm =⇒ HV = −VH = +1600 mm
So the object-space principal point is very far in front of the first vertex.
Object-space focal and principal points of the telephoto system. Both are located far ahead of the
front vertex V.
We can locate the image of an object at a finite distance say, 3 m in front of the first lens (OV =
3000 mm) using the three methods: (1) “brute-force” calculation, (2) by applying the Gaussian
imaging formula for distances measured from the principal points, and (3) from the Newtonian
imaging equation.
71
(1) “Brute-Force Calculation”
The distance from the object to the first thin lens is 3000 mm, so the intermediate image distance
satisfies:
1
1
1
+ 0 =
z1
z1
f1
µ
¶−1
1
3000
1
0
z1 =
=
−
mm ∼
= 103.45 mm
100 mm 3000 mm
29
The transverse magnification of the image from the first lens is:
(MT )1 = −
z10
1
=−
z1
29
The object distance to the second lens is negative:
z2 = t − z10 = 80 mm −
3000
680
mm = −
mm ∼
= −23.45 mm
29
29
the object is virtual. The image distance from the second lens is:
1
1
1
+ 0 =
z2
z2
f2
µ
0
z1 = −
µ
¶¶−1
1
3400
29
=+
− −
mm ∼
= +377.8 mm
25 mm
680 mm
9
The corresponding transverse magnification is:
¢
¡ 3400
+ 9 mm ∼
z20
¢ = −16.1
(MT )2 = − = − ¡ 680
z2
− 29 mm
The system magnification is the product of the component transverse magnifications:
Ã ¡
¢!
mm
+ 3400
1
5
9
¢ =−
MT = (MT )1 · (MT )2 = − · − ¡ 680
29
9
− 29 mm
(2) Gaussian Formula
Now evaluate the same image using the Gaussian formula for distances measured from the principal
points. The distance from the object to the object-space principal point is:
z1 = OH = OV + VH = 3000 mm + (−1600 mm) = +1400 mm
The image distance measured from the image-space principal point is found from the Gaussian image
formula:
µ
¶−1
1
1
1
7000
1
1
0
0 O0 =
=
−
=
H
=+
=⇒
z
−
mm ∼
= 777.8 mm
z0
feff
z
500 mm 1400 mm
9
The distance from the rear vertex to the image is found from the known value for H0 V0 = +400 mm:
V0 O0 = H0 O0 − H0 V0
7000
3400
=+
mm − 400 mm =
mm ∼
= 377.8 mm
9
9
72
thus matching the distance obtained using “brute force”. The transverse magnification of the image
created by the system is:
+ 7000
z0
5
9 mm
MT = − = −
=−
z
+1400 mm
9
(3) Newtonian Lens Formula
Now repeat the calculation for the image position using the Newtonian lens formula. The distance
from the object to the object-space focal point is:
x = OF = OV + VF = OV − FV = 3000 mm − 2100 mm = 900 mm
Therefore the distance from the image-space focal point to the image is:
2
x0 = F0 O0 =
feff
(500 mm)
2500
=
=
mm ∼
= 277.8 mm
x
900 mm
9
So the distance from the rear (image-space) vertex V0 to the image is:
V0 O0 = V0 F0 + F0 O0
2500
3400
= 100 mm +
mm =
mm ∼
= 377.8 mm
9
9
which again agrees with the result obtained by the other two methods.
2.13.17
Images from Telephoto System:
Image (1): Object at Object-Space Focal Point
z1 = F F L = +2100 mm
z1 · f1
(+2100 mm) · 100 mm
=
z10 =
= +105 mm
z1 − f1
(+2100 mm) − (100 mm)
z2 = t − z10 = +80 mm − 105 mm = −25 mm
the second lens is infinite (as expected).
z20 =
z2 · f2
(−25 mm) · (−25 mm)
=
=∞
z2 − f2
(−25 mm) − (−25 mm)
Image (2) from Telephoto System: Object at Object-Space Principal Point
z1 = F F L − feff = 2100 mm − 500 mm = 1600 mm
z1 · f1
(1600 mm) · 100 mm
320
=
=+
mm
z10 =
z1 − f1
(1600 mm) − (100 mm)
3
+ 320 mm
z0
1
(MT )1 = − 1 = − 3
=−
z1
1600 mm
15
73
: The object distance to the second lens is:
80
320
mm = − mm
z2 = t − z10 = +80 mm −
3
¢3
¡ 80
mm
·
(−25
mm)
−
z
·
f
2
2
¢
z20 =
= ¡ 803
= −400 mm
z2 − f2
− 3 mm − (−25 mm)
z0
(−400 mm)
¢ = −15
(MT )2 = − 2 = − ¡ 80
z2
− 3 mm
µ
¶
1
(MT )system = (MT )1 · (MT )2 = −
· (−15) = +1
15
which again confirms that the transverse magnification is that expected for the object and image at
the principal points.
Image (3) from Telephoto System: Equal Conjugates
z1 = F F L + feff = 2100 mm + 500 mm = 2600 mm
z1 · f1
(+2600 mm) · 100 mm
=
= +104 mm
z10 =
z1 − f1
(+2600 mm) − (100 mm)
z0
(+104 mm)
1
(MT )1 = − 1 = −
=−
z1
(2600 mm)
25
z2 = t − z10 = +80 mm − 104 mm = −24 mm
z2 · f2
(−24 mm) · (−25 mm)
=
z20 =
= +600 mm
z2 − f2
(−24 mm) − (−25 mm)
z0
(+600 mm)
= +25
(MT )2 = − 2 = −
z2
(−24 mm)
µ
¶
1
(MT )system = (MT )1 · (MT )2 = −
· (25) = −1
25
74
2.13.18
Example (3) of Two-Lens System: Two Negative Lenses
Now consider a system composed of a positive lens and a negative lens separated by just a bit more
than the sum of the focal lengths: f1 = −100 mm, f2 = −25 mm, and t = +125 mm. The focal
length of the equivalent thin lens is:
f1 · f2
= H0 F0 = FH
f1 + f2 − t
(−100 mm) · (−25 mm)
=
= −10 mm
(−100 mm) + (−25 mm) − 125 mm
feff =
Note that the focal length of the system negative and shorter than either lens..
Now locate the image-space focal point and principal point. For an object located at ∞, the
BFL and FFL are found by substitution into the appropriate equation:
(f1 − t) · f2
(f1 + f2 ) − t
45
(−100 mm − 125 mm) · (−25 mm)
= − mm = −22.5 mm
=
(−100 mm) + (−25 mm) − 125 mm
2
BF L = −22.5 mm
BF L = V0 F0 =
f1 · (f2 − t)
(f1 + f2 ) − t
(−100 mm) · (−25 mm − 125 mm)
=
= −60 mm
(−100 mm) + (−25 mm) − 125 mm
F F L = −60 mm
F F L = FV =
75
(1) Object at Object-Space Focal Point
z1 = F F L = −60 mm (virtual object)
z1 · f1
(−60 mm) · (−100 mm)
=
z10 =
= +150 mm
z1 − f1
(−60 mm) − (−100 mm)
z2 = t − z10 = +125 mm − 150 mm = −25 mm
the second lens is infinite (as expected):
z20 =
z2 · f2
(−25 mm) · (−25 mm)
625 mm2
=
=
=∞
z2 − f2
(−25 mm) − (−25 mm)
0 mm
Images from System: (2) Object at Object-Space Principal Point
z1 = F F L − feff = −60 mm − (−10 mm) = −50 mm
z1 · f1
(−50 mm) · (−100 mm)
=
z10 =
= +100 mm
z1 − f1
(−50 mm) − (−100 mm)
z0
+100 mm
(MT )1 = − 1 = −
= +2
z1
−50 mm
z2 = t − z10 = +125 mm − 100 mm = +25 mm
z2 · f2
(+25 mm) · (−25 mm)
=
z20 =
= −12.5 mm
z2 − f2
(+25 mm) − (−25 mm)
z0
(−12.5 mm)
1
=+
(MT )2 = − 2 = −
z2
(+25 mm)
2
µ
¶
1
(MT )system = (MT )1 · (MT )2 = (+2) · +
= +1
2
which again confirms that the transverse magnification is that expected for the object and image at
the principal points.
Images from System: (3) Equal Conjugates
z1 = F F L + feff = −60 mm + (−10 mm) = −70 mm
z1 · f1
(−70 mm) · (−100 mm)
700
1
=
z10 =
=+
mm = 233 mm
z1 − f1
(−70 mm) − (−100 mm)
3
3
¢
¡ 700
0
mm
z
10
=+
(MT )1 = − 1 = − 3
z1
(−70 mm)
3
76
325
700
mm = −
mm ∼
z2 = t − z10 = +125 mm −
= −108.3 mm
3
3
¢
¡ 325
− 3 mm · (−25 mm)
z2 · f2
¢
z20 =
= ¡ 325
= −32.5 mm
z2 − f2
− 3 mm − (−25 mm)
z0
(−32.5 mm)
3
¢ =−
(MT )2 = − 2 = − ¡ 325
z2
10
− 3 mm
µ
¶ µ
¶
10
3
(MT )system = (MT )1 · (MT )2 = +
· −
= −1
3
10
2.14
Plane and Spherical Mirrors
One of the most familiar optical elements is the plane mirror (you probably see one every morning!).
For each ray incident at angle θ measured from the normal to the surface, a reflected ray is generated
at angle −θ relative to the normal. Consider a full sphere with reflective surface on the inside and
a point object O at the center, as shown in (a) in the figure. All rays from the object encounter the
surface at normal and reflect back to form an image at the center. We can infer the focal length of
the spherical concave mirror from this observation by noting that the object and image distances
are identically R, so the focal length is determined by the thin-lens imaging equation:
1
1
1
+
=
f
z1
z2
z1 = z2 = R =⇒
1
1
1
2
R
= + =
=⇒ f =
f
R R
R
2
Note that in this case of a complete sphere, the algebraic sign of the radius of curvature is not well
defined, but since rays converge to form the image, the focal length clearly must be positive. Because
the object and image distances are equal, this clearly is imaging at equal conjugates with transverse
magnification is MT = −1:
z2
2·f
MT = − = −
= −1
z1
2·f
The negative sign on MT means that if the object source is moved “upward” from its position on
the horizontal axis at the center, then the reflected rays will converge to a point “below” the optic
axis, as shown in part (b) of the figure.
In part (c) of the figure, half of the spherical mirror surface is removed so that all rays emitted
towards the left will escape without striking the mirror and all rays emitted towards the right will
strike the surface one time before returning to the “image” at the center and then escaping to the
right. This mirror surface clearly makes rays converge to a real image coincident with the object
and so must have a positive focal length EVEN THOUGH the radius of curvature R is negative
(because V is to the right of C).
2.14 PLANE AND SPHERICAL MIRRORS
77
Spherical mirror: (a) rays from point source at center of sphere are all normal to the surface and
reflect back upon themselves to form a point image at object, so that z1 = z2 = R; (b) if the point
source is moved “upward”, the image moves “downward,” which shows that MT = −1; (c) half the
sphere is removed leaving a hemisphere with R = CV < 0.
Derivation of the focal length of a concave spherical mirror. The magnified section at the bottom
shows the triangles used to evaluate f in terms of R: f = R
2 in the paraxial approximation.
We can consider the hemispherical concave mirror with radius of curvature R = VC < 0. Even
though the radius is negative, we have already inferred that the focal length of this system is positive
since the image rays converge, so we have:
f=
|R|
−R
R
=
=−
2
2
2
78
A ray from an object at infinity that is close to (and parallel to) the optical axis, as shown in the in
the figure. From triangle ∆CAV in the magnified view, it is apparent that:
sin [θ] =
From ∆F0 AV, we see that
x
x
x
=
=
−R
CV
−VC
tan [2θ] =
x
F0 V0
Now apply the paraxial approximation that sin [θ] ∼
= tan [θ] ∼
= θ if θ ∼
= 0:
x ∼
= θ =⇒ x = −R · θ
−R
x∼
tan [2θ] =
= 2θ =⇒ x = f · 2θ
f
sin [θ] =
Now equate the two terms to find a relationship between f and R:
−R · θ = f · 2θ =⇒ f = −
R
2
This expression for the focal length may be substituted into the imaging equation for a single thin
lens:
1
1
1
2
+
= =−
z1 z2
f
R
For the case just considered of a concave surface, R < 0 and f > 0. If the object distance z1 > f ,
then the image distance z2 is positive, BUT IS MEASURED FROM RIGHT TO LEFT. If the
mirror is a convex spherical surface with R = VC > 0; the image of a ray from an object at infinity
crosses the axis at the image-space focal point behind the mirror, so the optic makes rays diverge
and therefore has negative power.
Convex mirror has positive radius of curvature (R > 0) but the reflected rays diverge and so the
R
surface has negative focal length via f = − .
2
2.15 STOPS AND PUPILS
2.14.1
79
Comparison of Thin Lens and Concave Mirror
Comparison of the vertices, focal points, principal points, and equal-conjugate points of a concave
mirror and a thin lens. The vertices and the principal points coincide in both cases so that
MT = +1 for object and image at the vertex of the mirror and at the surfaces of the lens. The
object- and image-space focal points of the mirror coincide at the distance feff = − R
2 for the mirror,
and the equal conjugate points are located at the center of curvature so that z1 = z2 = 2feff . For the
lens, the equal conjugate points are also located such that z1 = z2 = 2feff with MT = −1.
2.15
Stops and Pupils
In any multielement optical system, the beam of light that passes through the system is shaped like
a solid circular “spindle” with different radii at different axial locations. A larger exiting ray cone
means that more light reaches the image to make it brighter, so the diameter of this specific element
is the limiting factor for image “brightness.” The diameter of one optical element will limit the size
of the ray spindle that exits the system; this limiting element is the aperture stop of the system and
may be a lens or an aperture with no power (an iris diaphragm) that is placed specifically to limit
the diameter of the ray cone. Consider the example of a two-lens system with an iris positioned
between them shown in the figure. The iris limits the cone of rays from the object at O
80
Schematic of the aperture stop S and entrance and exit pupils E and E0 , respectively for a system
formed from two positive lenses and an iris with no power. The entrance pupil E is the image of
the stop S seen from the left through the first lens L1 , while the exit pupil is the image of S seen
from the right through the second lens L3 . Note that the element that is the stop may vary with
object location O.
Obviously, the aperture stop in an imaging system composed of a single lens is that lens. In a
two-element system, the stop will be one of the two lenses, determined by the relative diameters
and the locations of the lenses. The image of the stop seen from the input “side” of the lens is the
entrance pupil, which determines the angular spread of the ray cone from an object point that “gets
into” the optical system, and thus determines the “brightness” of the image. The image of the stop
seen from the output “side” is the exit pupil (once called the Ramsden disk ).
In an imaging system intended for viewing by eye, it is useful to locate the exit pupil at the iris
of the eye and to match its diameter to that of the iris of the eye to ensure that all light through
the optical system makes it into the eye to form the viewable image.
2.15.1
Focal Ratio — f-number
For multilens systems, the size of the entrance pupil determines the angular extent of the ray cone
that enters the system from a point source. The figure shows a simple hypothetical imaging system
with object-space and image-space principal points H and H0 , respectively and aperture stop of
diameter d0 as the first element in the system (the same analysis applies for systems with the
entrance pupil at other locations for an object at infinity). In this system, the stop is also is the
entrance pupil. A point source at infinity creates a plane wave through the entrance pupil, which is
then incident on the object-space principal plane H with the same diameter. The unit transverse
magnification of the two principal planes ensures that the light emerging from the image-space
81
principal plane H0 has that same diameter d0 = dNP . The cone angle of rays incident on the image
plane at the image-space focal point F0 is the ratio of the diameter to the distance H0 F0 = feff :
d0
dNP
=
feff
feff
This means that the focal ratio of the system is:
f/# =
feff
dNP
Note that a corresponding expression could be constructed based on the diameter of the exit pupil,
but the propagation distance then would have to be the distance from the exit pupil to the image,
which (in this case) is longer than the effective focal length.
Specification of the system focal ratio: the plane wave from a point source at infinity is incident
through the aperture stop with diameter d0 onto the object-space principal plane H. The light
emerging from the image-space principal plane H0 has the same diameter d0 . The light propagates
the focal length feff to the image. The angle of the ray cone is fde 0ff ,which is the system focal ratio
f/#.
This f-number specifies the ability of the system to collect light.
2.15.2
Example: Focal Ratio of Lens-Aperture Systems
The focal ratio of a single thin lens obviously is the ratio of the focal length to the diameter of the
lens:
f
f /# =
d0
Note that the smallest possible focal ratio exists for a full sphere (which is anything but thin and
the paraxial approximation certainly does not apply over its full diameter). It might be useful to
determine the focal ratio for such a case with “normal” glass (n = 1.5). The focal length of the
82
sphere in the (ridiculously invalid) thin-lens paraxial approximation where R = 12.5 mm is obtained
from the lensmaker’s equation:
µ
µ
¶¶−1
1
1
f = (n2 − 1)
−
R1 R2
µ
¶−1
1
1
= (1.5 − 1)
−
12.5 mm −12.5 mm
= 3.125 mm
The focal ratio is:
f /# =
3.125 mm
f
1
=
=
d0
25 mm
8
This is ridiculously invalid because it assumes that the sphere is simultaneously “thin” and “fat”
If we assume the spherical lens is composed of two thin lenses at the vertices with the power of
a single surface:
f1 = f2 =
µ
1.5 − 1
12.5 mm
¶−1
= 25 mm
t = 25 mm
f1 · f2
25 mm · 25 mm
feff =
=
= 25 mm
f1 + f2 − t
25 mm + 25 mm − 25 mm
(f1 − t) · f2
BF L =
=0
f1 + f2 − t
Single Thin Lens + Aperture “in front”
Consider a system with a diaphragm (iris or aperture) of diameter d0 located at a distance t “in
front” of the lens with focal length f1 and diameter d1 . Since the aperture has no power to refract
light (φ = 0 diopters), then its “focal length” is infinite (f0 = ∞). The focal length of the two-“lens”
system is:
µ
¶
f0 · f1
f0
feff =
= f1 · lim
= f1
f0 →∞ (f0 + f1 ) − t
(f0 + f1 ) − t
which makes sense: the focal length of a system consisting of one refracting element and one “nonrefracting” element is that of the refracting lens.
For an object at infinity (z1 = ∞ =⇒ z2 = f1 ), the diaphragm is the aperture stop if its diameter
is smaller than that of the lens:
d0 < d1 =⇒ iris is aperture stop
and the iris is also the entrance pupil. The focal ratio of the system is:
f /# =
f1
d0
The exit pupil may be located by applying the imaging equation:
zXP =
t · f1
t − f1
which shows that the exit pupil is virtual (“behind” the lens as seen from image space) if t < f1 .
Note that if t = f1 so that the aperture is located at the object-space focal point of the system, then
the distance from the lens to the exit pupil is infinite: the system is “telecentric in image space.”
The exit pupil is real (and may be visualized on an observation screen) if zXP > 0 =⇒ t > f1 .
Consider some examples with f1 = 100 mm, d1 = 25 mm, t = 25 mm, and d0 = 10 mm. If the iris
83
is deleted, then the focal ratio is:
f /# =
feff
100 mm
=
= f /4
d1
25 mm
The iris is the stop and entrance pupil. The location of the exit pupil is:
t · f1
25 mm · 100 mm
100
=
=−
mm
t − f1
25 mm − 100 mm
3
− 100 mm
4
= − 3
=+
25 mm
3
40
1
= d0 · MXP =
mm = 13 mm
3
3
zXP =
MXP
dXP
The iris is the stop and entrance pupil, so the focal ratio is:
f /# =
100 mm
feff
=
= f /10
dNP
10 mm
Single Thin Lens + Aperture “behind”
If the lens comes first in the system, then we need to find the condition of the iris diameter to
determine if it is the aperture stop. At some risk of confusion, we’ll maintain the notation where the
diameter of the lens is d1 and that of the aperture is d0 even though it is second in the system. For
an object at infinity, the figure shows that the distance to the iris must be less than the focal length
to have any possibility of being the aperture stop. The image of the aperture seen from object space
is located at
t · f1
z=
t − f1
which is positive (so the entrance pupil is real) if t < f1 . The transverse magnification of the entrance
pupil is:
z
f1
MT = =
t
t − f1
which implies that the diameter of the image of the iris is:
d00 = MT · d0
If we use the same numerical values as before but with the iris “behind,” the distance to the entrance
pupil is:
t · f1
25 mm · 100 mm
100
=
zN P =
=−
mm
t − f1
25 mm − 100 mm
3
− 100 mm
zNP
4
=− 3
=+
25 mm
25 mm
3
4
40
dN P = + · 10 mm =
mm
3
3
This is the diameter of the incoming beam at the lens, so the focal ratio is:
MN P = −
f /# =
100 mm
feff
= 40
= f /7.5
dNP
3 mm
84
Three examples of systems: the first is a single thin lens with the aperture stop at the lens, so the
stop coincides with the entrance and exit pupils; the second moves the iris “in front” of the lens so
that it is also the entrance pupil; in the third, the iris is behind the lens and the magnified diameter
of the entrance pupil is the relevant parameter for the focal ratio.
85
2.15.3
Example: Exit Pupils of Telescopic Systems
Galilean Telescope
In the example of a telescopic system, such as binoculars, composed of an objective lens L1 with
diameter d1 and an eyelens L2 with diameter d2 , where the two lenses are separated by the sum
of their focal lengths. Consider the specific example of a Galilean telescope with f1 = +200 mm,
D1 = 50 mm, f2 = −25 mm, D2 = 25 mm, and t = f1 + f2 = 175 mm. We have already seen that the
angular magnification of the system is the ratio of the focal lengths of the two lenses:
Mθ = −
f1
+200 mm
=−
= +8
f2
−25 mm
To determine which element is the aperture stop for a ray incident from an object at infinity, we
need to determine where this ray strikes the second lens. In this case, it strikes well within the lens
diameter — the ray height from the first lens is:
¶
µ
µ
¶
175 mm
d1
t
25
d2
= 25 mm · 1 −
y=
· 1−
=
mm = 3.125 mm <
2
f1
200 mm
8
2
so the first lens is the aperture stop, and therefore also the entrance pupil.
Location of aperture stop for the specified Galilean telescope. Since the ray from infinity that strikes
the edge of the positive lens passes well within the boundary of the negative lens, the aperture stop
is the positive lens for an object at infinity.
The exit pupil is the image of the aperture stop (first lens) seen through the second lens, which
has negative focal length, ensuring that the exit pupil will be virtual. The distance from the stop
to the second lens is:
z2 = t = f1 + f2 = 175 mm
and the image distance from the second lens is:
z20 =
z2 · f2
175 mm · (−25 mm)
175
=
=−
mm = −21.875 mm
z2 − f2
175 mm − (−25 mm)
8
86
Figure 2.1:
The size of the exit pupil is determined from the transverse magnification:
MT = −
− 175 mm
z20
1
=− 8
=+
z2
175 mm
8
Since the diameter of the stop is d1 = 50 mm, the diameter of the exit pupil is:
1
dXP = MT · dStop = + · 50 mm = +6.25 mm
8
For the Galilean telescope, the exit pupil is virtual (located 21.875 mm “behind” the eyelens) and
small.
Keplerian Telescope
Now repeat the analysis for a corresponding Keplerian telescope with f1 = +200 mm, d1 = 50 mm,
f2 = +25 mm, d2 = 25 mm, t = f1 + f2 = 225 mm and angular magnification:
Mθ = −
f1
+200 mm
=−
= −8
f2
+25 mm
Again, the height of the ray at the edge of the first lens from an object at infinity has height at
the second lens:
¶
µ
µ
¶
225 mm
t
25
d1
= 25 mm · 1 −
· 1−
= − mm = −3.125 mm
y =
2
f1
200 mm
8
d2
|y| <
2
The first element is still the stop and the entrance pupil. The image of the first lens through the
87
second is the exit pupil; its location and size are determined using the thin-lens imaging equation:
z2 = t = f1 + f2 = 225 mm
z2 · f2
225 mm · 25 mm
225
=
z20 =
=
mm = +28.125 mm
z2 − f2
225 mm − 25 mm
8
225
mm
z0
1
=−
MT = − 2 = − 8
z2
225 mm
8
µ
¶
1
dXP = dStop · MT = 50 mm · −
= −6.25 mm
8
The exit pupil is “real” (outside of the system at a distance of 28.125 mm beyond the eyelens) and
inverted.
In both of the telescopes just considered, note that the diameter of the exit pupil is the ratio of
the focal length of the eyepiece and the focal ratio of the object lens:
dXP = ³
(dXP )Galilean
(dXP )Keplerian
f2
f1
d1
d
d
´ = ³ 1´ = 1
f1
Mθ
f2
50 mm
=
= 6.25 mm
+8
50 mm
= −6.25 mm
=
−8
In words, the diameter of the exit pupil is equal to the ratio of the diameter of the entrance pupil
(which is the objective in this case) and the magnifying power; more power means a smaller exit
pupil.
Common binoculars used for birdwatching are listed as “10 × 50,” which means that the angular
magnification (magnifying power) is 10 and the diameter of the entrance pupil (which is that of the
objective lens0 is 50 mm / 2 in. The diameter of the eyelens is:
dXP =
50 mm
= 5 mm
10
Until recently, the most common variety of binocular was the “7 × 50,” which has a magnifying
power of 7 and objectives with d = 50 mm, so the diameter of the exit pupil is:
dXP =
50 mm
' 7 mm
7
This is a close match to the diameter of the iris of the dark-adapted eye and thus are a good choice
for astronomical viewing; for that reason, 7 × 50 binoculars were known as “night glasses.” When
used with the smaller iris diameter of the eye during daytime, much of the diameter of the exit pupil
would illuminate the opaque iris and not contribute to the brightness of the image on the retina.
For a formerly common amateur telescope with a mirror objective with d1 = 6 in ∼
= 150 mm and
a focal length f1 = 48 in ∼
= 1220 mm, the focal ratio is:
f /# =
48 in
=8
6 in
so the diameter of the exit pupil is when viewed through an eyelens with focal length f2 is
dXP =
f2
f2
=
f /#
8
If the focal length of the eyelens is f2 = 25 mm ∼
= 1 in, then the diameter of the exit pupil is about
3 mm, which is pretty small. If the focal length of the eyelens is f2 = 4 mm ∼
= 16 in, the magnifying
88
power of the system is:
Mθ =
f1 ∼ 48 in
= +288
= 1
f2
6 in
which is a large number that will impress a naive user. BUT the diameter of the exit pupil is very
small
1
in
f2
1
= 6
=
in ∼
dXP =
= 0.5 mm
8
8
48
so it would be very difficult to “see” anything through this telescope. This illustrates the flaw in the
strategy that was once used often by manufacturers of cheap telescopes intended as gifts for children;
the manufacturers would often quote a very large value for the magnifying power that required an
eyepiece with a very short focal length and therefore a very small exit pupil. The images were very
difficult to see by novices and experienced users alike.
The location of the exit pupil also is important. It is useful to have it placed “outside” the
imaging system where the eye would be located so that it is feasible to get all of the light through
the pupil into the eye. The distance from the rear vertex of the system to the exit pupil is the eye
relief :
V0 E0 = eye relief
An imaging system with “lots of” eye relief may be easier to view through, since the location where
the eye is optimally placed is back away from the eyelens. An example of a system that needs a
large eye relief is a rifle scope, where the eyepiece lens will be located “far” in front of the viewing
eye.
For different object distances, it is possible for the aperture stop to “move around,” i.e., the
element that defines the aperture stop may change with object distance. The locations and sizes of
the pupils are determined by applying the ray-optics imaging equation to these objects. To some,
the concept of finding the “image of a lens” may seem confusing, but it is no different from before
— just think of the lens as a regular opaque object at its location and find the images through the
optics that come after (for the exit pupil) or that came before (entrance pupil).
Which element in a multielement system is the “stop” depends on the relative sizes of the lenses.
In the first case shown below, the first lens (the objective) is small enough that it acts as the stop
(and thus also the entrance pupil). The image of the objective lens seen through the eyelens is the
exit pupil, and is “between” the two lenses and very small. Because the exit pupil is small and
“remote” (located “within” the optical system), so is the field of view of the Galilean telescope. In
the second example, the smaller eyelens is the stop and also the exit pupil, while the image of the
eyelens seen through the objective is the entrance pupil and is far behind the eyelens and relatively
large.
More Examples of Galilean and Keplerian Telescopes
Consider the two two-lens telescope designs. The Galilean telescope has a positive-power objective
and a negative-power ocular or eyelens. The Keplerian telescope has a positive objective and a
positive eyelens. Assume that the objective is identical in the two cases with f1 = +100 mm and
d1 = 30 mm. The focal lengths and diameters of the oculars (eyepieces) are f = ±15 mm and
d2 = +15 mm (these are the approximate dimensions and focal lengths of the lenses in the OSA
Optics Discovery Kit). The lenses of a telescope are separated by f1 + f2 , (f1 + f2 ∼
= 85 mm and
115 mm for the Galilean telescope and Keplerian telescope, respectively). We want to locate the
stops and pupils. The stop is found by tracing a ray from an object at ∞ through the edge of the
first element and finding the ray height at the second lens. If this ray height is small enough to pass
through the second lens, then the first lens is the stop; if not, then the second lens is the stop.
89
Galilean telescope for object at z1 = +∞: (a) the objective lens is the aperture stop and entrance
pupil because it limits the cone of entering rays. The image of the stop seen through the eyelens is
the (very small) exit pupil; (b) the larger objective means that the eyelens is the aperture stop and
the exit pupil. The image of the eyelens seen through the objective is the entrance pupil, and is
behind the eyelens because the object distance to the objective is less than the focal length.
Consider the Galilean telescope first. The ray height at the first lens is the “semidiameter” of the
lens: d21 = 15 mm; it is not called the “radius” to avoid confusion with a “radius of curvature.” From
there, the ray height would decrease to 0 mm at a distance of f1 = +100 mm, but it first encounters
the negative lens at a distance of t = +85 mm. The ray height at this lens is
100 mm − 85 mm
· 15 mm = 2.25 mm
100 mm
which is much smaller than the lens semidiameter of d22 = 7.5 mm. Hence the first lens (the objective
lens) is the stop.
The entrance pupil is the image of the stop through all of the elements that come before the stop.
In this case, the first lens is also the entrance pupil and its transverse magnification is unity. The
exit pupil is the image of the stop through all elements that come afterwards, which is the negative
lens. The distance to the “object” is f1 + f2 = 85 mm, so the imaging equation is used to locate the
exit pupil and determine its magnification:
1
1
1
1
=
=
+
85 mm z 0
f2
−15 mm
µ
¶−1
1
51
1
0
z = −
= − mm = −12.75 mm
−
15 mm 85 mm
4
z0
−12.75 mm
= 0.15
MT = − = −
z
85 mm
The exit pupil is upright, but more important, its distance from the second lens is negative; the exit
pupil is a virtual image and not accessible to the eye. The viewer “sees” the exit pupil in front of
90
the eye. This limits the field of view of the Galilean telescope.
Follow the same procedure to determine the stop and locate the pupils and their magnifications
for the Keplerian telescope. The ray height at the first lens for an object located at ∞ is again
15 mm. The ray height decreases to 0 mm at the focal point, but then decreases still farther until
encountering the ocular lens at a distance of f1 + f2 = 115 mm. The ray height h at this lens is
determined from similar triangles:
15 mm
100 mm
=
=⇒ h = −2.25 mm
−h
15 mm
So the first lens is the stop and entrance pupil (with unit magnification) in this case too. The
distance from the stop to the second lens is f1 + f2 = 115 mm, so the imaging equation for locating
the exit pupil and determining its magnification is:
1
1
1
1
=
+ 0 =
115 mm z
f2
+15 mm
µ
¶−1
1
69
1
z0 = +
= + mm = +17.25 mm
−
15 mm 115 mm
4
0
z
+17.25 mm ∼
MT = − = −
= −0.203
z
85 mm
The exit pupil is a real image of the aperture stop in the Keplerian telescope — we can place our eye
at it and see a larger field of view.
Vignetting
The location of the aperture stop is determined for an object located “on” the optical axis. If the
object is “off” the axis, the cone of rays that get throught the system is “skewed” or “tilted.” If
other elements in the system (lenses or diaphragms) constrain parts of the skewed cone of rays,
then the cone of rays is truncated and the brightness of the image is reduced; this phenomenon is
“vignetting.”
Example of vignetting; the brightness of the scene at the edges is reduced due to the presence of an
“out-of-focus” aperture in the system.
2.15.4
Pupils and Diffraction
The concept of pupils may be combined with diffraction to evaluate the effective focal ratio (f/number)
of the imaging system. For a single thin lens, the diffraction spot is determined by the size and shape
specified by the pupil function p [x, y] or p (r) and the distance to the image. If the lens has a circular
91
2.16 MARGINAL AND CHIEF RAYS
pupil of diameter d0 , the pupil function
p (r) = CY L
µ
r
d0
¶
determines the extent of the ray cone that enters the system. We derived the resulting diffraction
pattern, which is proportional to a scaled circularly symmetric sombrero function, which is the
analogue of the SINC function using the first-order Bessel function, and therefore is sometimes
called the “besinc” function.
⎞
⎛
2
πd0
r ⎠
´
h (r) ∝
· SOM B ⎝ ³
λ0 z2
4
d0
If the object distance is large, then the image distance z2 ' f and the amplitude of the impulse
response is:
⎛
⎞
r
h (r) ∝ SOM B ⎝ ³ ´ ⎠
λ0 f
d0
The diameter of the Airy disk is approximately:
µ ¶
f ∼
∼
D0 = 2.44λ0
= 2.44 · λ0 · f/#
d0
2.15.5
Field Stop
As suggested by its name, a field stop limits the field of view of the system. It may be as simple as
the finite size of the sensor (e.g., a rectangular piece of photosensitive emulsion or a CCD sensor),
or it may be placed at an intermediate image within the system or even at the object itself. Images
of the field stop are located at the same locations as intermediate images of the object.
2.16
Marginal and Chief Rays
Many important characteristics of an optical system, including the possible presence of vignetting,
are determined by the trace of two specific rays through the imaging system. For an object O with
image O0 , aperture stop S and entrance pupil E and exit pupil E0 , the marginal ray traces from the
center of O to the edge of S and back to the center of O0 . The chief ray (or principal ray) is traced
from the edge of O (or edge of the “field of view”) hrough the center of S to the edge of O0 . Since E
and E0 are images of the stop S, the marginal and chief rays also go through the edges and centers
of the pupils, respectively.
The marginal ray is specified by its ray heights y and ray angle u at different points on the
optical axis; the corresponding notation for the chief ray includes “overscores” or “bars:” y, u.
Heights and angles of the marginal ray after refraction at a surface are “primed,” e,g, y 0 and u0 .
The corresponding quantities for the chief ray are y 0 , and u0 .
From the definition of the marginal ray, an object or image is located at any location (value of
z) where y = 0. Similarly, the aperture stop, entrance pupil, and exit pupil are located at values
of z where y = 0. An image exists wherever the marginal ray crosses the axis and the aperture
stop or pupils are located wherever the chief ray crosses the axis. Complete specification of these
two rays is sufficient to characterize the location of object and image(s), the field of view, and the
magnifications.
The chief ray is the axis of the unvignetted light beam from a point at the edge of the field of view.
The radius of the unvignetted light beam (or perhaps more appropriately called the semidiameter
to avoid potential confusion with the “radius of curvature) is the sum of the heights of the marginal
and chief rays:
dunvignetted
= y + y at any location z
2
92
Figure 2.2: The marginal and chief rays for a two-element imaging system where the second element
is the stop. The marginal ray comes from the center of the object O, grazes the edge of the stop and
through the center of the image O0 . The chief ray travels fromt the edge of the object through the
center of the stop to the edge of the image.
Because paraxial calculations are linear, it is customary to normalize the ray heights and angles
for the calculation and then scaling the results to satisfy the conditions of the specific system. For
example, we generally select the chief ray height y = 1 and the marginal ray angle u = 1 at the object.
Clearly the choice of unit ray angle (in radians) is inconsistent with the paraxial approximation, but
this is just a computational convenience because all quantities are scalable.
2.16.1
Telecentricity
If the aperture stop is located such that the entrance and/or exit pupils are at infinity, then the
system is telecentric. One way to do this is to place the aperture stop at one of the focal points of
the system, which means that the corresponding pupil is at the same location and the other pupil
is at infinite. As shown in the figure, if the stop is located at the object-space focal point of a single
thin lens, then the entrance pupil is at the same location and the exit pupil is at infinity in image
space — this is an image-space telecentric system.
2.16 MARGINAL AND CHIEF RAYS
93
Telecentric system consisting of single thin lens with aperture stop placed at object-space focal point,
showing chief ray (solid blue) and marginal ray (red). The chief intersects the optical axis at that
focal point and so emerges from the lens parallel to the optical axis. The dashed blue lines parallel to
the chief ray intersect at the image. The defocused image is the same height as the focused image.
If the stop is located at the image-space focal plane, then the entrance pupil is at infinity, forming
an object-space telecentric system. If either the entrance or exit pupil is at infinity, then the chief
ray must be parallel to the optical axis on that side of the imaging system. This means that the
system transverse magnification will be constant even if the image is blurry. Put another way, a
blurred image has the correct magnification.
A “double telecentric” system is an afocal system (telescope) with the stop located at the common
focal plane of the two lenses. This means that both the entrance and exit pupils are at infinity. The
fact that the magnification of the system does not depend on accuracy of focusing makes telecentric
systems particularly useful for metrology.
Double telecentric system with the aperture stop at the common focal point of the two lenses. The
marginal ray is shown in red and the chief ray in solid blue.
94
2.16.2
Marginal and Chief Rays for Telescopes
The marginal ray of an afocal system used to image an object at infinity travels parallel to the
optical axis before the first lens and after the last (u = 0, u0 = 0). The relative sizes of the two
lenses determine which is the aperture stop — for a Galilean telescope, the aperture stop is usually
the negative ocularlens
MORE TO COME
Chapter 3
Tracing Rays Through Optical
Systems
The imaging equation(s) become quite complicated in systems with more than a very few lenses.
However, we can determine the effect of the optical system by ray tracing, where the action on
two (or more) rays is determined. Raytracing may be paraxial or exact. Historically, graphical,
matrix, or worksheet ray tracing were commonly used in optical design, but most ray tracing is now
implemented in computer software so that exact solutions are more commonly implemented than
heretofore.
3.1
Paraxial Ray Tracing Equations
Consider the schematic of a two-element optical system made of thick lenses, so the vertices and
principal planes of individual lenses do not coincide at the same points.
Schematic of ray tracing of a provisional marginal ray from an object at an infinite distance. The
system has two elements and the locations Hn and Hn0 are the principal planes of the nth element.
The ray height at the nth element is yn and the ray angle during transfer between elements n − 1
and n is un .
The two elements are represented by their two principal “planes”, which are the planes of unit
magnification. The refractive power of the first element changes the ray angle of the input ray. In
the example shown, the input ray angle u1 = 0 radians, i.e., the ray is parallel to the optical axis.
The height of this ray above the axis at the object-space principal plane H1 is y1 units. The ray
95
96
CHAPTER 3 TRACING RAYS THROUGH OPTICAL SYSTEMS
Figure 3.1: Refraction of a paraxial ray at a surface with radius of curvature R between media with
refractive indices n and n0 . The ray height and angle at the surface are y and u, respectively. The
angle of the ray measured at the center of curvature is α. The height and angle immediately after
refraction are y and u0 . The object and image distances are s and s0 (which are now called z and
z 0 in the text).
emerges from the principal plane H01 at the same height y1 but with a new ray angle u2 . The ray
“transfers” to the second element through the distance t2 in the index n2 and has ray height y2 at
principal plane H2 . The ray emerges from the principal plane at the same height but a new angle
u3 .
3.1.1
Paraxial Refraction
Consider refraction of a paraxial ray emitted from the object O at a surface with radius of curvature
R. For a paraxial ray, the surface may be drawn as “vertical”. The height of the ray at the surface
is y.
From the drawing, the incoming ray angle u measured from the optical axis is:
hy i y
∼
u = tan−1
= >0
z
z
and the corresponding equation for the outgoing ray measured from the optical axis is:
hyi
y
u0 = tan−1 0 ∼
= 0 >0
z
z
The angle of the height of the ray at the refractive surface measured from the center of curvature is:
hyi
y
∼
α = − tan−1
=−
R
R
The incident and refracted angles measured from the surface at height y are the angles of incidence
and refraction. From the drawing:
i = u−α
i0 = u0 − α
3.1 PARAXIAL RAY TRACING EQUATIONS
97
Now apply Snell’s law in the paraxial approximation:
n sin [i] = n0 sin [i0 ] =⇒ n · i ∼
= n0 · i0
n · (u − α) = n0 · (u0 − α)
=⇒ n0 u0 ∼
= nu − nα + n0 α = nu + α (n0 − n)
³ y´
= nu + −
· (n0 − n)
R
(n0 − n)
= nu − y ·
R
≡ nu − y · φ
n0 u0 ∼
= nu − y · φ
The paraxial refraction equation in terms of the incident angle u, refracted angle u0 , ray height y,
1
surface power φ = , and indices of refraction n and n0 is:
f
φ=
3.1.2
n0 u0 − nu
y
Paraxial Transfer
Paraxial transfer from one surface to the next in a medium with refractive index n0 .
The transfer equation determines the ray height y 0 at the next surface given the initial ray height
y, the physical distance t0 and the ray angle u0 in the medium with index n0 . From the drawing, we
have:
y 0 = y + t0 · u0
µ 0¶
t
y0 = y +
· (n0 u0 )
n0
where the substitution was made to put the ray angle in the same form n0 u0 that appeared in the
0
refraction equation. The distance nt 0 ≤ t0 is called the reduced thickness (note the potential for
98
confusing reduced thickness
3.1.3
t0
n0
and optical path length n0 t0 ).
Linearity of the Paraxial Refraction and Transfer Equations
Note that both the paraxial refraction and transfer equations are linear in the height and angle,
i.e., neither includes any operations involving squares or nonlinear functions (such as sine, tangent,
or logarithm). Among other things, this means that they may be scaled by direct multiplication
to obtain other “equivalent” rays, as to match the marginal ray height to the semidiameter of the
aperture stop or the chief ray angle to the semidiameter of the field stop. For example, the output
angle may be scaled by scaling the input ray angle and the height by a constant factor α:
α (nu − yφ) = α · (nu) − (α · y) φ = α (n0 u0 )
We will take often advantage of this linear scaling property to scale rays to to find the exact marginal
and chief rays from the provisional counterparts.
3.1.4
Paraxial Ray Tracing
To characterize the paraxial properties of a system, two provisional rays are traced:
1. Initial height of marginal ray at first surface: y = 1.0, initial marginal ray angle nu = 0;
2. Initial height of chief ray at first surface: y = 0.0, initial chief ray angle nu = 1.
We have already named these rays; the first is the provisional marginal ray that intersects the
optical axis at the object (and thus also at every image of the object). The second ray (distinguished
by the overscore) is called the provisional chief (or principal) ray and travels from the edge of the
object to the edge of the field of view through the center of the stop (and thus through the center of
the pupils, which are images of the stop). Since the paraxial ray tracing equations are linear, these
provisional rays may be scaled to the parameters of the system.
The process of ray tracing is perhaps best introduced by example. Consider a two-element
three-surface system. The first surface is the cornea, with radius of curvature in the model of
R1 = +7.8 mm. The “aqueous humor” between the cornea and the lens has a thickness of in the
model of 3.6 mm and refractive index of n2 = 1.336. The surfaces of the lens have curvatures
R2 = +10 mm, and R3 = −6 mm, thickness of 3.6 mm, and refractive index n3 = 1.413. The “vitreous humor” between the lens and the retina has the same refractive index of n4 = 1.336 as the
“aqueous humor.”
3.1 PARAXIAL RAY TRACING EQUATIONS
99
Marginal and chief rays traced through the three-surface optical system.
The refraction at the first surface changes the angle but not the height of a ray from the object.
If the incident ray angle is 0 radians, then the new ray angle for the provisional marginal ray is:
£
¤
(n0 u0 )1 = (nu)1 − y1 [ mm] · φ1 mm−1
= 0 − (1.0) (+0.043077)
= −0.043077 radian
Note that we are retaining 6 decimal places in this calculation to ensure the best result at the end.
We will then truncate (round) the value to a more reasonable accuracy.
The transfer equation for the provisional marginal ray between the first and second surface
changes the height of the ray but not the angle. The height at the second surface is:
µ 0¶
t
y10 = y1 +
(n0 u0 )1 [ mm]
n0 1
3.6
=1+
(−0.043077) = +0.883924 mm
1.336
∼ −0.04 radians and arrives at the
Thus the ray exits the first surface at the “reduced angle” n0 u0 =
second surface at height y 0 ∼
= +0.88 units. The corresponding equations for the chief ray at the first
surface are:
(n0 u0 )1 = (nu)1 − y 1 φ1
= 1 − (0.0) (+0.043077)
= 1 radian
100
y10
µ
¶
t0
= y1 +
(n0 u0 )1
n0 1
3.6
=0+
(1) = +2.694611 mm ∼
= 2.695 mm
1.336
Since the provisional chief ray went through the center of the lens, its angle did not change. The
height of the chief ray at the second surface is proportional to the ray angle.
Ray-Tracing Table
The equations may be evaluated in sequence to compute the rays through the system. These are
presented in the table. Each column in the table represents a surface in the system and the “primed”
quantities refer to distances and angles following the surface. In words, t0 in the first row are the
distances from the surface in the column to the next surface.
P aram eter
In i t i a l
R
t0
n0
1.0
0
−φ = − n −n
R
t0
n0
S u rfa ce 1
S u rfa ce 2
S u rfa c e 3
+7.8 mm
+10.0 mm
−6.0 mm
3.6 mm
3.6 mm
1.336
1.413
1.336
−0.043077 mm−1
3.6 mm = 2.694611 mm
1.336
−0.007700 mm−1
3.6 mm = 2.54771 mm
1.413
−0.012833 mm
Im a g e S u rfa c e
⇓
12.699 mm
R ays ⇓
y
n0 u0
1 mm
1 mm
0.883924 mm
0.756833 mm
0 mm
0
−0.043077 r a d i a n
−0.049883 r a d i a n
−0.059596 r a d i a n
−0.059596 r a d i a n
0 mm
2.694611 mm
5.189519 mm
16.779317 mm
1 ra d ia n
1 ra d ia n
0.979251 r a d i a n
0.912654 r a d i a n
y
n0 u0
The raytrace
indicates
⎤ marginal ray emerges from the last surface with height
⎡
⎤ ⎡ that the provisional
y
0.756833 mm
⎦. These are used to calculate the (boxed) distance to
n0 u0
−0.059596 radians
the image location (where the marginal ray height is 0):
and angle ⎣
⎦=⎣
t0 0 0
(n u )
n0
t0
0 = (+0.756833) + 0 (−0.059596)
n
t0
+0.756833 ∼
=⇒ 0 =
= +12.699 mm
n
0.059596
y0 = 0 = y +
This is the “reduced distance” in the image medium with index n4 ; the physical distance t0 is:
=⇒ t0 =
+0.756833
mm · n0 = 12.699 · 1.336 ∼
= 16.966 mm
0.059596
The height and angle of the provisional chief ray at the image location are y ∼
= 16.78 mm and
n0 u0 ∼
0.91
radians,
respectively,
which
may
be
scaled
to
the
size
of
a
known
sensor
to determine
=
the field of view.
This particular system is often used as a model for the human eye with the lens “relaxed” to view
objects at ∞. The first surface represents the cornea of the eye, while the other two surfaces are
the front and back of the lens. Note that the power of the cornea (0.043077 mm−1 ∼
= 43 diopters) is
considerably larger than the powers of the lens surfaces (7.7 diopters and 12.8 diopters, respectively).
3.2
Matrix Formulation of Paraxial Ray Tracing
The same linear paraxial ray tracing equations may be conveniently implemented as matrices acting
on ray vectors for the marginal and chief rays whose components are the height and angle. The ray
3.2 MATRIX FORMULATION OF PARAXIAL RAY TRACING
101
vectors may be defined as:
⎡
y
paraxial marginal ray vector : ⎣
nu
paraxial chief ray vector : ⎣
nu
⎡
y
⎤
⎦
⎤
⎦
Note that there is nothing magical about the convention for the ordering of y and nu (i.e., which goes
“on top” of the vector); this is the convention used by Roland Shack at the Optical Sciences Center
at the University of Arizona, but Willem Brouwer’s book “Matrix Methods in Optical Instrument
Design” uses the opposite order. Note that the choice of convention here determines the form of the
system matrix, but the two choices are equivalent.
In this notation, the two column vectors that represent the marginal and chief rays may be
combined to form a ray matrix L:
⎞⎤ ⎡
⎤
⎡⎛
⎞⎛
y
y y
y
⎠⎦ = ⎣
⎦
⎠⎝
L ≡ ⎣⎝
nu
nu nu
nu
which may be evaluated at any point in the system. The determinant of this ray matrix is:
det [L] = y · (nu) − (nu) · y ≡ ℵ
which we shall show to be a constant — the so-called Lagrange invariant. In words, the Lagrange
invariant is the product of the chief ray height and marginal ray angle subtracted from the product
of the marginal ray height and chief ray angle. We denote it by the symbol ℵ (“aleph,” chosen here
for the simple reason that it is distinctive). We shall see that ℵ is unaffected by both the refraction
and transfer, and therefore is invariant as we progress through different locations in the system.
3.2.1
Refraction Matrix
Given the ray vectors or the ray matrix, we can now define operators for refraction and transfer.
Recall that paraxial refraction of a marginal ray and of a chief ray at a surface with power φ changes
the ray angles but not the heights (at the surfaces):
n0 u0 = nu − y · φ for marginal ray
n0 u0 = nu − y · φ for chief ray
The refraction process for the marginal ray may be written as a matrix R and the output is the
product with the ray vector which will have the same ray height and a different angle:
⎡
⎤
⎤
⎡
y y
y
y
⎦
⎦ = ⎣
R⎣
nu nu
n0 u0 n0 u0
⎤
⎡
a c
⎦
R = ⎣
b d
102
where we need to evaluate the four values a − d. Consider the action of the refraction matrix on the
marginal ray:
⎤ ⎡
⎤⎡
⎤
⎡
⎡
⎤
y
y
a c
y
⎦=⎣
⎦⎣
⎦ = ⎣
⎦
R⎣
n0 u0
nu
b d
nu
ay + c · (nu) = y =⇒ a = 1, c = 0
by + d · (nu) = n0 u0 = nu − y · φ =⇒ b = −φ, d = 1
substitute these values to see the form of the refraction matrix:
⎡
⎤
1 0
⎦
R=⎣
−φ 1
The determinant of the refraction matrix is:
⎤
⎡
1 0
⎦ = (1) (1) − (−φ) (0) = 1
det R = det ⎣
−φ 1
The action of a refraction matrix R on a ray matrix L is:
RL = L0
⎤ ⎡
⎤
y0
1 0
y y
y0
⎣
⎦⎣
⎦=⎣
⎦
nu nu
−φ 1
n0 u0 n0 u0
⎡
⎤
y
y
⎦
=⎣
nu − y · φ nu − y · φ
⎡
⎤⎡
The determinant of the ray matrix after refraction is:
£ ¤
det L0 = y (nu − y · φ) − y (nu − y · φ)
= y · nu − yy · φ − y · nu + yy · φ
= y · nu − y · nu = ℵ = det [L]
which confirms that the Lagrangian invariant is not affected by refraction.
3.2.2
Ray Transfer Matrix
The transfer of the marginal ray from one surface to the next within the medium with index n0 is
y0 = y +
t0 0 0
(n u )
n0
which also may be written as the product of a ray matrix T with the marginal ray vector:
µ 0¶⎤
⎡
⎤ ⎡
t
0 0
y
⎢ y + (n u ) n0 ⎥
⎣
⎦
T
=⎣
⎦
n0 u0
n0 u0
⎡ µ 0 ¶ ⎤⎡
⎤ ⎡
⎤
t
0
1
y
y
⎢
⎦=⎣
⎦
n0 ⎥
=⎣
⎦⎣
0 0
0 0
n
u
n
u
0
1
103
so the determinant of the transfer matrix also is 1:
⎡ µ 0¶⎤
t
µ 0¶
t
⎢ 1 n0 ⎥
det ⎣
=1
⎦ = (1) (1) − (0)
n0
0
1
The action of the transfer matrix T on the ray matrix L is:
⎡
⎤ ⎡ µ t0 ¶ ⎤ ⎡
⎤
0
0
1
y
y
y
y
n0 ⎥
⎦=⎢
⎦
L0 = T L = ⎣
⎦⎣
⎣
0 0
0 0
n0 u0 n0 u0
nu nu
0
1
⎡
µ 0¶
µ 0¶
⎤
t
t
0 0
0 0
y
+
u
y
+
u
·
n
·
n
⎢
⎥
n0
n0
=⎣
⎦
n0 u0
n0 u0
and the determinant of the ray matrix after the transfer operation is:
det [L0 ] = det [T L]
µ
µ 0¶
µ
µ 0¶ ¶
¶
t
t
0
0 0
0 0
0
= y +
n u (n u ) − y +
nu (n0 u0 )
n0
n0
µ 0¶
µ 0¶
t
t
0
0 0
0 0
0 0
0
=y ·nu +
n u · n u − y · nu −
n0 u0 · n0 u0
n0
n0
= y 0 · n0 u0 − y 0 · n0 u0 = ℵ = det [L]
so the determinants of the ray matrix before and after refraction are also identically the Lagrangian
invariant ℵ; in other words, neither the refraction nor the transfer matrices has any effect on the
determinant of a ray matrix, so the Lagrangian invariant is preserved by refraction or transfer (hence
its name!).
Ray Transfer Matrix for an Optical System
The refraction and transfer matrices may be combined in sequence to model a complete system. If
we start with the marginal ray vector at the input object, the first operation is transfer to the first
surface. The next is refraction by that surface, transfer to the next, and so forth until a final transfer
to the output image:
¡
¢
T n Rn · · · T 2 R2 T 1 R1 T 0 Lob ject = Limage
If the initial ray matrix is located at the object (as usual), the marginal ray height is zero, so the
ray matrix at the object and any images has the form:
⎤
⎡
0
y in
⎦
Lob ject = ⎣
(nu)in (nu)in
⎡
⎤
0
y out
⎦
Limage = ⎣
(nu)out (nu)out
104
so the system from object to image is:
S ≡ T n Rn · · · T 2 R2 T 1 R1 T 0
S · Lob ject = Limage
⎡
⎤ ⎡
⎤
y in
y out
0
0
⎦=⎣
⎦
(T n Rn · · · T 2 R2 T 1 R1 T 0 ) ⎣
(nu)in (nu)in
(nu)out (nu)out
Note that the individual refraction and transfer matrices are sequenced in inverse order, i.e., the
last matrix is the first in the sequence for the system. The transfer matrix T 0 acts on the input ray
matrix, so it must appear on the right.
Ray Matrix for Provisional Marginal and Chief Rays
The system is characterized by using provisional marginal and chief rays located at the object. The
linearity of the computations ensure that the rays may be scaled subsequently to satisfy other system
constraints, such as the diameter of the stop. The provisional marginal ray at the object has height
y = 0 and ray angle nu = +1, while the provisional chief ray at the object has height y = +1 and
angle nu = 0. Thus the provisional ray matrix at the object is:
⎡
⎤
0 1
⎦
L0 = ⎣
1 0
3.2.3
“Vertex-to-Vertex Matrix” for System
We can construct a matrix that represents JUST the optical system by excluding the input ray
matrix, the transfer matrix from object to object-space vertex, the transfer from image-space vertex
to image, and the output ray matrix. This subset is the “vertex-to-vertex matrix” MVV0 of the
system and is a complete specification of the paraxial properties of the system. The general form
for the matrix is:
⎤
⎡
A B
⎦
MVV0 = (Rn · · · T 2 R2 T 1 R1 ) = ⎣
C D
where A, B, C, D are factors to be determined from the various refractions and transfers for a specific
system. The entries A and D in the matrix are “pure” numbers (without units), while B and D
have dimensions of length and reciprocal length, respectively. From matrix algebra, it is possible to
show that the determinant of the matrix product is the product of the determinants. We already
know that the determinants of the matrices for any transfer or refraction is unity, which establishes
a constraint on the vertex-to-vertex matrix:
det [MVV0 ]
=
=
=⇒
=⇒
det Rn · det T n−1 · · · · · det R2 · det T 1 · det R1
1 · 1 · ··· · 1 · 1 = 1
det [MVV0 ] = 1
AD − BC = 1
−1
Consider a simple example of the matrix MVV0 for a two-lens system with powers φ1 = (f1 )
and φ2 = (f2 )−1 separated by t. The product of the two refraction matrices and the transfer matrix
105
is:
MVV0 =R2 T 1 R1
⎤⎡
⎤⎡
⎤
⎡
1 t
1 0
1 0
⎦⎣
⎦⎣
⎦
=⎣
0 1
−φ1 1
−φ2 1
⎡
⎤
1 − φ1 t
t
⎦
=⎣
− (φ1 + φ2 − φ1 φ2 t) 1 − φ2 t
⎤
⎡
t
1 − φ1 t
⎦
MVV0 =⎣
−φeff 1 − φ2 t
where the known expression for the system power
1
1
t
1
=
+ −
=⇒ φeff = φ1 + φ2 − φ1 · φ2 · t
feff
f1
f2 f1 · f2
has been substituted in the last expression. It is easy to confirm that the determinant of this system
matrix is unity.
We have four equations in the four unknowns A, B, C, D, which may be combined to find useful
systems metrics in terms of the elements in the vertex-to-vertex matrix MVV0 :
effective focal length of system
front focal length
back focal length
distance from front vertex to object-space principal point
distance from image-space principal point to rear vertex
distance from rear vertex to image (if obj. dist. t1 is known)
distance from object to front vertex (if image dist. t2 is known)
1
1
=−
φeff
C
FV
D
FFL =
=−
n
C
V0 F0
A
BF L =
=−
n
C
VH
D−1
=
n
C
H0 V0
1−A
=
n0
C
0
0
VO
t2
m−A
B − At1
= 0 =
=−
n0
n
C
D − Ct1
1
D−
t1
B + Dt2
OV
m
=
=
=
n
n
C
A + Ct2
feff =
When evaluating matrices, note that you need to retain plenty of significant figures in the calculation (at least 6) to ensure that the derived values are sufficiently accurate.
3.2.4
Example 1: System of Two Positive Thin Lenses
To illustrate, consider the system of two thin lenses in the last section with f1 = +100 mm, f2 =
200
+50 mm, and t = 75 mm, which we showed to have feff = +
mm ∼
= 66.7 mm. The system matrix
3
is:
⎤
⎡
⎤ ⎡
1 − φ1 t
A B
t
⎦
⎦=⎣
MVV0 = ⎣
C D
− (φ1 + φ2 − φ1 φ2 t) 1 − φ2 t
⎡
⎤⎡
⎤⎡
⎤ ⎡
⎤
1
1
0
1
0
1 75 mm
75
mm
4
⎦⎣
⎦⎣
⎦=⎣
⎦
=⎣
1
1
− 50 mm 1
− 100 mm 1
0
1
− 2003mm − 12
106
and its determinant evaluates to one:
⎡
det ⎣
1
4
75 mm
− 2003mm
− 12
From the values in the last section, we can see that
⎤
⎦=1
B = 75 mm = t
1
200
−
=
mm = feff
C
3
which in turn demonstrates our old result that the power of a two-lens system is:
C=−
1
feff
=⇒ φ = φ1 + φ2 − φ1 φ2 t =
1
1
t
+ −
f1 f2
f1 f2
The input ray matrix consists of the provisional marginal and chief rays at the object, which
“pass through” the transfer matrix from object to front surface. For example, if the object is located
1000 mm from the front vertex, the transfer matrix is:
⎡
⎤
1 1000 mm
⎦
T0 =⎣
0
1
If a ray is “cast out” from the center of the object (y = 0) at an angle of 1 radian, the
⎤
⎤
⎡
⎡ ⎤ ⎡
⎤ ⎡
y0
y
0
1000 mm
⎦=T0⎣ ⎦=⎣
⎦
⎦=⎣
T0⎣
nu
1
1
n0 u0
In words, the height of the provisional marginal ray at the front vertex is 1000 mm and the angle is
1 radian, a HUGE angle, but remember that all equations in this paraxial assumption are linear, so
the angle and ray height can be scaled to any value. The emerging provisional marginal ray is:
⎡
⎤ ⎡
⎤⎡
⎤
⎤ ⎡
1
0
325
mm
75
mm
1000
mm
y
4
⎣
⎦=⎣
⎦⎣
⎦
⎦=⎣
31
3
−
−
− 12
1
n0 u0
200 mm
2
In words, the marginal ray from an object 1000 mm at an angle of 1 radian at the front vertex of the
lens emerges from the image-space vertex with height y 0 = 325 mm and angle of n0 u0 = − 31
2 radians.
To find the location of the image, find the distance until the marginal ray height y = 0, which is
the location of the image:
⎤⎡
⎡
⎤ ⎡
⎤ ⎡
⎤
t0
325 mm
325 mm
0
1
0
V0 O0 = T ⎣
31 ⎦ = ⎣ 31 ⎦
31 ⎦ = ⎣ n ⎦ ⎣
−
−
−
0
1
2 µ
2
2
¶
0
31 t
=⇒ 325 mm + − · 0 = 0
2 n
µ
¶
0
t
2
650
∼ +20.97 mm
=⇒ = 325 mm · +
=+
mm =
1
31
31
which agrees with the result obtained earlier. We observed that the transverse magnification of the
image in this configuration is
MT = −
2 mm ∼
z0
H0 O0
=−
=−
= −0.064
z
31 mm
OH
107
so the provisional marginal ray at the image point is:
⎡
⎤ ⎡
⎤
⎤ ⎡
0
y0
0
⎣
⎦ = ⎣ 31 ⎦ = ⎣
⎦
−
n0 u0
MT−1
2
The marginal ray out of the vertex-to-vertex matrix for the object distance OV = 1000.
Back Focal Length (BFL)
The image of an object located at ∞ is the image-space focal point of the system. This ray enters
the system with angle nu = 0 and arbitrary height, which we can model as y = 1. The emerging
ray is:
⎡
⎤⎡ ⎤ ⎡
⎤
1
1
75
⎢ 4
⎥⎣ 1 ⎦ ⎢ 4 ⎥
=⎣
⎣
1 ⎦
3
3 ⎦
0
−
−
−
200
2
200
3
The ray height is 14 mm and the angle is n0 u0 = − 200
. The distance to the point where the ray
height is zero is the back focal distance:
⎤ ⎡
⎡
⎤⎡ 1 ⎤ ⎡
⎤
1
t0
0
1
⎢
⎥
⎥
⎢
BF L = V0 F0 = T ⎣ 43 ⎦ = ⎣ n0 ⎦ ⎣ 43 ⎦ = ⎣
3 ⎦
−
−
−
0 1
200
200
200
¶
µ
0
1
3
t
=⇒ + −
=0
·
4
200 mm n0
t0
1 200 mm
100
=⇒ = ×
=
mm ∼
= 16.7 mm
1
4
3
6
Front Focal Length (FFL): Ray Through “Reversed” System
To find the front focal distance, we can trace the “provisional” marginal ray “backwards” through
the system, or trace it through the “reversed” system where the lenses are placed in the opposite
order. The “reversed” system matrix is:
⎤
⎡
⎤⎡
⎤⎡
⎤ ⎡
1
−
75 ⎥
1
0
1 0
1 75
2
⎦⎣
⎦⎣ 1
⎦=⎢
(MVV0 )reversed = ⎣
⎣
1
3 1 ⎦
−
1
−
1
0 1
−
100
50
200 4
Note that the “diagonal” elements of the “forward” and “reversed” vertex-to-vertex matrices are
“swapped”, while the “off-diagonal” elements are identical.
108
If the input ray height is 1 and the angle is 0, the outgoing ray from the reversed matrix is:
⎡
⎤⎡ ⎤ ⎡
⎤
1
1
1
− mm
−
−
75
mm
1
100
⎢
⎥
⎢
⎥
2
2
⎣ ⎦=⎣ 2
⎣
3 1 ⎦
3 ⎦ =⇒ F F L = FV = µ 3 ¶ = + 3 mm
0
−
−
−
200 4
200
200
3.2.5
Example 2: Telephoto Lens
To illustrate, we apply the vertex-to-vertex matrix for the thin-lens telephoto considered in the last
section with f1 = +100 mm, f2 = −25 mm, and t = +80 mm:
⎡
⎤⎡
⎤⎡
⎤
1
0
1
0
1 +80 mm
⎦⎣
⎦⎣
⎦
MVV0 = ⎣
1
1
−
1
−
1
0
1
−25 mm
100 mm
⎡
⎤ ⎡
⎤
1
80
mm
t
t
1
−
φ
⎢
⎥ ⎣
1
5
⎦
= ⎣
1
21 ⎦ =
− (φ1 + φ2 − φ1 φ2 t) 1 − φ2 t
−
500 mm
5
1
=⇒ feff = − = +500 mm
C
µ ¶
1
A
· (−500 mm) = +100 mm
=⇒ BF L = − = −
C
5
µ ¶
21
D
=⇒ F F L = − = −
· (−500 mm) = +2100 mm
C
5
µ
¶
D−1
21
VH
=⇒
=
=
− 1 · (−500 mm) = −1600 mm =⇒ HV = +1600 mm
n
C
5
µ
¶
D−1
21
VH
=⇒
=
=
− 1 · (−500 mm) = −1600 mm =⇒ HV = +1600 mm
n
C
5
µ
¶
1−A
1
H0 V0
=⇒
=
=
1
−
· (−500 mm) = −400 mm =⇒ V0 H0 = +400 mm
n0
C
5
If the object is located 1000 mm from the first surface, the ray matrix at the front vertex of the
system is :
⎡
⎤
⎡ ⎤
y
0
⎦=T0⎣ ⎦
T0⎣
nu
1
⎤
⎤⎡ ⎤ ⎡
⎡
1000 mm
0
1 1000 mm
⎦
⎦⎣ ⎦ = ⎣
⎣
1
1
0
1
The height of the provisional marginal ray at the front vertex is 1000 units and the angle is 1 radian,
which are huge values, but can be scaled to any value because all equations are linear.
⎡
⎤⎡
⎤ ⎡
⎤
⎤ ⎡
1
80
mm
280
mm
1000
mm
y
⎢
⎥⎣
5
⎦ = ⎣ 11 ⎦ = ⎣
⎦
⎣
1
21 ⎦
1
nu
−
5
500 mm
5
In words, the marginal ray from an object 1000 mm in front of the lens emerges with height 280 mm
11
and angle of +
radians.
5
109
To find the location of the image, find the distance until the marginal ray height y = 0:
⎤⎡
⎡
⎤ ⎡
⎤ ⎡
⎤
t0
280 mm
280 mm
0
1
V0 O0 = T ⎣ 11 ⎦ = ⎣ n0 ⎦ ⎣ 11 ⎦ = ⎣ 11 ⎦
0 1
5
5
5
µ
¶
11 t0
=⇒ 280 mm + + · 0 = 0
5 n
µ
¶
t0
5
1400
=⇒ = 280 mm · −
=−
mm ∼
= −127.3 mm
1
11
11
which indicates that the image is virtual. (Figure out why!)
The magnification of the image in this configuration is
MT = −
3.2.6
2
z0
OH mm
=−
=− 0 0
z
31
H O mm
MVV0 Derived From Two Rays
Consider the action of the vertex-vertex matrix on two rays that we know both before and after the
system. For two arbitrary (but noncollinear) rays, we have:
⎡
⎤
⎡
⎤
y1
y10
⎦ = ⎣
⎦
MVV0 ⎣
nu1
nu01
⎡
⎤
⎡
⎤
y2
y20
⎦ = ⎣
⎦
MVV0 ⎣
nu2
nu02
In actual use, the marginal ray and chief ray are the rays of choice. The marginal ray goes from
the center of the object to the center of the image while grazing the edge of the aperture stop (and
therefore the edge of the entrance and exit pupils), while the chief ray goes from the edge of the
object through the center of the aperture stop (and therefore of the pupils) to the edge of the image.
The vertex-vertex matrix applied to the incoming marginal from the center of the object yields the
emerging marginal ray:
⎡
⎤ ⎡
⎤
MVV0 ⎣
y
nu
and the same relation for the chief ray is:
⎡
MVV0 ⎣
ȳ
nū
⎦=⎣
⎤
⎡
⎦=⎣
y0
n0 u0
⎦
ȳ 0
⎤
n0 ū0
⎦
We can combine the two vectors to form a 2 × 2 matrix:
⎤
⎡
⎡
⎤
y0
y ȳ
ȳ 0
⎦ = ⎣
⎦
MVV0 ⎣
nu nū
n0 u0 n0 ū0
MVV0 L = L0
110
We can now use the properties of the 2 × 2 matrix to derive the form of vertex-vertex matrix:
(MVV0 L) L−1
(MVV0 L) L−1
=
=
L0 L−1
¡
¢
MVV0 LL−1 = MVV0 · I
=⇒ L0 L−1 = MVV0
In words, we can evaluate the vertex-vertex matrix from its action of the marginal and chief rays.
The inverse of the input-ray matrix is easy to derive:
⎡
⎤
y ȳ
⎦
L = ⎣
nu nū
⎡
⎤
nū
−ȳ
1
⎦
=⇒ L−1 =
·⎣
det L
−nu y
⎤
⎡
nū −ȳ
1
⎦
⎣
=
y · nū − ȳ · nu −nu y
⎤
⎡
1 ⎣ nū −ȳ ⎦
≡
ℵ −nu y
where ℵ ≡ y · nū − ȳ · nu is the previously defined Lagrangian invariant. So the vertex-vertex
matrix has the form:
⎡
⎤
⎡
⎤
µ
¶
ȳ 0
y0
nū −ȳ
1
⎦·
⎣
⎦
MVV0 = ⎣
y · nū − ȳ · nu
n0 u0 n0 ū0
−nu y
⎤⎞
⎤⎡
⎛⎡
nū −ȳ
ȳ 0
1 ⎝⎣ y 0
⎦⎠
⎦⎣
=
·
ℵ
−nu y
n0 u0 n0 ū0
⎤
⎡
y · ȳ 0 − ȳ · y 0
1 ⎣ y 0 · nū − ȳ 0 · nu
⎦
=
·
ℵ
n0 u0 · nū − n0 ū0 · nu n0 ū0 · y − n0 u0 · ȳ
¯ ⎤
¯
¯
⎡ ¯
¯ 0
¯
¯
¯ 0
¯ y ȳ 0 ¯
¯ y ȳ ¯
¯ ⎥
¯
¯
⎢ ¯
¯
¯ 0 0¯ ⎥
⎢ ¯
¯
¯ ⎥
¯
¯
⎢
y
nu
nū
ȳ
1 ⎢ ¯
¯ ¯
¯⎥
=
·⎢ ¯
¯
¯
¯⎥
ℵ ⎢ ¯ nu nū ¯ ¯ y
ȳ ¯ ⎥
¯ ¯
¯⎦
⎣ −¯
¯ 0 0 0 0¯ ¯ 0 0 0 0¯
¯ n u n ū ¯ ¯ n u n ū ¯
where we have used the shorthand notation for the determinant in the last expression:
¯
⎡
⎤ ¯
¯ 0
¯
¯ y ȳ 0 ¯
y 0 ȳ 0
¯
⎦=¯
det ⎣
¯
¯
¯ nu nū ¯
nu nū
3.3
Object-to-Image (Conjugate) Matrix
The vertex-vertex matrix applied to a “test ray” with height y and angle u in index n from the
object to the front vertex is:
111
3.3 OBJECT-TO-IMAGE (CONJUGATE) MATRIX
⎡
MVV0 ⎣
y
nu
⎤
⎡
⎦ = ⎣
A B
C D
⎤⎡
⎦⎣
y
nu
⎤
⎡
⎦=⎣
y 0 = A · y + B · (nu)
nu0 = C · y + D · (nu)
y0
nu0
⎤
⎦
For rays emerging from one plane and converging to the corresponsing “conjugate” plane (the image),
the output ray height at the image is a function ONLY of the image ray height — the angles of all
rays at the object do not matter, since they all converge to the image. In mathematical terms:
y0
= Ay + B · (nu) = f [y] (does not depend on angle)
=⇒ B = 0
=⇒ y 0 = A · y
We know the relationship between y 0 and y is the transverse magnification:
y0
= MT = A
y
rays (a, b, c) diverge from the object and converge as (a0 , b0 , c0 ) to form the image; the choice of
specific ray angle at the object has no effect on the location of the convergence — only the heights of
the rays at the object matter.
If we define the angular magnification to be the ratio of the angles “from” the object and “to”
the image::
∆u0
= Mθ ≡
∆u
we can find a relatiohsip from the matrices:
n0 u01 = C · y + D · (nu1 )
n0 u02 = C · y + D · (nu2 )
112
Evaluate the difference of these:
n0 (u02 − u01 )
n0 · ∆u0
=
=
C · y − C · y + D · (nu2 − nu1 )
n · D · (∆u)
n
∆u0
=⇒
≡ Mθ = 0 · D
∆u
n
n0
=⇒ D =
· Mθ
n
We can combine these two observations to see the form of the “conjugate-to-conjugate” matrix:
⎤
⎡
M
0
T
⎥
⎢
MOO0 = ⎣
⎦
1 n0
−
· Mθ
feff n
We know that the determinant of this matrix must also be one, which implies that:
MT ·
n0
n0
1
Mθ = 1 =⇒
Mθ =
n
n
MT
so we can also write the conjugate matrix as:
MOO0
⎤
MT
0
=⎣
1 ⎦
1
−
feff MT
⎡
The principal planes H and H0 are those for which MT = +1
MHH0
⎡
⎤
+1 0
⎦
=⎣
1
−
+1
feff
The points of equal conjugates are related by MT = −1, so the object-image matrix for these points
is:
⎡
⎤
−1 0
⎦
MOO0 = ⎣
1
−
−1
feff
We can include the translation matrices from object to vertex and from vertex to image along
with the vertex-to-vertex matrix MVV0 :
⎡
⎤
A B
⎦
MVV0 = ⎣
C D
The matrix that relates two conjugate planes (object O and image O0 ) may be obtained
by adding¢
¡
transfer matrices for the appropriate distances from the object to the front vertex t1 = n1 · OV
113
3.3 OBJECT-TO-IMAGE (CONJUGATE) MATRIX
¡
¢
and from the rear vertex to the image t2 = n2 · V0 O0 , which yields for n1 = n2 = 1:
⎤
⎤
⎡
⎡
1 t2
1 t1
⎦ • MVV0 • ⎣
⎦
MOO0 = ⎣
0 1
0 1
⎡
⎤⎡
⎤⎡
⎤
1 t2
A B
1 t1
⎦⎣
⎦⎣
⎦
=⎣
0 1
C D
0 1
⎡
⎤
A + t2 C (A + t2 C) t1 + B + t2 D
⎦
=⎣
C
Ct1 + D
⎡
⎤
MT 0
=⎣
1 ⎦
−φ
MT
=⇒ MT = A + t2 C = (Ct1 + D)
φ = −C
0 = (A + t2 C) t1 + B + t2 D
−1
We know that the marginal ray heights at the object and image are zero (yin = yout = 0), which
sets some limits on the “conjugate-to-conjugate” matrix. Apply this matrix to the ray matrix L at
the object and at the image:
MOO0 L = L0
⎤⎡
⎤ ⎡
⎤
0
0
A + t2 C (A + t2 C) t1 + B + t2 D
y in
y out
⎦⎣
⎦=⎣
⎣
⎦
C
Ct1 + D
(nu)in (nu) in
(nu)out (nu)out
⎡
Evaluate the inverse matrix L−1 and apply to both sides from the right:
¡ ¢
(MOO0 L) L−1 = L0 L−1
⎤ ⎡
⎡
⎤ ⎡
⎤−1
0
A + t2 C (A + t2 C) t1 + B + t2 D
0
y out
y in
⎦=⎣
⎣
⎦·⎣
⎦
C
Ct1 + D
(nu)out (nu)out
(nu)in (nu)in
⎤
⎡
y out
0
⎥
⎢
y in
⎥
=⎢
⎣ (nu)out ·(nu)in −(nu)out ·(nu)in (nu)out ⎦
yin (nu)in
(nu)in
µ
¶
y out
The ratio of the chief ray heights at the object and image is the transverse magnification
≡ MT ,
y in
(nu)out
1
whereas the ratio of the marginal ray angles
=
(nu)in
MT
114
Example: System with Two Positive Thin Lenses
Again, consider the example of a system composed of two thin lenses with f1 = +100 mm, f2 =
+50 mm, and t = +75 mm:
⎤
⎤⎡
⎡
⎤⎡
⎤ ⎡
1
75 mm ⎥
1
0
1
0
1 75 mm
4
⎦⎣
⎦⎣
⎦=⎢
MVV0 = ⎣
⎣
1
1
1 ⎦
3
−
−
1
1
0
1
−
−
50 mm
100 mm
200 mm
2
From the table of properties of the matrix, we see that:
1
200
=+
mm
C
3
D
100
F F L = FV = − = −
mm
C
3
A
50
BF L = V0 F0 = − = + mm
C
3
D−1
VH =
= +100 mm
C
A−1
H0 V0 =
= +50 mm
C
feff = −
which again match the results obtained before. The matrix that relates the object and image planes
for the two-lens system presented above is:
⎤⎡
⎤
⎡ 650 ⎤ ⎡ 1
⎤ ⎡
2
75
0
−
1
1
1000
⎥⎣
⎥
⎦=⎢
31 ⎦ ⎢
T 2 MVV0 T 1 = ⎣
⎣ 43
⎣ 31
1⎦
3
31 ⎦
0 1
−
−
−
−
0 1
200
2
200
2
which has the form of the principal plane matrix except the diagonal elements are not both unity.
However, note that they are reciprocals of teach other, so that
⎡
⎤
2
−
0
⎢
⎥
det ⎣ 31
3
31 ⎦ = 1
−
−
200
2
2
We had evaluated the transverse magnification in this configuration to be MT = − , so we note
31
that the upper-left component of the conjugate-to-conjugate matrix is the transverse magnification.
The general form of a conjugate-to-conjugate matrix is:
MOO0
⎤
0
=⎣
1 ⎦
−φ
MT
⎡
MT
and the specific form that relates the principal planes with MT = 1 is
⎤
⎡
1 0
⎦
MHH0 = ⎣
−φ 1
This is the matrix of the equivalent “single thin lens.”
3.3.1
Matrix of the “Relaxed” Eye (focused at ∞)
The vertex-to-vertex matrix for the three refractions and two transfers is:
3.4 VERTEX-VERTEX MATRICES OF SIMPLE IMAGING SYSTEMS
⎡
MVV0 = ⎣
1
−φ3
115
⎤
⎤
⎤⎡
⎤⎡
⎤
t02 ⎡
t01 ⎡
1
1
0 ⎢
1 0 ⎢
1 0
0
0
⎥
⎥
⎦ ⎣ n2 ⎦ ⎣
⎦ ⎣ n1 ⎦ ⎣
⎦
−φ
−φ
1
1
1
2
1
0 1
0 1
where the individual terms evaluate to:
φ1 =
t01
=
n01
φ2 =
t02
=
n02
φ3 =
n01 − n1
1.336 − 1
=
= 4.3077 × 10−2 mm−1 = 43.077 m−1 = 43.077 Diopters
R1
7.8 mm
3.6 mm
= 2. 694 6 mm
1.336
n02 − n2
1.413 − 1.336
=
= 0.77 × 10−2 mm−1 = 7.7 Diopters
R2
10 mm
3.6 mm
= 2.547 8 mm
1.413
n03 − n3
1.336 − 1.413
=
= 1.2833 × 10−2 mm−1 = 12.833 Diopters
R3
−6 mm
so the vertex-to-vertex matrix has the form:
⎡
⎤
0.756 83
5.189 5 mm
⎦
MVV0 = ⎣
−2
−1
−5.959 6 × 10 mm
0.912 65
¡
¢−1
=⇒ feye = 5.959 6 × 10−2 mm−1
= +16.780 mm
−2
−1
=⇒ φeye = 5.9596 × 10 mm = −59.596 m−1 ∼
= 60 Diopters
A ray from infinity has a ray angle of zero, but the ray height is determined from the diameter of
the iris. If we assume that the iris diameter is 1 mm, then the output ray vector is:
⎡
⎤ ⎡
⎤
⎤⎡
⎤ ⎡
0.75683
5.1895 mm
1 mm
0.756 83 mm
y0
⎣
⎦=⎣
⎦
⎦⎣
⎦=⎣
−5.9596 × 10−2 mm−1 0.91265
0
−5.959 6 × 10−2
n0 u0
3.4
Vertex-Vertex Matrices of Simple Imaging Systems
We now get to where the “rubber meets the road;” the discussion of simple examples of actual
imaging systems. It is useful to emphasize the point that optical systems may create a real image
that may be “sensed” by a CCD or photographic emulsion, while those for human viewing will
produce virtual images or are afocal (image at infinity).
3.4.1
Magnifier (“magnifying glass,” “loupe”)
The magnifier or loupe is a lens (or system of lenses) with positive focal length that is used to
increase the size of the image on the retina than could be formed with the eye alone. Recall that
when the ciliary muscles that deform the eye lens are relaxed, the lens becomes “flatter,” increasing
the focal length. To view an object “close up,” the focal length of the lens must shorten by making
the lens more spherical. The closest distance to an object that appears to be sharply focused by
the unaided eye is the “near point,” which (obviously) depends on the flexibility of the deformable
eyelens and the capability of the ciliary muscles, which (obviously) vary with individual, and with
age for a single individual. The distance to the near point may be as close as 50 mm for a young
child and 1000 mm − 2000 mm for an elderly person. This reduction in “accommodation” is one
of the signs of aging. The near point of an “ideal” eye is assumed to be 250 mm ∼
= 10 in from
the front surface. For nearsighted individuals, the near point is closer to the eye, thus increasing
the angular subtense of fine details for those individuals. For this reason, nearsighted individuals
116
in ancient times (before optical correction) often were attracted to professions requiring fine work,
such as goldsmithing. Since nearsightedness can be a genetic trait, descendents often continued in
these crafts.
In use, the object is held closer to the eye than the near point and viewed through the positive
lens, which in turn is held closer to the eye than its focal length to create a virtual image “behind”
the lens at the near point. If the focal length of the magnifying lens is f = 100 mm and the image
is distance is z1 = 10 mm, the object-to-image matrix is:
⎡
⎤⎡
⎤⎡
⎤
1
0
1 −250 mm
1 z mm
⎦⎣
⎦⎣
⎦
MOO0 = ⎣
1
−
1
0
1
0 1
50 mm ⎤
⎡
=⎣
6
− 50 1mm
(6 · z − 250) mm
1−
1
50 z
⎦
Since this has the form of an “object-to-image” matrix, the off-diagonal element in the upper-right
corner must evaluate to zero:
(6 · z − 250) mm = 0 =⇒ z =
250 mm
2
= 41 mm
6
3
The diagonal element in the upper-left corner of the “object-to-image” matrix is the transverse
magnification
250 mm
MT = +6 = 1 +
f
This is the transverse magnificxation of the magnifier if the image is at the near point.
If the object is located at the object-space focal point, then the image is at infinity:
⎡
⎤⎡
⎤⎡
⎤
1
0
1 ∞ mm
1 50 mm
⎦⎣
⎦⎣
⎦
MOO0 = ⎣
1
−
1
0
1
0
1
50 mm µ
∙
¶¸
⎡
⎤
1
1
6
−
z
(z
−
250)
−
z
z
−
6
mm
⎢
⎥
50
50
=⎣
⎦
1
1
1− z
−
50 mm
50
⎡
⎤
∞
0
⎦
=⎣
1
−
0
50 mm
3.4.2
Galilean Telescope of Thin Lenses
The Galilean telescope is an afocal system formed from an objective lens with positive power and
an eyelens with negative power separated by the sum of the focal lengths. If the focal length of the
objective and eyelens are f1 = +200 and f2 = −25 units, the separation t = (200 − 25) = 175 units.
The system matrix is:
⎡
⎤⎡
⎤ ⎡
⎤
⎤⎡
1
1
0
1
0
⎢
⎥ ⎣ 1 175 mm ⎦ ⎢
⎥ ⎣ 8 175 mm ⎦
MVV0 = ⎣
⎦
⎣
⎦=
1
1
−
1
−
1
0
1
0
8
(−25 mm)
(+200 mm)
Note that the system power φ = 0 =⇒ feff = ∞, as it must be for an afocal system (both objectand image-space focal points at infinity). The ray from an object at ∞ with unit height generates
the outgoing ray:
⎡
⎣
1
8
175 mm
0
8
⎤⎡
⎦⎣
1 mm
0
⎤
⎡
⎦=⎣
y 0 [ mm]
n0 u0
⎤
⎡
⎦=⎣
1
8
mm
0
117
⎤
⎦
so the outgoing ray is at height 18 and the angle is zero; both incoming and outgoing rays are parallel
to the axis. Note that the diagonal elements of MVV0 are positive and the determinant is 1.
For a “provisional” chief ray into the system with height 0 and angle 1, the outgoing ray is:
⎤
⎡
⎤⎡ ⎤ ⎡
⎤ ⎡
1
0
175
mm
175
mm
y
[
mm]
⎦
⎣8
⎦⎣ ⎦ = ⎣
⎦=⎣
1
8
0
8
nu
So the outgoing ray angle is 8 times larger; this is the angular magnification of the telescope; the
image is upright since the incoming and outgoing ray angles are both positive. The form of an afocal
system is:
⎤
⎡ 1
0
⎦
MVV00 (afocal system) = ⎣ mθ
0 mθ
3.4.3
Keplerian Telescope of Thin Lenses
The Keplerian telescope with f1 = +200 and f2 = +25 units with separation t = (200 + 25) = 225
units. The system matrix is:
⎤⎡
⎤
⎤⎡
⎤ ⎡
⎡
1
0
1 225 mm
− 18 225 mm
1
0
⎦⎣
⎦
⎦⎣
⎦=⎣
⎣
− (+2001 mm) 1
0
1
− (25 1mm) 1
0
−8
The diagonal elements are negative, the determinant is 1, and the system power φ = 0 =⇒ feff = ∞.
The outgoing ray angle is −8, which specifies that the angular magnification is 8 and the image is
inverted.
The ray from an object at ∞ with unit height generates the outgoing ray:
⎡
⎤⎡
⎤ ⎡
⎤
⎤ ⎡
− 18 225 mm
1 mm
y 0 [ mm]
− 18 mm
⎣
⎦⎣
⎦=⎣
⎦
⎦=⎣
0
−8
0
n0 u0
0
so the outgoing ray is at height − 18 — the image is “inverted” and the angle is zero.
The “provisional” chief ray into the system has height 0 and angle 1; the outgoing ray is:
⎤
⎤⎡ ⎤ ⎡
⎡
⎤ ⎡
0
− 18 225 mm
225 mm
y 0 [ mm]
⎦
⎦⎣ ⎦ = ⎣
⎣
⎦=⎣
1
−8
0
−8
n0 u0
So the outgoing ray angle is 8 times larger than the incoming ray but negative (which implies that
the image is inverted).
3.4.4
Thick Lenses
The matrix method is convenient for thick lenses. If the thick lens is made of glass with n0 = 1.5,
radii of curvature R1 = +50 mm, and R2 = −100 mm, and thickness t0 (which we shall vary). It
is useful to evaluate the focal length of the single “thin” lens with these radii and refractive index
118
from the lensmaker’s equation:
¶
µ
1
1
1
−
= (n − 1) ·
f
R1 R2
µ
µ
¶¶−1
1
200
1
2
f = (1.5 − 1.0) ·
=+
−
mm = 66 mm
50 mm −100 mm
3
3
The powers of the two surfaces are:
n0 − n
1.5 − 1
0.5
1
=
=+
=+
R1
50 mm
50 mm
100 mm
n − n0
1 − 1.5
−0.5
1
=
φ2 =
=
=+
R2
−100 mm
−100 mm
200 mm
φ1 =
so if the thickness is zero, the focal length evaluates to:
φeff = φ1 + φ2 − φ1 · φ2 · t
µ
¶ µ
¶ µ
¶ µ
¶
1
1
1
1
= +
+ +
− +
· +
·0
100 mm
200 mm
100 mm
200 mm
3
=
200 mm
1
1
t
200
+ −
=+
feff =
mm
f1 f2
f1 · f2
3
which agrees with the result obtained from the lensmaker’s equation.
The system matrix for the lens with thickness t0 may be evaluated with this parameter:
MVV0 = R2 T 1 R1
⎡
⎤
⎤⎡
t0
1
0
⎢ µ
¶ ⎥ ⎣ 1 1.5 mm ⎦ ⎢ µ
¶ ⎥
=⎣
⎣
⎦
⎦
1
1
− +
− +
1
1
0
1
200 mm
100 mm
⎤
⎡
0
1 − 0.006666 7 · t
0.666 6667 · t0 mm
⎦
=⎣
1
1
0
0
(0.0033333
·
t
−
1)
−
1
−
0.003333
3
·
t
100 mm
200 mm
1
0
⎤⎡
Note that the thickness t0 is present in each of the four terms in the matrix. Now we can derive
matrices for different values of the thickness: t0 = 0 mm, 1 mm, 2 mm, 5 mm, and 10 mm, where we
substitute into the table of properties to find the BFL, FFL, VH, and H0 V0 :
t0 = 0 mm (thin lens)
⎡
MVV0 (t0 = 0 mm) = ⎣
1
100 mm
1 − 0.006666 7 · 0
(0.003333 3 · 0 − 1) −
⎡
⎤
1
0
⎦
= ⎣
− 2003mm 1
1
200 mm
0.666 6667 · 0 mm
1 − 0.003333 3 · 0
⎤
⎦
119
1
200
2
=+
mm = 66 mm
C
3
3
D
1
200
¢ =+
F F L = FV = − = − ¡
mm = feff
C
3
− 2003mm
A
1
200
¢ =+
BF L = V0 F0 = − = − ¡
mm = feff
C
3
− 2003mm
feff = −
D−1
(1 − 1)
¶ = 0 mm
=µ
41
C
−
50 mm
A
−
1
(1
− 1)
¶ = 0 mm
H0 V0 =
=µ
41
C
−
50 mm
VH =
All quantities correspond to the values we would expect for the single thin lens: the front and back
focal lengths are identical to the effective focal length, which means that the principal points coincide
with the vertices — they are all located AT the lens.
t0 = 1 mm
⎡
MVV0 (t0 = 1 mm) = ⎣
1
100 mm
1 − 0.006666 7 · 1
(0.0033333 · 1 − 1) −
⎡
⎤
0.993 33 0.666 67 mm
⎦
= ⎣
1
− 66.814
0.996 67
mm
1
200 mm
0.666 6667 · 1 mm
1 − 0.003333 3 · 1
⎤
⎦
1 ∼
= 66.814 mm
C
D
0.996 67
¢ = 66.592 mm
F F L = FV = − = − ¡
1
C
− 66.814
mm
A
0.993 33
¢ = 66.368 mm
BF L = V0 F0 = − = − ¡
1
C
− 66.814
mm
D−1
(0.996 67 − 1)
¢ = 0.2225 mm
VH =
= ¡
1
C
− 66.814
mm
A−1
(0.993 33 − 1)
¢ = 0.4456 mm
H0 V0 =
= ¡
1
C
− 66.814
mm
feff = −
So the object- and image-space principal planes are within the lens and close to the surfaces. Note
that the front and back focal lengths are slightly different: the image-space principal point is “more
within the lens” since the second surface has less power than the front surface.
t0 = 2 mm
⎡
MVV0 (t0 = 2 mm) = ⎣
1
100 mm
1 − 0.006666 7 · 2
(0.0033333 · 2 − 1) −
⎡
⎤
0.986 67
1.3333 mm
⎦
= ⎣
3×10−2
− 1.493mm
0.993 33
1
200 mm
0.666 6667 · 2 mm
1 − 0.003333 3 · 2
⎤
⎦
120
feff = −
1
1
´∼
= −³
= 66.966 mm
1.493 3×10−2
C
−
mm
D
0.993 33
´ = 66.519 mm
F F L = FV = − = − ³
1.493
3×10−2
C
−
mm
BF L = V0 F0 = −
VH =
H0 V0 =
A
0.986 67
´ = 66.073 mm
= −³
1.493 3×10−2
C
−
mm
D−1
(0.993 33 − 1)
´ = 0.4467 mm
=³
3×10−2
C
− 1.493mm
A−1
(0.986 67 − 1)
´ = 0.8926 mm
=³
3×10−2
C
− 1.493mm
Note that the same “behavior” exists for this lens: the image-space principal point is farther “inside”
the lens than the object-space principal point.
t0 = 5 mm
⎡
MVV0 (t0 = 5 mm) = ⎣
0.666 6667 · 5 mm
(0.0033333 · 5 − 1) − 2001mm 1 − 0.003333 3 · 5
⎡
⎤
0.966 67
3. 333 3 mm
⎦ =⇒ feff ∼
= ⎣
= 67.417 mm
3×10−2
− 1. 483mm
0.983 33
feff = −
1
100 mm
1 − 0.006666 7 · 5
⎤
⎦
1
1
´∼
= −³
= 67.417 mm
1. 483 3×10−2
C
−
mm
F F L = FV = −
D
0.983 33
´ = 66.293 mm
= −³
1. 483 3×10−2
C
−
mm
A
0.966 67
´ = 65.170 mm
BF L = V0 F0 = − = − ³
−2
1.
C
− 483 3×10
mm
VH =
H0 V0 =
D−1
(0.983 33 − 1)
´ = 1.1238 mm
=³
3×10−2
C
− 1. 483mm
A−1
(0.966 67 − 1)
´ = 2.247 mm
=³
3×10−2
C
− 1. 483mm
t0 = 10 mm
⎡
MVV0 (t0 = 10 mm) = ⎣
1
100 mm
1 − 0.006666 7 · 10
(0.003333 3 · 10 − 1) −
⎡
⎤
0.933 33
6.666 7 mm
⎦
= ⎣
7×10−2
− 1. 466mm
0.966 67
1
200 mm
0.666 6667 · 10 mm
1 − 0.003333 3 · 10
⎤
⎦
feff = −
121
1
1
´∼
= −³
= 68.180 mm
1.466 7×10−2
C
−
mm
D
0.966 67
´ = 66.293 mm
F F L = FV = − = − ³
−2
C
− 1.466 7×10
mm
A
0.933 33
´ = 63.635 mm
BF L = V0 F0 = − = − ³
1. 466 7×10−2
C
−
mm
VH =
H0 V0 =
D−1
(0.966 67 − 1)
´ = 2.2724 mm
=³
7×10−2
C
− 1. 466mm
A−1
(0.933 33 − 1)
´ = 4.5456 mm
=³
7×10−2
C
− 1. 466mm
From these results, we see that the effective focal length gets LONGER as the lens gets THICKER
for the same radii of curvature and that the image-space principal point “penetrates” more inside
the lens as the lens thickness is increased.
3.4.5
Microscope
A simple microscope is also composed of two lenses (assumed to be “thin” in this discussion, though
the optical components generally are composed of multiple elements). The distance t between the
image-space (rear) focal point of the first lens and the object-space (front) focal point of the ocular
(the “tube length”) is fixed, often at t = 160 mm. The first lens (the “objective”) has a (very) short
focal length and the object typically is placed just “outside” its object-space focal point so that
z1 ' f1 . The objective generates a real image between the objective and eyepiece (or “ocular”),
which is a lens with a short focal length used as a simple magnifier.
Assume f1 = 5 mm, f2 = 50 mm
⎡
MVV0
⎤⎡
⎤
⎤⎡
1
0
1
0
⎢
⎥ ⎣ 1 160 mm ⎦ ⎢
⎥
=⎣
⎣
⎦
⎦
1
1
−
−
1
1
0
1
(−50 mm)
(5 mm)
⎡
⎤
−31
160 mm
=⎣
21 ⎦
41
−
50 mm
5
⎡
⎤
−31
160 mm
det ⎣
41
21 ⎦ = 1
−
50 mm
5
122
1
50
= + mm ∼
= +1.220 mm
C
41
µ ¶
21
D
210 ∼
5
¶ =+
F F L = FV = − = − µ
= −5.12 mm
41
C
41 mm
−
50 mm
A
−31
1550 ∼
¶ =−
BF L = V0 F0 = − = − µ
= −37.8 mm
41
C
41 mm
−
50 mm
µ
¶
21
−1
D−1
160
5
µ
¶ =−
=
mm ∼
VH =
= −3.902 mm
41
C
41
−
50 mm
A
−
1
−31
−1
1600
¶=
H0 V0 =
=µ
mm = 39.02 mm
41
C
41
−
50 mm
feff = −
MOV0
3.5
⎤⎡
⎤⎡
⎤⎡
⎤
1
0
1
0
⎢
⎥ ⎣ 1 160 mm ⎦ ⎢
⎥ ⎣ 1 3 mm ⎦
=⎣
⎣
⎦
⎦
1
1
−
−
1
1
0
1
0 1
(−50 mm)
(5 mm)
⎡
⎤
−31
160 mm
⎦
=⎣
41
21
−
50 mm
5
⎡
Image Location and Magnification
1
1
1
+
=
z1
z2
f
MT = −
z2 ∼ f
= − in usual case
z1
z1
µ
¶−1
1
1
1
z1 f
1
1
+
=
=
=⇒ z2 =
−
z1 z2
f
f
z1
z1 − f
z2
f
f
∼
MT = − = −
= − ∝ f if z1 À f
z1
z1 − f
z1
In words, if the object distance z1 is large (compared to the focal length f ), then the transverse
magnification is (approximately) proportional to the focal length. Therefore, doubling the focal
length doubles the magnification if the object is distant (with the caveat that the magnification is
still negative and smaller than unity, −1 < MT < 0).
3.6
Marginal and Chief Rays for the System
⎡⎛
L = ⎣⎝
y
nu
⎞⎛
⎠⎝
ȳ
nū
⎞⎤
⎡
⎠⎦ = ⎣
det [L] = y · nū − ȳ · nu ≡ ℵ
y
ȳ
nu nū
⎤
⎦
3.6 MARGINAL AND CHIEF RAYS FOR THE SYSTEM
123
The marginal ray goes through the center of the object and any image(s) (i.e., the point where the
marginal ray crosses the optical axis is either the object or an image of the object). It also “grazes”
the edge of the aperture stop, so if we know the location and the diameter of the aperture stop in
the system, we can scale the height of the marginal ray so that its height matches the semidiameter
of the aperture stop at that location.
The chief ray goes through the center of the stop (and of the entrance and exit pupils), so we set
the chief ray height at the location of the stop to be zero and its angle to be arbitrary (say unity),
then propagate that provisional ray “forward” towards the image-space vertex and “backwards”
towards the object-space vertex (note that when tracing “backwards” toward the first lens, the
matrices in the ray trace must be inverted). During the tracing, we find the element that most
constrains the chief ray, and then scale the height of the provisional chief ray to make sure that it
gets “through” the other elements. The angle of the chief ray emerging from the front vertex to the
object is the half-angle of the field of view; the angle of the chief ray emerging from the image-space
vertex is the half angle of the image field at the sensor.
3.6.1
Examples of Marginal and Chief Rays for Systems
In the lab, you constructed Keplerian and/or Galilean telescope with an iris diaphragm at various
locations. We can use this as a model for demonstrating how to evaluate the marginal and chief
rays. To evaluate the location of the stop, we must know the diameters as well as the locations of
the lenses. We can cast a provisional marginal ray into the system from the object to determine
which element is the aperture stop. We then scale the provisional marginal ray so that its height
and the semidiameter of the stop “match.” We then propagate a provisional chief ray forward and
backward from the center of the stop and scale its angle so that it grazes the element that constrains
it. From the angle of the chief ray entering and exiting the system, we can determine the field of
view. We will use the Galilean telescope as the first example.
Example 1: Galilean telescope, object at ∞
Consider a telescope with the following parameters.
L1 : f1 = +200 mm, d1 = 40 mm
L2 : f2 = −40 mm, d2 = 5 mm
t = f1 + f2 = 160 mm
⎡
R1 = ⎣
1
0
⎤
⎦
− +2001 mm 1
⎤
⎡
1 160 mm
⎦
T = ⎣
0
1
⎤
⎡
1
0
⎦
R2 = ⎣
− −401mm 1
124
The vertex-vertex matrix of this system is
⎤
⎡
⎤⎡
⎤⎡
⎤ ⎡
1
160
mm
1
0
1 160 mm
1
0
⎦
⎦⎣
⎦⎣
⎦=⎣ 5
MVV0 = ⎣
0
5
− −401mm 1
0
1
− +2001 mm 1
⎤
⎡
1
160
mm
⎦
MVV0 = ⎣ 5
0
5
for which element C = 0, which is characteristic of an afocal system. For an object at at infinity, the
provisional marginal ray into the system is has angle of zero and height equal to the semidiameter
of the first element.
⎡
⎤
⎡
⎤ ⎡
⎤
d1
y
20
mm
⎣
⎦
⎦
=⎣ 2 ⎦=⎣
nu
0
0
provisional
We can propagate this ray through the first lens and translate it to the second lens:
⎡
⎤
⎡
⎤⎡
⎤⎡
⎤ ⎡
⎤
y
20 mm
4 mm
1 160 mm
1
0
⎦
⎦⎣
⎦⎣
⎦=⎣
⎦
T R1 ⎣
=⎣
1
nu
0
1
− +2001 mm 1
0
− 10
provisional
In words, the height of the provisional marginal ray at the second lens is 4 mm. Note that the ray
after the second lens has the form:
⎤
⎤ ⎡
⎤
⎤⎡
⎡
⎡
1
4
mm
1
mm
y
160
mm
⎦
⎦=⎣
⎦
⎦⎣
MV V 0 ⎣
=⎣ 5
0
0
nu
0
5
provisional
so that the height of the provisional marginal ray at the second lens is the same before and after
refraction (no surprise there) and that the ray angle after the second lens is 0 (parallel to the optical
axis, again no surprise). Note that the ray height at L2 is larger than the specified semidiameter of
the second lens:
d2
5 mm
y0 >
=
= 2.5 mm =⇒ L2 is aperture stop
2
2
This means two things: (1) that the second lens is the aperture stop, and (2) that we must scale the
height and angle of the provisional marginal ray to ensure that it grazes the edge of the stop. The
scaling factor is the ratio of the height of the provisional marginal ray
¡ d2 ¢
2.5 mm
5
2
=
=
y at L2
4 mm
8
We apply this scale factor to the marginal ray at all locations in the system. The marginal ray at
the first lens from an object at infinite distance is:
⎤
⎤
⎡
⎤
⎤ ⎡
⎡
⎡
y
12.5 mm
y
20 mm
5
5
⎦
⎦
⎣
⎦
⎦=⎣
=
= ⎣
·⎣
8
8
nu
0
nu
0
at L1
provisional
⎤
⎤
⎡
⎡
12.5 mm
y
⎦
⎦
⎣
=⎣
0
nu
at L1
which means that the marginal ray strikes the first lens well inside of the semidiameter; the entering
“tube” of rays does not fill the lens.
125
Now that we know that the second lens is the aperture stop, we can propagate a provisional chief
ray from center of the stop in both directions. One possible choice for the provisional chief ray is:
⎤
⎡
⎤
⎡
0 mm
y0
⎦
⎣
⎦
=⎣
1
n0 u0
provisional
where again an angle of 1 radian is HUGE, but we will scale it based on the parameters of the rest
of the system. Propagate this ray through the system (towards image space) to obtain
⎤⎡
⎤ ⎡
⎤
⎡
⎤ ⎡
0 mm
0 mm
0 mm
1
0
⎦⎣
⎦=⎣
⎦
⎦=⎣
R2 ⎣
1
1
1
− −401mm 1
so the height and angle of the provisional chief ray is unchanged by the action of the lens L2 that is
the stop because it passes through the center of the lens.
The provisional chief ray may be propagated from the stop “backwards” towards the first lens.
The translation matrix is inverted because the light is traveling “backwards” because we are traveling
from right to left.
⎡
T −1 ⎣
0 mm
1
⎤
⎛⎡
⎦ = ⎝⎣
1 +160 mm
0
1
⎤⎞−1 ⎡
⎦⎠
⎣
0 mm
1
⎤
⎡
⎦=⎣
−160 mm
1
⎤
⎦
The height of the provisional chief ray at the first element is negative, which means that it is BELOW
the optical axis at a MUCH LARGER distance than the semidiameter d21 = 20 mm of L1 . To ensure
that the chief ray “gets through” the first lens, we have to scale its angle by the factor:
¡ d1 ¢
20 mm
1
2
=
y 160 mm
8
So now go back to the original prescription for the provisional chief ray and scale it to obtain the
“actual” chief ray:
⎤
⎤
⎤
⎡
⎤
⎡
⎡
⎡
⎤
⎡
0 mm
y
y
y
0 mm
1
⎦
⎦= ·⎣
⎦
⎦ =⇒ ⎣
⎣
at L2 = ⎣
=⎣ 1 ⎦
8
nu
nu
nu
1
provisional
provisional
8
⎡
⎡
⎤
⎤
0 mm
y0
⎣
⎦
=⎣ 1 ⎦
n0 u0
at L2
8
⎡
⎤
⎡
⎤
−20 mm
y0
⎣
⎦
⎦
=⎣
1
n0 u0
at L1
8
We can now propagate this ray through L1 . The chief ray emerging from the front vertex is:
⎡
⎤⎞−1 ⎛⎡
⎤⎞−1 ⎡
⎤
⎛⎡
⎤
0
mm
0
mm
1
+160
mm
1
0
−1 ⎣
⎦⎠ ⎝⎣
⎦⎠ ⎣ 1 ⎦
R−1
1 T
1 ⎦ = ⎝⎣
0
1
− +2001 mm 1
8
8
⎡
⎤
−20 mm
⎦
= ⎣
1
40
126
Now propagate this chief ray forwards through the system by multiplying by MVV0
⎤⎡
⎡
⎤ ⎡
⎤
1
160
mm
−20
mm
0
mm
⎦⎣
⎣5
⎦=⎣
⎦
1
1
0
5
40
8
which has height of zero emerging from L2 (the aperture stop), as expected.
The field of view of the system is twice the angle at the front of L1 :
FoV = 2 ·
1
1
1 180◦ ∼
radian =
radian =
·
= 2.864◦
40
20
20
π
The exit pupil is (obviously) located at the aperture stop L2 , while the entrance pupil is the image
of the stop in object space, so we can evaluate the location of the entrance pupil from the calculation
of the chief ray emerging from the front vertex:
⎤
⎡
⎤
⎡
y0
−20 mm
⎦ (emerging from front vertex) = ⎣
⎦
⎣
1
n0 u0
40
The height is 20 mm and the angle is
the optical axis is:
1
40
radian, so the distance to the location where the ray crosses
zV 0 N P = −
−20 mm
1
40
= +800 mm
the distance from the vertex to the entrance pupil is positive, so the pupil is behind the objective
and is virtual. The transverse magnification of the entrance pupil is:
MT = −
800 mm
= +5
−160 mm
so the diameter of the entrance pupil is magnified:
dN P = 5 · 5 mm = 25 mm
127
Marginal ray (red) and chief ray (blue) from object at infinity traced through Galilean telescope with
aperture stop at second lens (eyepiece).
Example 2: Galilean telescope with aperture stop at FIRST lens, object at ∞
We already know that the height of the provisional marginal ray height at the second lens was
y = 4 mm, so we can select a diameter for L2 that exceeds this value, so that the aperture stop is
now the first lens:
L1 : f1 = +200 mm, d1 = 40 mm
L2 : f2 = −40 mm, d2 = 10 mm
t = f1 + f2 = 160 mm
The vertex-vertex matrix is the same as before:
⎡
MVV0 = ⎣
1
5
160 mm
0
5
⎤
⎦
We know from the results just calculated that if d2 = 10 mm, then its semidiameter exceeds
that height of the provisional marginal ray, so the aperture stop then becomes the first lens. The
marginal ray we calculated for the first lens then becomes the actual marginal ray; at the first lens,
the marginal ray is:
⎤
⎤
⎡
⎡
⎣
y
nu
⎦ (at L1 ) = ⎣
20 mm
0
⎦
128
and the marginal ray leaving the system after L2 is:
⎡
⎤
⎤
⎡
y
y0
⎣
⎦ (after L1 ) = ⎣
⎦
nu
n0 u0
⎡
⎤
20 mm
⎦
= MVV0 ⎣
0
⎤
⎤ ⎡
⎡
⎤⎡
1
4
mm
20
mm
160
mm
⎦
⎦=⎣
⎦⎣
= ⎣5
0
0
0
5
Since aperture stop has moved to L1 from L2 , we have to evaluate a different chief ray; it will go
through the center of L1 , so the provisional chief ray at L1 is:
⎤
⎤
⎡
⎡
0 mm
y
⎦
⎦
⎣
(at L1 ) = ⎣
1
nu
provisional
After the first refraction, the provisional chief ray is:
⎡
⎤
⎤⎡
⎤ ⎡
⎤
⎡
1
0
y0
0 mm
0 mm
⎣
⎦
⎦⎣
⎦=⎣
⎦
(after L1 ) = ⎣
1
−
1
n0 u0
1
1
provisional
+200 mm
which again should be no surprise, since the chief ray goes through the center of L1 , the lens has no
impact on the ray.
Now propagate the provisional chief ray to L2 by applying the translation matrix:
⎡
⎤ ⎡
⎤⎡
⎤ ⎡
⎤
0 mm
1 160 mm
0 mm
160 mm
⎦=⎣
⎦⎣
⎦=⎣
⎦
T ⎣
1
0
1
1
1
so the ray height of the chief ray is again MUCH larger than the semidiameter of the lens. The
scaling factor that must be applied to the provisional chief ray is the ratio of the semidiameter of
L2 to the ray height:
¡ d2 ¢
5 mm
5
1
2
=
=
=
y
160 mm
160
32
Therefore the true chief ray at the first lens is:
⎤
⎤
⎡
⎡
y
y
⎦ (at L1 ) = 1 · ⎣
⎦
⎣
32
nu
nu
provisional
⎤ ⎡
⎤
⎡
1 ⎣ 0 mm ⎦ ⎣ 0 mm ⎦
=
=
·
1
32
1
32
⎡
⎣
y
nu
⎤
⎡
⎦ (at L1 ) = ⎣
0 mm
1
32
⎤
⎦
In words, the angle of the chief ray into the first lens (and therefore into the aperture stop) is
1
32
129
radians, so the full-angle field of view of the system is:
1
radian
16
1 180 ∼
=
·
= 3.58◦
16 π
F oV = 2 · u =
which is larger than the field of view in the first case with the smaller diameter for L2 .
Just for fun, propagate both the marginal and chief rays through the system at the same time:
⎞⎤
⎡⎛
⎡⎛
⎞⎛
⎞⎛
⎞⎤
y
y0
y0
y
⎠⎦ = ⎣⎝
⎠⎝
⎠⎝
⎠⎦
MVV0 ⎣⎝
nu
nu0
nu0
nu
⎡
⎤⎡
⎤ ⎡
⎤
1
20
mm
0
mm
4
mm
5
mm
160
mm
⎦⎣
⎦=⎣
⎦
= ⎣5
1
5
0
0
0
5
32
32
⎡⎛
⎞⎛
⎞⎤ ⎡⎛
⎞⎛
⎞⎤
4 mm
5 mm
y0
y0
⎠⎝
⎠⎦ = ⎣⎝
⎠⎝
⎠⎦
= ⎣⎝
5
0
nu0
nu0
32
So the ray height of the marginal ray after the second lens is 4 mm and the ray angle is 0 radians
5
(propagates to the image at infinity), while the chief ray height after L2 is 5 mm and the angle is 32
10
5
◦
radians. The full angle of the image field is 32 = 16 radians ∼
= 17.9 .
stop at first lens.
The entrance pupil coincides with the aperture stop in this system, while the exit pupil is the
image of the aperture stop seen through L2 . The object distance to the stop is f1 + f2 = 160 mm, so
the exit pupil distance is:
zXP =
z1 · f2
160 mm · (−40 mm)
=
= −32 mm
z1 − f2
160 mm − (−40 mm)
130
and the diameter of the exit pupil is:
dXP = MT · 40 mm = −
−32 mm
· 40 mm = +8 mm
160 mm
Example 3: Galilean telescope with aperture stop between lenses, object at ∞
Now consider the result if we place an iris diaphragm with diameter d = 8 mm midway between L1
and L2 . The prescription for the system is:
L1
L2
t
S
:
:
=
:
f1 = +200 mm, d1 = 40 mm
f2 = −40 mm, d2 = 10 mm
f1 + f2 = 160 mm
VS = 80 mm, SV0 = 80 mm, dStop = 8 mm
The matrix for the imaging elements is unchanged:
⎡
MVV0 = ⎣
1
5
160 mm
0
5
⎤
⎦
but we need to confirm that the new iris is the aperture stop. Cast in a provisional marginal ray
from an object at infinity:
⎤⎡
⎤ ⎡
⎡
⎤ ⎡
⎤
20 mm
20 mm
20 mm
1
0
⎦⎣
⎦=⎣
⎦=⎣
⎦
R1 ⎣
1
0
− 10
0
− +2001 mm 1
Now propagate this ray to the iris, located at a distance of 80 mm after L1 :
⎡
⎤
⎡
⎤
20 mm
1 80 mm 20 mm
⎦ =
⎣
⎦
T ⎣
1
1
− 10
0
1
− 10
⎡
⎤
12 mm
⎦ =⇒ y = 12 mm > dStop = 8 mm = 4 mm at iris
= ⎣
1
2
2
− 10
So again we need to scale the provisional marginal ray by the ratio:
´
³
dS t o p
2
4 mm
1
=
=
y
12 mm
3
So the marginal ray at the first lens is:
⎤ ⎡ 20
⎤ ⎡ 2
⎤
⎡
6 mm
mm
1 ⎣ 20 mm ⎦ ⎣
⎦=⎣ 3
⎦
3
=
3
0
0
0
⎤ ⎡ 20
⎡
⎤
mm
y
⎦=⎣ 3
⎣
⎦
nu
0
131
Now propagate this ray through the first surface to the iris:
⎡
⎤⎡
⎤ ⎡
⎤
⎤⎡
20
1 80 mm
1
0
mm
4
mm
⎣
⎦⎣
⎦=⎣
⎦
⎦⎣ 3
1
0
1
− +2001 mm 1
0
− 30
We can now propagate this from the iris to and through the second lens:
⎤⎡
⎤
⎤⎡
⎤ ⎡
⎡
4
4 mm
1 80 mm
1
0
mm
⎦⎣
⎦
⎦⎣
⎦=⎣ 3
⎣
1
1
− 30
0
1
− −40 mm 1
0
So the marginal ray exiting the system is at a height of
the axis, as expected for a telescope).
4
3
mm and an angle of 0 radians (parallel to
Now propagate the provisional chief ray forward (toward L1 ) from the iris; the translation from
the iris is:
⎤
⎡
⎤
⎡
0 mm
y
⎦
⎣
⎦
= ⎣
1
nu
at stop
⎛⎡
⎤⎞−1 ⎡
⎤
⎡
⎤⎡
⎤ ⎡
⎤
1 +80 mm
0 mm
1 −80 mm
0 mm
−80 mm
⎝⎣
⎦⎠ ⎣
⎦ = ⎣
⎦⎣
⎦=⎣
⎦
0
1
1
0
1
1
1
If we propagate the provisional chief ray from the iris towards L2 , we obtain:
⎡
⎤⎡
⎤ ⎡
⎤
1 +80 mm
0 mm
+80 mm
⎣
⎦⎣
⎦=⎣
⎦
0
1
1
1
Note both ray heigths are too large, but that the ray height of the provisional chief ray at L2 is
much larger in percentage than its height at L1 ; the ratios are:
¡ d1 ¢
20 mm
1
2
=
=
80
80 mm
4
¡ dmm
¢
2
5
mm
1
2
=
=
80 mm
80 mm
16
So the second lens constrains the chief ray. Apply the scaling factor to the provisional chief ray to
find the true chief ray at the iris:
⎡
1 ⎣
·
16
0 mm
1
⎤
⎤ ⎡
0 mm
y
⎦
⎦ = ⎣ 1 ⎦=⎣
nu
at
16
⎤
⎡
stop
Propagate it “forward” towards and through L1 to find the prescription for the chief ray entering
132
the system:
⎛⎡
⎤⎞−1 ⎛⎡
⎤⎞−1 ⎡
⎤ ⎡
⎤
1
0
0 mm
−5 mm
1 +80 mm
⎝⎣
⎦⎠ ⎝⎣
⎦⎠ ⎣ 1 ⎦ = ⎣
⎦
1
3
−
1
0
1
+200 mm
16
80
⎡
⎤
⎡
⎤
−5 mm
y
⎣
⎦
⎦
=⎣
3
nu
into L1
80
The field of view of the system is twice the chief ray angle into the system:
FoV = 2 ·
3
3
3 180 ◦ ∼
radians =
radians =
·
= 4.30◦
80
40
40 π
Propagate the chief ray towards and through L2 to find the chief ray exiting the system:
⎤⎡
⎡
⎤⎡
⎤ ⎡
⎤
0 mm
+5 mm
1
0
1 +80 mm
⎦⎣ 1 ⎦ = ⎣
⎣
⎦⎣
⎦
3
− −401mm 1
0
1
16
16
⎤
⎡
⎡
⎤
+5 mm
y
⎦
⎣
⎦
=⎣
3
nu
out of L2
16
iris diaphragm between lenses.
133
Example 4: Keplerian telescope, object at ∞
Substitute a positive lens with the diameter of 5 mm for L2 , which also means that we have to change
the distance between the lenses:
L1 : f1 = +200 mm, d1 = 40 mm
L2 : f2 = +40 mm, d2 = 5 mm
t = f1 + f2 = 240 mm
The vertex-vertex (system) matrix is:
⎤⎡
⎤⎡
⎤
⎡
1
0
1 240 mm
1
0
⎦⎣
⎦⎣
⎦
MVV0 = ⎣
− +2001 mm 1
0
1
− +401mm 1
⎤
⎡ 1
− +240 mm
⎦
MVV0 = ⎣ 5
0
−5
The prescription for provisional marginal ray into system from object at infinity has the same ray
height as the semidiameter of L1 :
⎤
⎤
⎡
⎡
20 mm
y
⎦
⎦
⎣
=⎣
0
nu
provisional
The outgoing provisional marginal ray from the system is:
⎤⎡
⎤ ⎡
⎤
⎡
⎤
⎡
20 mm
−4 mm
y
− 15 240 mm
⎦⎣
⎦=⎣
⎦
⎦
MVV0 ⎣
=⎣
0
−5
0
0
nu
provisional
Since the ray height of the provisional ray is larger than the semidiameter
aperture stop:
d2
y0 >
=⇒ L2 is aperture stop
2
so we must scale the provisional marginal ray by a factor
⎤
⎤
⎤
⎡
⎡
Ã¡ ¢! ⎡
d2
5
y
y
y
mm
5
2
⎦ =
⎦
⎦
⎣
⎣
= 2
=
·⎣
y
4
mm
8
nu
nu
nu
provisional
provisional
⎤ ⎡
⎤
⎡
⎤
⎡
5
· 20 mm
12.5 mm
y
⎦=⎣
⎦
⎣
⎦
= ⎣8
0
0
nu
of L2 , then L2 is the
⎡
·⎣
y
nu
⎤
⎦
provisional
at L1
Now to the chief ray; the provisional chief ray emerging from center of aperture stop has zero
height and angle of unity:
⎡
⎤
⎡
⎤
y0
0 mm
⎣
⎦
⎦
=⎣
n0 u0
−1
provisional
134
The ray is propagated to the first lens:
⎡
T ⎣
0 mm
−1
⎤
⎛⎡
⎦ = ⎝⎣
1 +240 mm
0
1
⎤⎞−1 ⎡
⎦⎠
⎣
0 mm
−1
⎤
⎡
⎦=⎣
+240 mm
−1
⎤
⎦
so the height of the provisional chief ray at the first element is |y| = 240 mm, which is MUCH larger
than the semidiameter d21 = 20 mm of L1 . To ensure that the chief ray “gets through” the first lens,
we have to scale its angle by the factor:
1
20 mm
=
240 mm
12
So now go back to the original prescription for the provisional chief ray:
⎤
⎡
⎤
⎤
⎤
⎡
⎡
⎡
0 mm
y0
y0
y0
1
⎦ =⇒ ⎣
⎦
⎦=
⎦
⎣
⎣
= ⎣
12 n0 u0
−1
n0 u0
n0 u0
provisional
⎡
provisional
⎡
⎣
y0
n0 u0
⎤
⎡
⎦=⎣
⎤
⎤
0 mm
=⎣
1 ⎦
−
12
0 mm
1 ⎦
−
12
We can now propagate it from the rear vertex to and through the front vertex of the system. The
chief ray emerging from the front vertex is:
⎛⎡
⎝⎣
1
0
− +2001 mm 1
⎤⎞−1 ⎛⎡
⎦⎠
⎝⎣
1 +240 mm
0
1
⎤⎞−1 ⎛⎡
⎦⎠
⎝⎣
1
0
− +401mm 1
⎤⎞−1 ⎡
⎦⎠
⎣
0 mm
1
− 12
⎤
⎡
⎦=⎣
+20 mm
1
+ 60
⎤
⎦
1
In words, the chief ray height at the front surface is y = 20 mm and the chief ray angle is nu = + 60
radian (where the negative sign again just means that the ray angle into the system is the negative
of that emerging therefrom). The field of view of the system is twice the angle:
FoV = 2 ·
1
1
1 180◦ ∼
radian =
radian =
·
= 1.91◦
60
30
30
π
135
Marginal ray (red) and chief ray (blue) from object at infinity traced through Keplerian telescope
with aperture stop at second lens.
¡
¢
Example 5: Keplerian telescope, stop at eyepiece, nearby object OV = 500 mm
Consider a telescope with the following parameters.
L1
L2
t
z1
:
:
=
=
f1 = +200 mm, d1 = 40 mm
f2 = +40 mm, d2 = 5 mm
f1 + f2 = 240 mm
OV = 500 mm
The provisional marginal ray goes from the center of the object to the edge of the first lens, through
the system, and to the center of the image. The first provisional ray is:
⎡
⎤
⎡
⎤
y
0 mm
⎣
⎦
⎦
(at object) = ⎣
nu
1
provisional
It is useful to locate the image by propagating this provisional ray through the system:
⎤⎡
⎤⎡
⎤⎞ ⎡
⎤ ⎡
⎤ ⎡
⎤
⎛⎡
1 240 mm
1
0
1 500 mm
0 mm
140 mm
1
0
⎦⎣
⎦⎣
⎦⎠ · ⎣
⎦·⎣
⎦=⎣
⎦
⎝⎣
0
1
− +2001 mm 1
0
1
1
−5
− +401mm 1
So the image location relative to the rear vertex is:
V0 O0 = −
y
140 mm
=
= +28 mm
u
−5 radians
V0 O0 = +28 mm
136
so the image is real.
Now find the height of the provisional marginal ray at L1 :
⎡
⎤
⎡
⎤⎡
⎤ ⎡
⎤
y
1 500 mm
0 mm
500 mm
⎣
⎦
⎦⎣
⎦=⎣
⎦
(at L1 ) = ⎣
nu
0
1
1
1
provisional
where the ray height is MUCH too large and must be scaled to “fit” into the lens. The scale factor
is:
¡ d1 ¢
20 mm
1
2
=
=
y (at lens)
500 mm
25
So the second iteration of the provisional marginal ray at the front of the first lens is:
⎤
⎡
⎤ ⎡
1 ⎣ 500 mm ⎦ ⎣ 20 mm ⎦
·
=
1
25
1
25
which has a much smaller incident angle.
Now propagate this ray through the first lens to the second lens:
⎤⎡
⎤⎡
⎡
⎤
⎡
⎤
1
0
20 mm
20 mm
1 240 mm
⎦⎣
⎦⎣
⎦ = ⎣
⎦
T R1 ⎣
1
1
1
−
0
1
1
25
+200 mm
25
⎡
⎤ ⎡
⎤
28
5 3 mm
mm
⎦=⎣ 5
⎦
= ⎣ 5
3
3
− 50
− 50
so the ray height is still too large; it is blocked by L2 (which therefore is the aperture stop); scale
this ray to fit into the second lens by applying the factor:
¡ d2 ¢
2.5 mm
25
12.5
2
= 28
=
=
y (at L2 )
28
56
mm
5
So the third iteration produces the actual marginal ray from an object at a distance of 500 mm from
L1 :
⎡
⎤
⎤ ⎡
⎤ ⎡
⎤
⎡
⎣
y
nu
⎦
=
25
at ob ject
⎡
⎣
y
nu
25 ⎣ 0 mm ⎦ ⎣ 0 mm ⎦ ∼ ⎣ 0 mm ⎦
=
·
=
1
1
56
0.017857
⎤
⎦
at ob ject
⎡
=⎣
56
0 mm
1
56
The prescription for the marginal ray at L1 is:
⎤⎡
⎤ ⎡
⎡
0 mm
1 500 mm
⎦⎣
⎦=⎣
⎣
1
0
1
56
⎤
⎡
⎦∼
=⎣
125
14
mm
1
56
0 mm
0.017857
⎤
⎡
⎦∼
=⎣
⎤
⎦
8.929 mm
1
56
⎤
⎦
where the ray height is much smaller than the semidiameter of L1 , so the lens is overly large.
We can propagate this through the system to find the actual prescription for the exiting marginal
137
ray:
⎡
MVV0 · ⎣
1 500 mm
0
1
⎤ ⎡
⎦·⎣
0 mm
1
56
⎤
⎡
⎦ = ⎣
− 15 240 mm
⎡
⎣
0
y
nu
⎤
⎦
−5
at V0
⎤⎡
1 500 mm
⎡
mm
⎦⎣
=⎣
5
2
0
5
− 56
1
⎤
⎤⎡
⎦⎣
0 mm
1
56
⎤
⎦
⎦
Just to check, find the distance to the image to make sure it matches the result for the provisional
marginal ray:
5
mm
y
V0 O0 = −
= − 2 5 = +28 mm
nu
− 56
which agrees with what we found earlier.
Now that we know that L2 is the aperture stop for the specified object location, we can propagate
a provisional chief ray from center of the stop in both directions. (We will find that the chief ray is
unaffected by the location of the object.) The provisional chief ray is:
⎡
⎤
⎡
⎤
y0
0 mm
⎣
⎦
⎦
=⎣
n0 u0
+1
provisional
Propagate through the system towards image space to obtain
⎤
⎤ ⎡
⎤ ⎡
⎡
⎤⎡
1
0
0 mm
0 mm
0 mm
⎦
⎦=⎣
⎦=⎣
⎦⎣
R2 ⎣
− −401mm 1
1
1
1
so the height and angle of the provisional chief ray is unchanged by the action of the lens L2 that is
the stop because it passes through the center of the lens.
The provisional chief ray may be propagated from the stop “forwards” towards the first lens.
The translation matrix yields the ray height and angle at the first lens:
⎡
T ⎣
0 mm
1
⎤
⎛⎡
⎦ = ⎝⎣
1 +240 mm
0
1
⎤⎞−1 ⎡
⎦⎠
⎣
0 mm
1
⎤
⎡
⎦=⎣
−240 mm
1
⎤
⎡
⎦=⎣
y0
n0 u0
⎤
⎦
(at L1 )
provisional
Note that the height of the provisional chief ray at L1 is y = −240 mm, which means that it is
BELOW the optical axis at a MUCH value than the semidiameter d21 = 20 mm of L1 . To ensure
that the chief ray “gets through” the first lens, we have to scale its angle by the factor:
¡d ¢
1
20 mm
1
2
=
y 240 mm
12
So now go back to the original prescription for the provisional chief ray and scale it to obtain the
“actual” chief ray:
⎡
⎤
⎡
⎤
⎡
⎤
⎤
⎡
⎤ ⎡
⎤
⎡
0
0
y0
y0
y
y
0 mm
0
mm
⎣
⎦
⎦ =⇒ ⎣
⎦= 1 ⎣
⎦
⎦=⎣
⎦
=⎣
=⎣
1
0 0
0 0
0 0
12 n0 u0
nu
nu
n
u
1
12
provisional
provisional
Note that this is the same chief ray as for the case where the object is at infinity. In words, the chief
ray is determined by the stop and the diameters of the other elements, not by the location of the
object.
138
We can now propagate the scaled chief ray from the rear vertex to and through the front vertex
of the system. The chief ray emerging from the front vertex is:
⎛⎡
⎝⎣
1
0
− +2001 mm 1
⎤⎞−1 ⎛⎡
⎦⎠
⎝⎣
1 +240 mm
0
1
⎤⎞−1 ⎡
⎦⎠
⎣
0 mm
1
12
⎤
⎡
⎦=⎣
−20 mm
1
− 60
⎤
⎦
1
which has the correct ray height (the semidiameter of L1 ) y = 20 mm and angle nu = − 60
radian.
The field of view of the system is twice the angle:
F oV = 2 ·
1
1
1 180◦ ∼
radian =
radian =
·
= 1.91◦
60
30
30
π
The exit pupil is (obviously) located at the aperture stop L2 , while the entrance pupil is the
image of the stop in object space, so we can evaluate the location of the entrance pupil from the
calculation of the chief ray emerging from the front vertex:
⎤
⎡
⎤
⎡
20 mm
y0
⎦ (emerging from front vertex) = ⎣
⎦
⎣
1
− 60
n0 u0
The height is 20 mm and the angle is
the optical axis is:
1
40
radian, so the distance to the location where the ray crosses
zV0 N P = −
20 mm
= +1200 mm
1
− 60
in front the objective; the entrance pupil is real and its magnification is:
MT =
+1200 mm
=5
240 mm
so the diameter of the entrance pupil is:
dNP = 5 · dStop = 5 · 5 mm = 25 mm
139
Marginal ray (red) and chief ray (blue) from object at a distance of 500 mm from the first lens
traced through Keplerian telescope with aperture stop at second lens.
Chapter 4
Depth of Field and Depth of Focus
From experience with snapshots or movies, we all know that the optical images are not “in focus”
for objects at all distances from the lens; objects at distances other than that focused appear blurry.
This is not necessarily bad — it is used as a creative tool by photographers and cinematographers
to concentrate the attention of the viewer on particular objects of interest. However, in many (if
not all) scientific applications, this limitation to the region of “good” imaging is detrimental; we’d
like to see the entire 3-D object “in sharp focus.” For this reason, it is essential to understand the
factors that affect the depth of the region of “sharp focus,” which is the so-called “depth of field”
on the object as “seen” through the imaging system.
The concept of depth of field and focus and the dependence on f/# is illustrated in the figure
for a specified linear dimension of “acceptable sharpness.” The extent of the cone of rays between
the two locations truncated by this sharpness criterion is the “depth of focus.” Clearly this range is
larger for a smaller cone angle (larger f/#). This would lead us to the conclusion that the depth of
focus (and also its object-space equivalent, the depth of field) is proportional to the f/#:
∆z ∝ f /#
A more accurate criterion requires application of the principles of wave optics to show that diffraction
induces a “blur spot” whose linear dimension also increases with focal ratio that defines the dimension
of “acceptable” blur. A hybrid combination of the principles of ray and wave optics leads to a
criterion that the depths of field and of focus actually vary with the square of the f/#:
∆z ∝ (f /#)2
This hybrid criterion is discussed after illustrating the concept of depths of field and focus using
examples from film and television.
141
142
CHAPTER 4 DEPTH OF FIELD AND DEPTH OF FOCUS
The depth of focus for a known linear dimension of “acceptable sharpness” depends on angle of
the cone of rays, which is determined by the focal ratio (f/# ) of the system. If the cone of rays is
large (small f/#), then the extent of the cone in front of and behind the point of best focus is small;
if the angle of the ray cone is small (large f/#), then a wider range of depths appear “in focus.”
143
4.0.2
Examples of Depth of Field from Video and Film
Extensive discussion in Wikipedia at http://en.wikipedia.org/wiki/Depth_of_field
1. The Colbert Report, video image with “normal lens” shows the different in apparent sharpness with depth in the scene. This naturally draws attention to the object that is in focus and
often serves as a cue to the audience about which is the object of interest. There are three
areas of interest at different distances from the lens, which is focused on the nearest plane
(Stephen Colbert); the more distant plane where Jon Stewart sits is noticeably blurry, but the
bookshelf in the distant plane is very blurry.
Note the difference in sharpness with depth; Stephen Colbert in the foreground is in sharp focus,
Jon Stewart is clearly less sharp, and the items in the background are quite blurry.
144
c 2011, Masterpiece Mystery from the BBC, using limited depth of field to draw
2. Sherlock °
attention to the point of interest
This example shows how the director draws the attention of the audience to the desired point of
interest. The two frames are from A Scandal in Belgravia, the first episode in the second season
of Sherlock broadcast by the BBC and PBS. The two frames are taken from the same camera
position and separated in time by approximately two seconds. In the first frame, “Sherlock”
(Benedict Cumberbatch) is speaking about the camera phone of “Irene Adler” (Lara Pulver).
After he finishes speaking, the camera focus shifts rapidly to Adler in the background for her
reply. Note that her form is barely distinguishable in the first frame, which focuses the viewer’s
attention upon Sherlock in the foreground.
Use of limited depth of field to draw the attention of the audience to the subject of interest. The
camera shifts focus rapidly from the foreground character (at top) to the background character (at
bottom).
145
3. Citizen Kane by Orson Welles, small aperture (large f/#) =⇒ large depth of field
Both foreground and background are in focus — note cheek of “Mr. Bernstein” (Everett Sloane) in
near foreground on right and venetian blinds in the windows at the back. “Walter Thatcher”
(George Coulouris) on left and “Charles Foster Kane” (Orson Welles) in center are in focus. The
distance to the windows appears to be small because of the sharp focus.
Different frame of same scene from “Citizen Kane” shot with same focus setting. George Coulouris
(as “Walter Thatcher”) and Everett Sloane (as “Mr. Bernstein”) remain in focus in the
foreground. Orson Welles (as “Charles Foster Kane”) has walked to the windows, which are now
clearly many feet from the foreground characters. “Kane’s” stature appears to have been
diminished.
c 1941 RKO Pictures, Inc.) is famous for its creative cinematogThe film “Citizen Kane”(°
raphy by Gregg Toland and the director/star Orson Welles, including original camera angles
(especially upward shots from the floor or even from beneath the floor plane), movements,
transitions, and the use of “deep focus.” Consider the two frames from the film of a group of
three characters: the standing Orson Welles in the center (at age 26 as the elderly “Charles
Foster Kane,” a testament to the skill of makeup artist Maurice Seiderman), George Coulouris
on the left (as “Walter Parks Thatcher”, who had been Kane’s guardian), and Everett Sloane
on the right (as Kane’s assistant “Mr. Bernstein”). In the first frame, the three characters
are grouped together and the entire scene appears to be in focus, from the skin on Bernstein’s
face on the right to the venetian window blinds in the back. From the sharp focus of the background windows and expectations about depth of field based on past experience, viewers likely
146
will surmise that the windows must be physically close to the characters and therefore that
Kane is much taller than the background window sill. Between the first and second images,
the standing Kane has taken 18 steps to walk to the windows (perhaps 35-50 feet from the
foreground characters), while remaining in focus the entire time. His height is now shown to
be approximately the same as the height of the window sill. The apparent “shrinking” of his
size during the walk may be interpreted as an artistic metaphor for the diminishing stature of
Kane due to the partial failure of his media empire during the Depression. He subsequently
walks back to the foreground to sign the agreement held by Mr. Thatcher that sells much of
his publishing/broadcasting empire back to Thatcher’s bank. The very large depth of field can
only be obtained by a small aperture stop, which reduces the light reaching the sensor. Clearly
the emulsion must have good sensitivity (it must have been a “fast film”) and the lighting must
be sufficiently strong to record “useful” images. The sequence is available on “YouTube”- at
http://www.youtube.com/watch?v=WTmVlDh2V2g. Interested readers might want to view
the documentary about the movie (http://www.youtube.com/watch?v=eCkYlCBFV6w). Another scene in the movie that is interesting from the perspective of optics is the so-called “mirror
scene,” which is at the end of the 1-minute clip at http://www.youtube.com/watch?v=8fIP7g9en10
Still from the “mirror scene” in “Citizen Kane.” Again, note the depth of field.
147
c Selznick International Pictures, Vanguard Films 1945 )
4. Spellbound, by Alfred Hitchcock (°
The climactic scene in this classic movie is a confrontation between “Dr. Murchison” (Leo G.
Carroll) and “Dr. Constance Petersen” (Ingrid Bergman), where Petersen reveals she has evidence that Murchison murdered Dr. Anthony Edwardes, whose “substitute imposter” is played
by Gregory Peck. Frames from the scene are shown in the figure. The frames from the viewpoint of Dr. Murchison show the view of his hand, the gun, and Ingrid Bergman, with all apparently “in focus.” To avoid problems with depth of field, the hand and gun are actually models
that are larger than life size that were positioned closer to Bergman than to the camera. The
website for Turner Classic Movies states that the scene took a week to set up and 19 takes to
get the final result (http://www.tcm.com/this-month/article/18621%7C0/Spellbound.html).
YouTube clip available from http://www.youtube.com/watch?v=8rDMotFmCJc.
c Selznick International Pictures 1945), showing (a) Leo G. Carroll
Scenes from “Spellbound” (°
holding a revolver; (b) Ingrid Bergman walking towards the door as Carroll’s character aims the
revolver; (c) and (d) after Bergman’s exit, the hand and gun turn towards the camera and fires.
An additional note of interest in this black-and-white film is that two color frames as the gun
fires were spliced into each print by hand.
148
One of the two color frames of the gunshot spliced into each print of the film “Spellbound.”
5. Somewhere in Time, split-diopter lens to focus on two distances simultaneously, giving the
appearance of expanded depth of field
Split-diopter lens (Fig. 5.13 from Visual effects cinematography By Zoran Perisic), which is
attached to the front of a normal lens and which adds power on one side of the field of view.
c Universal Studios, 1980 ) illustrates the action of
The frame from “Somewhere in Time” (°
the “split-diopter” lens added to the normal camera lens. Both the foreground field on the
right (with Christopher Reeve as “Richard Collier”) and left-hand background field (with Jane
Seymour as “Elyse McKenna,” the white garden bench, and the trees) appear to be “in focus.”
The split diopter lens adds refractive power (thus shortening the focal length) for half the field.
Because the sensor is the same distance from the rear vertex of these two “half-systems,” the
object plane that is in focus in the half field with the additional power is closer to the lens. In
this example, the split-diopter lens is oriented to “split” the fields through the vertical white
pillar and adds power to the right half of the field. The left side of the vertical pillar is “fuzzier”
than the right side, where the features of the wood grain are visible. Note that the trees in
the background on the right are out of focus, while those on the left are sharp. The audience
likely does not notice the discrepancies in the image planes.
4.1 CRITERION FOR “ACCEPTABLE BLUR”
149
c Universal Studios, 1980) showing use of “split-diopter
Frame from “Somewhere in Time” (°
lens.” Both foreground and background are “in focus” but note that the left side of the foreground
pillar is “fuzzy” while the right side is “sharp.”
A system consisting of both optics and sensor is “diffraction-limited” if the pixel size of the sensor
(smallest resolvable spot) is smaller than the linear dimension of the diffraction spot. The system
is “detector-limited” / “sensor-limited” if the linear dimension of the individual sensor elements is
larger than the diffraction spot.
4.1
Criterion for “Acceptable Blur”
The discussion of the limiting “blur” of an imaging system may be extended to characterize the
range of “distances” (or “depths”) over which images of point objects exhibit the “same” (or at
least “similar”) blur dimensions. If specified in object space, the distance range is called the “depth
of field;” the same metric in image space is the “depth of focus.” The depth of field may be thought
of as the “zone of acceptable sharpness” for object locations.
There is no one way to define the depths of field and focus, but we can rather easily derive
a metric based on ray optics and a hybrid metric that includes the concept of “diffraction” from
wave optics (where the aspects must be taken “on faith” at this point). The measurement is based
upon the linear dimension B 0 of the “acceptable blur.” This may be due to a metric of acceptable
spatial resolution or the size of the sensor elements, or the diameter of the diffraction spot in the
hybrid metric. Consider a hypothetical value of B 0 shown in the figure. From this value, it is easy
to determine the range of possible axial distances that correspond to B 0 in the ray model and use
that to evaluate the corresponding dimension B in object space via the transverse magnification
z0
B0
MT = − =
.
z
B
150
The calculation of depth of field: B 0 is the linear dimension of the blur for the system (either the
diameter of the diffraction spot in a diffraction-limited system or the dimension of the sensor
element in a detector-limited system). The locations z 0 ± δ 0 specify locations in image space where
the geometrical blur has the same linear size. The corresponding locations in object space are the
limits of the “depth of field.”
As shown in the figure for a given B 0 , the “blur” spots are located at two positions equidistant
from the “in-focus” image. We assign the name δ 0 to the distance between the “in-focus” image and
the geometrically blurred images, so these two planes are located at z 0 ± δ 0 . The depth of focus in
this model is twice δ 0 :
∆z 0 = 2 · δ 0
In the ray model, the drawing shows that:
B0
z0 ∼ 0
D
= 0 =⇒ δ 0 = B 0 ·
= B · f/#
0
z
D
δ
(in the case where the object distance is “many” focal lengths so that the image distance is only
slightly longer than a focal length). If B 0 is small, so must be δ 0 ; if the f/# is large, so must be δ 0 .
The object distances z1 and z2 corresponding to these image locations may be evaluated from
the imaging equation for the corresponding image distances z10 = z 0 − δ 0 and z20 = z 0 + δ 0 . It is easy
to see that the absolute magnification |MT | is smaller for the smaller image distance, i.e., MT for
z10 = z 0 − δ 0 is smaller than MT for the larger object distance z20 = z 0 + δ 0 . The nonlinearity of the
imaging equation ensures that the distances between the in-focus object distance z and the extrema
are not equal, i.e., z1 − z 6= z − z2 , thus requiring labels for both: z1 = z + δ 1 and z2 = z − δ 2 .
However, if δ 0 is small, then the concept of longitudinal magnification ML allows simple approximate
expressions for the object distances. We already derived a simple expression for ML in terms of the
151
4.1 CRITERION FOR “ACCEPTABLE BLUR”
transverse magnification MT :
Differentiate both sides of the imaging equation:
µ
¶
µ ¶
1
1
1
d
+
=d
=0
z1
z2
f
¶ µ
¶
¶
µ
µ
1
1
1
1
= − 2 dz1 + − 2 dz2 = 0
+
d
z1
z2
z1
z2
dz2
=⇒
=−
dz1
µ
z22
z12
¶
=−
(∆z)0
ML =
=−
∆z
µ
z2
z1
µ
z2
z1
¶2
¶2
2
= − (MT ) < 0
= − (MT )2 < 0
The increments in object distance are related to the increments in image distance via the longitudinal
magnification:
δ0 ∼
= |ML | · δ 1 ∼
= |ML | · δ 2 =⇒ δ 1 ∼
= δ2 ∼
=
z1 = z + δ 1 ∼
=z+
z2 ∼
= z − δ2 ∼
=z−
δ0
δ0
=z− 2
|ML |
MT
δ0
|ML |
δ0
δ0
=z+ 2
|ML |
MT
So the depth of field is proportional to the f/# and to the linear dimension of the acceptable blur:
B 0 · f/#
δ0
δ0
∆z = z1 − z2 = δ 1 + δ 2 ∼
=2· 2 =2·
=2·
|ML |
MT
MT2
µ
¶
0
B
∆z ∼
= 2 · 2 · f/# ∝ f/#
MT
In the detector-limited case where the blur dimension is determined by the pixel dimension b0 ,
the depth of field is proportional to the f/#:
b0
∆z ∼
= 2 · 2 · f/# ∝ f/# (in ray model)
MT
Note that the depth of field is larger in “slower” systems (with large f-numbers and small cone
angles).
If we add the wave concept of “diffraction,” the linear dimension B 0 is determined by the diffraction pattern, which may be written in terms of the wavelength and the focal ratio. Assume that
the linear dimension of image blur has been measured for a particular imaging system at the specific
pair of object and image distances (z and z 0 respectively) of interest:
152
Blur in a diffraction-limited system with aperture diameter D. The image of the point source is a
diffraction pattern at the image plane whose linear dimension (using some criterion) is B 0 .
For example, the image of a point source located a distance z from the system could be measured
to find this limiting “blur diameter” B 0 , where the prime indicates that the measurement is made
in image space. In a diffraction-limited system, the discussion of Fraunhofer diffraction in imaging
shows that one possible measure for B 0 is the diameter of the central lobe of the diffraction spot:
B 0 = 2.44 · λ0 ·
z0 ∼
f
= 2.44 · λ0 · f/#
= 2.44 · λ0 ·
D
D
B0 ∼
= 2.44 · λ0 · f/#
f/#
λ0 · (f/#)
∆z ∼
= 2 · (2.44 · λ0 · f/#) · 2 = 4.88 ·
MT
MT2
2
2
λ0 · (f/#)
∆z ∼
= 4.88 ·
MT2
(if accounting for diffraction)
So the depths of field and of focus are proportional to the square of the f/# in the diffraction-limited
case.
4.2
Depth of Field via Rayleigh’s Quarter-Wave Rule
We can also derive the depth of focus by finding the range of image locations that satisfy Rayleigh’s
rule applied to defocus, and then transform those image distances back into object space via the
imaging relation to find the depth of field.
The necessary task is to find the change in the image location for change in the wavefront error
at the edge of the pupil. In the figure, the ideal reference wavefront has radius R1 (R1 ∼
= f if the
object is a large distance away) and the wavefront with defocus has radius R2 = R1 + δ 0 ∼
= f + δ0 ,
153
4.2 DEPTH OF FIELD VIA RAYLEIGH’S QUARTER-WAVE RULE
where δ 0 is the change in location of the focal plane with an added quadratic phase of ∆W020 = ± λ40 .
The quadratic-phase approximation to the new wavefront is:
W [x, y] =
x2 + y 2
x2 + y 2
¢=
= ¡
2R2
2 R1 + δ 0
=
x2 + y 2
2R1
=
x2 + y 2
2R1
x2 + y 2
∼
=
2R1
2
x + y2
=
2R1
x2 + y 2
µ
¶
δ0
2R1 1 +
R1
µ
+∞ µ 0 ¶n
0 ¶−1
2
2 X
δ
δ
x +y
1+
=
·
R1
2R1
R
1
n=0
!
Ã
µ 0 ¶2
µ ¶2
0
(−1) (−2) δ
(−1) (−2) (−3) δ 0
δ
+
+
+ ···
1 + (−1)
R1
2!
R1
3!
R1
¯ 0 ¯ ¯ 0¯
µ
¶
µ 0 ¶2
¯ δ ¯ ¯δ ¯
δ0
δ
¿ ¯¯ ¯¯ ∼
1−
(if
= ¯¯ ¯¯ ¿ 1)
R1
R1
R1
f
2
2
x +y
− δ0 ·
2R12
where the first term is the quadratic-phase approximation to the ideal wavefront and the second
term is the additional effect of the defocus.
Change in image position δ 0 as a function of the wavefront error ∆W = W020 for defocus.
In the limit where the object distance is large, the image distance R1 is approximately equal to the
focal length f , so this expression simplifies to:
µ 2
¶
x2 + y 2
x + y2
0
∼
W [x, y] =
−δ
2f
2f 2
µ 2
¶
2
2
x + y2
x +y
0
∼
∼
=⇒ ∆W [x, y] = −δ
∆W [x, y] = W [x, y] −
2f
2f 2
If the wavefront error is positive, ∆W > 0 =⇒ δ 0 < 0, which means that the image moves “towards”
the lens as shown in the figure.
The magnitude of the wavefront error at the edge of the pupil (where, say, x =
¡ d0 ¢2
¯ ∙
¸¯
¯
¯
+ 02
d2
d
0
0
= δ 0 · 02
|∆W | = ¯¯W x = , y = 0 ¯¯ = δ · 2 2
2
2f
8f
d0
and y = 0) is:
2
We can now apply Rayleigh’s rule that the image is effectively ideal if the maximum wavefront error
154
is less than a quarter wave, so that the single-sided depth of field is easy to evaluate:
µ
d2
λ0 f 2
λ0
> |∆W | = δ 0 · 02 =⇒ δ 0 ∼
·
=
4
8f
2
2
d0
¶2
= 2λ0
µ
f2
d20
¶
= 2λ0
µ
f
d0
¶2
2
=⇒ δ 0 ∼
= 2λ0 · (f/#) using Rayleigh’s rule for ideal imaging
In visible light with λ0 ∼
= 0.5 μm, the change in image position under the Rayleigh criterion is
2
δ 0 [λ0 ∼
= 0.5 μm] ∼
= (f/#) [ μm]
In words, an image in visible light appears to be “in focus” if the distance of the actual image plane
from the ideal image plane in micrometers is no larger than the square of the f/#. For example, if
the lens is used at f/4, the actual image plane must be within 16 μm of the ideal location; if at f/16,
the actual image plane must be within 256 μm ∼
= 0.25 mm of the ideal location. Note the similarities
and the differences with the rule of thumb that the size of the diffraction spot in micrometers is
equal to the f/#.
The depth of focus is twice this value because we can defocus on either side of the ideal image
plane:
2
2
Depth of focus: (∆z)0 = 2δ 0 ∼
= 4λ0 (f/#) ∼
= 2 · (f/#) [ μm]
Now convert this to the object space via the longitudinal magnification to find the depth of field:
0
δ0
(∆z)
=
= − (MT )2
δ
∆z
0
0
(∆z)
(∆z)
∆z ∼
=
=2·δ =
|ML |
(MT )2
ML =
4λ0 (f/#)2
∆z ∼
=
2
(MT )
which again is proportional to the square of the f-ratio and is quite similar to the “hybrid” metric
for depth of field in the diffraction-limited case from the last section:
Depth of field: ∆zHybrid ∼
= 4.88 ·
Ã
λ0 (f/#)2
(MT )
2
!
' ∆zRayleigh ∼
=4·
Ã
λ0 (f/#)2
2
(MT )
!
These two expressions are quite similar; the fact that these are not identical should be no surprise
since they were derived using different assumptions.
Note that the depth of field increases as the square of the f/#, so stopping down the lens by
a factor of 2 has a big impact — it increases the depth of field by about a factor of 4. Since the
transverse magnification is less than unity for most real imaging setups (and a lot less for distant
objects), the depth of field increases rapidly as the object distance increases.
It might be useful to do an example. Consider a normal lens with f = 50 mm acting in visible
light (λ0 = 500 nm = 0.5 μm) with the aperture wide open (say, f/2 so that the diameter of the
entrance pupil is d0 = 25 mm) imaging a nearby object with z1 = 1 m:
µ
¶−1
1
1
∼
z2 =
−
= 52.63 mm
50 mm 1000 mm
z2
52.63 mm
= −0.5263
MT = − = −
z1
1000 mm
where (again) the negative sign on the transverse magnification means that the image is “upside
4.2 DEPTH OF FIELD VIA RAYLEIGH’S QUARTER-WAVE RULE
155
down” compared to the object. The depth of focus is:
0
depth of focus at f/2: (∆z) = 2δ 0 ∼
= 4 · 0.5 μm · 22 = 8 μm
And the depth of field is obtained by scaling by the square of the transverse magnification:
0
8 μm
(∆z)
∼
depth of field at f/2: ∆z ∼
=
=
= 28.9 μm
MT2
(−0.5263)2
If we stop the lens down to, say, f/16 (a factor of 8), the depths of focus and field are much
larger:
depth of focus at f/16: (∆z)0 = 2δ 0 ∼
= 0.5 mm
= 4 · 0.5 μm · 162 = 512 μm ∼
depth of focus at f/16: ∆z ∼
=
512 μm
(−0.5263)2
∼
= 1.85 mm
If the object is a large distance away, say z1 = 100 m with the lens wide open at f/2, the transverse
magnification is much smaller:
µ
¶−1
1
1
∼
−
= 50.025 mm
50 mm 100 m
z2
50.025 mm
MT = − = −
= −5.0025 × 10−4
z1
100 m
z2 =
The depth of focus is the same as it was for the close-up image at f/2:
0
(∆z) = 4 · 0.5 μm · 22 = 8 μm
but the much smaller value for the transverse magnification means that the depths of field and focus
are much larger:
8 μm
∼
∆z ∼
=
2 = 32 m
(−5.002 5 × 10−4 )
∆z ∼
=
512 μm
2
(−5.002 5 × 10−4 )
∼
= 2 km
Depth of field of lens focused at z1 = 20 ft ∼
= 6 m for three focal ratios: f /1.8, f /5.6, and f /16
showing increase in depth of field with increasing focal ratio (from http://www.engadget.com).
156
4.3
Hyperfocal Distance
The last example just presented where the object distance z1 = 100 m and the depth of field ∆z ∼
=
2 km suggests another useful imaging metric: the shortest object distance for which the depth of
field extends to infinity, which is called the hyperfocal distance (z1 )hyp erfocal and the corresponding
image distance (z2 )hyp erfocal is the sum of the focal length and the “defocus distance” δ 0 :
(z1 )hyp erfocal + δ 1 = ∞ =⇒ (z2 )hyp erfocal − δ 0 = f
=⇒ (z2 )hyp erfocal = f + δ 0
The hyperfocal object distance (z1 )hyp erfocal satisfies the imaging equation for this image distance:
1
1
1
+
=
(z1 )hyp erfocal
(z2 )hyperfocal
f
µ
¶−1
1
1
−
f
f + δ0
f 2 + δ0 f
f2
=
=f+ 0
0
δ
δ
2
f
∼
= f+
2λ0 (f/#)2
f2
∼
=
2
2λ0 (f/#)
Hyperfocal Distance (z1 )hyp erfocal =
where we can also interpret this in terms of the diameter of the diffraction spot:
(z1 )hyp erfocal ∼
=
f2
f2
=
(2λ0 f/#) · (f/#)
(f/#) · ddiffraction
sp ot
where ddiffraction sp ot ∼
= 2 · λ0 · f/#. So if we have a so-called “normal lens” with f = 50 mm acting
at f/2 (close to wide open) and in light with λ0 = 500 nm, the hyperfocal distance is:
(z1 )hyperfocal ∼
=
(50 mm)2 ∼
= 625 m
2 · 500 nm · 22
which is quite distant. If we stop the lens down to f/16, we get:
(z1 )hyp erfocal ∼
=
(50 mm)2
∼
= 9.8 m
2 · 500 nm · 162
which is quite a lot closer to the lens. This means that objects at all distances in the interval
10 m / z1 < ∞ should appear to be “in focus” if the lens is used at f/16.
4.4
Methods for Increasing Depth of Field
1. Google Lens: http://www.google.com/patents/US6320979
2. Focus stacking: digital combinations of images collected at different focus settings. Different
images are combined based on local sharpness to produce an image with extended depth of
field.
3. Light-field camera = plenoptic camera that captures the four-dimensional field [x, y, z, t]. An
example of such a camery is the Lytro, which uses a matrix of microlenses to collect ray
4.5 SIDEBAR: TRANSVERSE MAGNIFICATION VS. FOCAL LENGTH
157
direction information in addition to color and lightness. This stored information allows recovery
of focused information at different depths.
4. Cameras with different focal settings for different colors of light. The information is combined
digitally to extract the sharp edge data from the color with the large f/# with the blurrier
structure in other colors.
4.5
Sidebar: Transverse Magnification vs. Focal Length
It may be useful to derive the relationship between transverse magnification and focal length for a
given object distance. We know the imaging equation for object distance z1 , image distance z2 , and
focal length f
1
1
1
+
=
f
z1
z2
We already know that for an imaging system consisting of two or more lenses, the object distance is
measured to the object-space principal point, the image distance is measured from the image-space
principal point, and the focal length is replaced by the effective focal length. For a specific object
distance z1 and a fixed focal length f , the equation may be rearranged to determine the image
distance:
z1 · f
z2 =
z1 − f
We can substitute the expression for the transverse magnification:
´
³
!
Ã
Ã
!
z1 ·f
z1 −f
f
z2
f
f
1
1
=−
MT = − = −
=
=
z1
z1
f − z1
z1 zf − 1
z1 1 − zf
1
1
¯ ¯
¯ ¯
If the focal length is shorter than the object distance, then the term ¯ zf1 ¯ < 1:
MT
MT
!
µ
¶ Ã
1
f
= −
·
z1
1 − zf1
µ
¶ X
¶n
µ
∞
1
f
f
= −
·
−
z1
n!
z1
n=0
!
µ
¶ Ã
µ ¶2
f
1 f
f
= −
+
− ···
· 1−
z1
z1 2 z1
µ ¶2
µ ¶3
f
f
1 f
= − +
−
+ ···
z1
z1
2 z1
f
∼
= − if f ¿ z1
z1
where the series for (1 − t)−1 has been used. For a lens with a fixed focal length f but two object
distances (z1 )a and (z1 )b the transverse magnifications are:
f
(MT )a ∼
= −
(z1 )a
f
(MT )b ∼
= −
(z1 )b
158
so the difference in transverse magnifications is:
(MT )a − (MT )b = ∆MT ∼
=
∆MT
∆MT
¶ µ
¶
µ
f
f
− −
−
(z1 )a
(z1 )b
¶
1
1
−
(z1 )
(z1 )b
¶
µ a
(z1 )b − (z1 )a
= (−f ) ·
(z1 )a · (z1 )b
(z1 )a − (z1 )b
∆z1
= f·
=f·
(z1 )a · (z1 )b
(z1 )a · (z1 )b
∼
= (−f )
µ
We have already seen that the transverse magnification varies with the focal length of the lens:
µ
¶
µ
µ
¶¶
1
1
1
z2
1
z2
1
1
+
=
·
+1 =
· 1− −
· (1 − MT )
=
=
f
z1 z2
z2
z1
z2
z1
z2
z2
=⇒
= (1 − MT )
f
f
1
=⇒
=
z2
1 − MT
If the object distance z1 is large, then |MT | / 0, which means that we can substitute the geometric
series:
+∞
X
1
t if |t| < 1
=
1−t
=0
1
f
=
z2
1 − MT
+∞
X
2
=
(MT ) = 1 + MT + (MT ) + · · · ∼
= 1 + MT if |MT | < 1 =⇒ z2 ' f
=0
f ∼
= 1 + MT if |MT | < 1 =⇒ z2 ' f
z2
which implies that the magnification increases with the focal length
We should check this for some known cases: if the object distance z1 = +∞, then z2 = f and :
f
=1∼
= 1 + |MT |
z2
=⇒ |MT | ∼
= 0, correct answer
z1 = ∞ =⇒
If the object distance is z1 = 100 · f , then the image distance and approximate transverse magnification are:
100
99 ∼
f
1
=
z2 =
f =⇒
= 1 + MT =⇒ MT ∼
=−
99
z2
100
100
The actual transverse magnification is:
MT = −
so the approximation is still quite good.
¡ 100 ¢
99
100
=−
1 ∼
1
=−
99
100
4.5 SIDEBAR: TRANSVERSE MAGNIFICATION VS. FOCAL LENGTH
159
Now consider two distant objects a and b at object distances (z1 )a > (z1 )b À f , we have:
(z1 )a
(z1 )b
∆z
−
=
f
f
f
(z1 )a
(z1 )b ∼
−
= (1 + MT )a − (1 + MT )b = (MT )a − (MT )b = ∆MT
f
f
∆z1 ∼
= ∆MT
f
which shows that the difference in transverse magnifications decreases as the focal length f increases
for fixed ∆z1 . In words, if two distant objects are separated along the optical axis by the distance
∆z, the transverse magnifications for the two objects are more similar if the focal length f is large,
which gives the impression to the viewer that the objects are “close together.”
Consider the example shown below; the subjects are a pair of 15- in diameter Rodman smoothbore
cannon dating from 1864 that are preserved on restored carriages at Fort Foote, Maryland, near my
childhood home (when I was growing up, the two barrels had not been mounted, but were lying on
the ground). The near and distant cannons are separated by the fixed distance ∆z1 . The images were
taken with a zoom lens: the first used a “telephoto” setting with equivalent focal length f1 = 140 mm
for the 35 mm film format (the actual focal length was f1 = 22.2 mm). The second image was taken
with equivalent focal length f2 = 32 mm for the 35 mm format (a “wide-angle” lens; the actual focal
length f2 = 6.6 mm). The difference in transverse magnifications clearly is smaller with the long focal
length (first image) as the distant cannon is readily visible; the tiny distant cannon is barely visible
in the second image. The transverse magnifications for the background cannon differ by nearly a
factor of 2.5 for the two images. This effect leads to the statement that telephoto lenses “compress”
the depth of field (though some vigorously dispute this statement for psychological reasons!).
160
Illustration of the variation in transverse magnification with focal length of the lens. The equivalent
focal length of the lens used to make the top image is f ∼
= 140 mm (telephoto) and that for the
bottom is f ∼
= 32 mm (wide angle). The background cannon is MUCH smaller in the second image.
Chapter 5
Aberrations
Aberrations may be loosely defined as deviations from predicted behavior of an optical system.
Chromatic aberrations describe deviations from predicted behavior due to variations in the refractive
index for different wavelengths of light. Monochromatic aberrations are variations from calculated
behavior due to the approximations used. For example, if we use just the first-order approxmation
sin [θ] ∼
= tan [θ] ∼
=θ
we can describe the deviations from predicted first-order behavior as the third-order aberrations.
The aberrations may be described in terms of waves or of rays. The wave aberration is the
departure of the wavefront from the ideal spherical wave that “should” emerge from the exit pupil
of the system to the image:
p [x, y] · exp [+iΦ [x, y]] = p [x, y] · exp [+iπW [x, y]]
where W [x, y] is the scalar wave aberration function measured in units of π radians at each point
in the exit pupil. Note that the spherical wave “converges” to a real image or “diverges” from a
virtual image.
The wave aberration function is the difference of the actual emerging wave from the ideal sphere,
which has the form:
r
(x2 + y 2 )
2
2
2
2
x + y + z = R =⇒ z = R · 1 −
R2
5.1
Chromatic Aberration
In the earliest days of optics, all optical systems were constructed from single lenses (“singlets”) and
therefore suffered from chromatic aberrations due to the physical mechanism of dispersion.We saw
that the index of refraction of optical materials decreases with increasing wavelength λ in regions of
normal dispersion. At longer wavelengths in a regime with normal dispersion, a lens with positive
power will have less refractive power φ (longer focal length f ). Conversely, a lens with negative
power will have a longer negative focal length at longer wavelengths.
The impact of chromatic aberration on the image was minimized if the focal is long and the focal
ratio is large. For this reason, early telescopes for astronomical viewing were made very long in part
for magnification and in part to reduce the visibility of chromatic aberrations.
161
162
CHAPTER 5 ABERRATIONS
The aerial telescope of Johannes Hevelius with a focal length of f = 45 m ∼
= 148 ft with an aperture
diameter of d ∼
= 220 mm ∼
= 8.5 in
The observation that different glasses have different dispersions is the basis for the principle of
achromatization (from the Greek words for without color ), where two optical elements made from
glasses with different dispersion characteristics are combined to match the focal lengths at two
different wavelengths (typically red and blue). An achromatic doublet is fabricated from a positive
element made from crown glass with a lower refractive index and lower dispersion, and a negative
element made of flint glass with a larger refractive index and a larger dispersion. For an achromat
with a positive focal length (converging lens), the lens is made of a positive lens from crown glass
and a negative lens from flint glass so that the chromatic aberrations act in opposition to match at
the two wavelengths. If the component lenses are in contact (and often the curvatures are designed
to match so that they may be cemented together, then the positive power must be larger (focal
length must be shorter).
Lens systems may be built that correct for three or more wavelengths. It may be obvious that the
number of elements must match or exceed the number of corrected wavelengths. Apochromats have
at least three elements to correct the focal length at three different wavelengths (typically red, green,
and blue) and are fabricated from three glass elements with different dispersion characteristics. Of
course, the need for the additional element(s) means that apochromats tend to be more expensive
than achromats.
5.1 CHROMATIC ABERRATION
163
Principle of the achromat: the first singlet lens exhibits chromatic aberration because of the
dispersion of the glass (nred < ngreen < nblue ), which means that red light focuses farther away.
Add a second element of flint glass with negative power that matches the focal lengths for red and
blue light to form an “achromat.”
164
Apochromat made of three elements to correct focus at three wavelengths.
The traditional wavelengths used to design optics were specified by Fraunhofer based on absorption lines in the solar spectrum:
Line
λ [ nm]
n for Crown
n for Flint
C
656.28
1.51418
1.69427
D
589.59
1.51666
1.70100
F
486.13
1.52225
1.71748
The design of acromats is based on the dispersion of the glass, which we already specified
Refractivity
nD − 1
1.75 ≤ nD ≤ 1.5
Mean Dispersion
nF − nC > 0
differences between blue and red indices
Partial Dispersion
nD − nC > 0
nD − 1
ν≡
nF − nC
differences between yellow and red indices
Abbé Number
ratio of refractivity and mean dispersion, 25 ≤ ν ≤ 65
For a single thin lens, the power of the system is:
¶
µ
1
1
1
φ = = (n − 1) ·
≡ (n − 1) · (C1 − C2 )
−
f
R1 R2
where
1
R
The effect of dispersion on the power is obtained by differentiating:
C≡
dφ
φ
dn
nF − nC
φ
= (C1 − C2 ) =
=⇒ dφ = φ ·
=φ·
≡
dn
n−1
n−1
n−1
ν
where ν is the Abbé number.
165
5.2 THIRD-ORDER OPTICS, MONOCHROMATIC ABERRATIONS
For a two-lens system, we have already determined the formula for the power:
φeff
= φ1 + φ2 − φ1 · φ2 · t
=⇒ dφeff = dφ1 + dφ2 − φ2 t · dφ1 − φ1t · dφ2
= (1 − φ2 t) · dφ1 + (1 − φ1 t) · dφ2
The power at the two wavelengths is matched so that:
dφeff
=
0 = (1 − φ2 t) · dφ1 + (1 − φ1 t) · dφ2
φ
φ
= (1 − φ2 t) · 1 + (1 − φ1 t) · 2
ν1
ν2
φ1
φ
=⇒ − (1 − φ2 t) ·
= (1 − φ1 t) · 2
ν1
ν2
ν1
ν2
+
f
ν
+
f
ν
φ
φ2
1 1
2 2
=⇒ t = 1
=
ν1 + ν2
ν1 + ν2
φ1 ν 1 + φ2 ν 2
=
ν1 + ν2
φeff
If the two lenses are in contact so that t = 0, then:
f1
ν2
φ2
=
=−
φ1
f2
ν1
For an achromat that has the same focal length for red light (C line, λ = 656.28 nm) and blue light
(F line, λ = 486.13 nm).
Note that it is possible to use the same glass and adjust the focal lengths and distance to
achromatize. If ν 1 = ν 2 ≡ ν, then
(f1 + f2 ) ν
f1 ν 1 + f2 ν 2
f1 + f2
→t=
=
ν1 + ν2
2ν
2
1
φ1 + φ2
2
f f
´ =2· 1 2
=
=
=⇒ feff = ³
1
1
feff
2
f
+ f2
1
+
t =
φeff
f1
5.2
f2
Third-Order Optics, Monochromatic Aberrations
Aberrations may be interpreted as corrections to the paraxial imaging behavior of optics that result
by adding the second term to the approximations for the trigonometric functions: for cos [ϕ]:
ϕ3
sin [ϕ] ∼
= ϕ−
3!
2
ϕ
cos [ϕ] ∼
= 1−
2!
3
ϕ
tan [ϕ] ∼
= ϕ+
3
The expression for the cosine may be substituted into the formula for the path length
in terms of the object distance z1 , the angle ϕ and the radius of curvature R:
µ
µ 2
¶
¶ 12
2R
2R
= 1+
+
(1 − cos [ϕ])
z1
z12
z1
1
1
of the ray
166
µ
1
z1
¶
third order
1
µ
µ 2
¶µ
µ
¶¶¶ 12
2R
2R
ϕ2
= 1+
+
1− 1−
z12
z1
2!
1
µ
¶¶
µ
2
Rϕ2 R
= 1+
+1
z1
z1
¢1
¡ 2
∼
= z1 + Rϕ2 z1 · (R + 1) 2
=
which is a significantly more complicated expression than the first-order solution:
µ ¶
1
∼
= 1 =⇒ 1 ∼
= z1
z1 first order
The wavefront emerging from the aperture of the system (the exit pupil ) may be characterized
by its shape or by rays at different locations in the pupil that are orthogonal to the wavefront. The
rays are defined by the end-point coordinates in the pupil plane (with height r from which they
emerge) and in the image plane (with height r0 to which they travel). The deviations from the wave
or of the rays from the ideal behavior are characterized by the concept of ray aberrations, which
typically are as a set of numerical values (coefficients) that describe the amount of deviation of the
ray or of the wavefront from the ideal. The order of the aberrations is determined by the highest
power of the term kept in the expansion for the sine in Snell’s law:
sin [θ] = θ −
θ3
θ5
+
− ···
3!
5!
The inclusion of these larger powers in the expansion results in larger deviation of the theoretical
calculation from the actual behavior at larger off-axis angles.
We can also consider deviations of the actual wavefront from the ideal in first-order paraxial or
Gaussian optics. For example, a translation of the ideal wavefront down the z-axis from the “ideal”
image location may be characterized by an “aberration” that is called defocus.
The decomposition of the wavefront into deviations from the ideal requires six coefficients of
powers of r and r0 :
Spherical Aberration
r4
Coma
r3 r0 cos [θ]
Astigmatism
r2 r02 cos2 [θ]
Curvature of Field
r2 r02
Distortion
rr03 cos [θ]
Piston Error
r04
The last of these, piston error, is a measure of a z-axis translation of the wavefront analogous to
defocus. As such, it has no effect on the image and often is not included in the list of aberrations.
167
In spherical aberration with positive coefficients, the rays from the margin of the pupil cross the
axis closer to the optic than the paraxial rays. The image of a point object created by a system with
spherical aberration shows a bright central region surrounded by a “halo” of light from the margin
of the pupil.
Spherical aberration describes the deviation of the rays emerging from the pupil from the ideal
convergence to an image point. If the aberration coefficient is positive, the rays emerging from the
margin of the pupil cross the optical axis closer to the optic than the paraxial rays close to axis.
In other words, the focal length for marginal rays is shorter than that for paraxial rays. Spherical
aberration is a circularly symmetric deviation of the wavefront from the quadratic-phase ideal of
Gaussian optics. The resulting wavefront emerging from the pupil is a 4th power of the pupil coordinates, which has the shape of a china bowl. This shows that the rays near the edge of the pupil
are directed towards a point on the axis that is closer to the optic. Since spherical aberration is
a function only of the pupil-plane coordinates, it describes a shift-invariant deviation that may be
characterized by an impulse response.
The shape of the wavefronts emerging from the pupil for spherical aberration (black) and defocus
(red). Marginal rays emerging from a pupil that exhibits spherical aberration will cross the axis
(i.e., “focus”) closer to the pupil than the paraxial rays.
For coma, the deviations from ideal performance for coma are larger for larger values of the
image plane coordinate r0 . If a point source and its image are located on axis, coma in the system
will have no effect on the image, but the image of a point source located off axis will be spread
differently at different values of the image plane coordinates. The image of an off-axis point source
will be “teardrop” shaped.
To introduce the concept of monochromatic aberrations, consider the complex amplitude of the
168
wavefront diverging from a specific object point [x0 , y0 ] to the location [x, y] in the entrance pupil:
w [x, y; x0 , y0 ] = p [x, y] · exp [+i Φ [x, y; x0 , y0 ]]
where:
µ
¶
∙
µ
¶¸
z1
1
r2 1
Φ [x, y; x0 , y0 ] = exp +2πi
−
· exp +iπ
· exp [+2πi · ∆Φ [x, y; x0 , y0 ]]
λ0
λ0 z1
f
is the phase at the pupil due to a point source located at [x0 , y0 ] in the object plane, which includes
the quadratic phase of the ideal “spherical” wavefront converging to the image point plus any
phase error ∆Φ [x, y; x0 , y0 ] and p [x, y] specifies the magnitude function of the pupil (the so-called
apodization function). A similar expression may be written for light converging to the image point
[x00 , y00 ] from the location [x0 , y 0 ] in the exit pupil. If the actual wavefront at [x, y] in the pupil lags
behind the ideal sphere (actually a paraboloid), then the light from that location converging to the
image plane must have been emitted earlier in time; the phase difference ∆φ at that location [x, y]
in the pupil is positive. The map of ∆Φ [x, y; x0 , y0 ] may be decomposed into different “shapes”
described by different powers of the object coordinates [x0 , y0 ] and of the pupil coordinates [x, y].
The weights of each of these different shapes present in the actual wavefront are the aberration
coefficients, which are commonly used to specify the differences of the behavior from the ideal.
Comparison of ideal and actual wavefronts emerging from optical system. The difference between
the wavefronts may be specified by the difference in phase or by the intersections of rays normal to
the wavefront.
Alternatively, we can describe the difference in action of the optic from the ideal in terms of the
“rays” from different points in the pupil. The rays are (of course) perpendicular to the wavefront
emerging from the pupil. Unaberrated rays should all cross the optical axis exactly at the image
point. Rays from an aberrated wavefront will cross at different locations.
169
Rays from different points on the wavefront emerging from the pupil of an optic with spherical
aberration; the rays cross the optical axis at different locations.
The aberration function specifies the difference in optical phase between the actual and ideal
wavefronts that converge to the ideal real image point (or diverge from the ideal virtual image
point). Since the shape of the wavefront due to a point object generally varies with its location in
the object plane, the aberration function generally depends on coordinates in both the object and
pupil planes; it is a 4-D function. The coordinates used in the calculations of the rays are shown in
the figure:
170
Coordinates used to evaluate aberrations. Light propagates from the pupil plane (coordinates
without subscripts) over the distance z2 to the image plane (coordinates with subscripts). Note that
the pupil and image plane coordinates are normalized so that rmax = (r0 )max = 1.
A ray of light with wavelength λ0 that emerges from the exit pupil at [x, y] and crosses the image
plane at [x0 , y0 ] has the form:
w [x, y; x0 , y0 ] = p [x, y] · exp [+2πi · Φ [x, y; x0 , y0 ]]
where p [x, y] specifies the magnitude of the pupil transmittance of the exit pupil (the so-called
apodization function) and Φ [x, y; x0 , y0 ] is the phase at the pupil for an object point at coordinates
[x0 , y0 ] emerging from the pupil at [x, y]. The phase includes the converging “spherical” (actually
parabolic) wave and the phase difference term:
µ
¶
r2 1
1
+ ∆Φ [x, y; x0 , y0 ]
Φ [x, y; x0 , y0 ] = +i
−
2λ0 f
z2
We consider the locations in polar coordinates: the image location is [x0 , y0 ] = (r0 , α) and
the pupil coordinates [x, y] = (r, θ). If the optical system has a circular cross-section (i.e., if the
optical system is rotationally symmetric), then the behavior of the aberration does not depend on
the absolute azimuthal coordinates but only on their difference, so that we can consider a threedimensional description based on radial coordinates r, r0 , and relative azimuthal angle θ − α ≡ ϕ;
i.e., we can write the phase error function in the form ∆Φ [r, r0 , ϕ]. The relative phase between the
171
object point and a location in the pupil is 2π radians (per cycle) multiplied by the number of cycles,
which is the ratio of the distance between the locations in the object plane and in the pupil divided
by the wavelength λ0 :
o1
n
2
2 2
distance: R = z 2 + (r cos θ − r0 cos α) + (r sin θ − r0 sin α)
o 12
R
2π n 2
Φ [x, y; x0 , y0 , z] = 2π
=
z + (r cos θ − r0 cos α)2 + (r sin θ − r0 sin α)2
λ0
λ0
¢ ¡
¢ª 1
2π © 2 ¡ 2
=
z + r cos2 θ + r02 cos2 α − 2rr0 cos θ cos α + r2 sin2 θ + r02 sin2 α − 2rr0 sin θ sin α 2
λ0
ª1
2π © 2
=
z + r2 + r02 − 2rr0 (cos θ cos α + sin θ sin α) 2
λ0
ª1
2π © 2
=
z + r2 + r02 − 2rr0 cos [θ − α] 2
λ0
½
∙µ 2
¶¸¾ 12
¶ µ
z
r + r02
2rr0
= 2π
· 1+
cos
[θ
−
α]
+
−
λ0
z2
z2
½
∙µ 2
¶¸¾ 12
¶ µ
z
r + r02
2rr0
≡ 2π
· 1+
cos
[ϕ]
+
−
λ0
z2
z2
This expression may be expanded into a power series via the binomial theorem:
n
n (n − 1) 2
u+
u + ···
1!
2!
1
1
1
1
=⇒ (1 + u) 2 = 1 + u − u2 + u3 − · · ·
2
8
16
n
(1 + u) = 1 +
In the current expression, we can identify:
¶
¶ µ
µ 2
r + r02
2rr0
cos
[ϕ]
+
−
u≡
z2
z2
¶ ³
µ 2
´
1
rr0
r + r02
cos
[ϕ]
+
−
=⇒
u=
2
2z 2
z2
¶¸2
¶ µ
∙µ 2
1
1
2rr0
r + r02
=⇒ − u2 = −
cos
[ϕ]
+
−
8
8
z2
z2
"µ
¶2
µ 2
¶#
¶2 µ
¶µ
1
2rr0
r + r02
2rr0
r2 + r02
=−
+ − 2 cos [ϕ] + 2
− 2 cos [ϕ]
8
z2
z
z2
z
¶ µ 2 2
¶³
¶
µ 2
∙µ 4
´¸
4
2 2
2
4r r0
rr0
r + r0
r + r0 + 2r r0
1
2
+
cos
[ϕ]
−
4
cos
[ϕ]
=−
8
z4
z4
z2
z2
¶
µ
¶¸
¶
µ
∙µ 4
r3 r0
4r2 r02
r + r04 + 2r2 r02
1
rr03
2
=−
cos
[ϕ]
−
4
cos
[ϕ]
+
cos
[ϕ]
+
8
z4
z4
z4
z4
¶ µ 2 2
µ 4
¶ µ 3
¶
r r0
1
r + r04 + 2r2 r02
r r0
rr03
2
−
− u2 = −
cos
[ϕ]
+
cos
[ϕ]
+
cos
[ϕ]
8
8z 4
2z 4
2z 4
2z 4
So the power series for the phase function truncated to the second order becomes:
µ
µ 2
¶
³ rr
´¶
r + r02
z
0
1
+
+
2π
−
2π
cos
[ϕ]
Φ [x, y; x0 , y0 , z] ∼
=
λ0
2z 2
z2
¶
µ 3
¶¶
µ µ 4
¶
µ
z
rr03
r r0
r + r04 + 2r2 r02
r2 r02
2
+ 2π
cos
[ϕ]
+
2π
cos
[ϕ]
+
cos
[ϕ]
−
−
2π
λ0
8z 4
2z 4
2z 4
2z 4
172
z
, which produces 10 terms: a constant,
λ0
three terms from the first-order polynomial, and six from the second-order polynomial:
Ã¡
¢!
2
2
+
r
r
z
rr0
0
+ 2π
Φ [x, y; x0 , y0 , z] ∼
− 2π
cos [ϕ]
= 2π
λ0
2λ0 z
λ0 z
¶
µ 2 2
¶
µ 4
r r0
r3 r0
rr03
r + r04 + 2r2 r02
2
−
2π
cos
[ϕ]
+ 2π
cos [ϕ] + 2π
cos [ϕ]
− 2π
3
3
3
8λ0 z
2λ0 z
2λ0 z
2λ0 z 3
Now we can multiply through by the leading factor of 2π
z
λ0
r2
r2
rr0
+ 2π
+ 2π 0 − 2π
cos [ϕ]
2λ0 z
2λ0 z
λ0 z
r4
r2 r02
r2 r02
r3 r0
rr03
r4
− 2π 0 3 − 2π
− 2π
cos2 [ϕ] + 2π
cos [ϕ] + 2π
cos [ϕ]
− 2π
3
3
3
3
8λ0 z
8λ0 z
4λ0 z
4λ0 z
2λ0 z
2λ0 z 3
= 2π
which may be reordered into:
z
Φ [x, y; x0 , y0 , z] ∼
= 2π
λ0
π 2
π 2
2π
+
r +
r0 −
· r r0 cos [ϕ]
λ0 z
λ0 z
λ0 z
π
π 3
π 2 2
−
r4 +
r r0 cos [ϕ] −
r r0
3
3
4λ0 z
λ0 z
λ0 z 3
π 2 2
π
π 4
−
r r0 cos2 [ϕ] +
r r3 cos [ϕ] −
r
λ0 z 3
λ0 z 3 0
λ0 z 3 0
In other words, we have “decomposed” the phase of the spherical wave into terms with different
powers of the coordinate in the pupil plane (with coordinates [x, y] = (r, θ)) and in the image plane
(with coordinates [x0 , y0 ] = (r0 , α) in a manner analogous to the decomposition into sinusoidal
components in the Fourier transform. Our goal will be to decompose the phase difference between
the ideal and actual wavefronts using these same terms. Again, since the system is assumed circularly
symmetric, only the difference in azimuthal coordinates θ − α ≡ ϕ is relevant.
5.2.1
173
Names of Aberrations
The difference in the shape of the “actual” wavefront from the ideal spherical wavefront is decomposed into the same terms as the phase; each term has its unique “shape” and name, and will be
described by a coefficient that determines “how much” of each “shape” is present in the phase difference. From the series above, we can apply weighting coefficients to the three relevant coordinates
distinguished by subscripts: the index j of the power of the radial coordinate r0 at the image (the
“image height”), the index m of the power of the radial coordinate r at the pupil, and the index n
of the power of cos [ϕ]. From the series above we can see that only some powers are included in the
summation, so we can write the phase difference as
∆Φ [x, y; x0 , y0 , z] = Φideal [x, y; x0 , y0 , z] − Φactual [x, y; x0 , y0 , z2 ]
X
=
Wjmn r0j rm cosn ϕ
j,m,n
= W000 (propagation from pupil to image)
+ W200 r02 (piston error) + W111 r0 r cos ϕ (tip-tilt) + W020 r2 (defocus)
+ W040 r4 (spherical aberration) + W131 r0 r3 cos ϕ (coma)
+ W220 r02 r2 (curvature of field) + W222 r02 r2 cos2 ϕ (astigmatism)
+ W311 r03 r cos ϕ (distortion) + W400 r04 (piston error)
+ ···
The coefficients Wjmn measure the “amplitudes” of the individual terms and typically are specified in units of wavelengths (the “number of waves” of the aberration) at the edge of the pupil
(i.e., at r = 1); they must be multiplied by 2π radians per wavelength to convert to phase angle. For
example, a sample system might be specified as having “one-half wave of spherical and a quarter
wave of astigmatism.”
Shift Invariant or Not?
Note that phase errors that depend on r0 will produce different images for different image “heights”
and therefore are shift-variant effects that strictly cannot be characterized by impulse responses
and/or transfer functions. That being said, it is common practice to examine the “impulse response” and/or the “transfer function” in a local region as though the aberration were shift invariant, which allows the analyst to create a (“pseudo”) frequency-domain description of the action of
the aberration.
174
5.2.2
Aberration Coefficients
To get an idea of the behavior in the wavefront due to these terms, we can plot graphs of these
“shapes” at the pupil for specified locations in the object plane. The examples are plotted for
different object locations and assuming that λ0 = z2 = 1. The aberrations are grouped by the
numerical powers of the radial terms in the series, e.g., j + m = 0 for W000 , j + m = 2 for W200 ,
W111 , and W200 , j + m = 4 for W040 , W131 , etc. You might expect that the second-order grouping
would include W200 (piston error), W111 (tip-tilt), and W020 (defocus). However, for historical
reasons, the groupings are based on the powers for the “rays” derived from the “wavefronts” via
the gradient operator (a first-order derivative), so these three form the group of the first-order
aberrations. The terms with j + m = 4 are the third-order aberrations, etc.
Zero-Order Term:
Propagation:
constant phase (zero-order piston error = propagation from pupil to image):
⎧
p
⎨ 1 if
x2 + y 2 ≤ 1
λ0
∆Φ [x, y; x0 , y0 , z] = 2π · W000 ·
p
⎩ 0 if
x2 + y 2 > 1
The coefficient W000 is the number of incremental wavelengths due to propagation “downstream”
from the object to the pupil is a normal part of the imaging; it is not considered to be an aberration.
In any event, its only effect on the irradiance is the constant attenuation of the image field due to
the inverse square law identical to the constant phase term in the Fresnel and Fraunhofer diffraction
terms.
zero-order term, constant phase, piston error aberration
175
Second-Order Wave (First-Order Ray) Aberrations:
These include the three terms for which the sums of the powers of r and r0 equal two. Since the rays
are oriented orthogonal (and must be calculated by derivatives), these correspond to the “first-order”
aberrations for rays. In fact, these three terms often are not considered to be aberrations since the
only one that has a degrading effect on an irradiance image is defocus, which may (of course) be
compensated by changing the location of the sensor so that it coincides with the image.
Constant Phase — First-Order Piston Error
constant phase (first-order piston error):
⎧
2
p
⎪
⎨ + r0 if
x2 + y 2 ≤ 1
2λ0 z
∆Φ [x, y; x0 , y0 ] = 2π · W200 ·
p
⎪
⎩
0
if
x2 + y 2 > 1
This is an additional constant phase due to the off-axis location in the image plane; it is quadratic
in the image coordinate, but constant in the pupil coordinate, so it is a constant for a particular
image location. Since this measures the “constant” phase difference, it has no effect on the measured
irradiance and therefore no impact on the quality of the image.
constant phase from first-order terms: piston error
176
Bilinear-Phase — “Tip-Tilt”
linear phase from both object and pupil (tip or tilt):
⎧ rr
p
⎨ − 0 cos [ϕ] if
x2 + y 2 ≤ 1
λ0 z
∆Φ [x, y; x0 , y0 ] = 2π · W111 ·
p
⎩
0
if
x2 + y 2 > 1
A phase that has linear contributions from the pupil location r and image location r0 (a “bilinear”
phase) means that the shape of the field emerging from the pupil for a particular object location is
a “flat” plane tilted in proportion to the off-axis position of the object and the image. Because it is
a linear phase in the pupil, it displaces the resulting image towards the direction where the phase is
negative.
In atmospheric imaging scenarios (imaging along a vertical path through turbulence), the timevarying tip-tilt aberration is dominant. For example, the centers of the images of individual stars
appear to move around over short time intervals of the order of hundredths of a second. The
correction of tip-tilt aberration has a very significant positive effect on the quality of the resulting
image. For an example, see the animated GIF file at URL:
http://www.ast.cam.ac.uk/~optics/Lucky_Web_Site/100Her_10ms_200fr.gif
first-order linear term, tip-tilt error
177
Quadratic-Phase Error, Focus Shift = “Defocus”
quadratic phase =⇒ defocus = focus shift
⎧
2
p
⎪
⎨+ r
x2 + y 2 ≤ 1
if
2λ0 z
∆Φ [x, y; x0 , y0 ] = 2π · W020 ·
p
⎪
⎩
0
if
x2 + y 2 > 1
This quadratic term is the error in the Fresnel propagation from the exit pupil if the observation
plane does not coincide with the image plane and is therefore called “defocus.” Since it is not a
result of flaws in the optics, it is often not considered to be an “aberration,” but there is reason to
do so in some applications. As an example, consider the atmospheric imaging scenario mentioned
under tip-tilt; any time-varying quadratic contribution to the relative phase displaces the focal
plane (slightly), so images through atmospheric turbulence with quadratic contributions appear to
go in and out of focus over short time intervals (but, as already mentioned, the tip-tilt aberration
is dominant, totalling 87% of the light energy under certain assumptions — see Noll, JOSA, 66,
pp.207-211, 1976 and van Dam & Lane, JOSA A, 19, pp. 745-752).
first-order quadratic term, focus shift error = “defocus”
Since defocus is a function only of the pupil-plane coordinates, it is shift invariant at the image
plane; the effect of defocus does not vary with “image height” and therefore may be described by
an impulse response and a transfer function. For example, consider a small first-order focus error of
π radians at the edge of a rectangular pupil with linear dimension d0 = 1 unit. The complex-valued
wavefront has the form shown:
178
Pupil function with defocus of π radians at edge of the pupil (“half-wave of defocus”): (a) real
part; (b) imaginary part; (c) magnitude; (d) phase, showing quadratic nature.
The incoherent transfer function is the scaled autocorrelation of the pupil and the impulse response is the inverse Fourier transform. The MTF has a zero at the normalized spatial frequency
ρ∼
= 0.5. Note that the image with defocus is “wider” and the peak irradiance is “smaller” than the
diffraction-limited image.
(a) MTF of incoherent optical system with square aperture with one-half wave of defocus compared
to MTF without defocus (red); (b) psf with one-half wave of defocus (black) and without defocus
(red).
Other examples of transfer functions (MTFs) and impulse responses for square apertures with different amounts of defocus (measured in waves at the edge of the pupil) are shown. Note in particular
that the intermediate frequencies are degraded more rapidly than either the smallest or largest spatial
179
frequencies. Note that the MTF at certain frequencies is negative, which means that the modulation
has changed sign (“lighter” regions in the original object become “darker” in the defocused image).
This can be seen in an object with different spatial frequencies.
MTF and corresponding psfs for square pupil with different amounts of defocus from λ40 at the edge
of the pupil to 1.5λ0 . Note that the decrease in MTF is most pronounced at intermediate spatial
frequencies. For larger amounts of defocus, the MTF goes negative over regions of the frequency
domain (contrast reversal). The psf widens with increasing defocus.
The spatial frequency of a “radial grating” f [x, y] increases as the reciprocal of the distance from
the center. In the examples shown, the irradiance is biased up so that its normalized maximum and
minimum amplitudes are 1 and 0, respectively. The grating is imaged through a real optical system
onto a CCD sensor that samples the image and thus the image is aliased at large spatial frequencies
(near the center). The three images are at the focal plane (i.e., “in focus”) and with two increments
of defocus. Track a radial line in the original (in red) to see that the amplitude of the in-focus
does not vary from unity (except where there is aliasing), while the defocused image exhibits several
changes in phase, from light to dark to light, etc. The contrast of the smallest spatial frequency (at
the edge of the image) is reversed in the image with more defocus, and this image also exhibits more
changes in phase.
180
Effect of two increments of defocus on the image of a radial grating. The negative regions of the
MTF of defocus imply that the contrast of those spatial frequencies is “reversed” (darker gray →
lighter gray and vice versa). Track the “lightness” along the red lines to see the contrast reversals.
Note that the “in-focus” image exhibits some sampling (“aliasing”) artifacts in the center where
the azimuthal spatial frequency is large.
This artifact is often called “spurious resolution,” because the object is not reproduced at the
locations of the phase change.
5.2.3
181
Fourth-Order (Third-Order Ray) Aberrations:
the “Seidel aberrations”
r4
−
=⇒ no variation at object, quartic phase at pupil =⇒ spherical aberration W040 (LSI)
2λ0 z 3
rr03
cos [ϕ]
2λ0 z 3
r2 r02
−
4λ0 z 3
2 2
r r0
−
cos2 [ϕ]
2λ0 z 3
r3 r0
+
cos [ϕ]
2λ0 z 3
r4
− 0 3
8λ0 z
+
=⇒ cubic phase at object, linear phase at pupil =⇒ coma, W131
=⇒ quadratic phase at object and pupil
=⇒ field curvature, W220
=⇒ quadratic phase at object and pupil + azimuth variation =⇒ astigmatism, W222
=⇒ linear phase at object, cubic phase at pupil =⇒ distortion, W311
=⇒ quartic phase at object, no variation at pupil =⇒ third-order piston error, W400
Note that the four of these six terms have even powers of both the pupil coordinate r and the image
coordinate r0 , whereas coma and distortion include odd powers of both.
Spherical Aberration
This is the simplest third-order aberration to describe mathematically since it depends only on
the coordinates in the pupil plane; its effect is constant across the image plane. This means that
spherical aberration is the only one of the six Seidel terms that is shift invariant (and may therefore
be described as a convolution). The wavefront shape for spherical aberration resembles a deeper
“bowl” than the paraboloid for defocus. Note that the negative sign on the phase means that the
spherical aberration is negative if the phase contribution is positive.
linear phase from both object and pupil (tip or tilt):
⎧µ
¶
4
p
⎪
⎨ − r
x2 + y 2 ≤ 1
if
2λ0 z 3
∆Φ [x, y; x0 , y0 ] = 2π · W040 ·
p
⎪
⎩
0
if
x2 + y 2 > 1
quadratic term from second order of expansion: spherical aberration
If the numerical coefficient of spherical aberration is positive, then rays from the marginal regions
of the pupil have a steeper slope than those from the paraxial region near the optical axis. In other
182
words, the “marginal focus” is closer to the lens than the ideal “paraxial focus.” The paraxial image
of a point object is not “sharp” but exhibits a halo of light around a bright central core.
Negative coefficient of spherical aberration of positive lens: rays from the margin of the pupil cross
axis closer to the optic than paraxial rays. The image of a point object at the paraxial focus exhibits
a bright central region surrounded by a “halo” of light from the margin of the pupil.
Because it is a shift-invariant effect at the image plane, spherical aberration may be described
by an impulse response and by a transfer function. Spherical aberration is a distortion of the true
spherical wavefront that makes a “deeper bowl” so that the incremental phase error is large near
the edge of the pupil (far from the optical axis, for the marginal part of the wave) and small near
the center of the pupil (near the optical axis, for the paraxial part of the wave).
Example of quartic wavefront error of spherical aberration compared to quadratic error from
defocus. Spherical aberration error is a “deeper bowl.”
Consider an example for spherical aberration where the phase error is π radians at the edge of a
square pupil, the same phase error at the edge that was considered for defocus. The profiles of the
phase in the pupil are:
183
Pupil function for one-half wave of spherical aberration: (a) real part; (b) imaginary part; (c)
magnitude; (d) phase in units of π radians, showing the fourth-power behavior.
The incoherent MTF shows a significant decrease as the frequency approaches cutoff and the psf
is noticeably wider and “shorter:”
184
(a) MTF of incoherent optical system with square aperture with one-half wave of negative spherical
aberration at the edge of the pupil compared to MTF without aberration (red); (b) psf with one-half
wave of aberration (black) and without aberration (red). Note that the image with spherical
aberration is “shorter” and “fatter.”
MTF and corresponding psfs for square pupil with different amounts of spherical aberration from
λ0
4 at the edge of the pupil to 1.5λ0 . The MTF has a similar behavior as for defocus; it decreases
most rapidly at the middle frequencies rather than at smallest or largest, and it may go negative at
some frequencies. The MTF for spherical aberration decreases more slowly than for defocus because
the phase changes more slowly except near the edge of the pupil.
The uncorrected optical system in the Hubble Space telescope suffered from significant spherical
aberration due to flaws in the primary mirror that were disguised during mirror testing.
Spherical aberration of the wave emerging from different parts of the pupil may be partially
balanced by changing the focus, i.e., by “adding defocus.” For example, the phase at the edge of the
185
pupil may be compensated by applying a defocus aberration in the opposite direction so that
µ
¶
14
12
+ 2π · W020 ·
2π · W040 · −
=0
3
2λ0 z
2λ0 z
W040
=⇒ W020 =
z2
If we use defocus cancel the phase error due to spherical aberration at the edge of the pupil, the
resulting transfer function and image have the form shown, so that the image is improved markedly
by using the appropriate amount of defocus.
Application of defocus to balance spherical aberration at edge of square pupil: (a) MTF without
aberrations (black), with 1/2 wave of spherical aberration (red), and after balancing with -1/2 wave
of defocus; (b) corresponding impulse responses.
Coma
=⇒ linear phase from both object and pupil (tip or tilt):
⎧
3
p
⎪
⎨ + r0 r cos [ϕ] if
x2 + y 2 ≤ 1
3
2λ0 z
∆Φ [x, y; x0 , y0 ] = 2π · W131 ·
p
⎪
⎩
0
if
x2 + y 2 > 1
The surface shape is proportional to the cube of the image height, proportional to the height
of the ray in the pupil. This produces a different phase error, and therefore different images, for
different values of the image height r0 as shown in the example. The images have a “comet-like”
shape, hence the name for the aberration.
186
Star field imaged through optical system with coma; elongation of the star images increases with
distance from optical axis (which is located below bottom of the image). Credit: “Star Gazing with
Telescope and Camera,” George T. Keene, Amphoto, Garden City, 1967, p. 93.
Curvature of Field
quadratic phase from object and pupil
⎧
2 2
p
⎪
⎨ − r0 r if
x2 + y 2 ≤ 1
3
2λ0 z
∆Φ [x, y; x0 , y0 ] = W220 ·
p
⎪
⎩
0
if
x2 + y 2 > 1
As indicated by the name, the “best” images in systems with this aberration are on a curved surface.
187
Some imaging systems (e.g., Schmidt cameras) are deliberately designed with curved fields because it produces good images over wide fields of view. The sensors used in wide-field Schmidt
astronomical cameras were glass plates that were predistorted” prior to being installed in the camera. Since the plates could be as large as 14" square, this was a touchy operation.
Astigmatism
The Latin word for “points” is “stigmata,” so that a system with astigmatism is not capable of
producing points. It focuses “horizontal” and “vertical” patterns at different focal planes, as shown:
Astigmatism focues vertical and horizontal lines at different planes (horizontal lines in the
“sagittal” plane and vertical lines in the “meridional” plane)
http://www.olympusmicro.com/primer/anatomy/aberrations.html
The aberration coefficient for astigmatism is:
188
quadratic phase from object and pupil and azimuthal variation
⎧
p
⎨ − 1 r02 r2 cos2 [ϕ] if
x2 + y 2 ≤ 1
3
2λ0 z
∆Φ [x, y; x0 , y0 ] = 2π · W222 ·
p
⎩
0
if
x2 + y 2 > 1
The error is quadratic with an azimuthal dependence; the additional quadratic is maximized along
the azimuthal direction ϕ = 0 and, and zero along the orthogonal direction. It therefore adds
an azimuthally dependent “focusing” power. In other words, object lines oriented along different
directions are focused at different distances from the optic.
The eye systems of many people exhibit astigmatism, which means that the corrective lenses
must have different powers along the orthogonal axes; in other words, lenses with cylindrical power
are needed.
Lenses that have been corrected for astigmatism are known as anastigmats.
Distortion
cubic phase at pupil, linear phase at object, azimuthal variation
⎧
3
p
⎪
⎨ + r0 r cos [ϕ] if
x2 + y 2 ≤ 1
3
2λ0 z
∆Φ [x, y; x0 , y0 ] = 2π · W311 ·
p
⎪
⎩
0
if
x2 + y 2 > 1
This is a cubic dependence on the pupil coordinate and linear variation the image coordinate.
Like coma, the effect of distortion varies with image height.
189
The image shapes resulting from distortion with coefficients of different algebraic signs are different.
If W311 < 0 or W311 > 0, the images suffer from “pincushion distortion” or “barrel distortion,”
respectively.
Images of a grid object through systems with (a) no aberrations; (b) “pincushion” distortion
( W311 < 0); (c) “barrel” distortion ( W311 > 0).
Piston Error
quartic phase at object
⎧
4
p
⎪
⎨ − r0
if
x2 + y 2 ≤ 1
2λ0 z 3
∆Φ [x, y; x0 , y0 ] = 2π · W400 ·
p
⎪
⎩
0
if
x2 + y 2 > 1
This is a constant phase due to the off-axis distance at the image plane and has no effect on the
irradiance of the image, hence it often is not considered to be an aberration. However, it does have
an important effect on optical systems with “sparse” primary elements, such as multiple-mirror
telescopes.
190
constant term from second-order expansion: piston error
Of course, the ultimate resolution of optical systems may be due in part to other uncontrollable
factors. For example, ground-based astronomical telescopes are ultimately limited by random variations in local air temperature that create random variations in the refractive index of atmospheric
“patches.” These variations are often decomposed into the Seidel aberrations. The constant phase
(“piston”) error has no effect on the irradiance (the squared magnitude of the amplitude). Linear
phase errors move the image from side to side and or top to bottom (“tip-tilt”). Quadratic phase
errors (“defocus”) add or subtract power from the lens to move the image plane along the axis
forwards (towards the optic) or backwards (away from the optic), respectively. In correction for
atmospheric phase errors, the tip-tilt error is most significant, which means that correcting this
aberration significantly improves the image quality. The field of correcting atmospheric aberrations
is called “adaptive optics,” and is an active research area.
5.2.4
Zernike Polynomials
It should be no surprise that other useful decompositions of the wavefront errors exist. Another
common set of basis functions are the Zernike polynomials, which are often used for fitting data from
interferometric optical testing (though NOT in the presence of air turbulence; Zernikes have little
value in this situation). The Zernike polynomials are functions of radial and azimuthal coordinates
that describe “surfaces” on the unit circle such that the average value of each is zero:
Zn (r, ϕ) = Rn (r) · cos ( · ϕ)
Zn− (r, ϕ) = Rn (r) · sin ( · ϕ)
where the radial part is defined as:
⎧
(n− )/2
k
⎪
X
⎪
(−1) (n − k)!
⎪
⎪
µ
¶ µ
¶ · rn−2k if n − is even
⎨
n+
n−
k=0 k! ·
Rn (r) =
−k !·
−k !
⎪
2
2
⎪
⎪
⎪
⎩
0
if n − is odd
191
So that:
0!
· r0 = 1 (r) =⇒ Z00 (r, ϕ) = 1 (r) · cos (0 · ϕ) = 1 (r)
0! · 0! · 0!
(−1)0 · 1!
R11 (r) =
· r1 =⇒ Z11 (r, ϕ) = r · cos (1 · ϕ) = r · cos (ϕ)
0! · (0)! · (0)!
R00 (r) =
Z1−1 (r, ϕ) = R11 (r) · sin (1 · ϕ) = r · sin (ϕ)
etc.
One advantage of the Zernike polynomials is that distinct polynomials are orthogonal over the unit
circle (i.e., the scalar product of any pair of distinct Zernike polynomials vanishes):
⎧
Z r=1
⎨ 1 if n = m
Rn (r) · Rm (r) r dr ∝
≡ δ nm
⎩ 0 if n 6= m
r=0
where δ nm is the Kronecker delta function. The set of the first 36 (nonconstant) Zernike polynomials
yields a decomposition with minimum RMS wavefront error. Since they all represent wavefront errors
at the exit pupil, the corresponding impulse responses and transfer functions may be calculated; the
former are shown in a figure.
192
First 28 Zernike polynomials ordered by azimuthal index (horizontally) and radial index(vertically).
Ref: http://scien.stanford.edu/class/psych221/projects/03/pmaeda/index_files.
psfs (impulse responses) of the aberrations for each of the first 28 Zernike Polynomials (ref:
http://scien.stanford.edu/class/psych221/projects/03/pmaeda/index_files/image096.gif )
5.3 STRUCTURAL ABERRATION COEFFICIENTS
5.3
193
Structural Aberration Coefficients
Structural aberration coefficients are due to the “configuration” or “orientation” of the lens. We
have just seen that the lensmaker’s equation ensures that there are many prescriptions for a thin
lens with a fixed focal length made from one glass. For example, if n2 = 1.5 and f = 100 mm, we
can have R1 = R2 = 100 mm (double convex) or R1 = 50 mm and R2 = ∞ (plano-convex, curved
side towards object) or R1 = ∞ and R2 = 50 mm (plano-convex, curved side towards image), and
many other possibilities. It is perhaps logical that the aberrations from these different prescriptions
will be different too. The calculation leads to one of the “rules of thumb” for optical systems; a
better image is generated by an optical system if the side of the optic with the larger radius is on
the side with the shorter conjugate, which “divides” the power of the lens more equally between the
two surfaces.
For example, for a plano-convex lens with the source point at infinity (so that the image is at the
focal point), the image exhibits better quality if the curved side of the lens is towards the object.
With the flat side towards the object, the front flat surface contributes no power to the image.
5.4
Optical Imaging Systems and Sampling
Q factor
5.5
Optical System “Rules of Thumb”
1. If imaging with a singlet lens, the aberrations are smaller if the lens surface with more curvature
(shorter radius of curvature) is on the side of the longer conjugate. Since the transverse
magnification is smaller than 1 in most cases (distant object), the “more curved” side of the
lens should be towards the distant object. This divides the power of the surfaces more evenly
and minimizes the spherical aberration.
2. If imaging in visible light, the diameter of the diffraction spot in micrometers is approximately
equal to the f-number of the system.
3. The MTF at the Rayleigh limit is about 9% (www.normankoren.com/Tutorials/MTF1A.html).
Lenses are sharpest in the interval of about two stops between the (small) aperture where
diffraction starts to dominate and two stops smaller than the maximum aperture. For 35mm
lenses, the maximum aperture often is of the order of f/2, so two stops smaller is typically f/5.6.
The aperture at which diffraction starts to dominate depends on wavelength, but is generally
accepted as about f/22. Therefore the sharpest range for a 35mm lens is between about
f/5.6 and f/11.At larger apertures (smaller f/ numbers), resolution is limited by aberrations
(astigmatism, coma, etc.); at small apertures, resolution is limited by diffraction. The MTF
if the lens is used “wide open” is almost always poorer than MTF at f/8 because of the
aberrations. Note that this discussion does not consider the effects of the sensor, just the lens.
4. Image is visually unaberrated if the Strehl ratio D ' 0.8 =⇒ σ(∆W ) / 0.075 · λ0 =⇒
λ0
λ0
∆Wmax /
. =⇒ σ ∆W /
4
14
5. If imaging in visible light, the image appears to be “in focus” if the defocus distance measured
2
in micrometers is smaller than (f/#) .
6. Depending on source, the resolution r of lens in line pairs per mm is approximately
1600
1390
/r/
f/#
f/#
194
7. More to come...

RAY OPTICS NOTES - RIT Center for Imaging Science

Transcription

Similar documents

The author has no financial interest to disclose. How Many of your

25% off the entire frame and lenses price, if the frame retail is below

Sea Ray Overnighter 220

Portatile e senza fili, per uno screening di qualità Portatile e senza

Bio 1 Lab 2 review Hadley - Biology-Lab-1

Givi V46 reflective decal installation instructions:

The Architecture of the Model

Ray J Crook Obituary web

Video Sunglasses, Sports HD 720p Wide Angle

17/63 George Street, BEENLEIGH, QLD, 4207