Ten Years of Lessons from Imaging of the Archimedes Palimpsest
Transcription
Ten Years of Lessons from Imaging of the Archimedes Palimpsest
Ten Years of Lessons from Imaging of the Archimedes Palimpsest Roger L. Easton, Jr. 1, William A. Christens-Barry2, Keith T. Knox3 1 Chester F. Carlson Center for Imaging Science, Rochester Institute of Technology, Rochester NY, USA, easton@cis.rit.edu 2 Equipoise Imaging, LLC, Ellicott City MD USA, equipoise1@verizon.net 3 Air Force Research Laboratory, Kihei HI, USA, knox@cis.rit.edu Keywords: Archimedes palimpsest, spectral imaging, image processing, pseudocolor rendering, statistical image analysis ABSTRACT The Archimedes Palimpsest is a circa 10th-century parchment manuscript that was erased early in the 13th century and overwritten with a Christian prayer book. The source of its name arises from the fact that the original texts were recognized early in the 20th century to include partial copies of seven treatises by Archimedes, the oldest extant reproductions of his writings. The manuscript was sold at auction in 1998 and lent by its new owner to the Walters Art Museum, which has supervised a ten-year collaboration by conservators, imaging scientists, and scholars to image, transcribe, and translate the original writings. Over the course of this project, a variety of imaging and processing techniques were developed and applied to clarify the original texts. These techniques evolved from spectral reflective imaging over 30 bands of visible wavelengths to a simpler protocol that produced pseudocolor rendering from fluorescence and reflective images. The final system added additional reflective wavelengths produced by light-emitting diodes (LEDs) in the visible and infrared regions plus low-angle (“raking”) illumination to help visualize the small-scale topography. Image processing ranged from supervised segmentation via a spectral pseudoinverse calculation to deterministic renderings in pseudocolor to unsupervised classification by principal component analysis; all methods developed proved to be useful to recovering text from some set of leaves. The changes in technology and processing were due both to the requirements of the project and to the relentless advance in technology over the time span of the project. 3 HISTORY AND SIGNIFICANCE OF THE CODEX Almost everything known about the work of Archimedes has been gleaned from three codex manuscripts, which were rather uninspiringly designated by Johan Ludvig Heiberg as “A”, “B”, and “C”. The first two vanished from scholarly view by the 16th century, but parts of seven treatises survive in the last, which is a Byzantine codex from the 10th century. Among the treatises in codex C are the only extant leaves of On the Method of Mechanical Theorems, one leaf from the only extant copy of Stomachion (“Stomachache,” likely the oldest known study of combinatorics), and the oldest known example in the original Greek of Archimedes’ most famous work, On Floating Bodies. The other treatises in the codex are On the Measurement of the Circle (in which Archimedes derived an excellent estimate of the value of π), On the Sphere and Cylinder (where he proved that volumes of a sphere and its enclosing cylinder are in the proportion 2:3), On Spiral Lines (where he derived that the area of a single revolution of a spiral – his invention – is 1/3 that of the enclosing circle), and On the Equilibrium of Planes. The manuscript included copies of the diagrams for each treatise, which have proven to be very significant in the scholarly assessment of the book. Of the seven works, the Method is arguably the most significant; it is in the form of a letter from Archimedes to Eratosthenes that outlines methods for proving mathematical conjectures from mechanical analogies. The treatises in this codex were copied in the 10th century, probably in Constantinople, in a book format that is fairly large by modern standards; each parchment leaf measured approximately 200 mm in width and 300 mm in height. In the 13th century, the codex was sacrificed to make a copy of an Orthodox Christian liturgical book, the Euchologion: the parchment pages were disbound, the text erased, and each folio was cut in half along the fold. The copy of the Euchologion was written over these newly cleaned pages and sheets from several other codices. Based upon readings of the colophon of the Euchologion obtained during the imaging project, we now know that Ioannes Myronas was its scribe and that the book was dedicated on Saturday, 14 April 6737 (corresponding to Easter Saturday, 1229 CE). Recycling of books in this way was a common practice due to the expense of treating goat or lamb skin to make new parchment and was feasible because of the durability of the material. The individual leaves of the new book measure approximately 155 mm wide and 200 mm tall and the new writings are perpendicular to the erased originals on all but one folio. A few lines of the original writings on each leaf were hidden in the gutter of the bound prayerbook on all folios save those in the centers of quires. Iron gall ink was used for the prayerbook text and most, if not all, of the original texts that were erased. The ink of the later Euchologion is dark brown in color and the characters are quite readable on most pages, whereas the visibility of the erased texts ranges from rather obvious to virtually invisible. The remaining ink stains of the original text are generally more “reddish” in color than the overtext. The prayer book was used in Christian Orthodox services at the Monastery of St. Sabas in the Judean desert for hundreds of years. In the 1800s, the book was placed in the library of the Metochion of the Church of the Holy Sepulchre in Constantinople, where its presence was noted in 1844 by Constantin von Tischendorf, who was most famous for “borrowing” the Codex Sinaiticus from St. Catherine’s Monastery. He published observations made during his visit to Constantinople in the book “Reise in den Orient,” which was published in German in 1846 and in English translation by W.E. Shuckard as “Travels in the East” in 1847. In the book, Tischendorf noted that the bishop allowed him 4 “to make any use of the manuscripts I found. They were thirty in number, but they were altogether without any especial interest, with the exception of a palimpsest upon mathematics.” (Tischendorf, tr. by Shuckard, 1847, p.274) It is quite likely that this citation refers to the Archimedes palimpsest. Tischendorf apparently made use of the manuscript in a manner that was no doubt unforeseen by his host, since one leaf from the codex was found among his papers after Tischendorf’s death and now resides in the Cambridge University Library as Add. 1879.23. The prayerbook was catalogued as MS 355 in the library of the Metochion in 1899 by Athanasios Papadopoulos-Kerameus. His reference noted the use of the book at St. Sabas and included a transcription of the Greek characters of a few lines of the palimpsested text that he could read. The catalogue came to the attention of the Danish philologist Johan Ludvig Heiberg, who recognized the source of the palimpsested text and traveled to Constantinople to see the book for himself in 1906. Heiberg’s only tools to assist his reading were his eyes, natural light, and probably an optical magnifying lens. The binding of the book prevented him from reading any original text within the gutter except on folios at the center of a quire, but he was still able to produce an excellent transcription of the text. During his study, Heiberg had photographs taken of at least 102 pages of the prayerbook of a then-existing total of 354; the 65 photographs of these folios survive as Ms. Phot 38 in the Royal Library in Copenhagen. Shortly after his viewing of the palimpsest, Heiberg announced the discovery of new Archimedes text, an item deemed sufficiently important to merit a front-page story in the New York Times of 16 July 1907. Heiberg published his results from the Method in the journal Hermes in 1907 and in the second edition of his three volumes of Archimedis Opera Omnia cum Commentariis Eutocii published in 1910-1915; the first edition had been published in 1880-1881. The manuscript disappeared from the Metochion during the upheavals in Europe in the early 20th century and was seen publicly again only after a French family put it up for auction at Christie’s in New York in 1998. Though virtually nothing definitive is known about the history of the codex during the intervening nine decades, some facts may be inferred from its condition at the time of the auction. The most superficial comparison of the manuscript with Heiberg’s photographs from 1906 show the extent of injury that has been inflicted on the book during this period. Many pages have been severely damaged by mold that did not exist in 1906, some pages have disappeared altogether, and four pages have been painted over with portraits of Christian Evangelists. Since the auction in 1998, Georgi Parpulov of the Walters Art Museum has unearthed a letter in French from Salomon Guerson, a vendor of antiquities in Paris, to Professor Harold Willoughby of the University of Chicago. The letter in the Willoughby Archives in the Library of the University of Chicago referenced an earlier correspondence between these two in 1932. The translated letter describes a folio that was “identified through your mediation by the curator of the Huntington library as being the manuscript of Archimedes described by J.L. Heiberg in Hermes vol. 42, p. 248. I would like you to know that I wish to sell this manuscript. … I am asking $6000.” Though not identified in the letter, the curator at the Huntington Library must have been Reginald B. Haselden, who pioneered the application of modern technology to the study of manuscripts and who published the book “Scientific Aids for the Study of Manuscripts” in 5 1935. Parpulov also located the monochrome negative of a photograph of the detached f.57v of the Euchologion in the same archive; the digitized image of this negative is now included in the Archimedes Palimpsest database. As already mentioned, four leaves were erased yet again some time after 1938, overpainted with forged icons of the four Gospels, and further distressed to appear even more aged. One hypothesis is that the icons were painted in an attempt to increase the value of these pages, or perhaps even of the intact codex, in the eyes of less-sophisticated potential purchasers since books with medieval art were generally considered to be more valuable than plain texts. Tragically, the original text on two of the overpainted leaves was from the Method. The outward appearance of the book in 1998 is shown in Figure 1 and one side of one disbound folio appears in Figure 2. The manuscript was purchased at auction at Christie’s on 28 October 1998 for $2 million US by an anonymous American collector. Its new owner deposited the manuscript at the Walters Art Museum in Baltimore, MD, USA early in 1999 and funded an intensive program of conservation, imaging, and scholarly study for ten years. The imaging study was officially completed on the tenth anniversary of the auction with the posting of all original and many processed images at http://www.archimedespalimpsest.net, though experimental studies continue. Publication of the scholarly work on the manuscript is underway, including editions of the Method and Stomachion. A history of the project up through 2006 is available in (Netz and Noel, 2007). Figure 1: Appearance of the Archimedes Palimpsest before the auction in 1998 (Walters Art Museum) The existence of the works by Archimedes beneath the prayerbook text was known at the time of the auction in 1998, but other original works have been identified in the manuscript during the course of the imaging project. These include five leaves containing parts of two speeches 6 by the Athenian orator Hypereides that were previously thought lost: four folios of “Against Diondas” and one of “Against Timandros”. The former speech reveals the political conditions in Athens after the battle at Chaeronea (Tchernetska, 2005). Seven folios contain a third recently identified unique manuscript, a commentary on Aristotle’s “Categories” by Alexander of Aphrodisias. The original text on another set of four folios is a history of St. Pantoleon, while two folios contain a 10th-century orthodox liturgical text, the Menaion. Two as-yet unidentified texts appear on six folios. In short, the Archimedes Palimpsest is a document with writings that are significant to the fields of mathematics, history, and philosophy. This book is extraordinarily important, even unique, to classical studies. Its existence raises the question of whether other copies of the Euchologion (or other manuscript) might still exist that were made from leaves of the original parchments. Figure 2: Visual appearance of one leaf of the treatise “On Spiral Lines” (Euchologion f. 093v-092r), oriented with the faint Archimedes text running horizontally and the Euchologion text running vertically. (Copyright retained by the owner of the Archimedes Palimpsest). The first, and ultimately most important, objective of the project was the conservation of the badly damaged manuscript, which was executed by the conservation staff of the Walters Art Museum under the direction of Abigail Quandt. An important aspect of conservation was the essential separation of the leaves. To better understand what they were up against, Ms. Quandt and her colleagues dispatched microscopic core samples of a folio of text to the Canadian 7 Conservation Institute in Ottawa. The analysis reported that the collagen in the parchment is breaking down, which further intensified the urgency of the conservation. The painstaking task required removal of both traditional hide glue and modern polyvinyl acetate cement (PVAC, close kin to wood glue used by modern carpenters) to separate the fragile leaves. Before imaging, candle wax residue left from hundreds of years of religious services had to be removed by painstaking scraping with surgical tools under the microscope. It is fair to say that these efforts by the conservation staff of the museum are largely responsible for the ultimate success of the project. THE FORGERIES During the forensic study of the four leaves with the forged icons in December 1999, John Lowden of Courtauld Institute in London identified their source as the book “Miniatures des Plus Anciens Manuscrits Grecs de la Bibliothèque Nationale du VIe au XIVe Siècle” published by (Henri Omont, 1929). The forged paintings are 1:1 copies that were transferred via pinpricks, tracings, and preliminary sketches in red. The paintings were artificially aged by smudging dirt into the paint surface and fraying the edges of the parchment. The pair of forgeries on the facing leaves f. 57r and 63v of the Method are shown in Figure 3. Figure 3: Forgeries on facing leaves f. 057r (top) and f. 067v (bottom) showing forged icons painted over leaves after 1938. The Archiemedes text is largely obscured except in the gutter, where it is seen to run horizontally. (Copyright retained by the owner of the Archimedes Palimpsest). 8 Dr. Jennifer Giaccai, then on the scientific staff of the Walters Art Museum, studied the properties of the forged icons, including the paint palette. After using Fourier-transform infrared spectroscopy to identify the pigments, she determined that one of them, phthalocyanine green, was not commercially available until 1938. She also concluded that the same palette was used in a forgery in a manuscript at Duke University whose source had been identified previously by Lowden. SPECTRAL IMAGING OF THE MANUSCRIPT The primary focus of the remainder of this paper is the imaging of the manuscript at so-called “optical” wavelengths, i.e., the spectral regime from the near-ultraviolet to the near-infrared regions that may be imaged with a silicon sensor. The imaging team for this task included Dr. Keith Knox, originally affiliated with the Xerox Research Center in Webster NY and now at the Air Force Research Laboratory in Kihei HI, Dr. William A. Christens-Barry of Equipoise Imaging, LLC in Ellicott City, MD, and Dr. Roger L. Easton, Jr. from the Chester F. Carlson Center for Imaging Science of the Rochester Institute of Technology. Dr. Easton was able to entice several graduate and undergraduate students and even high school interns to work on the project. Over the course of the project, one important lesson learned (and subsequently reinforced in work on other manuscripts) is that every method that has been imagined and implemented has been useful, if not essential, for some original manuscript in the book. In other words, the visibility of the original texts in the palimpsest may be affected little, if at all, by some technique and may be improved significantly by another. It is essential to have a variety of hardware and software tools in the arsenal and to be flexible in their application. Over the 10-year duration of the project, the entire manuscript has been imaged using several techniques and a variety of sensors. The original exploratory imaging sessions, dubbed “Phase I” took place in the summer of 2000. This work led to adoption of a standard imaging protocol for Phase II. The team assembled at the Walters Art Museum in Baltimore six times between 2001 and 2004 for imaging of groups of approximately 15 newly disbound and conserved folios. An additional session was held in November 2006 to reimage several pages of the manuscript using experimental techniques and the entire codex was reimaged with a higher-resolution camera system and illumination from light-emitting diodes in August 2007, an effort that might appropriately be called “Phase III.” Some pages were imaged yet again in March 2009 to provide additional image data for new processing algorithms that had been developed. The techniques included imaging in reflected light at different wavelengths and different angles, imaging of visible light emitted by the parchment when illuminated by ultraviolet light (ultraviolet fluorescence imaging). Several different monochromatic and color digital sensors were used, and the development of the technology over the course of the project eventually required reimaging with newer systems. Only the spectral imaging modality will be discussed in this paper; the other methods are treated elsewhere. In spectral imaging, multiple digital images are collected of the same scene using different wavebands. The images are subsequently combined by computer to reveal or enhance the desired features of the scene. The process often is called “multispectral imaging” for a small number N of relatively broad wavebands (e.g., N ≤ 10 with ∆λ ≥ 25nm) or “hyperspectral imaging” for a larger number of narrower bands. The subsequent computations may be based 9 on observations of the features or on statistics of the (sometimes quite subtle) variations in the response of the ink across the different wavebands. Fluorescence imaging is based on the observation that some materials, particularly those composed of organic substances, absorb light photons at some energy level and subsequently emit other photons at lower energies (longer wavelengths) that are characteristic of the material. These re-emitted photons may be imaged with appropriate sensors. In the ultraviolet fluorescence imaging used in the project, the ultraviolet photons are absorbed by the parchment, which then emits visible light that could be imaged with a standard digital camera; the emitted light is dominated by blue wavelengths just longer than the incident ultraviolet radiation. As shall be demonstrated, very useful information about the original texts may be conveyed by the relative intensities of the fluorescent emission over the visible wavelengths. This information required imaging of the different spectral bands, which could be implemented with a digital camera with a color sensor or a monochrome camera with external bandpass filters. X-ray fluorescence (XRF) imaging is based on the same principle, though both the incident and emitted photons are highly energetic X rays and thus require special generation and sensing equipment. These X rays are capable of penetrating obscuring materials, such as dirt or paint. The spectrum of emitted X rays reveals the relative populations of elements at those locations, e.g., iron in the ink. XRF imaging proved essential for reading the original text on leaves with the forged icons and on leaves that were particularly grimy, such as the colophon of the Euchologion. The original goal of the imaging was to develop a collection and processing scheme capable of “stripping off” the prayerbook text to leave the original “undertext” with enhanced contrast and readability; this objective is perhaps better described as making the overtext “disappear” into the parchment so that the visibility of the original text is improved. The difference in color appearance of the inks provided the basis for the original strategy of collecting images under different illuminations and digitally combining these images to create new images with enhanced undertext. After considerable experimentation, a method was developed that works well for the Archimedes texts and for leaves from some of the other manuscripts. We also learned that the protocol for imaging and processing needed to be modified for use on the leaves from other original manuscripts. For example, the writings in the Alexander and St. Pantaleon texts responded rather poorly to the original processing technique, but new imaging tools have been noted to perform much better in those cases. This is one of the important lessons learned in the course of this work: that the combination of image collection technique and image processing varies for different manuscripts within the Archimedes palimpsest. PHASE-I IMAGING AND IMAGE PROCESSING The first test images were collected during the summer of 2000, with the goal of “stripping off” the overtext so that only the original writings remained visible. A scientific digital camera was used that was state of the art for the time, incorporating a cooled monochromatic chargecoupled device (CCD) sensor array with 1536 × 1024 picture elements (“pixels”). The cooling ensured that images had 12 bits of useful dynamic range, which means that the sensor can distinguish 212 = 4096 different shades of “gray” in the scenes. The five color bands imaged in the preliminary phase were selected by glass filters that transmitted different bands of light 10 with widths ∆λ ~ 100nm in the ultraviolet, blue, green, red, and infrared regions of the spectrum. The images were collected under three types of illumination: visible light from tungsten photofloods, “shortwave” ultraviolet lamps with emission centered about λ0 ! 254nm, and “longwave” ultraviolet lamps with λ0 ~ 365 nm. The five filters and three illuminations produced 15 spectral images for subsequent multispectral processing. The individual leaves of the book, each approximately 150mm × 190mm, were imaged in two sections at a spatial resolution of about 8 picture elements (pixels) per mm (equivalent to 220 “dots” per inch, or “dpi”); this is approximately equal to the spatial resolution of the eye at normal reading distance. The images were processed and digitally stitched via many custom (and timeconsuming) computations to create processed images of the leaves. Part of the computational intensity was due to the use of glass bandpass filters in the optical path. Unavoidable small differences in the “tilts” of the filters relative to the sensor had the effect of translating the recorded images by small distances in varying directions. These translations had to be removed to “register” the images, i.e., to “line up” the pixels in each waveband, before performing spectral image processing. As it happened, the registration process was more difficult than had been envisioned. Even after significant custom image processing, the resulting images were still imperfect. The registered images of each leaf may be envisioned as a 3-D array of image data with coordinates [x,y,λ], i.e., as a “cube” of data. The image datasets were processed using several spectral segmentation techniques to distinguish among the various classes of object present in the images, such as “parchment”, “mold”, “overwriting” (from the Euchologion), and “underwriting” (the desired original text). In general, spectral image processing methods may be divided into two basic classes: “supervised” and “unsupervised” classification (Duda, et al., 2000). In this project, the former type of algorithm was used in most of the processing, though a specific type of unsupervised classification proved essential for leaves of one manuscript. This type of processing will be described in detail later in this paper. The mathematical basis for the supervised classification algorithm assumes that a pixel belonging to a specific class will exhibit a specific “spectral signature,” i.e., a specific set of gray values over the set of spectral bands. In supervised classification, the user specifies regions of known object type; the algorithm analyzes these for patterns of pixel brightness vs. wavelength that are used to “train” the algorithm. The remaining pixels in the scene are analyzed based on the training data to determine the likelihood of membership in each of the object classes. For example, a pixel of original text would exhibit a “redder” spectrum than one for the overtext. If the spectral signature for the specific object class m is represented as the vector em, then the output values for a pixel belonging to that class should form a vector proportional to em, where the proportionality constant is determined by the “lightness” of that pixel: ea e b ≡e m eN m After user-supplied data are analyzed to estimate the vectors for each class, a matrix E of these spectra signatures is constructed: 11 ea ea ea eb eb eb eN eN eN M 1 2 ≡E To illustrate the use of this matrix, note that the class membership of a pixel composed purely of the first object class may be expressed as a vector α of the form: 11 0 2 =α 1 0M where the subscripts on the vector components merely specify the class of object at that pixel. The matrix E is applied to this vector to produce an output vector that is the spectral signature for that class: ea ea ea eb eb eb eN eN eN M 1 2 11 02 = 0M ea e b eN 1 Note that, in general, an individual pixel is heterogeneous, i.e., it is composed of some mixture of the M object classes, which may be described by a general vector α of the form: α1 α 2 ≡α α M The elements in the vector α clearly must sum to unity if all classes are included. M ∑α m =1 m =1 In such a case, the set of measured gray values at a pixel in each of the N bands is determined by multiplying the spectral signature matrix E by the percentage class vector α to produce a measured spectral vector r: 12 ea ea ea e eb eb = Eα b eN eN eN M 1 2 α1 2 α= α M r1 r 2 ≡r rN The task at hand is to determine the class membership vector α from the measured vector at a pixel. The most desirable (and most obvious) solution would be to apply the inverse of the matrix E to the measured vector r to evaluate the desired percentage class vector α: r = Eα ⇒ α = E-1r However, the principles of linear algebra demonstrate that the inverse matrix E-1 may only exist if the matrix E is square, i.e., if the number N of measured spectral bands matches the number M of object classes., and that E-1 often does not exist even in that case. That being said, it is easy to show that it is possible to evaluate a “pseudoinverse” matrix that may exist if the number of measured spectral bands exceeds the number of object classes, which means that E has more rows than columns, so that N > M. Put another way, the pseudoinverse matrix may (but is not guaranteed to) exist if there are more components in the measured data set than must be computed at each pixel. The pseudoinverse E† is evaluated by multiplying the measured vector by E and its transpose ET in the appropriate order: (ETE)-1 ET ≡ E† ⇒ E†r = ((ET E)-1 ET )r = ᾶ where the overscored tilde indicates that the result is an estimate of α and not its actual value. In fact, the squared error of this estimate is a minimum compared to other possible calculations of the vector α; this is the so-called “least-squares solution” (Schowengerdt, 1997). Custom software was written to implement the least-squares supervised classification to produce a processed image cube with coordinates [x,y, α ]. The value of α is rendered as gray value at each pixel to produce a visualization of the class. For example, the gray values of the pixels in an image of the “undertext” class are determined by rendering the calculated estimate of the membership of that class as a gray value at each pixel. The example in Figure 4 compares processed images of Euchologion leaf f.70v for the “overtext” and “undertext” classes to the original appearance. The imaging team was pleased with these results and believed that they validated the proposed imaging and processing. 13 Figure 4: Original Multispectral Image Processing of f. 70v: (a) Appearance in visible light; (b) Output of class membership for Euchologion text (“overwriting”), where pixels in the class are rendered as “white;” (c) Output for class of Archimedes text (“underwriting”) rendered as “white.” Note the detrimental effect of the mold on this text in the upper-right quadrant of the image. (Copyright retained by the owner of the Archimedes Palimpsest). When submitted to scholarly review by Dr. Reviel Netz of Stanford University and Dr. Natalie Tchernetska of Cambridge University, the results were judged to have inadequate spatial resolution (they were described as being too “fuzzy” or “out of focus”) and that the image processing did not meet their requirements for scholarly reading. This is another lesson that was taught in this phase: that the images must satisfy the scholar, not the imager. The first problem, the “fuzziness” of the generated images, arose from two sources: insufficient spatial resolution of the original images and the aforementioned problem with image registration. The scholars judged that the image resolution was too coarse by at least a factor of three. This condition could be remedied only by increasing the magnification of images, which would require either many more image sections per page (with the consequent increase of the number of images to be stitched together by a multiplicative factor), or the use of a larger sensor with more pixels. The image registration problem arose because pixels in images at the same location on the parchment taken at different wavelengths did not “line up” exactly, despite significant custom processing. The misalignment made the image resulting from combining the images taken at the different wavebands to appear blurry. This problem could only be solved by a fundamental change in the image collection scheme. The second problem of inappropriate processing was due to the original goal of the imagers to create images where the overtext disappears into the background and is no longer visible. In a seeming paradox, this apparent success in fact made the Archimedes text more difficult for the scholars to read. The characters of the undertext exhibited “gaps” where the ink had been obscured by the overtext, without any visible explanation for those gaps. The scholars preferred to see both texts, but to have them distinguished by some other property of the image. In other words, the primary problem for scholarly transcription was the “faintness” of the original text, not the fact that it was overwritten. 14 PHASE-II SPECTRAL IMAGING Based on the comments of the scholars about the results of the first phase, the imaging plan was modified to meet a new goal: to enhance the contrast and appearance of the original undertext while retaining visible overtext. The algorithm used to produce this result evolved from observations made while trying to create images of the Archimedes undertext separately and is arguably still a type of “supervised classification.” During the imaging, it was noted that the two texts could be segmented fairly well by using a simplified collection and processing algorithm based on the different colors of text. Because the remnants of erased text are generally “redder” than the overwritten text, these characters are virtually invisible in the red channel of images taken under white-light illumination. In other words, the measured gray values of pixels that contain Archimedes text are approximately the same as those from the surrounding parchment when viewed in red light. Because the color of the overwriting is much more neutral (i.e., without color – it is “grayer” than the erased text), it is generally quite apparent in all channels of the visible-light images. This observation led to a simple algorithm for enhancing the original undertext. Instead of requiring images at a number of wavelengths obtained by using glass bandpass filters, only two color images were required. These were collected with a color digital camera of the type used by photojournalists that was state of the art at the time: a Kodak DCS-760 (3040 × 2008 pixels ~ 6 megapixels). The camera sensor is overlaid with a Bayer color filter array so that each pixel in the sensor measures the brightness in one of the three additive primary colors (red, green, or blue, RGB). Since each pixel measures light of only one color, the light from the other two at that pixel must be estimated from measurements at neighboring pixels. For example, if a specific pixel measures green light, the red and blue values at that pixel are estimated by interpolating the values of neighboring pixels that measure those two colors. The necessary interpolation ensures the actual spatial resolution of the image is somewhat poorer than the pixel count, but the system has the advantage that the spectrometry at a pixel is performed within the camera rather than by external filters. In this way, the issue of image registration became moot as long as the camera and manuscript do not move between images taken under the different illuminations. By using an appropriate lens (60mm Micro-Nikkor), the images exhibited a spatial resolution of approximately 25 pixels per mm (625 “dots per inch” or “dpi”). For this sensor, individual images covered areas of approximately 120mm × 80mm. Each full Euchologion folio, corresponding to one leaf of the original manuscript, was imaged in 10 overlapping sections under three illuminations, for 30 images per leaf. The three illuminations were: white-light xenon strobe (to produce a documentary image of the visible appearance), low-wattage tungsten light, and longwave ultraviolet light (λ ~ 365 nm) recorded in sequence. To facilitate movement of the manuscript under the camera to image the ten sections, a computer-controlled x-y translation table was used that is capable of precisely locating the page beneath the imaging camera. Images of the first section were collected under all illuminations, after which the translation table centers the next section of the manuscript beneath the camera. The two images collected under low-wattage tungsten lamps and longwave ultraviolet (λ ~ 365nm) were combined to render the processed image. The former lamps have a fairly “reddish” color temperature, while the ultraviolet illumination induces visible-light fluorescence from organic material in the parchment. Since the closest visible light to the excitation wavelength is blue, this color dominates the fluorescence, though (as we shall see), useful fluorescence information exists at longer wavelengths for some leaves. The color 15 images under the tungsten and ultraviolet illumination were “split” into their red, green, and blue color channels to create monochrome images of the scene in each band. The neutral-gray Euchologion text appears “dark” in all three channels of the visible-light image, while the fainter “reddish” Archimedes text is “lighter” in the red channel and “darker” in green and blue of the tungsten image. The contrast of each image band was enhanced by a custom algorithm based on the gray-level statistics in a local square neighborhood with linear dimension of 401 pixels. The original gray value of each pixel was compared to the mean and variance of gray values of the pixels within the local neighborhood. A new gray value was assigned to that pixel based on difference of the original gray value from the mean in units of standard deviation within the neighborhood. The new image was typically scaled to ±3 standard deviations in gray value. For example, if the original gray value is equal to the mean, then the pixel was assigned to be in the middle of the output dynamic range (gray value of 128); if the pixel value was one standard deviation brighter than the mean, the new gray value would be the mid-gray value plus one third of the remaining dynamic range, etc.. This process effectively eliminates gross variations in gray value due to localized effects, such as local parchment staining. The two channels after contrast enhancement are shown in Figure 5a and Figure 5b, respectively. These gray values at pixels for both parchment and overtext are approximately equal in these two images, so that the Euchologion text “disappears” into the parchment when the two are subtracted (Figure 5c).. Since the processing was not “perfect,” traces of the Euchologion text generally were still visible (as was desired). Figure 5: Subtraction of spectral images to enhance undertext in gutter region of f.093v-092r from “On the Sphere and the Cylinder”: (a) red channel under tungsten illumination showing very little undertext; (b) blue channel under ultraviolet illumination, showing both texts; (c) difference image (“sharpie”). (Copyright retained by the owner of the Archimedes Palimpsest). This rendering method is useful for reading and is, in fact, preferred by some scholars, most notably the late Dr. Robert Sharples of University College London. His interest in these images led to the pet name of “sharpies” in his honor. Note that the subtraction step also enhances any noise in the images, which may interfere with the reading of the text. This observation led to the development of a “pseudocolor” rendering of the same two monochrome images. The image in the red channel under tungsten illumination (with “bright” Archimedes text and “dark” Euchologion text) is placed in the red channel of the new image, while the ultraviolet-blue image (with both texts “dark”) is inserted in the blue and green channels. The Archimedes text is “bright” in the red channel and “dark” in the green and blue channels of the resulting image and thus appears with a “reddish” tint, while the Euchologion 16 text appears in a “neutral” gray. This provides a color cue to the reader about the origin of the characters, as depicted in Figure 6 for the same region of f.093v-092r that was shown in Figure 5. Figure 6: Pseudocolor image of gutter section of f.093v-092r from “On the Sphere and the Cylinder”, showing increased contrast of differential color rendering of undertext. (Copyright retained by the owner of the Archimedes Palimpsest). During image collection (generally at the end of each day), the sets of 30 images for the sides of each folio were transferred from the imaging control computer to a UNIX workstation (Apple Powerbook laptop) and processed in an overnight batch by Dr. Keith Knox. This provided a rapid assessment of the quality of the images. After the entire imaging session is completed, the processed pseudocolor image sections were digitally “stitched” together by using commercially available software tools to form large images of the complete folio that cover approximately 5000 × 7500 pixels. The scholars find these pseudocolor images to be easier to read than the manuscript itself for nearly all leaves. The image sets were distributed to scholars on portable disk drives and/or as prints for their assessment. We estimate that the scholars were able to read as much as 80% of the original text from the images collected and processed during Phase II. EXPLORATION OF OTHER IMAGING TECHNOLOGIES Though the pseudocolor imaging algorithm was quite helpful for scholarly reading, it was less successful (and sometimes much less) on the leaves with painted forgeries, damaged leaves, (such as f. 001-002, the colophon of the Euchologion) or on leaves with texts other than Archimedes. To explore other tools for these pages, other researchers were invited to a conference at the Walters Art Museum in April 2004. The results obtained from the pseudocolor method were presented to the invitees, who were then encouraged to suggest other technologies that could be usefully applied to the project. Three suggestions were deemed sufficiently promising to warrant further study. Derek Walvoord, a graduate student in Imaging Science at the Rochester Institute of Technology, developed a digital transcription system utilized context and prior knowledge of the words used by Archimedes in a 17 probabilistic network to predict the most likely Greek characters to fit in a particular circumstance (Walvoord and Easton, 2008). The program was primarily intended to assist the scholar in identifying isolated characters that are difficult to read. Dr. William ChristensBarry suggested the use of newly available light-emitting diodes (LEDs) as spectral illumination sources. Dr. Uwe Bergmann of the Stanford Linear Accelerator Center proposed X-ray fluorescence imaging of the most troublesome leaves. All three avenues were pursued and utilized. The method and results from X-ray fluorescence are considered in other papers in this session. Though not a result of this meeting, another occasionally useful technique was developed based on observations by Dr. Judson Hermann of Allegheny University. Dr. Hermann was trying to read the original text on folios that were originally identified by Dr. Natalie Tchernetska in October 2002 as fragments of speeches by the Greek orator Hypereides, one of the ten canonical orators of antiquity. The only previously known examples of Hypereides’ works were on papyri discovered in the 19th century; this is the first evidence that any of his works made the technological transition from papyrus scroll to parchment codex. The speeches in the Codex are “Against Diondas” and “Against Timandros.” (Tchernetska, 2005). Dr. Hermann was trying to read these original texts from the parchments with the aid of Abigail Quandt’s microscope in the Conservation Laboratory at the Walters Art Museum. He noticed that the original ink of this manuscript had completely vanished in many places, but that shallow “channels” had been excavated in the parchment by the acid in the original ink. The channels became more visible if illuminated at low raking angles. As a result of his observation, the imaging system used in the extra session in November 2006 was modified to include a similar illuminator to that used with the microscope. The value of this tool is shown in a comparison of the original pseudocolor and raking-incidence images in Figure 7. Figure 7: Raking-incidence illumination of f. 176v with text from “Against Diondas” by Hypereides: (a) pseudocolor rendering with little (if any) apparent undertext; (b) under raking-incidence visible illumination, showing textured undertext characters; (c) same image with lines added to characters “διονδασ” = “diondas” (Copyright retained by the owner of the Archimedes Palimpsest). 18 ENHANCED MULTISPECTRAL IMAGING SYSTEM A recurring theme in this project is the development of digital imaging technology over the ten-year time span. The capabilities of digital cameras in 2007 had so improved over those available at the beginning of the project in 2000 that the owner of the palimpsest was willing to fund a complete reimaging of the book. A camera system with higher resolution supplied by Stokes Imaging in Austin TX was combined with the new spectral LED illumination system designed by Dr. Christens-Barry. The plan had been to use a monochrome camera with a 33-megapixel sensor, but delays in its delivery forced the substitution of a Sinar 54H color digital camera back (5440 × 4080 pixels with Bayer RGB screen, 22 MP). The Sinar back uses piezoelectric micropositioners to translate the sensor by full- and half-pixel increments both horizontally and vertically relative to the optical image. The full-pixel translations allow red, green, and blue values to be measured at all pixels of stationary objects, so that a RGB color images may be collected that do not require the local color interpolation of a standard digital color camera. The half-pixel shifts produce p images with twice the pixel count along both axes by measurements between the original pixels; the final image size is 10880 × 8160, or 88 MP. However, the additional pixels provide little additional information because the modulation is very small at the larger spatial frequencies. The change to a color camera was originally a major disappointment and also created significant problems in the image collection, including an increase in the collection time by a factor of 16, but its use later became a blessing in disguise. The color information at each pixel proved essential to the readings of the Alexander texts and led to a new collection protocol for subsequent projects, as will be described shortly. The LED illuminators have several important attributes for use with historic and fragile manuscripts. Since LEDs generate light by electronic transitions rather than by thermal interactions, they apply little or no heat to the manuscript. The wavebands are rather narrow (∆λ ~ 40 nm), which has advantages for spectral image processing. A fairly large number of LED color bands are currently available and more can be anticipated as the technology advances, so further enhancements of the capabilities of such illumination systems are anticipated. In fact, the LED illuminators have already proven to be useful for making highquality renderings of colored artifacts, such as paintings. The LED illumination system constructed by Dr. Christens-Barry (now dubbed “Eureka Lights”) included one waveband centered in the near ultraviolet region of the spectrum (λ ~ 365nm), seven visible bands (λ0 ~ 445nm, 470nm, 505nm, 530nm, 570nm, 617nm, and 625nm), three infrared bands (λ0 ~ 700nm, 735nm, and 870nm). In addition, images were also collected under raking illumination obtained from separate lighting units from two sides and at two bands (λ0 ~ 470nm and 870nm) and under tungsten illumination, for a total of 16 images per sequence. The object was imaged in a darkened room so that the sensor could “see” only the illuminating or fluorescing wavelengths. This allowed removal of the infrared blocking filter from the camera to enable imaging of light at the infrared wavelengths out to λ ~ 1000nm. The sample image sequence in Figure 8 shows the color response of the sensor. 19 Figure 8: Imaging sequence under LED illumination with RGB sensor. Note that the apparent color of the images is dominated by the color of the illumination in the visible range. The image at the band centered about λ0 ~ 735nm appears “orange” due to the differential transmittances of the RGB filters, while that at λ0 ~ 870nm appears gray because all three RGB filters transmit approximately the same amount of light at that wavelength. (Copyright retained by the owner of the Archimedes Palimpsest) UNEXPECTED DISCOVERIES FROM THE SPECTRAL IMAGES The existence of the Alexander text was not realized until June 2005, when Nigel Wilson of Lincoln College of Oxford University was able to read the string of characters “αριστοτέληζ” (“Aristotle”) in the gutter of Euchologion f.80r (Figure 9). Unfortunately, the normal pseudocolor processing was of very little help on these leaves (Figure 10). Figure 9: Pseudocolor image showing detail region of gutter of f.080v-073r with faint string of characters noted by Nigel Wilson to read “αριστοτεληζ” (“Aristotle”), also with emphasis added for clarity. (Copyright retained by the owner of the Archimedes Palimpsest) 20 Figure 10: Detail of f. 120v-121r with undertext of the Commentary: visual appearance on left and normal pseudocolor image on right. Note that little if any of the original text is visible in either image. (Copyright retained by the owner of the Archimedes Palimpsest) More than three years later, in December 2008, a method to read these texts was developed thanks to a convergence of events. Dr. Noel had challenged the imaging team to process the text in the Alexander folios over the holidays. Kevin Bloechl, a first-year undergraduate student in imaging science at the Rochester Institute of Technology, made the request of Dr. Roger Easton to do some imaging work during the academic break. Expecting little, Easton gave him the task of processing one of the Alexander leaves. After one day of effort on the task, Mr. Bloechl produced a digital image of one Alexander leaf taken under ultraviolet illumination using one type of unsupervised spectral classification that is known as “principal component analysis” (PCA); the remarkable results of this process will be considered after a digression to describe the processing algorithm. In general, unsupervised classification is based on the statistical properties of the set of N gray values that have been measured for each pixel; these may be visualized as an N-dimensional vector. The ensemble of all such vectors in the image generates an N-dimensional histogram of the data. It is natural to expect that pixels for objects in the same class would have similar vectors, so that they congregate in “clusters” in this histogram. The processing algorithm applies some criterion to the dataset in an attempt to identify clusters that may correspond to different object “classes.” One of the simpler such algorithms is PCA, which calculates a set of N equivalent images by projecting the pixel data in the N-band histogram onto N distinct and orthogonal axes that are determined in a specific order from the statistics. The first principal component is obtained by projecting the image data onto that axis in the Ndimensional space that spans the largest variance; the second component is the projection onto the axis orthogonal to the first that spans the next largest variance; the third axis is that orthogonal to the first two and that spans the next largest variance, etc. Because the variances of the projected data decrease with increasing order of the component, it is common for the highest-order components with the (smallest variance to be dominated by random noise. If the 21 user is lucky, a class of object that is of interest may “appear” as the dominant feature in one or more of the principal components. A hypothetical schematic of the process for 3-D data is shown in Figure 11. For these images, PCA was applied to the color images (RGB) of ultraviolet fluorescence on the Alexander leaves. We know that the raw RGB image is dominated by the response in the blue channel, which leads to the expectation that the first principal component will resemble the blue image. The second and third principal components are orthogonal to this band and thus should be dominated by some combination (weighted sum or difference) of the green and red color channels. As it happens, the original text is most visible in the image of the third principal component, which demonstrates that the text information is conveyed by the spectrum of the fluorescence rather than by the intensity or color of the reflected light. An example from f. 120v is shown in Figure 12. The scholars now believe that they can read most of the original Alexander text from these images, whereas very little text was transcribed from the pseudocolor images. Figure 11: Schematic of 3-band PCA of an image with two overlapping object classes. The 3-D histogram is elongated along the blue axis, which resembles the situation with the UV fluorescence images. In PCA, the data in the histogram “cloud” are projected onto the axis that spans the largest variance and rendered from black to white to evaluate the first principal component. In this case, the projection onto PCA band 3, which is dominated by a combination of red and green channels, produces a segmented image. 22 Figure 12: Comparison of pseudocolor image of 120v (upper left) to PCA band 3 from visible color image under UV illumination. Note that the undertext is quite apparent in much of the PCA image. (Copyright retained by the owner of the Archimedes Palimpsest) The observation that undertext at different locations in the scene often appears in either the second or third bands of the 3-band PCA images led to a further development of the method in the summer of 2010. The fact that the fluorescence is dominated by blue emission means that the 2nd and 3rd PCA bands are composed largely of weighted combinations of the green and red channels of the fluorescence image. This observation led to the realization that the undertext at different locations in the scene might become more visible in a pseudocolor rendering of the three PCA bands. The enhanced contrast of the displayed higher-order PCA bands translates to enhanced color contrast in the pseudocolor image. The transcribing scholar may change the rendering by dynamically rotating the hue angle of the pseudocolor image in software (hence the pet name of “Hueys” for these renderings). The hue angle is analogous to the azimuth angle of a three-dimensional cylindrical coordinate system with coordinates (r, θ, z), where the radial coordinate r displays the color saturation and z axis is represents the “lightness” in the scene. The method supplied significant additional text to the scholars charged with transcribing the Alexander writings; in the words of Dr. Natalie Tchernetska, “I could read some letters and even words that we could not decipher on pseudo(color image)s, sharpies, and PCAs.” Though much of the value of the method is due to the dynamic display of the pseudocolor image data, which is not captured by static printed images, Figure 13 shows three sections of f.120v-121r displayed at two different hue angles. The value of rotating the hue angle of pseudocolor renderings of higher-order PCA bands is just beginning to be explored. 23 Figure 13: Demonstration of the hue angle rendering of the PCA pseudocolor image of f. 120v-121r. Three different regions of the bifolio are shown at two different hue angles. The top and bottom sections are respectively in the upper and lower folio and the center sections are in the gutter. The rendering in (a) shows the text in the upper and lower folios, while that in (b) provides additional text in the gutter region. (Copyright retained by the owner of the Archimedes Palimpsest) Finally, the six leaves of a history of the life of St. Pantoleon also do not respond well to the standard pseudocolor processing. During the reimaging of the entire manuscript under LED illumination in 2007, it was noticed that the remnants of the original ink on these leaves appeared “bright” when imaged in the band of infrared radiation centered at a wavelength of 870nm. On some pages, e.g., Euchologion f.025v-026r, the effect is quite striking (Figure 14). The scholars can read much of the text from contrast-enhanced infrared images. This again demonstrates the value of designing the image collection and processing for each original text. Figure 14: Comparison of detailed area of f.024v-025r with unknown original text; (a) appearance under visible light; (b) image under infrared illumination centered about λ = 870nm, showing original text as lighter than parchment; (c) one principal component processed from complete set of wavebands, showing the visible original text. (Copyright retained by the owner of the Archimedes Palimpsest) 24 CONCLUSIONS The Archimedes Palimpsest Project has been a ten-year effort to conserve and extract the original texts from the 10th-century manuscript. The new readings obtained from the Method have led to a much richer understanding of the sophistication of the thought process of Archimedes. The other, previously unsuspected, original texts in the codex are judged to be of great importance in philosophy and classical history. Among the lessons learned over the course of this project is that every imaging tool and technique that we have imagined has been valuable in some application, so that is important to have a variety of collection schemes and processing algorithms available for use. The imaging collection and processing tools have been developed specifically for this manuscript, but are applicable to a wide variety of texts. The imaging team is currently preparing to image a 9thcentury Syriac palimpsest using the same tools. PUBLICATION The transcriptions and translations of the original texts in the Archimedes Palimpsest will be published over the next few years. The entire library of raw images of the codex are now available on the project website, http://www.archimedespalimpsest.org, under the heading “Digital Palimpsest.” The images have been released under a Creative Commons attribution license and may be used and released by anyone with appropriate credit. ACKNOWLEDGEMENTS The authors are deeply indebted to the owner of the Archimedes Palimpsest for supporting this work over the lifetime of the project. John R. Stokes, John T. Stokes, and Ken Boydston have supplied imaging hardware and software essential to the task. We also thank the conservation staff of the Walters Art Museum, especially Abigail Quandt, for invaluable contributions, including remaining on the job when the imaging ran overtime. We also must thank Michael B. Toth of R.B. Toth, Associates for his management expertise and Doug Emery of Emery IT for his data support. Reviel Netz, Natalie Tchernetska, and Nigel Wilson, have provided essential criticism of the results of the image processing. Doug Emery’s advice on metadata was essential to the success of the last imaging session. We also want to thank Charles Dickinson, Derek Walvoord, Matt Heimbueger Allison Bright, Alvin Spivey, Kevin Bloechl, Claire Mac Donald, and Upasana Marwah, who have assisted in the image processing while students or interns in the Chester Carlson Center for Imaging Science of the Rochester Institute of Technology. 25 REFERENCES Duda, R.O., Hart, P.E., and Stork, D.G. (2000), Pattern Classification, Second Edition, John Wiley & Sons, New York. Netz, Reviel and W. Noel (2007), The Archimedes Codex, Da Capo Press, Cambridge. Omont, Henri (1929), Miniatures des Plus Anciens Manuscrits Grec de la Bibliothèque Nationale du VIe au XIVe Siècle, Librairie Ancienne Honore Champion, Paris. Schowengerdt, R.A. (1997), Remote Sensing, Models and Methods for Image Processing, Second Edition, Academic Press, San Diego. Tchernetska, Natalie (2005), New Fragments of Hypereides from the Archimedes Palimpsest, Zeitschrift für Papyrologie und Epigraphik, 154, pp.1–6. Tischendorf, Lobegott Friedrich Constantin (1847), Travels in the East (translated. by W.E. Shuckard), Longman, Brown, Green, and Longmans, London. (available at Google books http://www.books.google.com) Walvoord, D.J. and R.L. Easton, Jr. (2008), Digital Transcription of Archimedes Palimpest, IEEE Signal Processing, 25¸ pp. 100–104. 26