Mathemusical Thought - LUC Sakai
Transcription
Mathemusical Thought - LUC Sakai
Mathemusical Thought Aaron Greicius Loyola University Chicago Fall 2014 c 2015 Aaron Greicius All Rights Reserved Contents 1 Introduction to Mathemusical Thought: 1.1 Appeal to authority . . . . . . . . . . . 1.2 Definitions: meet the players . . . . . . 1.3 Vantage points, goals, questions . . . . . Classic 1 . . . . . . . . . . . . . . . . . . . . . meet the . . . . . . . . . . . . . . . . . . . . . . . . players . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2 3 3 4 2 Elementary music theory 2.1 Sound, tones and notes . 2.2 Pitch notation . . . . . 2.3 Intervals . . . . . . . . . 2.4 Onset and offset . . . . Classic 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 5 6 9 11 13 3 Frequency 3.1 Frequency space . . . . . . . . . . . 3.2 The Pythagorean Legend . . . . . . 3.3 The transposition group Tfreq . . . . 3.4 Just tunings and equal temperament Classic 3 . . . . . . . . . . . . . . . . . . . Classic 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 14 15 16 21 26 27 4 Pitch space 4.1 Pitch space . . . . . . . . . . . 4.2 The transposition group Tpitch 4.3 Comparing the two pictures . . Classic 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 28 29 31 35 . . . . . . . . . . . . . . . . . . . . 1 . . . . . . . . . . . . . . . . . . . . . . 5 Pitch-class space 5.1 Octave equivalence . . . . Classic 6 . . . . . . . . . . . . . 5.2 Equivalence relations . . . 5.3 Pitch-class space . . . . . 5.4 Pitch or pitch-class space? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 38 39 41 44 50 6 Chords 6.1 Sets and sequences . . . . . . . . . . 6.2 Chords . . . . . . . . . . . . . . . . . 6.3 Operations on chords: transposition 6.4 Operations on chords: inversion . . . Classic 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 52 53 58 61 66 . . . . . . . . . . . . . . . . . . . . . . . . . 7 Chord-types 68 7.1 Counting chords of the same type . . . . . . . . . . . . . . . . . . 70 7.2 Counting chord-types . . . . . . . . . . . . . . . . . . . . . . . . 72 Classic 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 8 Scales 8.1 Generated scales . . . . . . . . . . . . . . . . 8.2 Small-gap scales . . . . . . . . . . . . . . . . 8.3 Scalar intervals, transpositions and inversions 8.4 Maximally even scales . . . . . . . . . . . . . 9 Wrap-up 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 77 79 82 83 88 Introduction to Mathemusical Thought: meet the players 1.1 Appeal to authority What mathematicians say • “Mathematics and music, the most sharply contrasted fields of intellectual activity which can be found, and yet related, supporting each other, as if to show forth the secret connection which ties together all activities of the mind...” –Hermann von Helmholtz • “It is in its performance that the music comes alive and becomes part of our experience; the music exists not on the printed page, but in our minds. The same is true for mathematics; the symbols on a page are just a representation of the mathematics. When read by a competent performer...the symbols on the printed page come alive–the mathematics lives and breathes in the mind of the reader like some abstract symphony.” –Keith Devlin • “A mathematician, like a painter or a poet, is a maker of patterns. If his patterns are more permanent than theirs, it is because they are made of ideas. His patterns, like the painter’s or the poet’s must be beautiful; the ideas, like the colors or the words, must fit together in a harmonious way.” –G. H. Hardy (1877-1947) 2 What musicians say • “Music is the arithmetic of sounds as optics is the geometry of light.” –Claude Debussy • “The fugue is like pure logic in music.” –Frederic Chopin • “Despite all the experience that I could have acquired in Music, as I had practiced it for quite a long time, it’s only with the help of Mathematics that I have been able to untangle my ideas, and that light made me aware of the comparative darkness in which I was before.” –Jean-Philippe Rameau • “I am not saying that composers think in equations or charts of numbers, nor are those things more able to symbolize music. But the way composers think–the way I think–is, it seems to me, not very different from mathematical thinking.” –Igor Stravinsky • “Music is not to be decorative; it is to be true.” –Arnold Schoenberg 1.2 Definitions: meet the players The following represent a reduction of many carefully constructed definitions of mathematics and music found in the philosophical literature. Music is the art of structured sound. Mathematics is the science of abstract structure. A philosopher will surely not be content with such definitions, but they are a useful starting point for us. In particular, they reveal both a potential hurdle to making a connection between the two fields (art/science), as well as a potential point of attack: the idea of structure. 1.3 Vantage points, goals, questions Vantage points The course will examine three points of contact. I list them here in order of increasing profundity (toward a deep connection), and ornamented with some fancy philosophical terms. 1. Ontological. Musical objects are very much like mathematical objects. We will describe and define the main musical parameters (melody, rhythm, harmony, timbre) in mathematical language (sets, sequences,topological spaces, groups). 2. Methodological. Mathematical thought, operations and objects are frequently employed both in the analysis and composition of music. We will look closely at examples of mathematical methods in both of these areas of musical practice. 3 3. Epistemological. Music often bears a strong logical quality. We speak of understanding a piece of music, of one passage of music following from another passage. Can these activities be compared to understanding or following mathematical arguments? We will explore these connections with the aid of formal logic. Goals The following goals are listed in order of increasing ambitiousness. 1. Get to know some classics in both music and mathematics: compositions, theorems, musical forms, proofs, etc. 2. Develop a short, “cocktail party” answer to the question: What exactly is the connection between math and music? 3. Get comfortable reading both musical scores and mathematical arguments. Come to understand better the nature of music and mathematics as practices. 4. Improve upon our “cocktail party” answer and articulate a deeper connection between music and mathematics. Questions Progress toward our last, most ambitious goal can be measured in part by our ability to answer the following questions: 1. Does the connection between music and math actually extend beyond the surface level, that is beyond the fact that works of music can be seen as mathematical objects? 2. What is special about the math/music relation? Why is it any deeper than the connection between say math and painting, or math and improv comedy? 3. What precisely is the difference between the art of music, and the science of mathematics? Classic 1 (Musikalisches Opfer, Canon I. a 2 cancrizans, by J.S. Bach). Below you find a facsimile of J.S. Bach’s Canon I, from Musikalisches Opfer (or The Musical Offering). As the performance instructions indicate, this is an example 4 of a crab canon. Video link. Tim Smith’s overview of Musikalisches Opfer. Performance instructions: Instrument 1 plays through from left to right, then back. Instrument 2 plays from right to left, then back. Below you find the two parts written out separately; in this form, each instrument now performs the music from left to right, then back. Turn the score into a Möbius band. You should first fold the score in half lengthwise, obtaining a strip with Instrument 1 on one side and Instrument 2 on the other. 1. Describe Bach’s composition as a path along your Möbius band. Make sure your path traverses the whole piece (36 measures in all)! 2. What properties of Bach’s composition are articulated by the geometry of the Möbius band? What does the geometry say about the role of the two different instruments? 3. We could have also made a simple cylinder (or hoop) out of our score-strip; what advantage does the Möbius band representation have (if any)? 4. Compare our Möbius band representation to the one in the video. Which is better? 2 2.1 Elementary music theory Sound, tones and notes In Musimathics: the mathematical foundations of music, Gareth Loy distinguishes between sounds, tones and notes. 5 • A sound can be thought of as a physical thing, the object of study of the science of acoustics. Physical properties of sounds include frequency, intensity, envelope, decay, etc. • A tone, according to Loy is determined by three “sonic properties”: pitch, loudness, and timbre (or color). These properties are closely related, but not identical to corresponding physical properties of sound: pitch and frequency, loudness and intensity, etc. For example: “Frequency is a physical measure of vibrations per second. Pitch is the corresponding perceptual experience of frequency”. (Loy, 13). As such it seems a tone is more of a perceptual entity, something that certain sounds are transformed into by our mind. • Lastly, a note is just a tone with the further properties of onset (when the note begins) and offset (when the note ends). Musical sound consists mostly of notes. Common Music Notation (CMN) is a system for representing notes; it captures with varying degrees of precision their 5 defining properties (pitch, loudness, timbre, onset, offset). 2.2 Pitch notation Pitch names Pitch is “that property of a sound that enables it to be ordered on a scale going from low to high,” according to the ASASAT1 . We begin by first assigning names to different pitches and identifying them with keys on the keyboard. �♯/�♭ �♯/�♭ � � � � � � �♯/�♭ �♯/�♭ �♯/�♭ �♯/�♭ �♯/�♭ �♯/�♭ �♯/�♭ �♯/�♭ � � � � � � � � Some observations and terminology: 1. The sequence of pitch names repeats. For example, we see there are two occurrences of C. The corresponding pitches on the piano are not the same; they are in fact an octave apart. These two different pitches are said to be octave equivalent, and the name ‘C’ here identifies a pitch only up to octave equivalence. To specify exactly which C you mean, you add a number indicating which octave range the note falls: e.g., ‘C2’ or ‘C6’. More on this later. 1 Acoustical Society of America Standard Acoustical Terminology 6 2. Our pitches are divided into white notes and black notes, depending on which piano key they correspond to. 3. A single step along our sequence, from one pitch to the very next pitch (to the left or right, whether black or white) is called a half step. A distance of two steps along our sequence is called a whole step. For example: the first C is one half step from the first C] , and one whole step (=two half steps) from the first D. 4. Sharpening a pitch corresponds to moving one half step to the right in our sequence; flattening a pitch corresponds to moving one half step to the left. 5. There are multiple names for the same pitch. The first black note on the left is both C] and D[ . These are said to be two enharmonic spellings of the pitch. White note mnemonics I cannot refrain from including two classic mnemonic devices in (United States) music pedagogy: the perfectly inoffensive “FACE”, and the ever so creepy “Every Good Boy Does Fine”. F A C E E G B D F Every Good Boy Does Fine Pitches on the staff Our next step is to transport our pitch names from the keyboard diagram to a musical staff. Some orientation and terminology: 7 • Here we represent our pitches as notes on a staff. Note that the alternating sequence of lines and spaces of the staff corresponds to our sequence of white notes: C-D-E-F-G-A-B. • Since moving up the staff corresponds to moving along the sequence of white notes, this movement proceeds either in half steps or whole steps, depending on where we are in the sequence. (Compare the staff increments D-E and E-F, for example.) • The symbol G is called the treble clef (or G clef). It tells us where G lies on the staff: viz., the second line from the bottom. Now we add the black notes simply by applying the sharpening and flattening operations to our white notes. Note the natural symbol (\) that occurs in front of the note representing G. It negates the flat applied to the G before it. In general once an accidental (i.e., a sharp or flat) is applied to a note, all subsequent instances within the measure retain this accidental, even in the absence of the ] or [ symbol. Pitch: bass clef The bass clef staff is another common musical staff. As the name suggests, it is suited to instruments (or voices) of a lower register. The idea is essentially the same, only now the bass clef (or F clef) symbol tells us where F lies on the staff: viz., on the second line from the top. Here is an online pitch reading tutorial I found after a not-so-exhaustive internet search. You can probably find better ones on your own. Key signatures Often times a musical piece will consistently use a certain subset of the black notes. In such cases a key signature is introduced, declaring that a certain 8 subset of notes will always bear a given accidental. Key signatures come in either a sharp or flat flavor. Some examples: The first signature declares F will always be sharped, the second that F and C will always be sharped, etc. We will make more sense of this convention, in particular the sequence of sharps/flats in a signature, when we discuss scales and keys. 2.3 Intervals We define an interval simply as a set of two pitches. The interval length is the distance, measured in half steps, between these two pitches. Comment 2.1. In Musimathics Loy defines an interval as the difference in pitch between two pitches. As stated, this is not a well-defined notion: given pitches P and Q, is the interval P − Q or Q − P ? We have not as yet assigned any numeric values to pitches, but once we do, you will see that according to our definition the interval length between two pitches P and Q is |P − Q|. This notion is well-defined as |P − Q| = |Q − P |. Notice also that interval lengths will always be nonnegative, thanks to the absolute value. Example 2.1. Compute the intervals between the following sets of pitches. (a) 5 half steps. (b) 4 half steps. (c) 6 half steps (d) 12 half steps Sonorities In Musimathics Loy sorts various intervals into sonority classes (“perfect”, “major”, “minor”, etc.) and summarizes their shared sonic properties in a nice 9 loy79076_ch02.fm Page 19 Wednesday, April 26, 2006 12:13 PM Representing Music 19 table (Loy, 19). Table 2.1 Interval Classification by Sonority Class Name Semitones Description Perfect Unison Octave Fourth Fifth 0 12 5 7 Provides harmonic anchoring and framework. Major Third Sixth Seventh Second 4 9 11 2 Provides expansive emotional color. Minor Third Sixth Seventh Second Upper pitch is one semitone smaller than major intervals. Minor intervals provide a contractive emotional color. Diminished 3 8 10 1 6 Augmented 6 Upper pitch is one half step less than a minor or a perfect interval. A diminished fifth is called a tritone. Upper pitch is one half step greater than a major or a perfect interval. An augmented fourth is also called a tritone. Table 2.2 It should beandnoted that the resulting names of intervals (“perfect fifth”, “minor Diatonic Minor Scale Interval Order sixth”, Diatonic etc.) Degree are not a. . .function 1 2 3 4 solely 5 6 7 of 1 the 2 3 number 4 5 6 7 of 1 half 2 3 steps, 4 5 6 but 7 . . . depend also on Diatonic the particular way the pitches are spelled. interval order . . . 2 2 1 2 2 2 1 2 2 1 2 2 2 1 2 2 1 2 2 2 1 . . . . 2 2 1 2 below 2 2 1 comprise 2 2 1 2 24 half 2 1 2steps. 2 1 2 Describe 2 2 1 . . . each in Minor interval Example 2.2. order Both . .intervals terms of sonorities. 7 Major 1 2 Minor 6 (a) This interval is a major third. 3 (b) This interval is a diminished fourth. 5 4 Sonority name algorithm Given spellings of two pitches P and Q: Figure 2.6 Major and minor scales. 1. First determine the IntervalName (“third”, “fifth”, etc.) by counting the number of lines and spaces they span (inclusive) and using the following table. #Lines/Spaces Unison 1 Second 2 Third 3 Fourth 4 Fifth 5 Octave 8 2. Then determine the IntervalQuality (“perfect”, “major”, “minor”, “diminished”, “augmented”) by computing the interval length (in half steps) and referring to Loy’s Table 2.1. (See following example.) Example 2.3. Apply the sonority name algorithm to the following intervals. 10 (a) IntervalName=“fourth”. IntervalLength=6= 5 +1. A perfect fourth is of length 5 half steps, as per Table 2.1. Thus this is an augmented fourth, also called a tritone. (b) IntervalName=“fifth”. IntervalLength=6= 7 -1. A perfect fifth is of length 7 half steps. Thus this is a diminished fifth, likewise called a tritone. (c) IntervalName=“third”. IntervalLenth=3. A third of length three half steps is a minor third. (d) IntervalName=“sixth”. IntervalLength=7= 8 -1. A sixth of length 8 half steps is a minor sixth. Thus this is a diminished sixth. (e) IntervalName=“octave”. IntervalLength=11= 12 -1. A perfect octave is of length 12 half steps. Thus this is a diminished octave. Naturally, musicians do not use such an algorithm when coming up with names of intervals; instead, they simply know the names/qualities of all the white note intervals, then compare the given interval to one of these, adding a “diminished” or “augmented” as necessary. In other words, they have the chart below burned into their memory. A useful observation in this regard is that there is exactly one non-perfect white fourth, and exactly one non-perfect white note fifth: the tritones containing B and F. (P=perfect, M=major, m=minor) Fourths P4 4 Fifths 8 P5 Thirds 15 M3 Sixths 22 29 2.4 M6 P4 P5 m3 M6 Onset and offset TRITONE! P4 P5 m3 m6 P4 P5 P5 M3 M3 M6 M6 P4 P5 m3 m6 Recall what we originally set out to do: show how the 5 properties of notes are represented in CMN. We’ve spent an inordinate amount of time on pitch, and 11 P4 TRITONE! m3 m6 now I will proceed to give short shrift to loudness and timbre before moving on loy79076_ch02.fm Page 28 Wednesday, April 26, 2006 12:13 PM to onset and offset. Loudness The28 loudness of a note is indicated by dynamics markings.Chapter I will not 2 attempt to improve upon Loy’s Table 2.5 (Loy, 28). Table 2.5 CMN Indications for Dynamic Range Pianississimo ppp As soft as possible Mezzo forte mf Moderately loud Pianissimo pp Very soft Forte f Loud Piano p Soft Fortissimo ff Very loud Mezzo piano mp Moderately soft Fortississimo fff As loud as possible level for his or her instrument, depending upon musical context. The nuances of this context are quite subtle and extensive, usually requiring years to master. The CMN indications for dynamic range are shown in table 2.5. The Italian names are univerTimbre sally used, I suppose because they invented the usages, which were subsequently adopted by other European countries. The dynamic range indications 2.5 areproperties entirely subjective. describeand we Timbre is perhaps the slipperiest of thein 5table sonic of aI note, how toitrelate them to objective in section will tackle in earnest latermeasurements on. One way of 4.24. describing the timbre of a note is For instruments that can change dynamic level over the course of time, the “hairpin” symbol to describe what instrument it sounds like, and indicates this isaaccomplished in musical indicates a gradual increase in loudness, while gradual decrease. Bowed notation by declaring given score level is intended forofpiano, andsimply blown instruments can usually that effect aachange in dynamic during the course a single or for note. Struck instruments including pianos generally can’t change the dynamic level of a note after violin, etc. it is sounded but can change dynamic levelsisover the course of several notes. proper inter-in what Further details about the timbre given by notation thatTheindicates pretation of these cues is part of every musician’s training. manner a note should be played on a given instrument. For example, a score for violin will indicate whether a note should be played with vibrato, whether 2.8 Timbre notes should be played legato or staccato, and what type of bowing should be In musical scores,techniques timbre means the type of instrument to be played, such as violin, trumpet, or basused. All of these effect the timbre of notes. soon. But timbre also is used in a general sense to describe an instrument’s sound quality as sharp, dull, shrill, and so forth. Onset, offset, duration How quickly an instrument speaks after the performer starts a note, whether it can be played with and many instrumental qualities are also lumped as timbre. Timbre also gets We vibrato, introduce another abstract time variable t to together the score, measured in beats. mixed up with loudness because some instruments, like the trombone, get more shrill as they get The beginning of the score is set to time t = 0. louder. As a consequence, it’s easier to say what timbre isn’t than what it is: timbre is everything Using lines, aduration, score and is not divided upHowever, into negative measures, aboutvertical a tone that isbar not its pitch, not its its loudness. definitionseach of are slippery and provide no(though new information. which has a well-defined possibly variable) number of beats. Thus at There areofother of representing tones shed positive light on timbre. as colors can using the beginning anyways given measure wethat know how many beatsJust have passed be shown to consist of mixtures of light at various frequencies and strengths, sounds can be shown some simple to consistarithmetic. of mixtures of sinusoids at various frequencies and strengths (see volume 2, chapter 3). TheFor onset an we event scoreonis the amount ofour time t1 us (in between instance,ofwhen hear a in noteaplayed a trumpet, even though ears tell webeats) are hearing a single tone, fact score we are hearing simpler tones mixed of together a characteristic way that the beginning of inthe and the beginning the inevent; its offset isour the time minds—perhaps through long perhaps some intrinsic into t2 between the beginning ofexperience, the score andthrough the end of the capability—fuse event; the duration the perception of a trumpet sound. of the event is t2 − t1 , the amount of time (in beats) the event lasts. Note values and time signatures The system of note values allows us to compare the durations of different types of notes. 12 The duration of a given note value above is always 12 the duration of its neighbor to the left; thus two half notes make a whole note (in duration), two quarters make a half, etc. Finally, a time signature is used both to specify the number m of beats per measure, and the note value n (2=half, 4=quarter, etc.) which will be assigned a duration of 1 beat. This is notated using a ratio-like notation of the form m/n. Only now, with all this notational equipment at our disposal, can we fully specify onsets, offsets and durations of notes in scores. Example 2.4. There are two additional bits of notation in the score above that require explanation. The arc connecting the two E notes is called a tie; it indicates that the note is sounded only once and is “held over the bar”, for a total value of 2 quarter notes. Also, adding a dot after a note value, as we have after the last C, has the effect of increasing the note value by 1/2; thus the last note has a value of 2+1=3 quarter notes. 1. How many beats per measure are there? Ans: 6 beats. 2. Which note value has a duration of 1 beat? Ans: the eighth note. 3. What is the duration (in beats) of the A in the third measure? Ans: 1 half=2 quarters=4 eighths=4 beats. 4. What is the onset (in beats) of the A in the third measure? (Careful: the first note of the piece has onset 0.) Ans: 13 beats. Classic 2 (Musica Ricercata, No. 1, by György Ligeti). Hungarian composer György Ligeti wrote Musica Ricercata between 1951-1953. Ligeti’s own description of the composition: In 1951 I began to experiment with very simple structures of sonorities and rhythms as if to build up a new kind of music starting from nothing. My approach was frankly Cartesian, in that I regarded all the music I knew and loved as being, for my purposes, irrelevant and even invalid. The word ‘ricercata’ is derived from the Italian verb ‘ricercare’, meaning “to search” or “to investigate”. As such the title means something to the effect of “investigative music” or perhaps even “experimental music” (as in scientific experiment). In music a ricercar was a sort of fugue-precursor popular in the 16th and 17th centuries. Such pieces had an abstract or technical flavor; their aim was often to investigate or articulate musical consequences of a single theme using counterpoint. In Musikalisches Opfer Bach includes two ricercare based on the same “Royal theme” (“Thema Regium”) from Classic 1. Ligeti’s Musica Ricercata contains 11 pieces. The first piece uses only 2 pitch classes (A and D), and in each subsequent piece the number of pitch 13 classes is incremented by 1. Thus the last piece uses all 12 pitch classes, and is itself a ricercar in homage to Girolamo Frescobaldi’s (c. 1583-1643) ‘Ricercar cromatico’. 3 Frequency We will now set about modeling sonic properties like frequency and pitch using mathematical language. This will provide an opportunity to introduce (or review) some basic mathematical notation and operations. Furthermore, we will meet two very important types of mathematical objects that you probably have not seen before: groups and topological spaces. 3.1 Frequency space The physical property of a sound that is most strongly associated with pitch is frequency. We will say in more detail what frequency is later (when discussing timbre); for now, let us be content to say that pitched sound has a periodic (or repetitive) quality, and frequency measures the number of repetitions (or cycles) per second exhibited by the sound. Some basic properties of frequency: • The SI unit of measurement for frequency is the hertz (Hz), defined as 1 Hz = 1 cycle per second. • As frequency increases, so does the perceived pitch. • A frequency f , being a measure of the number of cycles per second, is a positive number, though not necessarily an integer: we can have f = 12 , √ f = 2, f = π1 , etc. • Humans can hear frequencies ranging from around 20 Hz to 20,000 Hz (or 20 kHz). Frequency space A frequency f is allowed to be any positive real number. The set of all real numbers is denoted R. We define frequency space to be the set of all possible frequencies. Definition 1. The set of all possible frequencies is called frequency space, denoted Xfreq . From the observation above, we see that Xfreq = (0, ∞) = {x ∈ R : x > 0} =: R>0 14 Comment 3.1. The three equalities in the definition above introduce some interval notation, set notation, and a naming convention, respectively. 1. Recall that the open interval (a, b) is defined as the set of all x with a < x < b: i.e., all numbers strictly between a and b. Similarly the closed interval [a, b] is defined as the set of all x with a ≤ x ≤ b. 2. The second equality in the definition expresses this notion using set notation. In general we would write (a, b) = {x ∈ R : a < x < b}, which reads: The set ({. . . }) of all elements x in (‘∈’) the reals such that (‘ : ’) x is greater than a and less than b. 3. The last equality (‘=:’) is a naming notation that declares that the thing on the left will be denoted by the thing on the right. Similarly (‘:=’) we will be used to declare that the thing on the right will be denoted by the thing on the left. Below you find the frequencies associated to a variety of A pitches. Our naming scheme for the pitch now includes a number indicating which octave the pitch lies in on the standard 88-key piano. The numbering scheme is calibrated on where C notes lie. Thus the lowest C on the piano is C1; as there is an A below this lowest C, that A is called A0. We immediately observe that going up an octave corresponds to doubling the frequency. This is not a recent discovery. 3.2 The Pythagorean Legend The history of the relation between intervals and frequencies goes back to an apocryphal story about Pythagoras (c. 580-500 BC). See Figure 1. Here is a reading of Gaffurius’ woodcut comic. Passing a blacksmith shop, Pythagoras notices that the sounds created by hammers of different weights striking the anvil sound nice together (consonant). When he measures the weights of the different hammers he notices that their ratios take the form of “simple” fractions: that is, fractions that can be expressed as ratios of small whole numbers. He then performs similar experiments using bells of varying dimensions, glasses of water of varying height, pipes of varying lengths, etc. Each 15 Figure 1: Woodcut from Theorica Musicae, by Franchinus Gaffurius time he observes a similar phenomenon: when the dimensions of two simultaneously sounding instruments form simple ratios (12/9=4/3, 8/6=4/3,etc.), the resulting interval is consonant.2 The story is in a sense the founding legend of the connection between music and mathematics. For Pythagoras and his followers, this was yet another striking example of mathematics governing nature. Historical consequences: earned music a place in the classical quadrivium along with arithmetic, geometry and astronomy; added fuel to the Pythagoreans’ already raging obsession with rational numbers; influenced thinking of future scientists (Aristotle, Ptolemy, Kepler), who sought examples of such ratios in astronomy–hence the so-called “music of the spheres”. Pythagoras’ conclusion, in slightly updated language, is as follows: all of these experiments, except the third, produce sounds whose frequencies form simple ratios; thus two sounds whose frequencies f1 , f2 can be expressed as a simple ratio (ratio of small integers) are consonant when sounded together. In particular, when ff12 = 21 , the interval produced is an octave. 3.3 The transposition group Tfreq Intervals as ratios Given two frequencies f1 , f2 ∈ Xfreq , the interval of their corresponding pitches is determined by the ratio f2 /f1 = c. How exactly c determines the interval is not immediately clear. For example, if f1 /f2 = 1.724, what is the corresponding interval? We will begin with a few simple observations. Fix f1 /f2 = c. 2 The third picture shows Pythagoras playing a stringed instrument called a monochord. The tensions of the strings here would form simple ratios. Since frequency is proportional to the square-root of the tension (assuming the length of the strings is held constant), this experiment would not produce the same phenomenon as the other three. 16 (1) Since f1 and f2 are positive, so is their ratio c. That is, c ∈ R>0 . (2) If f1 > f2 , then f1 f2 > 1, and thus c > 1. Likewise, if f1 < f2 , then c < 1. (3) To say ff12 = c is the same as saying f1 = cf2 . Thus multiplying a frequency by c corresponds to moving up (if c > 1) or down (if c < 1) by a certain interval. This operation is called a transposition (or shift, for short). Note that if c = 1, then the frequency f is unchanged; we call this the the trivial transposition (or trivial shift). Transposition group The last observation suggests that the ratios c we deal with are best understood as defining certain transpositions on the frequency space Xfreq . This motivates the following definition. Definition 2. Let Tfreq be the set of all possible ratios of frequencies. In set notation we have Tfreq = {f1 /f2 : f1 , f2 ∈ Xfreq }. We call Tfreq the transposition group of Xfreq . Comment 3.2. Since Tfreq is the set of all possible frequency ratios f1 /f2 , and since f1 and f2 are allowed to be any positive real number, it follows that the elements of Tfreq can be any positive real number; i.e., Tfreq = R>0 . Thus from now on, we will no longer think of an element c ∈ Tfreq as a ratio, but rather simply as a positive number that defines a certain transposition. When investigating how precisely an element c ∈ Tfreq acts as a transposition (or shift), it becomes clear that multiplication is the relevant operation. Let’s make this more explicit. Keep in mind for what follows that we have both Tfreq = R>0 and Xfreq = R>0 . 1. An element c ∈ Tfreq sends an arbitrary frequency f ∈ Xfreq to the new frequency cf , the product of c and f : f shift by c / cf 2. Transposing first by d and then by c corresponds to transposing by their product c · d. Indeed, given any frequency f , we have f shift by d / d·f shift by c shift by c · d 17 / c · (df ) = (c · d)f 7 3. To “undo” or “reverse” the transposition c, we simply transpose by its multiplicative inverse 1c : f shift by c / c·f / 1 (c shift by 1/c c 7 · f) = f trivial shift In mathematics we say the transposition 1/c is the inverse of the transposition c. Example 3.1. Fix a frequency f . Recall that we have already observed that the element c = 2 ∈ Tfreq corresponds to shifting up by an octave; i.e., the pitch 2f is exactly one octave higher than f . Let’s use the observations above to elaborate more on octave transpositions. 1. Following the second observation above, shifting f up by 2 octaves yields the new frequency 2(2f ) = 22 f . More generally, shifting f up n octaves yields the new frequency 2n f . Thus for n a positive integer, the element 2n ∈ Tfreq corresponds to shifting up n octaves. 2. Following the third observation, shifting down by 1 octave corresponds to the element 12 ∈ Tfreq . It then follows that shifting down by n octaves corresponds to the element ( 21 )n ∈ Tfreq . 3. Recall that 12 = 2−1 , and thus that ( 12 )n = 2−n . We can summarize the last two observations as follows: let n be any positive integer, then the element 2n ∈ Tfreq is transposition up by n octaves, and the element 2−n ∈ Tfreq is transposition down by n octaves. Harmonic series So far we know how the elements of the form c = 1 and c = 2m act as transpositions: the first is the trivial shift, the second shifts up or down by a number of octaves. What about other elements c ∈ R>0 ? We approach this question by first looking at positive integer values of c; that is, c = 1, 2, 3, . . . . If we start with a fixed frequency f and begin transposing by these values of c we obtain what is called the harmonic series on f : f, 2f, 3f, 4f, . . . , Here are the approximate pitches associated to 12 terms of the harmonic series starting on f = 110 Hz: 18 To illustrate that the above is only an approximation, note that the pitch 3f would have frequency 330 Hz, however the E4 written on the staff in fact has frequency around 329.63 Hz. Why? Short answer: our tuning system is not based on the harmonic series! Interval arithmetic This staff pitch approximation of the harmonic series provides a means of associating familiar interval transpositions to an element c ∈ Tfreq when c is a positive rational number: i.e., c = m n , where m and n are integers. 1. The interval on the staff between 2f and 3f is a perfect fifth. This tells us 3 that c = 3f 2f = 2 corresponds roughly to transposing up by a perfect fifth. We will call the interval determined by c = 32 a Pythagorean fifth. 2. The interval on the staff between 3f and 4f is a perfect fifth. This tells 4f = 43 corresponds roughly to transposing up by a perfect us that c = 3f fourth. We will call the interval determined by c = 43 a Pythagorean fourth. 3. Start with any frequency f . If we go up a Pythagorean fifth we get the frequency f 0 = 23 f . If we then go down a Pythagorean fourth, we get the frequency f 00 = 34 f 0 = 34 ( 32 f ) = 98 f . Since this process corresponds roughly to going up a perfect fifth and then down a perfect fourth, we see that the ratio c = 98 corresponds roughly to a major second! Group structure of Tfreq In coming to understand Tfreq = R>0 , we have seen how important a role multiplication has played. As it turns out, the set R>0 taken with the multiplication operation is an important example of what is called a group in mathematics. Definition 3. A group is a pair (G, ·), where G is a set, and · is an operation which, given any two elements g1 , g2 ∈ G outputs a third element h = g1 ·g2 ∈ G, and which further satisfies the following axioms: (i) The operation · is associative: i.e., g1 · (g2 · g3 ) = (g1 · g2 ) · g3 for all g1 , g2 , g3 ∈ G. (ii) There is an identity element e ∈ G satisfying e · g = g and g · e = g for all g ∈ G. (iii) Every g ∈ G has an inverse in G: that is, there is an element h such that g · h = h · g = e, the identity element. We write h = g −1 in this case. Let’s carefully show that Tfreq is a group. 1. We must first state explicitly what the underlying set is, and what the operation is. In this case the set is Tfreq = R>0 , the set of all positive numbers, and the operation is simply real number multiplication. 19 2. Next we must show our operation is associative. This is immediate in our case as we know already real multiplication is associative: r(st) = (rs)t, for any real numbers r, s, t. 3. Next we must identify an identity element e in our set, and show it satisfies the required property. In our case we take e = 1 ∈ Tfreq . For any other c ∈ Tfreq we have 1 · c = c · 1 = c, again by familiar properties of real number multiplication. 4. Lastly, given any c ∈ Tfreq we must show there is an inverse element d ∈ Tfreq satisfying c · d = d · c = 1. We take d = 1c . This is indeed an element in Tfreq , since if c > 0, then so is 1c . Once again, familiar properties of multiplication imply c · 1c = 1c · c = 1. This all looks deceptively simple, mainly because the underlying set and operation in this example are both very familiar to us. However, the notion of a group is very general, and examples can be much more exotic than this. Here’s how: 1. The underlying set G need not be a set of numbers. It may be a set of functions, or of letters, or of anything whatsoever. Furthermore, the underlying set may be finite or infinite. 2. The group operation may have nothing to do with operations familiar to you from arithmetic. As long as the operation is well-defined and satisfies the three axioms, we have a group. 3. In particular, though the group operation must be associative, it is not required to be commutative: that is, we can have groups (G, ·) such that g1 · g2 is not necessarily equal to g2 · g1 for all elements g1 , g2 in G! Example 3.2. Let G = {x, y} (a set with two elements), and define an operation ∗ on G as follows: x∗x x∗y = x = y∗x=y y∗y = x Show that (G, ∗) is a group. (Note: as you see, we don’t always have to use · to denote the group operation.) Solution: to show the operation is associative, one has to show that a number of different equalities of the form a ∗ (b ∗ c) = (a ∗ b) ∗ c are true. As an example observe that x ∗ (y ∗ y) = x ∗ x = x, and (x ∗ y) ∗ y = x ∗ x = x; thus x ∗ (y ∗ y) = (x ∗ y) ∗ y. 20 Once we know the operation is associative, we need to identify the identity and inverses. In this case, we declare e = x. This satisfies the identity axiom as x ∗ x = x and x ∗ y = y ∗ x = y. Finally, since x ∗ x = x = e and y ∗ y = x = e, we see that all elements are their own inverses! Thus we have x−1 = x and y −1 = y. Example 3.3. Let G = R>0 and let + denote the usual operation of real number addition. Show that (R>0 , +) is not a group. Specify exactly which axioms are satisfied, and which axioms fail. Solution: Addition does in fact define an operation on R>0 : given any positive x, y ∈ R>0 , their sum x + y is still positive, and thus lies in R>0 . Furthermore, we know this operation is associative, since addition is in fact associative on all of R. However, I claim there is no identity element in R>0 with respect to addition. Indeed suppose there were an e ∈ R>0 satisfying the identity axiom. Then in particular we would have e + 1 = 1; but this implies that e = 0, which is a contradiction since 0 ∈ / R>0 . Once we know there is no identity, there is no need to look for inverses, since this notion makes use of an identity element in its definition. Example 3.4. Now let G = R, the set of all real numbers. Show that (R, +) is a group, but (R, ·) is not a group, where + and · denote real number addition and multiplication, respectively. Solution: consider first (R, +). As noted above, we know already that addition is associative on R. We declare the identity element to be e = 0 ∈ R. This satisfies the identity axiom as 0 + r = r + 0 = r for any r ∈ R. Lastly given any r ∈ R, its inverse with respect to + is −r, since r + (−r) = 0 = e. (Note: common decency prevents us from using the group notation for inverses and writing in this case r−1 = −r.) Now consider (R, ·). Multiplication is indeed an associative operation on R, and furthermore we can set e = 1 as the group identity element–in fact, we are forced to do so: if e · r = r for all r ∈ R, then in particular e · 1 = 1, which implies e = 1. Furthermore given any nonzero r ∈ R, we can define its inverse as r−1 = 1r . However, we cannot forget that 0 ∈ R, and 0 has no inverse with respect to multiplication. Indeed, we have 0 · r = 0 for all r, so there can be no r with 0 · r = 1 = e. 3.4 Just tunings and equal temperament Let f ∈ Xfreq correspond to a particular instance of C, and let 2f be its transposition up one octave. Between f and 2f lie infinitely many frequencies in Xfreq , and yet our tuning system, 12-tone equal temperament, makes uses of only 12 of these–the 12 white and black notes of the keyboard starting with the first C and ending with B. What exactly are the corresponding frequencies of these pitches, and how did we decide upon them? 21 The 12-tone equal-tempered system, though itself not strictly based on the harmonic series on f (f , 2f , 3f ,. . . ), is the direct descendant of tuning systems that were based on this series; we will call such systems just tunings. After rigorously defining the 12 pitches appearing in our (unjust) equal-tempered system, we will compare this system with one of its direct ancestors, the Pythagorean tuning system. 12-tone equal temperament We continue to let f ∈ Xfreq correspond to a C pitch. The pitches of equal temperament are generated using a single half step interval that divides the octave from f to 2f into 12 equal subintervals. What ratio c corresponds to this half step interval? We are tempted to take the octave interval ratio 2 and divide it by 12, yielding 2/12=1/6. However this breaks 2 into 12 equal parts in an additive manner, 1 1 1 2 = + + ··· + , 6 6 6} | {z 12 times and we’ve seen that multiplication is the relevant operation when dealing with frequency space. As we will see below we seek instead a number c such that 2 = |c · c ·{zc · · · }c . 12 times Thinking in terms of our transposition group gives a clearer perspective of things. Raising f by a fixed interval step by step corresponds to fixing a c > 1 in Tfreq and successively multiplying f by c. Thus the first step would be cf , the second step would be c(cf ) = c2 f , and in general the n-th step would be cn f . To say that this fixed interval c divides the octave into 12 steps means that the 12-th step c12 f brings us to 2f : i.e., we have c12 f = 2f . Canceling f on both sides, we conclude that c12 = 2. Lastly, we solve this equation for c to conclude that √ 12 c = 21/12 = 2 represents transposition up by an equal-tempered half step! Equal-tempered intervals Once we know that the equal-tempered half step corresponds to c = 21/12 , we can easily represent any other equal-tempered interval in terms of c by computing the interval’s length in half steps: an interval of length n half steps corresponds to cn = = (21/12 )n 2n/12 (using an old exponentiation rule). 22 Thus we easily derive the following table (M=major, m=minor, P=perfect): Interval m2 M2 m3 M3 P4 Tritone P5 Half step length 1 2 3 4 5 6 7 Exact value 21/12 22/12 23/12 24/12 25/12 26/12 27/12 Decimal approx. 1.059 1.122 1.189 1.26 1.335 1.414 1.498 Sequence of fifths There are other intervals besides the half step, which can generate all 12 pitches of the equal-tempered system. One example is the perfect fifth c5 = 27/12 . For what follows I will fix a frequency f corresponding to C2, though the procedure I describe works with any starting pitch. Beginning with f we transpose successively by a prefect fifth, and if necessary, reduce by an octave to get a pitch between f and 2f . The table below illustrates the procedure for the first few pitches in this sequence. Term f0 f1 f2 f3 f4 Frequency f 27/12 f0 = 27/12 f 27/12 f1 = 214/12 f 27/12 f2 = 29/12 f 27/12 f3 = 216/12 f After octave adjustment f 27/12 f 22/12 f 29/12 f 24/12 f Pitch name C2 G2 D2 A2 E2 The resulting sequence f0 , f1 , f2 , . . . is called a sequence of fifths: as the last column shows, the pitch names jump up by fifths. The procedure we described adjusts by octave at each step if necessary. Alternatively, starting at our f corresponding to C2, we could simply repeatedly transpose up by fifth to generate a 12-note sequence, and then adjust by the correct number of octaves as shown in the figure below. 23 Here the last staff contains the pitches adjusted appropriately by octave. Note the enharmonic relation between F] and G[ in measures 7 and 8. Pythagorean tuning In Gaffurius’ woodcut depiction of Pythagoras we see in each experiment the sequence (6, 8, 9, 12). The sequence gives rise to the frequency ratios 8/6 = 4/3 , 9/6 = 3/2 , 12/6 = 2 . Recall that the intervals corresponding to 4/3 and 3/2 are called the Pythagorean fourth and fifth, respectively, and that these are roughly equal to the equaltempered perfect fourth and fifth. The Pythagorean tuning system can be generated by these two intervals by successively transposing up by a Pythagorean fourth or fifth, and then adjusting by octave as necessary–a process similar to the sequence of fifth procedure described above. In fact, since going up a fourth is the same as going down a fifth after octave adjustment, we can generate the Pythagorean system via a sequence of fifths going up and down from our starting frequency f . Fix the frequency f corresponding to C4. The table below illustrates how the Pythagorean tuning system is generated by transposing up and down by a Pythagorean fifth (c = 32 ). Reading up (resp., down) the table corresponds to transposing up (resp., down) by Pythagorean fifths. Term f6 f5 f4 f3 f2 f1 Frequency = 729 f 256 After octave adjustment 729 f 512 Approx pitch name F] 4 243 f 128 = 81 f 32 = 27 f 16 3 9 f = 4f 2 1 3 f 2 243 f 128 81 f 64 27 f 16 9 f 8 3 f 2 B4 G4 3 f 2 5 3 f 2 4 = 3 f 2 3 3 f 2 2 E4 A4 D4 f0 f f C4 f−1 2 f 3 F4 f−2 2 f = 89 f 3 −1 2 32 f = 27 f 3 −2 4 f 3 16 f 9 32 f 27 128 f 81 256 f 243 1024 f 729 f−3 f−4 f−5 f−6 2 64 f = 81 f 3 −3 2 256 f = f −4 3 243 512 2 f = f 3 −5 729 B[ 4 E[ 4 A[ 4 D[ 4 G[ 4 Some peculiarities about the Pythagorean system: 1. The Pythagorean half step is the interval between the Pythagorean E and F. This corresponds to c = (4/3)/(81/64) = 256/243. Unlike the equal-tempered half step, we cannot obtain the other intervals by taking successive transpositions by c. For example, we have c2 < 9/8 < c3 , 24 showing in particular that two Pythagorean half steps do not make a Pythagorean whole step. 2. The table above contains in fact 13 distinct pitches! Although F] and G[ 6 are the same pitch in equal-tempered tuning, the frequency f6 = 329 f is 10 slightly higher than our F] , and f−6 = 236 f is slightly lower than our G[ . The ratio f6 /f−6 = 312 /219 is very close but not equal to 1. This ratio is called the Pythagorean comma. 3. Whereas a sequence of equal-tempered fifths starting at G[ would take us back exactly to G[ after 12 steps, whence “circle of fifths”, the sequence of Pythagorean fifths takes us from f−6 to f6 , which is slightly higher than f−6 . The sequence of Pythagorean fifths thus fails to close, whence “spiral of Pythagorean fifths”, and this failure is measured by the Pythagorean comma. We saw that the sequence of Pythagorean primes did not close after 12 steps, but what about after 13 steps, or 100 steps? In fact the sequence will never close. To state this more clearly, let c = 32 and fix any frequency f . Then no two frequencies in the Pythagorean sequence of fifths f, cf, c2 f, c3 f, . . . are the same, even after adjusting octaves. Here is a proof of this fact. Suppose by contradiction that two elements in the sequence, say cn f and cm f with n < m, did in fact differ by a number of octaves. Then we would have cn f = 2−r cm f, where r is the number of octaves between them. Canceling f and substituting c = 32 , we obtain 3n 3m = 2−r m . n 2 2 Now clear denominators to obtain 2m+r−n = 3m−n . This last equality is a contradiction! Why? An integer cannot simultaneously be a power of 2 (the left-hand side) and a power of 3 (the right-hand side). This is a consequence of the fundamental theorem of arithmetic, which we will discuss below. Since we’ve reached a contradiction our original assumption must be false. Thus no two frequencies in the Pythagorean sequence of fifths differ by a number of octaves; the sequence never closes! 25 The unjustness of equal temperament The Pythagorean system is a just one: its intervals all correspond to rational numbers, ratios m/n where both m and n are integers. This is a consequence of this system being generated by the interval 3/2 taken from the harmonic series. The equal-tempered system was the result of efforts to iron out some of the peculiarities of the Pythagorean system and its descendants. Though equal temperament was successful in this regard, justness was lost in the process. In fact the only intervals in equal temperament that are √ rational numbers are octaves; the rest, starting with the half step 21/12 = 12 12 upon which the system is founded, are irrational (that is, not rational). √ Classic 3 (The irrationality of 12 12, by Euclid). Euclid has a lot of classics. You should also check out The infinitude of primes. √ The argument I describe below is more often used to demonstrate that 2 is irrational, but the two proofs are very similar. Both rely on Euclid’s fundamental theorem of arithmetic, which I will describe first, along with some terminology. A prime number is a positive integer p ∈ Z that has exactly two factors, namely 1 and p. Some examples: 1 is not prime, as it has exactly one factor; 2 is prime; 3 is prime; 4 is not prime as 1, 2 and 4 are all factors. Equivalently an integer n > 1 is not prime (called composite) if and only if we can factor n = a · b with a > 1 and b > 1. A useful property of prime numbers is that if a prime p divides a product of integers a · b, then it divides a or b; in particular, if a prime p divides a power of an integer ar , then p divides a. The fundamental theorem of arithmetic (FTA) tells us that prime numbers are the building blocks of all integers. Formally, given any positive integer n, we can: (i) decompose n = p1 · p2 · · · · pn as a product of prime numbers pi , and furthermore, (ii) this decomposition is unique. This theorem is also known as the unique factorization theorem. Let’s look at a simple example to see how this works, and to understand what exactly is meant by “unique”. The integer n = 12 can be written as a product of primes in a number of ways: e.g., we have 12 = 2·2·3 12 = 2·3·2 12 = 22 · 3 Technically the last example expresses 12 as a product of powers of primes, but this is the standard way we write the prime factorization of 12. These three factorizations are all different in some sense, but each factorization draws from the same set of primes {2, 3}, and furthermore, the number of times each prime of this set “appears” is the same: 2 appears twice, 3 appears once. This then is what we mean by uniqueness: given any two prime factorizations of an integer 26 n, the factorizations will contain the same set of primes, and each prime of this set will appear the same number of times. √ We are finally ready to prove that c = 21/12 = 12 2 is irrational: that is, we can not write c = m n with m and n integers. Indeed suppose we could. Writing c = 21/12 , we would have m 21/12 = (1) n for two integers m and n. Canceling factors if needed in the ratio m/n, we can assume that m and n share no common factors. This is very important later on. Now raising both sides of (1) to the 12th power, we see that 2 = 2n12 = m12 , which implies n12 m12 . (2) (3) Since 2 divides the left-hand side of (3), it must also the right-hand side. So 2 divides m12 . Since 2 is prime, this implies that 2 divides m. Now this implies that in fact 212 divides m12 , since each of the twelve m’s in this product has a 2 in it. Now we see that the right-hand side of (3) is divisible by 212 , thus so is the left-hand side; 212 divides 2n12 . However, for this many 2’s to appear in the factorization of 2n12 , we must have 2 appearing in the factorization of n; but then m and n share 2 as a common factor, which is a contradiction! Thus we cannot write c = 21/12 as a ratio of two integers, which means c is irrational. Classic 4 (Eleven Intrusions, by Harry Partch). Harry Partch was a 20thcentury American composer who invented and composed with a just scale containing 43 distinct frequencies within the octave. The first two pieces in this cycle are scored for Bass Marimba, and the Harmonic Canon, instruments invented by Partch himself. No. 1, “Study on Olympos’ Pentatonic”. Just intonation scale. f 9 8f 6 5f 3 2f 8 5f 2f G A B[ D E[ G No. 2, “Study on Archytus’ Enharmonic”. A scale built of two identical tetrachords. The pitch G↑ corresponding to 28 27 is in fact not included in Partch’s 43-note scale on G. f 28 27 f 16 15 f 4 3f 3 2f 14 9 f 8 5f 2f G G↑ A[ C D D↑ E[ G No. 5, “The Waterfall”, scored for Apapted Guitar II and Diamond Marimba– more Partch inventions. 27 4 Pitch space The multiplicative picture of frequency space and its transposition group, wherein intervals are identified with ratios of frequencies, and transposition is identified with multiplication, is forced upon us by the physical nature of sound. However, the language we use when speaking of pitch is additive in nature. Earlier we defined the interval between two pitches in terms of their difference, and we think of transposing a pitch up by an interval to be addition by a certain number of half steps. Recall that our notion of pitch is itself an abstraction of the physical property of frequency: whereas frequency is a measurable property of sound waves, pitch is a measure of how high or low we perceive a sound to be. Since our movement from the world of frequency to the world of pitch is already a movement away from the pure physical properties of sound, there is no harm in also freeing ourselves from the multiplicative picture of frequency space the world imposes on us, and choose instead an additive picture of pitch space that suits the language we already use. Mathematically speaking, our switch of pictures will correspond to replacing the multiplicative group (R>0 , ·) with the additive group (R, +). 4.1 Pitch space We will represent pitches as points on the real line. The pitches of 12-tone equal temperament will be embedded in the real line by declaring that the pitch C4 (“middle C”) will correspond to the real number 0 (the “middle” of the real line), and that moving up or down by n half steps from C4 corresponds to taking 0 and adding or subtracting n. Definition 4. We define pitch space to be the set Xpitch of all possible pitches and identify this with the set of all real numbers: Xpitch = R. The pitches of our 12-tone equal-tempered system are embedded in R by (i) setting 0 equal to C4, and (ii) declaring transposition up by a half step to correspond to addition by 1. The formal definition of Xpitch gives us the following new picture of pitch 28 space. G3 A♭3 A3 B♭3 B3 -5 -4 -3 -2 -1 C4 C♯4 D4 D♯4 E4 F4 0 1 2 3 4 5 ℝ ℤ As the diagram illustrates, the pitches of 12-tone equal temperament are embedded as the integers Z inside of R. (Recall that Z = {· · · − 3, −2, −1, 0, 1, 2 . . . }.) The real numbers lying between successive integers correspond to pitches that lie outside of our tuning system. It is instructive to compare our pitch space picture of equal temperament with the corresponding frequency space one: G3 A♭3 A3 B♭3 B3 -5 -4 -3 -2 -1 C4 C♯4 D4 D♯4 E4 F4 0 1 2 3 4 5 ℝ ℤ G3 200 C4 F4 300 B♭4 400 500 ℝ>0 4.2 The transposition group Tpitch Given two equal-tempered pitches P1 < P2 in Xpitch , their difference P2 − P1 tells us the length in half steps of the interval between them. (Observe that since P1 , P2 are equal-tempered, they will be integers, and thus so will their difference.) We generalize this notion two any two pitches in Xpitch . Definition 5. An interval in Xpitch is a set {P, Q}, where P, Q ∈ Xpitch . The length (in half steps) of the interval {P, Q} is defined as |P − Q|. 29 Example 4.1. 1. Let P = 52 and Q = 13 2 . Then the interval {P, Q} has length |P − Q| = | − 82 | = 4 half steps. Thus the interval is a perfect third lying wholly outside of our tuning system! 2. Given any pitch P , the interval {P, P } has length |P − P | = 0; this is the unison interval. √ √ 2 and Q =√2 + √3. Then 3. Let P√ = 2 − √ √ the√interval {P, Q} has length |2 − 2 − (2 + 3)| = | − 2 − 3| = 2 + 3 half steps. Given any real number t ∈ R, the operation P 7→ (t + P ) defines a transposition operation on Xpitch . If t ≥ 0 this is transposition up by t half steps; if t ≤ 0 this is transposition down by |t| half steps. This motivates the following definition. Definition 6. The transposition group of Xpitch is defined as the set Tpitch = R. An element t ∈ Tpitch acts as a transposition by sending a pitch P to t + P . Comment 4.1. As its name indicates, Tpitch is indeed a group: namely, the group (R, +), of real numbers with group operation defined to be addition. Arithmetic in Xpitch Switching from Xfreq to Xpitch simplifies our interval arithmetic immensely. Transpositions and interval lengths are now computed with additions and subtractions. This is more in line with how we naturally speak about pitches and intervals. Furthermore transposing by an equal-tempered half step now corresponds to the supremely simple operation of adding the integer 1 to a pitch, as opposed to the more complicated operation of multiplying a frequency by the irrational number 21/12 . Example 4.2. Find the t ∈ Tpitch that corresponds to the following transpositions. 1. Down by a major third. Solution:(t = −4) 2. Up by a major 7th. Solution: (t = 11) 3. Up by a tritone. Solution: (t = 6) 30 4.3 Comparing the two pictures Working in pitch space is easier than working in frequency space: so much easier that you may be asking yourself whether we are cheating somehow. A more nuanced question: do we lose any information when we switch from our frequency picture to our pitch picture? The answer is no, as far as interval information is concerned, and we have a precise mathematical way of showing this. For what follows let’s first drop the (ontological) distinction between frequency and pitch, referring to both simply as pitch. Then the main difference between Xfreq and Tfreq , on the one hand, and Xpitch and Tpitch on the other, is how exactly the notion of pitch is modeled. The former models pitch multiplicatively using R>0 ; the latter does so additively using R. To convince ourselves that both models of pitch provide the same information as far as intervals are concerned, we need to show there is a way of translating (in a linguistic sense) from one model to the other, and back again. We will do so mathematically using the function log2 (x). First consider our two different transposition groups Tfreq = (R>0 , ·) and Tpitch = (R, +), where here I emphasize the group structure of both of these objects. We define a function φ : Tfreq → Tpitch by setting φ(c) = 12 log2 (c) for any c ∈ Tfreq . Example 4.3. 1. Take c = 2 ∈ Tfreq , the Tfreq representation of transposition up by an octave. Then φ(2) = 12 log2 (2) = 12. Thus φ sends 2 ∈ Tfreq to 12 ∈ Tpitch . That’s good, since 12 is the Tpitch representation of transposition up by an octave (12 half steps=1 octave). 2. Take c = 21/12 ∈ Tfreq , the Tfreq representation of the half step. Then φ(c) = 12 log2 (21/12 ) = 12(1/12) = 1, where here I have use the general property that log2 (2r ) = r for any r. Note that 1 ∈ Tpitch is simply the Tpitch representation of the half step. This shows that φ correctly translates our Tfreq representation of the half step to the corresponding Tpitch representation of the half step. 3. More generally take c = 2n/12 , the Tfreq representation of transposition up by n half steps. Then φ(c) = 12 log2 (2n/12 ) = 12(n/12) = n, which is the Tpitch representation of this same transposition. It seems φ(x) does a good job of translating between our frequency language and pitch language! Here’s another nice property of our translating function φ: take any c1 , c2 ∈ Tfreq and let φ(c1 ) = t1 and φ(c2 ) = t2 be their corresponding representatives 31 in Tpitch ; then we have φ(c1 · c2 ) = 12 log2 (c1 · c2 ) (by definition) = 12(log2 (c1 ) + log2 (c2 )) (since log2 (a · b) = log2 (a) + log2 (b) ) = 12 log2 (c1 ) + 12 log2 (c2 ) = φ(c1 ) + φ(c2 ) = t1 + t2 . Let’s unravel what this means. Let T1 and T2 stand for the transpositions that c1 and c2 represent in Tfreq . Then t1 and t2 are their corresponding representations in Tpitch , and the equalities φ(ci ) = ti are understood as saying φ translates ci as ti . Now the element c1 · c2 is the Tfreq representation of transposing first by T2 and then by T1 . If φ acts as a good translator, then φ(c1 · c2 ) should be the Tpitch representation of transposing first by T2 and then T2 . In other words, we should have φ(c1 · c2 ) = t1 + t2 , which is exactly what the equations above show! On the level of groups, the equation φ(c1 · c2 ) = t1 + t2 means that the function φ respects the relevant group operations on each group; it sends a product in R>0 to a sum in R. This is one of the defining properties of what is called a group homomorphism. Definition 7. Let (G, ·) and (H, ∗) be two groups. I’ve denoted the two relevant operations with different symbols so that we can keep them straight; by the same token, let eG be the identity element of G, and let eH be the identity element of H. A group homomorphism from G to H is a function φ: G → H satisfying: (i) φ(eG ) = eH , (ii) φ(g1 · g2 ) = φ(g1 ) ∗ φ(g2 ) for all g1 , g2 ∈ G. Comment 4.2. The two conditions can be summarized by saying that a group homomorphism is a function from G to H that respects the group structure of each: it sends the identity element to the identity element, and it sends products (using ·) in G to products (using ∗) in H. Since we have already shown that our function φ : R>0 → R respects the group operations, to show it is a group homomorphism we need only show that it sends the identity of R>0 (that is, the element 1 ∈ R>0 ) to the identity of R (that is, the element 0 ∈ R). This is easy: φ(1) = 12 log2 (1) = 12 · 0 = 0. Translating back It remains to show that we can translate back from pitch space language to frequency space language. We do so with the translator function ψ : Tpitch → Tfreq 32 defined by setting ψ(t) = 2t/12 . To see that φ and ψ serve as complementary translators back and forth between frequency and pitch space, consider what happens when you start with any c ∈ Tfreq , apply φ (translating it into pitch language), and then apply ψ (translating back into frequency language). Using function notation the result would be ψ(φ(c)) = ψ(12 log2 (c)) (def. of φ) 2(12 log2 (c))/12 (def. of ψ) = log2 (c) = 2 (simple algebra) = c (since log2 (2c ) = c for any c) We have shown that ψ(φ(c)) = c, which tells us that the translator ψ correctly undoes the translating work of φ. The same is true going the other way: one shows in a similar manner that φ(ψ(t)) = t for any t ∈ Tpitch . In mathematics we say that the functions ψ and φ are inverses of one another, denoted ψ = φ−1 and φ = ψ −1 . Sameness of our two pictures Let’s collect all the nice properties of our translator function φ : Tfreq → Tpitch . 1. φ(21/12 ) = 1: φ sends the frequency space half step (21/12 ) to the pitch space half step (1). 2. φ(c1 ·c2 ) = φ(c1 )+φ(c2 ): φ respects how each language expresses successive transpositions. 3. φ has an inverse translator ψ going from pitch language back to frequency language. These three properties should convince us that anything we say using our frequency language has an equivalent translation into our pitch language, and vice versa. This means that Tfreq and Tpitch are simply two different, but equivalent pictures of the world of intervals and transpositions. We might choose to use the one picture or the other, depending on how convenient it is. Our translators φ and ψ allow us to move effortlessly between the two. Group isomorphisms The discussion above shows that our function φ : R>0 → R is what is known as a group isomorphism. Definition 8. Given groups (G, ·) and (H, ∗) a group isomorphism is a a homomorphism φ : G → H that is invertible: that is, for which there is an inverse function ψ = φ−1 : H → G. All in all this means 33 (i) φ(eG ) = eH , (ii) φ(g1 · g2 ) = φ(g1 ) ∗ φ(g2 ), (iii) φ is invertible. We say in this case that G and H are isomorphic. Comment 4.3. In the spirit of the foregoing discussion, it is useful to think of two isomorphic groups as being different pictures, or representations, of the same thing. They may look different as sets (as R>0 and R certainly do), but as groups they are essentially the same. Sequence of Pythagorean fifths in Xpitch We end with a nice illustration of the advantages of being bilingual. Recall that transposition up by a Pythagorean fifth is represented by the constant c = 32 ∈ Tfreq . In pitch language this translates as 3 3 φ( ) = 12 log( ) =: tPy5 . 2 2 A calculation shows tPy5 ≈ 7.02. This makes sense, since transposition by an equal-tempered perfect fifth is represented by the element 7 in Tpitch . Note how much simpler the frequency representation (c = 23 ) of the Pythagorean fifth is compared to the pitch representation (tPy5 = 12 log2 ( 32 )). In general our frequency language (using Xfreq and Tfreq ) is better suited to deal with questions of just tuning systems (where intervals correspond to rational numbers), whereas our pitch language (using Xpitch and Tpitch ) is in a sense tailor-made to deal with the equal-tempered system. Now let’s model the Pythagorean sequence of fifths in pitch space, starting with the pitch C4=0∈ Xpitch . From above, transposing by a Pythagorean fifth corresponds to addition by the constant tPy5 = 12 log2 ( 23 ). Doing this in succession yields the sequence 0, 0 + tPy5 , 0 + 2tPy5 , 0 + 3tPy5 · · · = 0, tPy5 , 2tPy5 , 3tPy5 , · · · ≈ 0, 7.02, 14.04, 21.06, . . . Recall that we must adjust by octave to ensure all pitches are within an octave of C4. In pitch space this corresponds to subtracting an appropriate multiple of 12 (12 half steps=1 octave) from the terms above so that they lie within [0, 12]. After octave adjustment our sequence of Pythagorean fifths then looks approximately like 0, 7.02, 2.04, 9.06, . . . The three figures below show sequences of Pythagorean fifths starting at C4= 0 ∈ Xpitch and ending after 12, 24, and 36 transpositions, respectively. 34 The marked integers 0, 1, 2, . . . , 12 represent the pitches of the equal-tempered system between C4 and C5. 0 1 2 3 4 5 6 7 8 9 10 11 12 0 1 2 3 4 5 6 7 8 9 10 11 12 0 1 2 3 4 5 6 7 8 9 10 11 12 Note in the first figure how the Pythagorean pitches are close to, but not equal to any of the equal-tempered pitches. Here the 12th transposition is the dot right next to C4=0. The linear distance between these two is the pitch space version of the Pythagorean comma: 12 log2 (312 /219 ) ≈ 0.235 half steps. As the remaining figures show, when we continue the sequence from here the dots seem to entirely fill up the interval [0, 12], and yet they will never again land squarely on any of the integers in 0, 1, 2, . . . , 12. Moral: though the Pythagorean system is best modeled in terms of Xfreq and Tfreq , it is only after translating this representation into the language of Xpitch and Tpitch that we get a clear picture of the relation between it and our equal-tempered system. It’s a good thing to know how to speak multiple languages! Classic 5 (Circle of fifths). We have seen how transposing successively by equaltempered perfect fifths cycles us through all 12 pitches of equal-temperament without repetition. If we start with C, the sequence is (C, G, D, A, E, B, F], D[, A[, E[, B[, F, C). The return to C at the end of the sequence is what earns it the name “circle of fifths”. The circle of fifths is used pervasively in music of all different eras and styles. Composers will often traverse segments of this sequence (forwards or backwards) in order to “move” a piece from one tonal region to another. Why a sequence of fifths, as opposed to say a circle of minor thirds, or even a circle of half steps? Besides the obvious observation that a circle of minor thirds does not hit all 12 pitches of our scale (C, E[, G[, A, C), and a circle of half steps (C, C], D, D], E,. . . ) can sound too predictable, there are some further properties of the circle of fifths that make it particularly appealing musically: 1. Movement by a fifths has strong harmonic significance. 2. If you take every other term in circle of fifths, you get two subsequences that move by whole step: (C, D, E, . . . ), and (G, A, B, . . . ). Composers often separate these two strands when employing a circle of fifths. 35 Figure 2: Mozart, Piano Sonata No. 12, KV 332, ca. 0’54” into the performance of our example recording Mozart, Piano Sonata No. 12, KV 332 Starting with the second to last measure of the first staff, Mozart moves down the circle of fifths, using a pattern of rising fourths (F up to B[) followed by falling fifths (B[ down to E[). The sequence segment we get here is (F, B[, E[, A[, D[), at which point Mozart short-circuits things with a G instead of a G[. Down is in general the preferred direction along the circle of fifths, mainly because the falling fifth pattern can be interpreted harmonically as a dominant (V) to tonic (I) movement, one of the most common progressions in music. We already see this movement between tonic and dominant in measures 2-4, where we oscillate between F (tonic) and C (dominant). We’ll have more to say about this later! “Jordu”, written by Duke Jordan, performed by Clifford Brown The circle of fifths is ubiquitous in jazz music. Here both the opening theme as well as the bridge section (starting at the “B” marking in the third staff) make their way down the sequence of fifths, as the chords notated above clearly indicate. Listen for the characteristic rising fourths and falling fifths pattern in 36 the bass and piano parts. Going up the circle of fifths Examples where music ascends the circle of fifth are harder to come by. The two examples below are from Haydn’s Piano Sonata No. 49 (Hob. XVI) and Beethoven’s Bagetelle No. 2, from Seven Bagatelles, Op. 33. 5 Pitch-class space You will have noticed by now that when speaking of pitches we often ignore differences of octaves. More often than not we refer to the pitch C4 simply as C, or the pitch G[ 7 simply as G[ . Our way of speaking of pitches suggests that we perceive a general property of C-ness and G[ -ness that transcends octave differences. This identification between octaves is common in nearly all musical traditions, as it turns out. Furthermore, research in psychoacoustics suggests that we perceive pitches that are octave transpositions of one another as being strongly related, almost interchangeable. We capture this identification mathematically by the notion of octave equivalence. 37 5 Figure 3: Haydn, Piano Sonata No. 49 (Hob. XVI), ca. 4’05” 5.1 Octave equivalence Definition 9. Two pitches P1 and P2 are octave equivalent to one another, written P1 ∼oct P2 if they are related by transposition by a number of octaves. Recall that transposition (up or down) by a number of octaves corresponds to adding an integer multiple of 12. Thus we see that two pitches P1 , P2 ∈ Xpitch are octave equivalent if P1 = P2 + n12 for some integer n ∈ Z. Comment 5.1. Since P1 = P2 + n12 if and only if P1 − P2 = n12, it follows that P1 ∼oct P2 if and only if (i) P1 − P2 is an integer, and (ii) this integer is divisible by 12. Example 5.1. Decide whether the following pairs of pitches are octave equivalent. 1. P1 = 19, P2 = 151 Solution: we have P1 − P2 = 19 − 151 = −132. This is an integer that is divisible by 12: −132 = (−11) · 12. Thus P1 ∼oct P2 . In fact both pitches are equivalent to P = 7; they are all G’s! 2. P1 = 66, P2 = 5 Solution: we have P1 − P2 = 66 − 5 = 61. This is indeed an integer, but not divisible by 12. Thus P1 6∼oct P2 . In fact P1 ∼oct 6. Showing that P1 is an F] , whereas P2 is an F. 38 Figure 4: Beethoven, Bagatelle No. 2 from Seven Bagatelles, Op. 33. = −24, which is 3. P1 = 50/3, P2 = 122/3 Solution: we have P1 − P2 = − 72 3 indeed an integer divisible by 12. Thus P1 ∼oct P2 . Note that neither of these pitches actually lies within the equal-tempered system. √ √ √ 4. P1 = 12 2, P2 = 11 2 Solution: we have P1 − P2 = 2. This is not an integer. Thus P1 6∼ P2 . In the last example we often took a pitch P and found an octave equivalent pitch P 0 satisfying 0 ≤ P 0 < 12; for example, this helps when trying to identify what pitch name corresponds to a given integer pitch in Xpitch . This procedure works in general. Given any pitch P ∈ Xpitch we can always find an integer n such that 12n ≤ P < 12(n + 1); but then we have P ∼oct (P − 12n), and P − 12n lies within [0, 12). Example 5.2. Take P = 200. Then we have 192 ≤ 200 < 204. Here 192 = 12 · 16 and 204 = 12 · 17. Then 200 ∼oct (200 − 12 · 16). Since 200 − 12 · 16 = 200 − 192 = 8, we see that P ∼oct 8, showing that our pitch is an A[ . Classic 6 (The division algorithm). The preceding observations are an illustration of a general fact in mathematics often described as the division algorithm, though it is not really an algorithm. We state it here as a theorem. The division algorithm. Fix a positive integer n > 0. Given any real number s ∈ R, there is a unique integer q and a unique real number r satisfying (i) s = nq + r, where (ii) 0 ≤ r < n. 39 The integer q is called the quotient upon division by n; the number r (not necessarily an integer), is called the remainder upon division by q. Fix a positive integer n and a real number s. To find the q and r such that s = nq + r as in the division algorithm, simply locate locate s between two consecutive multiples of n, as in the diagram below. nq ︸ � n(q+ 1) n(q+ 2) � = � - �� Comment 5.2. Give an integer n and a real number s, there are of course many ways to write s = nq 0 + r0 with q 0 an integer. For example, take n = 12 and s = 21.5. Then we have 21.5 = 12 · 2 + (−2.5) 21.5 = 12 · 0 + (21.5) 21.5 = 12 · (−1) + 32.5 However, there is only one way to do this where 0 ≤ r0 < 12, namely 21.5 = 12 · 1 + 9.5. Thus for n = 12 and s = 21.5 the unique choice of q and r is q = 1 and r = 9.5. Definition 10. Fix a positive integer n. The division algorithm allows us to define a function as follows: given any real number as input s ∈ R, we define the output to be r, its remainder upon division by n. We write this as s mod n = r. Since 0 ≤ r < n (as stipulated by the division algorithm), we have defined a function mod n : R → s 7→ 40 [0, n) s mod n. Example 5.3. Take n = 7. Then we have mod 7 : R → [0, 7). Compute x mod 7 for x = −73, 3π, and 49. Simply apply the division algorithm to each choice of x. −73 = 7(−11) + 4 ⇒ 3π ≈ 7(1) + 2.424 ⇒ 3π mod 7 ≈ 2.424 −73 mod 7 = 4 49 = 7(7) + 0 ⇒ 49 mod 7 = 0 Comment 5.3. When dealing with pitches P ∈ Xpitch , we are simply applying these concepts to the case n = 12. For example, to figure out what pitch name P corresponds to, we simply compute P mod 12 and add this many half steps to C. 5.2 Equivalence relations As the notation P1 ∼oct P2 and our foregoing discussion suggest, when we say P1 and P2 are octave equivalent, we mean something to the effect of they are “kind of the same”. Indeed, we even sometimes say “P1 and P2 are the same up to octave equivalence”, suggesting a kind of qualified equality relation. Like actual equality, octave equivalence also enjoys some familiar basic properties: (i) Reflexivity: P ∼oct P for all pitches P ∈ Xfreq . (ii) Symmetry: if P1 ∼oct P2 , then P2 ∼oct P1 . (iii) Transitivity: if P1 ∼oct P2 and P2 ∼oct P3 , then P1 ∼oct P3 . These three properties make ∼oct what is called an equivalence relation. Definition 11. Let X be a set, and let R be any relation defined on X: if the relation holds between a pair (x, y) of elements of X, we write xRy. The relation R is an equivalence relation if it satisfies the three following properties: (i) Reflexivity: xRx for all x ∈ X. (ii) Symmetry: if xRy, then yRx. (iii) Transitivity: if xRy and yRz, then xRz. Example 5.4. Take X = Z and define a relation x ∼∗ y if y = 3r · x for some r ∈ Z. For example 54 ∼∗ 6 since 6 = 3−2 · 54, but 8 6∼∗ 12, since we cannot write 12 = 3r · 8 for any r ∈ Z. I claim ∼∗ is an equivalence relation. Let’s show that it is reflexive, symmetric and transitive. 41 (i) Reflexive: given any x ∈ Z, we have x = 30 x. Thus x ∼∗ x. (ii) Symmetric: suppose x ∼∗ y. Then y = 3r · x for some r ∈ Z; but then x = 3−r · y. Since −r ∈ Z, we have y ∼∗ x. (iii) Transitive: suppose x ∼∗ y and y ∼∗ z. Then there are integers r, s ∈ Z such that y = 3r ·x, and z = 3s ·y; but then z = 3s ·y = 3s (3r ·x) = 3s+r ·x. Since r + s ∈ Z, we have x ∼∗ z. We’ve proved ∼∗ is an equivalence relation! Example 5.5. Of course, not all relations are equivalence relations. Consider the relation defined on Z>1 = {2, 3, 4, . . . } by xRy if x and y share a common nontrivial factor–that is, a common factor greater than 1. This relation is reflexive (since x is a common factor of x and x), and symmetric (if x and y share a common factor, then so do y and x), but not transitive: x = 2 and y = 6 share a common nontrivial factor (2), and y = 6 and z = 15 share a common nontrivial factor (3), but x = 2 and z = 15 share no nontrivial factor! It is also easy to come up with “real world” examples. Let X be the set of all living humans, and define xRy to mean x loves y. Tragically, this relation fails to be symmetric! In fact, sadly, this relation also fails to be reflexive. Congruences Mathematically speaking, our octave equivalence relation is just one example of a whole family of equivalence relations defined on the set R. Definition 12. Let X = R, and let n be a positive integer. Given x, y ∈ R, we say that x is congruent to y modulo n, if n | (x − y). We write x ≡ y (mod n) in this case. Comment 5.4. Recall that n | (x − y) if and only if there is an integer r with nr = x − y. After a little algebra, we see that x ≡ y (mod n) if and only if x = y + nr for some integer r. Thus x and y are congruent modulo n if and only if they differ by a multiple of n. Comment 5.5. It follows immediately from the definition that P1 ∼oct P2 if and only if P1 ≡ P2 (mod 12). Fix a positive integer n. It is indeed true that congruence modulo n defines an equivalence relation: that is, we have 1. x ≡ x (mod n) for all x; 2. if x ≡ y (mod n), then y ≡ x mod n; 3. if x ≡ y (mod n) and y ≡ z mod n, then x ≡ z (mod n). Furthermore, this relation also respects both addition and multiplication in R, as the following theorem explains. 42 Modular substitution. Let x, y, x0 , y 0 be integers with x ≡ x0 (mod n) and y ≡ y 0 (mod n) Then (i) x + y ≡ x0 + y 0 (mod n); (ii) x · y ≡ x0 · y 0 (mod n). Congruence modulo n and mod n You will note the notational similarity between the expressions ‘x ≡ y (mod n)’ and ‘x mod n’. There is indeed a close connection between the two, but we have to be careful when describing exactly what this is. First observe the main difference between the two: 1. To say x ≡ y (mod n) is to assert that a certain relation holds between x and y: namely, x − y is divisible by n. 2. On the other hand x mod n is simply a number–the result of plugging the input x into the function mod n. The following theorem tells us precisely how these two notions are related. Theorem. Fix a positive integer n. Let x and y be any real numbers. Then x≡y (mod n) if and only if x mod n = y mod n. Furthermore, let x mod n = r. Then x ≡ r (mod n). Example 5.6. The last two theorems taken in tandem allow us to do much heavy lifting in terms of modular arithmetic. Fix n = 4. Compute (1023 + 61 · 91) mod 4. Solution: According to the second theorem, to compute x mod 4, we can replace x with anything equivalent to x modulo 4. Let’s find a nice small x equivalent to 1023 + 61 · 91 modulo 4. We first compute 102 mod 4 = 2, 61 mod 4 = 1, and 91 mod 4 = 3. This implies by the second theorem that 102 ≡ 2 (mod 4), 61 ≡ 1 (mod 4), and 91 ≡ 3 (mod 4). Now we can substitute these in using the the first theorem (modular substitution): (1023 + 61 · 91) ≡ (23 + 1 · 3) ≡ (8 + 3) ≡ 11 ≡ 3 (mod 4) (mod 4) (mod 4) (mod 4). Lastly, we conlcude, again using the second theorem, that (1023 +61·91)mod4 = 3 mod 4 = 3. Note: we could have done this the hard way by directly computing 1023 + 61 · 91 = 1066759, and the computing 1066759 mod 4 = 3. 43 5.3 Pitch-class space Pitch-class space As we have begun to see, there are many situations where music theorists and composers alike are inclined to speak of octave equivalent pitches as being the same. In such situations, working within the Xpitch model can be somewhat cumbersome. We are constantly having to employ the phrase “up to octave equivalence” in our discourse, or else having to apply mod12 in our arithmetic. Pitch-class space, denoted Xpc , is a model especially suited to such situations. Roughly speaking Xpc is what you get by taking Xpitch and simply declaring that octave equivalence, our “qualified equality relation” on Xpitch , is now to be treated as honest to goodness equality. As such Xpc is the result of collapsing all octave equivalent pitches into a single entity, which we call a pitch-class. For example, the infinite set {· · · − 17, −5, 7, 19, 31 . . . } in Xpitch consisting of G4 and all of its octave transpositions is collapsed in Xpc into a single pitch-class, which we call simply G. How much exactly must we collapse Xpitch to get Xpc , and what are we left with? Below you find a diagram of Xpitch with various instances of C and G marked as reference points. �� - 12 0 �� �� �� �� 12 24 �� �� �� Since any pitch P ∈ Xpitch is octave equivalent to a pitch in the range [0, 12], when moving to Xpc we first collapse the real line to the interval [0, 12]. 0 12 � 44 Here all of our G’s are now identified with the single G between 0 and 12; more generally, an arbitrary pitch P ∈ Xpitch is identified with P mod 12, which lies within [0, 12]. Except for P = 0 and Q = 12, no two elements in [0, 12] are octave equivalent, so we are nearly done with our collapsing; we need only identify P = 0 and Q = 12. What sort of space do we get? A circle! � = �� � �� G Equal-tempered pitch-classes In the process of this collapse, our infinitely many equal tempered pitches, represented by the integers Z in Xpitch , are collapsed to the finitely many integers {0, 1, 2, . . . 11}, which divide the pitch-class circle into 12 equal arcs. G3 A♭3 A3 B♭3 B3 -5 -4 -3 -2 -1 C4 C♯4 D4 D♯4 E4 F4 0 1 2 3 4 5 ℝ ℤ ⇓ �� � � � �� � � � � � � 45 � � �� � � � �� Figure 5: Transposition by t = 8 is rotation by 8 steps along our integers. This is a clockwise rotation of 8 · 2π = 4π radians, or 240 degrees. Here the pitch-class P = 1 is shown rotated 12 3 to the pitch-class P 0 = 9. Transpositions in pitch-class space A transposition t ∈ Tpitch , which acts on Xpitch by shifting points to the left or right a certain distance, also acts on Xpc ; but now the operation it defines is a rotation of our pitch-class circle! Quotient objects Our construction of Xpc is an example of what is called quotienting out by an equivalence relation, and works with any set X and any equivalence relation ∼ defined on X. Our description of this process in terms of “collapsing” elements is a tad bit vague. The way we make rigorous sense of this is through equivalence classes. Definition 13. Let X be a set, and let ∼ be an equivalence relation defined on X. (i) Given an element x ∈ X, its equivalence class, [x]∼ , is the set of all elements y ∈ X that are equivalent to x: [x]∼ := {y ∈ X : x ∼ y}. (ii) The quotient of X by ∼, denoted X/ ∼, is defined as the set of all equivalence classes of X: X/ ∼:= {[x]∼ : x ∈ X}. Here is how this looks in the example of Xpitch with relation octave equivalence. 1. Here the equivalence class of a pitch P ∈ Xpitch , denoted [P ]12 , is by definition the set of all pitches that are octave equivalent to it. Thus [P ]12 = {...P − 24, P − 12, P, P + 12, P + 24, P + 36, . . . } In our particular musical example, we call such an equivalence class a pitch-class. 46 2. Note that there are many different names for a given equivalence class. For example, the equivalence class {. . . , −17, −5, 7, 19, 31, . . . } can equally well be denoted [7]12 , [−17]12 , [103]12 , or indeed [P ]12 for any P that is octave equivalent to 7. 3. Pitch-class space is then defined to be the set of all pitch-classes: that is, Xpc := {[P ]12 : P ∈ Xpitch }. Some observations: 1. Notice how we have captured the notion of collapsing pitches: we take a collection of octave equivalent pitches (the pitches −17, −5, 7, 19, . . . , for example), throw them all into a single set, the corresponding pitch-class, and treat this as a single object in Xpc . In this way our infinitely many pitches (the pitches −17, −5, 7, 19, . . . , for example) have been collapsed into one object: in our example [7]12 , or [−17]12 , or however you want to denote the set {· · · − 17, −5, 7, 19, . . . }. 2. How on earth does this strange set of pitch-classes correspond to a circle? (a) Though each pitch-class [P ]12 in Xpc has many different names, there is exactly one choice of P with 0 ≤ P < 12. This is how we collapsed Xpitch first to the interval [0, 12). (b) Next we map the points of P ∈ [0, 12) in a 1-1 fashion onto the points (x, y) of the unit circle using an invertible function! In our circle representation I have used the function P 7→ (sin( 2π 2π P ), cos( P )). 12 12 Quotient groups Suppose a set X with equivalence relation ∼ also happens to be a group. A natural question to ask is whether the quotient X/ ∼ is also a group in a natural way. The answer is often yes! Example 5.7. Let G = Z with group operation +. Fix an integer n > 0 and consider the equivalence relation r ≡ s (mod n). We denote the quotient object in this case Z/nZ. By definition we have Z/nZ = {[r]n : r ∈ Z}, where [r]n = {. . . r −2n, r −n, r, r +n, r +2n, r +3n, . . . } denotes the equivalence class of all integers congruent to r modulo n. Since every integer r ∈ Z is congruent to a unique integer r0 with 0 ≤ r0 < 12, we have Z/nZ = {[0]n , [1]n , [2]n , . . . [n − 1]n } 47 Note: by quotienting out by n, we have collapsed the integers to a finite set containing n elements! This set has a natural group structure defined on it, namely [r]n + [s]n = [r + s]n . When applying this rule your are allowed to use any name you want for the given equivalence classes: that is we have [r]n + [s]n = [r0 ]n + [s0 ]n for any choice of r0 , s0 with r ≡ r0 (mod n) and s ≡ s0 (mod n). Example 5.8. Consider Z/8Z = {[0]8 , [1]8 , . . . , [7]8 }. Compute the following additions. Your answer should be given in the form [r]8 , where 0 ≤ r < 8. 1. [103]8 + [−1]8 2. [808]8 + [−23]8 SOLUTION: [103]8 + [−1]8 [808]8 + [−23]8 = [102]8 = [6]8 , since 102 ≡ 6 = [0]8 + [1]8 (since 808 ≡ 0 = [1]8 . (mod 8). (mod 8), and −23 ≡ 1 (mod 8)) Spaces and quotient spaces I promised long ago to say something about topological spaces. In mathematics, a topological space is a set with an additional bit of structure, called a topology, that allows us to say in an abstract way when two elements x, y of the set are “close to one another”. As with groups, one defines what a topology is in a precise manner. The general definition is somewhat technical, and as such I will omit it here. However, the three spaces we have seen thus far, viz., Xfreq , Xpitch , and Xpc , are examples of a particular type of topological space, called a metric space, which is easier to describe. Essentially, a metric space is a set X equipped with a distance function d(x, y) that quantifies exactly how far elements x and y are from each other. Definition 14. A metric, or distance function, on a set X is a function that assigns to any pair (x, y) of elements of X a nonnegative real number d(x, y) (called the distance between x and y), satisfying the following properties: (i) d(x, y) = 0 if and only if x = y, (ii) d(x, y) = d(y, x), (iii) d(x, z) ≤ d(x, y) + d(y, z) (the triangle inquality). A set X, together with a distance function d is called a metric space. 48 Example 5.9. For Xpitch we define our distance function as d(P1 , P2 ) = |P1 − P2 |. In other words, distance in this space is just a measure of interval size! The fact that this is indeed a distance function (i.e., satisfies properties (i)-(iii) of the definition), follows from well known properties of the absolute value. Example 5.10. The space Xfreq = R>0 is a subset of R, and as such we could simply restrict the metric in the previous example to R>0 . However, we want our metric to measure interval size, and so instead we define d(f1 , f2 ) = |12 log2 (f1 ) − 12 log2 (f2 )| = |12 log2 (f1 /f2 )|. Any guess as to where this came from? Correct, I just used our invertible function φ(x) = 12 log2 (x) to map frequencies to pitches, and then used the definition of distance in pitch space! Example 5.11 (Quotient spaces). As with groups, given a metric space X and an equivalence relation ∼ defined on it, we can ask whether the quotient object X/ ∼ is also a metric space in a natural way. The answer for Xpc is yes! How do we define a distance function on Xpc ? Recall that we can identify Xpc with the unit circle. So we can define the distance d([P ]12 , [Q]12 ) between two pitch-classes to be the distance between their corresponding two points on the unit circle. By distance here, we mean the minimum length of the two circular arcs between the two points. As in the previous examples, technically one must prove that this definition of d(x, y) satisfies the three properties of a metric. I leave this as an exercise! Let’s compute d(x, y) for x = [25]12 and y = [−13]12 in Xpc . First note that x = [25]12 = [1]12 and y = [−13]12 = [11]12 . Thus x and y correspond to the two points on the circle below. �� �����������π/� � �������������π/� The distance between these two points is the minimum of the arc lengths (10/12)2π = 10π/6 and (2/12)2π = π/3. Thus d(x, y) = π/3. 49 5.4 Pitch or pitch-class space? Where does music live, in pitch or pitch-class space? The answer varies depending on how composers conceive their musical constructs, and on how listeners hear them. Example–Winterreise, by Franz Schubert In vocal music, octave shifts are often used for dramatic or virtuosic effect, and as such octave equivalence is by no means treated as actual equality. Here the natural model is usually pitch space. We look at two examples from Franz Schubert’s song cycle Winterreise, written just months before he died in 1828. The first passage is from “Frühlingstraum”. The line “es schrien die Raben vom Dach” is sung twice. The melody in both instances is identical up to octave equivalence, and yet they are very distinct in character. Note in particular the dramatic jump from E4 up an octave and a half step to F\ 5 in the second instance that results from substituting E4 for E5. In the second example, taken from “Die Krähe”, the melody for the line “Krähe, lass mich endlich seh’n, Treue bis zum Grabe!” would look rather uninteresting if collapsed into pitch-space. 50 Example–Das Wohltemperierte Klavier I, Fuga 2, by J.S. Bach On the other hand, when composers treat a musical idea as something to be manipulated and transformed using various operations, as in contrapuntal music from the Renaissance and Enlightenment periods, or 20th century serial music, then pitch-class space is often the more natural setting to model the music. Das Wohltemperierte Klavier (or “The Well-tempered Klavier”) is a collection of two books of 24 preludes and fugues. In each book Bach cycles through all 24 possible keys (12 choices for pitch name, 2 choices for major/minor) in sequence. Thus the first fugue is in C major, the second in C minor, the third in C] major, etc. The piece is a fugue in 3 voices. The first two bars of the fugue present the subject (S) in one voice. Then in measures 3-4 a second voice plays (nearly) the same subject transposed a perfect fifth up (this is called the answer (A)), while the first voice plays what is called the counter subject (CS). The rest of the piece is built essentially out of near perfect transpositions of 234%(5 these three basic units. 6(7 $89(,:/ !"#"$%&'()(*+,-.*/-0(1 ,,-,, ,,, ,-,, ,,, ,,, ,-,, ,,, ,,-,, +/,,-, % ' )* )* ( )* )* )* )* )* )* ( ' )* .-, -,, -, )** -, )* )*** ! 1########################### $ % % & ** +)*** *** )*** )*** *** )*** ** *** 1 )** *** +)*** ** *** )*** )*** )*** )*** )*** 1 )*** **** +)***** +)**** )**** )*** )*** )** /*)* **** ***** # ** 1 * * * * 1 1 1 1 "1 1 1 1 %% & ( ( ( 1########################### 0 % 1 1 #1 ,, , ,, ,, ,, , , , , , , , , , , , , ,, , , , , % ,,-, ,-, .-,, ,-, +/-,)* ,,,-, ,,-, -,, ,,-, ,,,-, ,,, ,-,, ,,-, -,, ,,,, ,,, -,, -,, ,,- ,,,, ,,-,, ,-,)* -,, ,-,)* +-,,,, ,,-, ,,-, -,, -,, )* )* 2 /)* -)*, )** ,-)** )*** )*** )*** )* +-,)** )** 1 )*** +)*** *** *** )** )** )*** %)*** )*** )** )** *) ***#*** 1 ! 1############################ $ % % )*** +)** )*** ** .)*** )*** )**** )*** 1 -,)*** * * ** ** * ** ** ** * * * * 1 1 1 1 "1 1 1 1 %% % ( ( ( 1############################ 0 1 1 #1 !"! One feature that %makes jump from C4 ,,-, ,, ,,, ,, subject ,,, ,, ,,, ,,-, .-,,,, ,,-, ,,,, ,-, is' the ,, ,,, ,, , ,the , , ,,, /-,,counter , so distinctive , +-,5 % % ,-)*** -, -,, ,-, ,,-, ,-,, ,,-,, ,,,-, 4)*** )*** )-*,** -,' -,,)** -,)* -,)* +-,,*), ,-,)* -,,)* ,,-, ,)*-, ' +)-,4*** ,)*** )*** )*** **)** ,)**#1 ! 1############################ $ plus *** ** )*** ** not 1 ** hesitate * ** * does * ) to E5: an octave a third. to cut this down to a 3 ' Yet1 Bach * * * 1 1 1 1 * ) * ) / * ) / ) * / / * ) / * ) / / * ) jump of only a " third, the passage below where we get C5 to E5. Thanks +)*** /)*** in * ) * +) * * +) * * ) * * * * * ) * +) * 1 % ' /**)** as 1 1 1 )* * * )** ** ** * * )** *)** ** *)** )*** 1 *)* *** *** *** *)*** )**** *)** )*** )*** )*** *)*** )*** )*** )** )**#)* 1 * * * * ** ** * * * 1############################ 0 % % counterpoint 1 * ** * ** adjusted version in part to the dense going on, we hear this octave of CS as being no different from the usual version. !#! ' , , ,, , ,, , , ,, , ,, , , , ,, ,, , , , ,, ,, % ,-,) *,-, +-,, ,-,' +)-,,,4 ,-,,)* )* )* )* ,5 -,, ,,- -,, -,, ,,-' ,,-, -,,,)* -,,)* -,, -,, -, ,,-, -,, -,, -,, -, ,,,-, ,,-, ,-,, ,,-, ,,,-, * * ! 1############################ $%% ' '** ***& +'*&** +'*&*** **&* ***&* ***** **** *** * ** **** *** * )** ** 1 *)*** ** * *4)*** * ** *** * )*** )**** **1 ** * . . ** *** ** # . 1 * , , +&*() &* 0 &* &* **& &* &* +(*&)* *&* 1 +&*** *&* *(&)* *** ** *&** *** ** &*1 '*&* 0 & * & * 1 %%% ,5 ,,,))#()) 1 3 ())) )* ,()) ())) '() )() ())) ()) ())) )) 3 () ())) )* ,()) ())) )) '()&* () &*() /)** (&)* '())* 3 ')*&* ,,)*-,)) ,(/)) ,)-,* ()) '+())) )) ,-,, ,,-,,()))) ,-,, )() +( , $ +( * ) ) ( , "!1 3############################ ' ) 1 1 * * ) * ) * * ) ) ) ) ) ) ) ( . / ) ) ) * )* * )* * * )* * )* -,) ) 1 3 0 % % % ')*)( **** )) **** )**** ***)) )*** )))*** )** *)** )*** )**** )*** *)** *)* *)* 3 )* 1 /*)) **** **** **** )**** %)**** )) )*** ) *)** ) *)* **** ) ))**** 3 ****)) 1 ** ** ** ) ** )*** )**** *** )***#1 3 1############################ ** ** * ** ' ) ( ' ) ( * ** * 3 ** ** ** ** ** ) ( ) ( ) ( +( ) ) ) "3 3 % () . -()) () +() . )-()) 2 ())) ()) ())) ()) )() ()) ())) 3 () )))) )))) )))) ()) )() ))) ))) ))) +())#33 3############################ 1 % % )) ) )) )) 3 ) ) ) ) ) ) )) ) , , , , , , , , , , +/)* +/)* % ,. , ,,** , ***,,,, **&*,-,,, ,,-,, ,-,, .,,-, ,-,, .,,-,, ,-,, ,,-, ,-, -,, +/-, /**)*** )* )*** +)*** )**** %)*** )*** )*** ** )*** ** )*** ** *** *** *&** *** *0 ()) .#1 ! 1############################ $ %% %%% +(,-,)) ,-,,()) -,,,&(*))) ,-, +&*-, 0 1 . ** )** .*)* )*** +&**(*) )*** **&**** **0 * * . . +&(*) *&(*) 0 ) ( & * * ) ( & * ) ( ) & ) ( * ) ,( ) ) +& * ) ( !1 3############################ $ 3 () ) ) #3 ) ) ) ) ) ) ) )) ()) )() ()) ''1 ()) ,, 3 %''()))-,, )))) ,, )))) )))) ,, +''())) ,, ')()) ,, ))) ,,, )))) ))) ,,+'())) 1 3 ,,-, /-,,,)* ,,-),* ,,,-, ,,),-* ) ,,-,, ) ,,,, ) )) ,,,, 3" 1 , / / , / , , , 1 -,,)* )*** ) *)** -),** 1 *)*** )*-,** -,,*)* -,,*)* * * "1############################ 3 0 % %%% *)** ()) **** '()))) ')())) *** ')())) )**** ,()) *** ())*)** -,*)** )*-,* 3 * * ()) 1 3 ) ) )) / () +(1 )) +(* )) 3 ()))** ())) ()))) * ()))) ** +()) ()) ())) ()) ())) ())) ()))) +()) ())) # )) ) 3############################ 1%% ) # ) 3 ) ) ) !$%! !"#! !$&! !"$! 6 * (((((;!"#"($%&'(.(234%(<="5($89(,:/>((((( Chords % . *&** *** *&** *** *** *&** *** ****& **&* *** *&** *** **&** **&* ** * ** . ** * ** *** *&** *** *&** *** ** *#3 ! 3############################ $ % % ')() ())) ())) +&*()) ')() '() ''*&*() '' ''*& ()))) &* ()))) ())) 3 *(&)*)) )() +&* ())) ())) +'&**() *&* '&*)() ()) '*&*() &** 3 '*&**)( +&* . +(*-&)*) ()&*)) ())) ())) )()) +&'0 ))( 3 ) ) ) ) ) ) ( ) ) ) ) ) ) ) a)) whole ))) )) 3 of progress )) )) )) '))() )) lot ) )) ) ) in3 ) our goal of) modeling We have not3 made musical ) ( ) ( "3 3 3 3 ) ) ( ) ( ) ) ( % % -())) . . +(-))) objects. . In ) ( ) ) ) ( ) ) ( ) ( ) ) ) ( ) ( ) ) ( ) ) ) ) ( objects with mathematical fact so far we only know how to model % ) ) ) ) ( ) ) ) 3############################ 1 ))) ())) '()) '()) 3 ))() )() ()) )()) ())) '()) ()) )()) 3 )) ) )) ) ))) ))) ))() )()) ())) )))) ))) )))) ) ) )#) 3 !%&! pitches. * ** * * * % ** ** *** ** *** * ***** **&* ***&* **&* ***** **** &** **&* &** **** . *&* ***&* **&** *&** **&* +&*** **&* **&** **4 ) ( ) ( * & * & * ! 3############################ $ % % *&()) &* +&* &*. *-&()*) *)(&*)) ())) ()) ())) '*0 3 ) ( ) ( 3 3 ) ( ) ) ) ( ')))( )) ))) )) )) )) )) +')-)( . . ')-)( '&**)-)( . . -&()*# )) 3 ) ) )() 3 ')))4 ) ) ) ) 3 3) ()) ()) () () "3 3 3 51 ()) () () % 2 / () +() +()) ()) )()) ()))#()))) 33 3############################ 1 % % ())) ())) )())) ())) ()))) ())) ())) ()) '()) ))))) )))) ))))) ))) )())) ())) ())) 3 ())) ())) ()))) ())) ())) ())) ()) '()) '() )))) ))))) )))) )())) ()))) +())) +())) 3 ())) ) )) ) ) ) ) ) ) ) ) ) ) ) )) )) ) )) !%'! * *** * % *** &** &** **& **** *** *&* *&** *&** **&* *** *** *** ***&* *&* *&** *&** . ** ** ** * ** * * * ** ** . *0 ** ! 3############################ $ % % *-*&()) . . &*-())) &**-())) . . +&* +&*-())) &** 3 ())) ()))) ()))) ())) ))() &*()) &**())) *&(*))) 3 +'**&)( '*&(*)) **&* *&(*)) '*&**() '&*() '&**()) . +&# '-)() 3 ) ) 3 3 3 3 )) )) )) )) )) ) "3 3 3 *** %% % ()))) ())) ()))) ()))) ())) ())) ()) ()) +() ()) () ()) * . -())) 3 ) ( ) ( * 3############################ 1 &** **&* '***& # 3 ) ) ) ) ) ) )) )) )) )) ())) 3 ())) ()))) +())) ())) )) ())) ()))) ()))) ())) )) 3 **&* &** &** !%(! % *** *&** * * * * *** *&** *** *&** *** *** *&** +&**** *&*** *&***&*& **** *&** ***&* *&*** *&**() *** *** **&* **&* *** *** * &* &* &* * ! 3############################ $ % % +&(*))) ()) ()) '() '&**() **&* +'**(&*) '**&** '**)(**& &* . +(&**))- 3 ()&*)) . -())) ())) ()) ()) ))( ())) 3 +'*&(())- . '())-( . &* +4'*)4 #6 !%)! Despair not! Now most of the mathematical machinery is in place, and this will allow us to easily model more complicated musical objects like chords, scales, melodies and rhythms. These further musical objects will be described as collections of objects (e.g., collection of pitches, or pitch-classes, or of onsets in the case of rhythms), and come in two flavors depending on whether or not order is important in these collections. For example, a chord will be defined as an unordered collection of pitches or pitch-classes, whereas a melody will be defined as an ordered sequence of pitches. 6.1 Sets and sequences In mathematics this important distinction between unordered and ordered collections is articulated by two distinct types of fundamental mathematical objects: sets and sequences. Recall that a set is defined simply as a collection (finite or infinite) of objects. A set is determined soley by its contents (the objects it contains), and not by any particular ordering of those objects. This is why the set A = {1, 2, 3} can equally well be written as A = {2, 1, 3}, A = {2, 3, 1}, A = {1, 2, 2, 2, 3, 1, 1}. In all cases A is the set that contains the objects 1, 2 and 3, and only these objects. This notion of order not mattering is captured in the very definition of set equality. Sets A and B are defined to be equal, written A = B, simply when they contain the same elements: that is, given any object x , we have x ∈ A if and only if x ∈ B. 4 2 Example 6.1 (Set equality √ proof). Consider the sets A = {x ∈ R : x −3x +2 = √ 0} and B = {1, −1, 2, − 2}. I claim A = B. To prove the claim we must show that x ∈ A if and only if x ∈ B. Observe that to prove an “if and only if” statement, we really are proving two implications: x ∈ A ⇒ x ∈ B and x ∈ B ⇒ x ∈ A. √ √ The second implication is easy. Each of the elements 1, −1, 2, − 2 of B are also elements of A because plugging each of these numbers into the expression x4 − 3x2 + 2 yields 0, as desired. This proves x ∈ B ⇒ x ∈ A. Now go the other way. Take x ∈ A. Then x satisfies x4 − 3x2 + 2 = 0. Factoring the expression on the left, we see that √ √ (x4 − 3x2 + 2) = (x2 − 1)(x2 − 2) = (x − 1)(x + 1)(x − 2)(x + 2) = 0. √ √ For this to be true we must have x = 1, −1 2, or − 2. In each case we have x ∈ B. Thus we have proved x ∈ A ⇒ x ∈ B. Note: by definition x ∈ A ⇒ x ∈ B means A ⊂ B. Thus another way of proving A = B is to prove A ⊂ B and B ⊂ A. 52 Sets and sequences For a sequence (x1 , x2 , x3 , . . . ) on the other hand, order does matter. For example, in contrast to our example with sets the following sequences containing 1, 2 and 3 are all distinct: (1, 2, 3), (2, 1, 3), (2, 3, 1), (1, 2, 2, 2, 3, 1, 1). As with sets the notion of sequences being ordered objects is captured by the very definition of equality of sequences: s1 = (x1 , x2 , . . . , xm ) and s2 = (y1 , y2 , . . . , yn ) are equal, written s1 = s2 , if n = m (they are of the same length) and xi = yi for all i. We summarize equality of two sequences s1 and s2 as follows: (i) they are of the same length, (ii) they are composed of the same elements, (iii) the elements appear in the same order in both sequences. 6.2 Chords Musical definitions of what exactly a chord is tend to appeal to a notion of “simultaneity” or “sounding together”. For example, the Harvard Dictionary of Music defines a chord as ”three or more pitches sounded simultaneously or functioning as if sounded simultaneously”. The last phrase in this definition is there to accommodate examples like the following passage, wherein every measure would be recognized as an “instance” of a C-major triad. As the example suggests, the identifying property of this chord is the simply the unordered collection of pitches from which it is formed. Thus we should model chords with sets! Pitch sets and pitch-class sets We will model chords either as sets of pitches, or sets of pitch-classes, depending on whether octave differences are taken into account. Definition 15. We collect a number of definitions related to chords. 1. A pitch set is a (finite) set of pitches {P1 , P2 , . . . , Pr } ⊂ Xpitch . 2. A pitch-class set is a (finite) set of pitch-classes {[P1 ]12 , [P2 ]12 , . . . [Pr ]12 } ⊂ Xpc . 3. We will call both pitch sets and pitch-class sets chords. 53 4. Given a positive integer n, an n-chord is a chord (of either pitches or pitch-classes) containing exactly n distinct objects. Example 6.2. As pitch sets the five instances of chords in the five measures above are modeled as {0, 4, 7}, {4, 7, 12}, {0, 4, 7, 12, 16}, {0, 4, 7, 12}, and, {7, 12, 16}. Note that these sets are all distinct. As pitch-class sets on the other hand they all collapse to the single pitch-class set {[0]12 , [4]12 , [7]12 }. For example, the third chord is {[0]12 , [4]12 , [7]12 , [12]12 , [16]12 } = {[0]12 , [4]12 , [7]12 , [0]12 , [4]12 } = {[0]12 , [4]12 , [7]12 }. In almost all cases we will model chords with pitch-classes, and as such we will often drop the brackets from our notation.Thus {[0]12 , [4]12 , [7]12 } will most often be written {0, 4, 7}. Counting pitch-class sets Let’s fix k > 0 and consider pitch-class sets containing exactly k pitchclasses taken from our 12 equal-tempered pitch-classes. The number of such sets is finite. Thus up to octave equivalence, the number of equal-tempered k-chords is finite. To count these we use a very useful formula that counts the number of subsets of size k taken from a set of n elements. n n! subsets of size k . =: # = taken from a set of n elements k k!(n − k)! The symbol on the right is known as a binomial coefficient due to its appearance in the binomial formula n n−1 n n−2 2 n (x + y)n = xn + x y+ x y + ··· + xy n−1 + y n . 1 2 n−1 In this context we also call nk “n choose k”, as it counts the number of ways of n! picking k objects out of a collection of n. We can use nk = k!(n−k)! to count equal-tempered k-chords. Here we are picking k pitch-classes out of a collection of 12. Thus the number of k-chords is 12 k . k = 1 A 1-chord is called a monad. There are 12 1 = 12!/(1!11!) = 12 monads: namely {0}, {1}, . . . {11} (or {C}, {C] }, . . . , {B}). 54 k = 2 A 2-chord is called a dyad. There are 12 2 = 12!/(2!10!) = (12·11)/2 = 66 dyads: e.g., {0, 3} (or {C, E[}), {4, 9} (or {E, A}), etc. k = 3 A 3-chord is called a trichord. There are 12 3 = (12 · 11 · 10)/(3 · 2) = 220 trichords: e.g., {0, 4, 7} (or {C, E, G}), {6, 7, 11} (or {F], G, B}), etc. You get the idea. The naming scheme continues in this manner: 4-chords are called tetrachords, 5-chords pentachords, and 6-chords hexachords. Triads The first “real” chords one learns about in music theory are triads, which will also play a prominent role in our mathematical approach. Definition 16. A triad is a trichord {P1 , P2 , P3 } built as follows: 1. Pick a root pitch P1 . 2. Transpose P1 up a third (major or minor) to get P2 , called the third of the triad. 3. Transpose P2 up a third (major or minor) to get P3 , called the fifth of the triad. The choice of major/minor in each of steps (2) and (3) determine four different types of triad: major + minor ⇔ major triad minor + major ⇔ minor triad minor + minor ⇔ diminished triad (fifth is diminished) major + major ⇔ augmented triad (fifth is augmented) Comment 6.1. The process above is often summarized by saying a triad is a “stack of thirds”. You can create chords by stacking other intervals as well. For example, {B[, C, E[, F } is a stack of perfect fourths. Example 6.3. Below you find all four types of triads built on roots C and F ] , respectively. Notation. The table below illustrates how we denote triads using the pitch name of the root (capital or lower-case depending on major/minor) along with additional symbols in the augmented and diminished case. Major Minor Diminished Augmented {C, E, G}, {F], A], C]}, . . . {C, E[, G}, {F], A, C]}, . . . {C, E[, G[}, {F], A, C}, . . . {C, E, G]}, {F], A], D}, . . . 55 C, F] . . . c, f] . . . c◦ , f]◦ . . . C+ , F]+ . . . How many triads are there in the equal-tempered universe? The following observations allow us to count them. As usual we work with pitch-classes. 1. Major, minor and diminished triads have a unique root. This is not true of an augmented triad, as each of its pitches can serve as a root: e.g., {C, E, G]} can be built starting with either C, E, or G] as the root. 2. Thus to each of the 12 pitch-classes we can associate the three triads (major, minor or diminished) of which it is the unique root. This gives rise to a total of 36 major, minor or diminished triads. 3. A simple count (try it on a keyboard) shows there are only 4 augmented triads: namely {C, E, G]}, {D[, F, A}, {D, F], A]}, {E[, G, B}. 4. Thus there are a total of 36 + 4 = 40 equal-tempered triads up to octave equivalence! Geometric representations of chords The geometric nature of Xpitch (the real line) and Xpc (the unit circle) allows us to easily picture chords–either as a collection of points plotted on the real line (when using pitches), or as a collection of points in the circle (when using pitch-classes). Below you find these two types of representations for the chord {C4, G4, E5}. -2 0 2 4 6 �� 8 10 � 12 14 16 18 20 22 � � �� � � � � � � � � �� The geometry offers some insight into the four different types of triads. Below 56 you find the C] major, B minor, G diminished, and C augmented triads. �� � �� � � �� � �� � � � � � �� � � � � � � � �� � � � � � � �� � � � � � �� � �� � � � � � �� � � � � � � � � �� � � � �� Functionality of triads Looking at the geometry, the major and minor triads are the most irregular of the four, at least in the sense of dividing up the circle into arcs of three distinct lengths. And yet in traditional harmony major and minor triads are considered to be the most “stable” of the four triads; diminished and augmented triads are treated as “unstable” triads that should “resolve” to a major or minor triad. If we think of our triads as descendants from corresponding triads in a just system (e.g., Pythagorean, or just intonation), then a possible explanation for this dichotomy presents itself. Starting with a fixed frequency f , we can build just major/minor triads using “simple ratios”: e.g., {f, (6/5)f, (3/2)f } is a major triad in just intonation. To build diminished or augmented triads, on the other hand, we have to use “not so simple” (thus less consonant) ratios. For example, the simplest instance of the diminished triad in the just intonation system is {B, D, F} = {(15/8)f, (9/8)f, (4/3)f }, which contains the “dissonant interval” {B, F} with frequency ratio 45/32. Seventh chords If you add one more third to a triad, you get what is called a seventh chord. Definition 17. A seventh chord is a 4-chord built by stacking thirds. In other words we create a seventh chord by starting with a root P1 and transposing up three times by thirds to obtain the chord {P1 , P2 , P3 , P4 }. 57 The interval {P1 , P4 } in this case is a seventh (hence the name), and the set {P1 , P2 , P3 } is a triad. The quality of both of these subchords determines the seventh chord’s name as follows: thirds M3+m3+M3 M3+m3+m3 m3+M3+M3 m3+M3+m3 m3+m3+M3 m3+m3+m3 M3+M3+m3 {P1 , P2 , P3 } major major minor minor diminished diminished augmented {P1 , P4 } M7 m7 M7 m7 m7 d7 M7 {P1 , P2 , P3 , P4 } major seventh dominant seventh minor/major seventh minor seventh half-diminished seventh diminished seventh augmented/major seventh Seventh chords Comment 6.2. What about the stack M 3+M 3+M 3? In this case {P1 , P2 , P3 } is an augmented triad, and P4 is an octave above P1 , which makes {P1 , P2 , P3 , P4 } = {P1 , P2 , P3 } as pitch-class sets. Thus we do not get a 4-chord in this case. Notation. The notation for seventh chords is similar to that of triads. As before, we illustrate with examples. Note: there is a wide variety of accepted chord notation. Here I follow the convention Dmitri Tymoczko outlines in his Music 105 lecture notes. Major Dominant Minor/major Minor Half-diminished Diminished Augmented/major 6.3 {C, E, G, B}, {F], A], C], E]} {C, E, G, B[}, {F], A], C], E} {C, E[, G, B}, {F], A, C], E]} {C, E[, G, B[}, {F], A, C], E} {C, E[, G[, B}, {F], A, C, E} {C, E[, G[, B[}, {F], A, C, E[} {C, E, G], B}, {F], A], C× , E]} Cmaj7 ,F]maj7 C7 ,F]7 cmaj7 ,f]maj7 c7 ,f]7 c∅7 ,f]∅7 c◦7 ,f]◦7 C+maj7 ,F]+maj7 Operations on chords: transposition We now model two types of musical procedures performed on chords with corresponding mathematical operations performed on pitch and pitch-class sets. The first, transposition, is an easy generalization of the transposition operations we defined on pitch and pitch-class space. The second, inversion, is an altogether new type of operation. Definition 18. Any real number α ∈ R defines a transposition operation tα on chords (whether pitch or pitch-class sets), defined as follows. Given a pitch set X = {P1 , P2 , . . . , Pn }, we define tα (X) = {P1 + α, P2 + α, . . . , Pn + α}. Given a pitch-class set X = {[P1 ]12 , [P2 ]12 , . . . , [Pr ]12 }, we define tα (X) = {[P1 + α]12 , [P2 + α]12 , . . . , [Pr + α]12 }. Again, if the context is clear, we will drop the bracket notation and simply compute everything modulo 12. 58 Comment 6.3. In both cases we see that a transposition tα is an operation which takes any chord and transposes each pitch in the chord by α. Example 6.4. Consider the transposition t−8 . As an operation, this takes any chord and transposes each pitch down a minor sixth. For example, given the pitch set {−10, 4, 5, 19}, we have {−10, 4, 5, 19} 7→ t−8 ({−10, 4, 5, 19}) = {−18, −4, −3, 11}. When dealing with pitch-class sets, we do the same thing, but take everything modulo 12. For example, given the pitch-class set {[2]12 , [4]12 , [5]12 , [7]12 }, we have {2, 4, 5, 7} 7→ t−8 ({2, 4, 5, 7}) = {2 − 8, 4 − 8, 5 − 8, 7 − 8} = {−6, −4, −3, −1} = {6, 8, 9, 11}. Geometric representation of transpositions As operations on pitch sets, transpositions tα act as horizontal shifts (right if α > 0, left if α < 0). {−10, 4, 5, 19} t−8 -20 -15 -10 -5 {−18, −4, −3, 11} -20 -15 -10 -5 0 5 10 15 20 5 10 15 20 7→ 7→ t−8 0 As operations on pitch-class sets, transpositions tα act as rotations (clockwise if α > 0, counterclockwise if α < 0). �� � � � �� � � � � � � � � �� t−8 = by 8(π/6) �� � � � �� � � � � � � 59 � � �� 12-tone equal-tempered transposition group As the discussion above illustrates, when restricting our attention to pitchclass sets taken from the 12-tone equal-tempered system, that is to subsets X ⊂ {0, 1, 2, . . . , 11}, we (i) only consider transpositions tα with α an integer (since otherwise tα would transpose us straight out the the equal-tempered universe), and (ii) we only care what α is up to congruence modulo 12. This motivates the following definition. Definition 19. Write Z/12Z = {0, 1, . . . , 11}; that is, we denote [i]12 by i. The group of equal-tempered pitch-class transpositions is defined as T12 := {t0 , t1 , t2 , . . . , t11 }, with group operation ti ◦ tj = ti+j . Comment 6.4. It might be objected that T12 is simply Z/12Z in disguise. More precisely, the map ti 7→ [i]12 is an obvious isomorphism between the two groups. So why bother introducing new notation? Answer: we wish to understand the elements of T12 as transpositions, certain operations acting on sets of pitch-class sets. This explains also why we use the composition symbol ‘◦’ to denote the group operation, so that ti ◦tj is understood as the composition of two transpositions done in succession. Comment 6.5. As usual, if the context is clear, we will denote elements of T12 simply as ti , and use modular substitution freely when computing with the group. For example, note that t−8 = t4 in T12 , since −8 ≡ 4 (mod 12). This is a nice, succinct way of saying that transposing down by a m6 is the same thing as transposing up by a M3 up to octave equivalence. This also means that in our earlier example we could have computed t−8 ({2, 4, 5, 7}) = t4 ({2, 4, 5, 7}) = {6, 8, 9, 11}. More generally, for any i ∈ {0, 1, 2, . . . , 11}, we have t−i = t12−i in T12 , since −i ≡ 12 − i (mod 12). This means transposing down by any interval i is the same as transposing up by its inverse interval 12 − i up to octave equivalence, as we have already had occasion to observe. Perhaps the most concrete way of thinking of elements of T12 is as rotations around the circle by a number of ticks. In this light equalities like t−i = t12−i are fairly obvious consequences of “clock arithmetic”! 60 6.4 Operations on chords: inversion We motivate the definition of inversion by first treating it as an operation on melodies, as opposed to chords. Consider the Contrapunctus No. 5, from J.S. Bach’s The Art of Fugue (Die Kunst der Fuge). The piece is built out of various transpositions of the subject and its inversion, shown below. Roughly speaking, we obtain the pitch content of the inversion by reflecting the pitches of the subject through the horizontal axis determined by the pitch A. This is not literally the case in the example shown. To be a true reflection the C in the inversion would have to be replaced by a C], which would give the inversion a major key feel. As such the example is of what is called a diatonic inversion. Algebraic definition of inversion It is not difficult to come up with an algebraic description of inversion. To a reflect a pitch P through the axis determined by A4 = 9 we simply subtract P − 9 from the pitch twice: the first subtraction brings the pitch to A, the second subtraction sends reflects it through to the other side! Thus combined map sends P 7→ P − 2(P − 9) = −P + 18. Let’s confirm this operation maps the subject in Bach’s example to its inverted form: P Name P 0 = −P + 18 Name 9 A4 9 A4 2 D4 16 E5 4 E4 14 D5 5 F4 13 C]5 .. .. .. ··· . . . We now can define a general inversion with respect to any fixed pitch or pitch class Q0 . Definition 20. We give separate definitions for pitches and for pitch classes. 1. Fix a pitch Q0 . We define inversion with respect to Q0 to be the function iQ0 defined by iQ0 (P ) = P − 2(P − Q0 ) = −P + 2Q0 . 61 2. Fix a pitch class [Q0 ]12 . We define inversion with respect to [Q0 ]12 to be the function iQ0 defined by iQ0 ([P ]12 ) = −[P ]12 + 2[Q0 ]12 = [−P + 2Q0 ]12 . Comment 6.6. There are a lot of letters in play here. Note that Q0 is fixed, and is part of the definition of the inversion. The input here is P (or [P ]12 ). Example 6.5. The pitch Q0 in the definition of iQ0 need not be equal-tempered! For example, consider the inversion with respect to Q0 = 1.5. By definition we have iQ0 (P ) = −P + 2Q0 = −P + 2(1.5) = −P + 3. This inversion still maps integers to integers, and thus equal-tempered pitches to equal-tempered pitches! Geometrically this reflects pitches through the horizontal line halfway between C]4 and D4. In particular we have i1.5 (C]4) i1.5 (1) = −1 + 3 = 2 = D4, and = = i1.5 (2) = −2 + 3 = 1 = C]4, i1.5 (D4) as expected. Example 6.6. Consider the same function as an operation on pitch-class space: that is, take [Q0 ]12 = [1.5]12 . As usual we will drop the brackets and use modular substitution with impunity. We have i1.5 (P ) = −P + 3 as before. Let’s compute i1.5 (P ) for P = 0, 1, 2, . . . 11 and see what transformation of the circle we get: P i1.5 (P ) 0 3 1 2 2 1 3 0 11 4 11 5 10 6 9 7 8 8 7 9 6 10 5 11 4 i 1.5 (P) 0 1 2 10 9 3 8 4 7 6 5 X pc We see that inversions in pitch-class space are also reflections. Now the function iQ0 (P ) reflects points of the circle through the diameter defined by the points Q0 and Q0 + 6. When Q0 is either an integer or half-integer (i.e., Q0 = i 62 or Q0 = i/2 for some i ∈ {0, 1, . . . 11}), then iQ0 (P ) sends equal-tempered pitch classes to themselves. Below you find typical examples of each case: 11 i 1.5 (P) 0 2 10 9 5 6 2 9 4 7 1 10 3 8 i 10 (P) 0 11 1 3 8 4 7 X pc 6 5 X pc Applying inversions to chords As with transpositions, our inversion functions iQ0 naturally define operations on chords X = {P1 , P2 , . . . , Pn }; we simply apply iQ0 to each element of the chord. Definition 21. Fix Q0 and let X = {P1 , P2 , . . . , Pn }. We define iQ0 (X) = {iQ0 (P1 ), iQ0 (P2 ), . . . , iQ0 (Pn )}. Comment 6.7. What do inversions do geometrically to chords? The foregoing discussion already gives us the answer: they reflect them! In pitch space chords get reflected through a certain line (determined by the particular inversion); in pitch-class space, chords get reflected through a certain diameter (again, determined by the particular inversion). Example Consider the inversion i2 (P ) = −P + 4 acting on the D major triad X = {2, 6, 9}, considered as a pitch-class set. Then we have i2 (X) = {i2 (2), i2 (6), i2 (9)} = {2, −2, −5} = {2, 10, 7} 63 11 X={2,6,9} 0 11 1 2 10 9 6 5 2 9 4 7 i2 (X)={2,7,10} 1 10 3 8 0 3 8 4 7 X pc 6 5 X pc Let X = {2, 6, 9} (the D major triad) and Y = {2, 7, 10} (the G minor triad). In the last example we saw that i2 (X) = Y ; our inversion transformed a major triad to a minor triad. The observation is true in general: any inversion iQ0 maps any major triad to a minor triad, and vice versa! The geometry of pitch-class space makes this relatively clear. In pitch-class space we identify a major triad as a sequence of a major third followed by a minor third going clockwise around the circle. When you invert this chord, reflecting it through some given diameter, you are left with a sequence of a major third followed by a minor third going counterclockwise around the circle. This is nothing more than a minor triad, as you can readily verify. At work here is the fact that mathematically speaking a reflection is an orientation reversing transformation of the circle. Interaction of transpositions and inversions What happens if you first apply an inversion, and then a transposition, or vice versa? What sort of operation results? Let’s focus on pitch-class space, and consider the inversion i0 (P ) = −P + 2 · 0 = −P . Take any one of our transpositions tj ∈ T12 and consider the composition tj ◦ i0 . To see what sort of operation this is, we see what it does to an arbitrary pitch-class P : tj ◦ i0 (P ) = tj (i0 (P )) = tj (−P ) = −P + j = −P + 2(j/2) = ij/2 (P ) We just proved that tj ◦ i0 = ij/2 for all tj ∈ T12 . In other words, when we compose a transposition with the inversion i0 , we get another inversion! What happens if we compose in the 64 opposite order: that is, what sort of operation is i0 ◦ tj ? Again we see what it does to an arbitrary P : i0 ◦ tj (P ) = i0 (tj (P )) = i0 (P + j) = −(P + j) = −P − j = −P + (12 − j) (modular subst.) 12 − j −P + 2 2 i 12−j (P ) = = 2 We just prove that i0 ◦ tj = i 12−j 2 for all j. So composing in the other order also yields an inversion! Adding inversions to T12 What if we wanted to enlarge our group T12 of transpositions by adding the inversion i0 , continuing to use composition as our group operation? Since tj ◦ i0 = ij/2 for any j, if we want the group operation to be well-defined we also have to add ij/2 to our new group for all j ∈ {0, 1, . . . , 11}. There are exactly 12 of these ij/2 , and one can show in fact that they are precisely all the inversions of pitch-class space which map equal-tempered pitches to equal-tempered ones. The resulting set {t0 , t1 , . . . , t11 , i0 , i1/2 , . . . , i5 , i11/2 } thus has exactly 24 elements. Is this a group? Yes. In fact, if we set e = t0 , t = t1 and i = i0 , then you can show that the set above is precisely {e, t, t2 , . . . , t11 , ti, t2 i, . . . , t11 i}, where t and i satisfy t12 = i2 = e itj = t12−j i. Look familiar? Definition 22. Set t = t1 and i = i0 , considered as transformations of pitchclass space. The group of equal-tempered transpositions and inversions is the group M12 = {e, t, t2 , . . . , t11 , ti, t2 i, . . . , t11 i} consisting of all transpositions and inversions of equal-tempered pitch classes. The elements t and i sastify the relations t12 = i2 = e itj = t12−j i, making M12 isomorphic to the dihedral group D12 . 65 Comment 6.8. Geometrically speaking M12 is a subset of the set M of all rigid motions of the circle; the ‘M’ thus stands for ‘motion’. A rigid motion is a transformation of a space which preserves distance. You can prove that every rigid motion of the circle is either a rotation (transposition) or a reflection (inversion). It follows that M is actually an infinite group, though it has a similar structure to D12 : if we let i be the same inversion (reflection) as above, and let tθ be transposition (rotation) by θ, then every element of M can be written as tθ or tθ i for some real number θ ∈ [0, 12). We obtain the finite group M12 by taking only those tθ and tθ i with θ = j an integer. This is precisely the set of those rotations and reflections which map integer classes to integer classes on our circle. This can also be thought of as the group of rigid motions (or symmetries) of the regular dodecahedron. Classic 7 (Béla Bartók’s Mikrokosmos). Mikrokosmos by Hungarian composer Béla Bartók (1881-1945) is a collection of 153 piano studies, ranging from beginner to professional level, published in 6 volumes. Volumes 5 and 6 contain many little masterpieces of twentieth century composition, and in a similar manner to the works of Bach, are a treasure trove for those looking for interesting examples of various composition techniques. Bartók was a towering figure among Hungarian composers. Ligeti himself, though never a student of Bartók’s, very deliberately attempted to get out from under his shadow. This was in fact one of Ligeti’s goals in writing Musica Ricercata, though as you may notice from the following Mikrokosmen, he didn’t completely succeed. Mikrokosmos, No. 140, “Free Variations” 66 The right-hand starting in mm.13-19 is clearly an inversion of the left-hand in the mm. 1-7. Octave equivalence is an issue here. Do we consider this an operation on pitch space, in which case it would be i4 , the inversion with respect to E4? Or do we consider it as an operation on pitch-class space, in which case it would be i9 , the inversion with respect to A? Mikrokosmos, No. 142, “From the Diary of a Fly” Mikrokosmos No. 142 makes prominent use of melodic inversion through the axis splitting G and A[ (i7.5 ), as well as chordal inversion through the axis splitting E[ (i13.5 ): 67 7 Chord-types When discussing triads we naturally sorted them into 4 different types: major, minor, diminished and augmented. Musically speaking, what type of triad a chord is determined by its intervallic content Mathematically speaking the type of a triad is determined by the various distances between pitches–more precisely, the sequence of these distances, moving around the circle clockwise. This is precisely the information that is preserved when we apply a transposition to a chord, and is the motivation for the following definition. Definition 23. We work within pitch-class space. Two chords X and Y are of the same type if the one is a transposition of the other: that is, if there is a transposition ti ∈ T12 such that ti (X) = Y . This defines an equivalence relation on the set of all chords (pitch-class sets). The equivalence classes determined by this relation are called chord-types. Thus given a chord X, the set of all its transpositions forms a chord-type, which we call the chord-type of X. Furthermore, chords X and Y have the same chord-type if and only if they are transpositions of one another. Comment 7.1. Two chords being of the same type is very closely related, though not identical to the notion of two shapes in the plane being congruent. Shapes X1 and X2 are congruent if the one can be obtained from the other via a rigid motion: that is, some combination of translation, rotation and reflection. Our definition of chord-type excludes the reflection option. Why? If we allowed inversion, then we would have to say that {0, 4, 7} is the same type as {0, 3, 7}, which would erase the difference between major and minor triads! The music theorist Allen Forte, who died this October (2014) at the age of 87, and who is largely responsible for introducing these set theory ideas into music theory, did in fact include inversion when considering chord-types. Thus according to his taxonomy, the major and minor triad are one and the same beast! 68 Example 7.1. The chord-type of X = {0, 4, 7} = {C, E, G} is defined as the set of all equal-tempered transpositions of X. This is just the set {{0, 4, 7}, {1, 5, 8}, {2, 6, 9}, . . . } = {{C, E, G}, {D[, F, A[}, {D, F], A}, . . . } of all twelve of the major triads. In this sense our equivalence class does indeed capture the property of being a major chord. Example 7.2. Compute the chord-type of X = {0, 3, 6, 9} = {C, E[, G[, A = B[[}. Begin by computing the transpositions of X one by one: t0 (X) = {0, 3, 6, 9} t1 (X) = {1, 4, 7, 10} t2 (X) = {2, 5, 8, 11} t3 (X) = {3, 6, 9, 12} = X!! Thus the chord-type of X is {{0, 3, 6, 9}, {1, 4, 7, 10}, {2, 5, 8, 11}} = {c◦7 , c]◦7 , d◦7 }. These are precisely all three diminished seventh chords. So in this case our chord-type captures the property of being a diminished seventh chord! The prime form of a chord-type Not all of our chord-types have natural names like ‘minor triad’ or ‘halfdiminished seventh’. To make up for this, we establish a convention for picking one particular representative of a chord-type to serve as its name. We will called this the prime form of a chord-type3 . Definition 24. The prime form of a chord-type is the unique element X = {P1 , P2 , . . . , Pr } (where 0 ≤ P1 < P2 < . . . Pr ≤ 11) in the equivalence class such that 1. P1 = 0, 2. Pr is as small as possible, 3. the sequence (P2 , P3 , . . . , Pr−1 ) is as small as possible with respect to the lexicographic order. . Comment 7.2. Ordering sequences lexicographically is the same thing as “alphabetizing” them using the natural ordering of the integers. Thus, for example, the sequence (1, 4, 2, 3, 1, 4) is smaller than the sequence (1, 4, 2, 5, 1, 4); the first term where they differ is the fourth, and 3 < 5. 3 Forte defines the prime form of a chord with respect to all transpositions and inversions. This of course yields a different notion of chord-type. 69 Example 7.3. Consider the chord X = {1, 4, 6, 9, 10}. To find the prime form of its chord-type, we first find all the transpositions of X satisfying properties (1) and (2) from the definition. To do this draw X on the circle and find the biggest gap between consecutive points (going clockwise). Each time you find a gap of this maximal length between pitches P and Q (in clockwise order), rotate the second pitch Q to 0. In our case the biggest gap is 3, occurring between 1 and 4, 6 and 9, and 10 and 1. These give rise to three transpositions of X satisfying (1) and (2): X1 = {0, 2, 5, 6, 9} X2 = {0, 1, 4, 7, 9} X3 = {0, 3, 5, 8, 9} The prime form is now the smallest of these three, ordered alphabetically. This is {0, 1, 4, 7, 9}. 7.1 Counting chords of the same type Why are there 12 major triads, but only 3 diminished seventh chords? Why do both these numbers divide 12? Is there in general a way of knowing how many chords of a certain type there are, or equivalently, of knowing how many different transpositions of a given chord X? Group theory provides the answers to all these questions! The results are so (dare I say) beautiful and far-reaching, that I’m sure you will forgive my having to introduce some more terminology. Group actions Definition 25. Let G be a group and let S be a set. A group action of G on S is a rule which given a group element g ∈ G and an element s ∈ S of our set, outputs a new element s0 = g · s of S, and which satisfies the following axioms: (i) e · s = s for all s ∈ S (identity element acts trivially); (ii) (g1 g2 ) · s = g1 · (g2 · s) for all g1 , g2 ∈ G and s ∈ S (associativity). Comment 7.3. This is just a generalization of our current setup. We have a group G = T12 of transpositions that acts on the set S of all chords. The rule for the action g · s is simply given by my definition of transposition: given a transposition tj in T12 and a chord X in S, we define tj · X = tj (X). Stabilizers and orbits Definition 26. Suppose G acts on the set S. Fix an element x ∈ S. The orbit of s, denoted Os is the set of all translates of s by elements of G: Os = {g · s : g ∈ G} = {g1 s, g2 s, . . . }. 70 The stabilizer of s, denoted Stabs , is the set of elements of G which fix s: Stabs = {g ∈ G : g · s = s}. Comment 7.4. In our setup, given a chord X, its orbit OX = {t0 (X), t1 (X), . . . , t11 (X)} is precisely its chord-type, the set of all transpositions of X. The stabilizer of X is StabX = {tj ∈ T12 : tj (X) = X}, the set of transpositions that fix the chord. Lagrange’s Theorem We now state a wonderful theorem that allows us to count the translates of an element s (in our context this will be the transpositions of a chord) by counting the elements of its stabilizer Stabs . The latter is often easier to count, which is why the theorem is so useful. Theorem (Lagrange’s Theorem). Let the finite group G act on the set S. Given any s ∈ S, we can count the size of its orbit as follows: #Os = #G/# Stabs . Example 7.4. Let G = T12 and let S be the set of all chords. Take X = {0, 4, 7}. We have StabX = {t0 }, since any other transposition does not map X to itself, as one readily verifies. Thus according to the theorem, the number of transpositions of X is #OX = #G/# StabX = 12/1 = 12, as we already verified! Take Y = {0, 3, 6, 9}. We have StabY = {t0 , t3 , t6 , t9 }, as one readily verifies. It follows that #OY = #G/# StabY = 12/4 = 3, as we already verified. Example 7.5. Continue with G = T12 and S the set of all chords. How many chords are of the same type as X = {0, 1, 3, 6, 7, 10}? We must count its different transpositions. We have StabX = {t0 , t6 } in this case, from which it follows that #OX = #G/# StabX = 12/2 = 6. In this case using the formula is faster than computing the chord-type of X directly, which would be {t0 (X) = X, t1 (X), t2 (X), t3 (X), t4 (X), t5 (X)} 71 Transpositional and inversional symmetry We easily conclude from Lagrange’s theorem that a chord X has less than 12 distinct transpositions if and only if there is a nontrivial transposition tj such that tj (X) = X. Since tj acts as rotation, it follows that if tj (X) = X, then the chord X admits a rotational symmetry, when considered as a collection of points on the circle, . Accordingly, we call such chords (and their corresponding chord-types) symmetric, or more specifically, we say in this case that the chord X has transpositional symmetry by j. We have already seen two examples of symmetric chord-types: written in prime form these are {0, 4, 8} (transpositional symmetry by 4) and {0, 3, 6, 9} (transpositional symmetry by 3), the augmented triad and diminished seventh chord. Similarly, we say a chord X has inversional symmetry if there is an inversion iQ0 such that iQ0 (X) = X. Since inversion is reflection, geometrically this means the chord X admits a an axis of symmetry. 7.2 Counting chord-types Recall that there are 220 different trichords. How many different chord-types do these divide up into? We have the four chord-types corresponding to the four types of triads. These account for just 3 · 12 + 4 = 40 of the 220 trichords. How do the remaining 180 break up into chord-types? More generally, we can fix an n and ask how many different chord-types there are among the set of all n-chords. This is not such an easy counting problem, but luckily group theory comes to our aid in spectacular fashion. From the group theoretic standpoint, when a group G acts on a set S, it decomposes S into distinct orbits Os . We often want to know how many different orbits S decomposes into under this action. This question has a surprising answer in the form of Burnside’s Lemma. Classic 8 (Burnside’s lemma). It turns out we can count the number of distinct orbits in S by counting something seemingly totally unrelated: the number of elements of S fixed by a given group element g ∈ G. Definition 27. Let G be a group acting on a set S. Given g ∈ G the set of fixed points of g is defined as the set Fixg := {s ∈ S : g · s = s} of all elements of S fixed by g. Comment 7.5. There is clearly some structural similarities between Stabs and Fixg , and it is easy to get the two notions muddled up. The following diagram might be useful in keeping things straight, at least as far as showing where everything lives. Everything in the left-most column lives in G; everything in 72 the right lives in S. G acts on S g∈G =⇒ Fixg = {s ∈ S : g · s = s} Stabs = {g ∈ G : g · s = s} ⇐= s∈S Burnside’s lemma. Let G = {g1 , g2 , . . . , gm } be a finite group acting on a set S. Let Norbits be the number of orbits that S decomposes into under this action. Then 1 X Norbits = # Fixg #G g∈G = 1 (# Fixg1 +# Fixg2 + · · · + # Fixgr ) #G Comment 7.6. I omit the proof here, as I did also with Lagrange’s theorem, simply because of time constraints. I’m happy to report however, that I consider both proofs well within your mathematical capabilities at this point. We apply the preceding to the situation when G = T12 acting on S, the set of all n-chords, in which case orbits correspond to chord-types of n-chords. We start off with n = 2 to get warmed up. Example 7.6. The set S of all 2-chords decomposes into a number of distinct chord-types under the action of T12 . Since a 2-chord {P1 , P2 } is just an interval, we call these chord-types interval-types. We count N2 , the number of intervaltypes, using Burnside’s lemma: N2 = X 1 # Fixtj #T12 = 1 (# Fixt0 +# Fixt1 + · · · + # Fixt11 ). 12 tj ∈T12 So now all we have to do is count Fixtj ; that is, for each transposition tj we have to count the number of intervals {P1 , P2 } it fixes. This is easy. First off we have # Fixt0 = 66, since t0 fixes all 66 = 12 2 dyads. Next we observe that # Fixt6 = 6 as it fixes all and only the 6 tritones {0, 6}, {1, 7}, . . . , {5, 11}. Lastly, # Fixtj = 0 for all other j. Why? Think geometrically! Putting it all together, we conclude that N2 = 1 (66 + 6) = 72/12 = 6. 12 Comment 7.7. Does the last example make sense? The conclusion was that there are only 6 different interval types, but shouldn’t there be 12 types (minor/major second, minor/major third, etc.) corresponding to the twelve possible lengths of the interval? 73 Prime form {0, 1} {0, 2} {0, 3} {0, 4} {0, 5} {0, 6} Interval type minor second=major seventh major second=minor seventh minor third=major sixth major third=minor sixth perfect fourth=perfect fifth tritone Figure 6: The prime forms for the 6 different interval-types Ah, we have forgotten that we work in pitch-class space, so there are in fact only 6 possible lengths of an interval {P1 , P2 }: we take the shortest path around the circle! Put another way, all inversely related intervals define the same interval-type! Example 7.7. Take n = 3. The set S of all 3-chords decomposes into a number of distinct chord-types under the action of T12 . Call these trichord-types. We count N3 , the number of trichord-types, using Burnside’s lemma: N3 = 1 (# Fixt0 +# Fixt1 + · · · + # Fixt11 ). 12 Note: our computation will differ from the n = 2 case, as now we count how many trichords {P1 , P2 , P3 } are fixed by each tj . As before # Fixt0 = 220 = 12 3 , since t0 fixes all trichords. Furthermore, # Fixt4 = 4 as t4 fixes the 4 different augmented triads. It follows that t8 = t4 ◦t4 also fixes these and only these trichords; thus #F ixt8 = 4. Lastly, # Fixtj = 0 for all other j by another geometric argument: for a trichord to be fixed by a rotation, it must be symmetric, hence an augmented triad. Thus 1 (220 + 4 + 4) = 228/12 = 19. N3 = 12 There are exactly 19 different trichord-types! 8 Scales How should we model musical scales? Our first inclination, especially after listening to our downstairs neighbor diligently pound through all 12 major scales, is to treat these as sequences of pitch-classes. Order matters here, right? For example, we would represent the C major scale as the sequence (0, 2, 4, 5, 7, 9, 11) (or (C, D, E, F, G, A, B), using pitch names) and the F] major scale as (6, 8, 10, 11, 1, 3, 5) (or (F], G], A], B, C], D], E]), using pitch names). Looking at these two sequences carefully, however, we see that the particular ordering of the chosen pitches here is not all that interesting: we have just listed the given pitch-classes in their natural order around the pitch-class circle. In other words, our sequences don’t contain much more information than the 74 corresponding sets {0, 2, 4, 5, 7, 9, 11} and {6, 8, 10, 11, 1, 3, 5}. (The sequences do in fact contain one more piece of information, namely who goes first, but this is easily dealt with.) Furthermore, scales are often treated by composers as fixed collections of pitches from which they draw subsets in order to build chords or melodies. Accordingly we will model scales as sets of pitch-classes, just as we did with chords, but will develop some additional theory to reflect their particular musical functions. Note: for the rest of this section we will work exclusively in pitchclass space. Accordingly we will drop the bracket notation, and use modular arithmetic with impunity. Definition 28. We will call a scale any subset X = {P1 , P2 , . . . , Pr } of r distinct pitch-classes. Most of the scales we will consider will be equal-tempered, which means as usual that the Pi ∈ {0, 1, 2, . . . , 11} are taken from our set of 12 equal-tempered pitch-classes. Furthermore, we typically will insist that r ≥ 5; i.e., you should have at least 5 pitches to be considered a scale. Lastly, as with chords we say that two equal-tempered scales X and Y are of the same type if there is a transposition tj ∈ T12 such that tj (X) = Y (i.e., the one is a transposition of the other). The equivalence classes determined by this relation are called scale-types. Example 8.1. The scale {0, 2, 4, 5, 7, 9, 11} is called the C diatonic scale. It can be described as starting with the pitch C and applying in order the following sequence of whole (W = M2) and half step (H = m2) transpositions: (W, W, H, W, W, W, H). The same description in terms of W and H allows us to define diatonic scales starting with any pitch. Thus the G diatonic scale is just {7, 9, 11, 0, 2, 4, 6} = {0, 2, 4, 6, 7, 9, 11}. It is clear each such diatonic scale is just a transposition of the C diatonic scale, and thus together these comprise a single scale-type, which we call diatonic. How many different diatonic scales are there? Is it possible, for example, that the G[ diatonic scale is just the same thing as the B[ diatonic scale written in a different order? Use Lagrange’s theorem! The stabilizer of the C diatonic collection is trivial, so its orbit has 12/1 = 12 different transpositions in it. There are indeed 12 different diatonic scales! Modes Let’s return to our earnest piano student downstairs. Listening more closely we hear he actually plays two different versions of the C diatonic scale: one that begins with C, the sequence (0, 2, 4, 5, 7, 9, 11), and one that begins with A, the sequence (9, 11, 0, 2, 4, 5, 7). This difference is not captured currently in our mathematical model of scales, but we fix this easily with the notion of a mode. 75 Mode (C, D, E, F, G, A, B) (D, E, F, G, A, B, C) (E, F, G, A, B, C, D) (F, G, A, B, C, D, E) (G, A, B, C, D, E, F) (A, B, C, D, E, F, G) (B, C, D, E, F, G, A) W -H sequence (W, W, H, W, W, W, H) (W, H, W, W, W, H, W ) (H, W, W, W, H, W, W ) (W, W, W, H, W, W, H) (W, W, H, W, W, H, W ) (W, H, W, W, H, W, W ) (H, W, W, H, W, W, W ) Name C ionian (or C major) D dorian E phrygian F lydian G mixolydian A aeolian (or A natural minor) B locrian Figure 7: The seven modes of the C diatonic scale Definition 29. Let X = {P1 , P2 , . . . , Pr } be a scale, and assume the pitchclasses are written in a clockwise sequential order. A mode of X is the sequence (Pj , Pj+1 , . . . , Pr , P1 , . . . , Pj−1 ) you get by starting with a pitch Pj and working around the scale in clockwise fashion. Given the a mode (Q1 , Q2 , . . . , Qr ), we call the i-th pitch in the mode the i-th scale degree of the mode (or scale degree i), denoted bi. We will use the same terminology when dealing with scales too, at least when their names indicate a preferred “first” pitch. For example, in the D diatonic scale, scale degree 3 is b 3 = F], and scale degree 7 is b 7 = C]. Diatonic modes In general a scale containing r distinct pitches will have r different modes, determined by who goes first. As with chords, we transpose modes simply by transposing each pitch: tj ((P1 , P2 , . . . , Pr )) := (tj (P1 ), tj (P2 ), . . . tj (Pr )). Since transposition preserves the W -H sequences above, these define the different mode-types for diatonic modes, and we can use them to generate any mode starting with any pitch. For example the F dorian mode is (F, G, A[, B[, C, D, E[). However, it is perhaps easier to just remember the white note modes and transpose these accordingly. Comment 8.1. It is not just scalar runs played by our downstairs neighbor that we identify with a particular mode. When analyzing music, we often describe entire passages as being written in a particular mode: e.g., “this passage is in F lydian”, or “here the composer switches to a G dorian mode”. Such assertions indicate two musical properties: 1. the underlying scalar collection the composer is using, and 2. a particular pitch that is given special emphasis, sometimes called the tonal center of the passage. For example, a passage written in G dorian makes use of the F diatonic collection = {F, G, A, B[, C, D, E} = {0, 2, 4, 5, 7, 9, 10}, and gives special emphasis to G somehow: perhaps the melody begins and ends on G, for example. In general the scalar collection (1) can be indicated fairly unambiguously, whereas the tonal center (2) can be trickier to identify. 76 Example 8.2 (“Paddy’s Green Shamrock Shore”, performed by Paul Brady). All pitches here are white notes (note the F] in the signature is always made natural!), making the scalar collection here C diatonic. As G is clearly the tonal center of the piece, we conclude it is written in G mixolydian. Example 8.3 (“I can’t explain”, The Who). (I lifted this example from Tymoczko’s Music 105 lecture notes.) The piece opens with the repeated sequence of major triads E-D-A-E. Collecting all the pitches in these chords yields the A diatonic collection {A, B, C], D, E, F], G]}. As E is clearly the preferred pitch, this is E mixolydian. 8.1 Generated scales Recall that we can generate the entire 12-tone scale by starting with a pitch and transposing up repeatedly by a perfect fifth. If we start with F, then the first seven pitches in this sequence are precisely the pitches of the C diatonic scale: {F, C, G, D, A, E, B}. We say in this case that the scale is generated. Definition 30. A scale X is generated, if there is a pitch P1 and a transposition t such that X = {P1 , t(P1 ), t2 (P1 ), . . . , tr−1 (P1 )}. (As usual, tj (P ) means transpose by t a total of j times. ) Pentatonic scale The first five pitches of the sequence of fifths starting on F comprise what is called a pentatonic scale: {D, F, G, A, C}. By definition it is a generated scale, and a subscale of the diatonic scale. 77 The prime form of the pentatonic scale is {0, 2, 4, 7, 9}. I will name a pentatonic scale according to the unique pitch that functions as 0 in the prime form. Thus {F, G, A, C, D} is the F pentatonic scale. As the naming scheme suggests, there are 12 different pentatonic scales. (Use Lagrange’s theorem!) Note that the black notes of the keyboard {6, 8, 10, 1, 3} also form a pentatonic scale: the G[ pentatonic scale. This should come as no surprise. If we pick up our sequence of fifths where we left off after generating the white notes, we get precisely the five black notes {G[, D[, A[, E[, B[}. This is most often the first pentatonic scale we meet in our musical life, and the black note pattern is the best way of remembering the intervallic content of the pentatonic scale. Stacks of stacks If you try creating generated scales with smaller intervals, thirds for examples, you get “scales” with less than five notes, and with sizable gaps between pitches. For example, if we start with C = 0 and use a major third as our generating interval, we get the “scale” {0, 4, 8}, which is none other than our augmented triad. Though this stack of thirds is not enough to form a scale, we can combine it with other augmented triads to get various 6-note scales: {0, 4, 8} + {1, 5, 9} = {0, 1, 4, 5, 8, 9} the hexatonic scale (or augmented scale) {0, 4, 8} + {2, 6, 10} = {0, 2, 4, 6, 8, 10} the whole-tone scale {0, 4, 8} + {3, 7, 11} = {0, 3, 4, 7, 8, 11} = t3 ({0, 1, 4, 5, 8, 9}). Note that the whole-tone scale is a generated scale, using transposition by a whole tone. The stabilizer of the whole-tone scale is H = {t0 , t2 , t4 , t6 , t8 , t10 }. Thus there are 12/6 = 2 whole-tone scales. We name them as follows: WT-0 = {0, 2, 4, 6, 8, 10} = {C, D, E, F], G], A]} WT-1 = {1, 3, 5, 7, 9, 11} = {D[, E[, F, G, A, B} Octatonic scale If we play the same game with a minor third, we begin with C = 0 and generate a diminished seventh chord {0, 3, 6, 9}. Up to transposition, combining any two such diminished seventh chords always produces the same scale-type, called the octatonic: {0, 3, 6, 9} + {1, 4, 7, 10} = {0, 1, 3, 4, 6, 7, 9, 10} How many different octatonic scales are there? The stabilizer is H = {t0 , t3 , t6 , t9 }. Thus there are 12/4 = 3 different octatonic scales. We will denote them as follows: Oct0,1 = {0, 1, 3, 4, 6, 7, 9, 10} Oct0,2 = {0, 2, 3, 5, 6, 8, 9, 11} Oct1,2 = {1, 2, 4, 5, 7, 8, 10, 11} 78 We pause here to collect information about our current list of scale-types. Name diatonic pentatonic hexatonic whole-tone octatonic Step sequence (2, 2, 1, 2, 2, 2, 1) (2, 2, 3, 2, 3) (1, 3, 1, 3, 1, 3) (2, 2, 2, 2, 2, 2) (1, 2, 1, 2, 1, 2, 1, 2) Prime form {0, 1, 3, 5, 6, 8, 10} {0, 2, 4, 7, 9} {0, 1, 4, 5, 8, 9} {0, 2, 4, 6, 8, 10} {0, 1, 3, 4, 6, 7, 9, 10} Stab {t0 } {t0 } ht4 i ht2 i ht3 i Interval vector 254361 032140 303630 060603 448444 The interval vector a1 a2 a3 a4 a5 a6 of a scale gives the number ai of intervals contained in the scale of length i half steps. The step sequence just indicates the number of half steps between successive pitches in the scale. These sequences can be read straight off of the prime form, though I have cycled the diatonic sequence around to its most familiar form (viz., “whole, whole, half, whole, whole...”). For the stabilizer column, the notation htj i denotes the subgroup of T12 generated by tj . Thus ht2 i = {t0 , t2 , t4 , t6 , t8 , t10 }, and ht3 i = {t0 , t3 , t6 , t9 }. Geometric summary with inversional symmetry indicated �� � � �� � � � � � � � � � �����-���� � � ��������� � � �� � � � �� � � � � � � �� � � � � � �� �� � �� 8.2 � � � � � � ���������� � �� � � � �� �� � � � � �� � �� � �������� � �� � � � � � ��������� � � � �� Small-gap scales One of the defining characteristics of the diatonic scale is that the gaps between successive pitches are no more than 2 half steps, and that there are never two 79 consecutive gaps of size one half step. It is natural then to consider all scaletypes satisfying these two properties, as they will be in some sense diatonic-like. It turns out that, up to translation, there are not so many. whole-tone diatonic acoustic octatonic {C, D, E, F], G], A]} {C, D, E, F, G, A, B} {C, D, E, F], G, A, B[} {C, C], D], E, F], G, A, B[} (2, 2, 2, 2, 2, 2) (2, 2, 1, 2, 2, 2, 1) (2, 2, 2, 1, 2, 1, 2) (1, 2, 1, 2, 1, 2, 1, 2) (Note: The acoustic scale is so called, as these pitches are the equal-tempered best approximation of the first 7 pitches of the harmonic scale.) In A Geometry of Music Dmitri Tymoczko defines an n-gap scale to be one where the gap between successive pitches is at most n half steps. He groups 2gap and 3-gap scales under the general heading of small-gap scales. Tymoczko adds two 3-gap seven note scales (harmonic minor and harmonic major) to our list, and we will follow suit here, yielding the following (final) table of scale-types: pentatonic hexatonic whole-tone diatonic acoustic harmonic minor harmonic major octatonic {C, D, E, G, A} {C, C], E, F, G], A} {C, D, E, F], G], A]} {C, D, E, F, G, A, B} {C, D, E, F], G, A, B[} {C, D, E[, F, G, A[, B} {C, D, E, F, G, A[, B} {C, C], D], E, F], G, A, B[} (2, 2, 3, 2, 3) (1, 3, 1, 3, 1, 3) (2, 2, 2, 2, 2, 2) (2, 2, 1, 2, 2, 2, 1) (2, 2, 2, 1, 2, 1, 2) (2, 2, 1, 2, 1, 3, 1) (2, 2, 1, 2, 1, 3, 1) (1, 2, 1, 2, 1, 2, 1, 2) As observed by Tymoczko, this collection of scales is “tonally complete” in the following sense: any chord X which does not contain a chromatic cluster (three or more consecutive pitches separated by half step) is contained within one of these scales. Claude Debussy, Préludes I, “Voiles” 80 Igor Stravinsky, Petroushka, II.Chez Petroushka Olivier Messiaen, Vingt Regards sur l’Enfant-Jésus, I. Regard du Père 81 8.3 Scalar intervals, transpositions and inversions Tymoczko likes to think of a scale as a ruler that measures pitch-class space in a particular way, in terms of scalar steps. Definition 31. Given a scale X = {P1 , P2 , . . . , Pr }, where we assume the Pi are listed in clockwise order, we say an interval of the form {Pi , Pi+k } has a scalar length of k (scalar) steps. Following the interval naming conventions of the diatonic scale, we call {Pi , Pi+1 } a (scalar) second, {Pi , Pi+2 } a (scalar) third, etc. Example 8.4. Let X = Oct0,1 = {0, 1, 3, 4, 6, 7, 9, 10}. Then {0, 4} is an octatonic fourth, since 4 is three scalar steps up from 0. Similarly, {1, 7} is an octatonic fifth. Let X = {0, 2, 4, 7, 9}, the C pentatonic scale. Then X has two different kinds of pentatonic seconds: those of chromatic length 2 ({0, 2}, {2, 4}, {7, 9}), and those of chromatic length 3 ({4, 7}, {9, 0}). Scalar transposition Definition 32. Given a scale X = {P1 , P2 , . . . , Pr } written in clockwise order, we define scalar transposition by k steps to be the function stk : X → X defined as stk (Pi ) = Pi+k , where we must take i + k modulo r for this to make sense. Intuitively, the scalar transposition stk shifts each pitch k places “forward” in the scale. Example 8.5. Let X = {0, 2, 4, 5, 7, 9, 11}, the C diatonic scale, and consider the scalar transposition st1 that shifts everything up by 1. Then we have st1 (0) = 2, st1 (2) = 4, st1 (4) = 5, . . . , st1 (9) = 11, st1 (11) = 0. Note that unlike normal transposition, scalar transpositions move pitches by a varying amount. They do not preserve the (chromatic) distance between scale pitches, but they do preserve the scalar distances! Comment 8.2. Once we know how to define scalar transpositions on the pitches of a scale, we go on to define scalar transpositions of subsets (chords) and sequences (modes, melodies) in the usual way. For example, let X = {0, 2, 4, 5, 7, 9, 11} again, and consider the scalar melody “Do a deer”: (0, 2, 4). Transposing this up by 1 scalar step yields the new melody (2, 4, 5), which is “Re a drop (of golden sun)”. The example is Tymoczko’s, and his point is that though chromatically speaking the two sequences are different (W-W, versus W-H), when measured by the C diatonic scale they are somehow the same: namely, both melodies simply ascend two scale steps. Scalar inversion We can also define a scalar version of inversion. Fix a scale X = {P1 , P2 , . . . , Pr }, written as usual in clockwise order. To invert around a scalar pitch Pj , we take 82 any pitch that is k scalar steps above Pj and and send it to the pitch that is k scalar steps below: that is, we want a map that sends Pj+k 7→ Pj−k . It is easy to see that the map Pi 7→ Pi−2(i−j) = P−i+2j does the trick. As with scalar transposition, we must compute −i + 2j modulo r for this to make sense. Definition 33. Given a scale X = {P1 , P2 , . . . , Pr } written in clockwise order, and choice of pitch Pj in the scale, we define scalar inversion with respect to Pj to be the function sij : X → X defined as sij (Pi ) = P−i+2j , where we must take −i + 2j modulo r for this to make sense. Example 8.6. Return to our example from The Art of Fugue. Subject Inversion Recall that the inversion is not a strict chromatic inversion of the theme. Can we express this operation in terms of scalar operations? Yes, but to do so, we need to use the harmonic minor scale on D: X = {D, E, F, G, A, B[, C]} = {P1 , P2 , . . . , P7 }. Now to get the inverted form from the subject, first transpose up by 3 scale steps (using st3 ) to make the A a D, then invert with respect to D (using si1 ). The corresponding scalar operation is then si1 ◦ st3 (Pi ) = si1 (Pi+3 ) = P−(i+3)+2 = P−i−1 = P−i+6 = si3 (Pi )! Let’s check that this operation exactly maps the subject onto the inverted form: P P5 si1 ◦ st3 (P ) P1 8.4 P1 P5 P2 P4 P3 P3 P4 P2 P5 P1 P6 P7 P5 P1 P4 P2 P3 P3 . Maximally even scales Maximally even scales What makes the diatonic scale so special? We have seen already that it is rich in intervallic content, as evidenced by its interval vector 254361. This is also apparent in the following property: each scalar interval of the diatonic scale comes in two chromatic flavors (m2/M2, m3/M3, etc.). A scale satisfying this property is called maximally even. 83 Definition 34. Given a scale X = {P1 , P2 , . . . , Pr } written in clockwise order, we say X is maximally even if for every 1 ≤ k ≤ r − 1, the scalar intervals of size k are either all of the same chromatic length, or else come in exactly two consecutive chromatic lengths: that is, one of size ` half steps, the other of size ` + 1 half steps. Example 8.7. Take the C pentatonic scale X = {0, 2, 4, 7, 9} = {P1 , P2 , P3 , P4 , P5 }. We investigate the different chromatic flavors of each scalar interval of size k, 1 ≤ k ≤ 7. Scalar size k Chromatic sizes 1 2, 3 2 4, 5 3 7, 8 4 9, 10 This shows the C pentatonic is maximally even, and hence that all pentatonic scales are maximally even. Example 8.8. Take Oct0,1 = {0, 1, 3, 4, 6, 7, 9, 10} = {P1 , P2 , P3 , P4 , P5 , P6 , P7 , P8 }. We investigate the different chromatic flavors of each scalar interval of size k, 1 ≤ k ≤ 7. Scalar size k Chromatic sizes 1 1, 2 2 3 3 4, 5 4 6 5 7, 8 6 9 7 10, 11 This shows Oct0,1 is also maximally even, and thus the same is true for all octatonic scales. Example 8.9. Take the hexatonic scale X = {0, 1, 4, 5, 8, 9} = {P1 , P2 , P3 , P4 , P5 , P6 }. We investigate the different chromatic flavors of each scalar interval of size k, 1 ≤ k ≤ 5. Scalar size k Chromatic sizes 1 1, 3 2 4 3 5, 7 4 8 5 9, 11 This shows the hexatonic scales are not maximally even: the scalar seconds, for example, come in two chromatic lengths, 1 and 3, which are not consecutive 84 Why does this property about the relation of scalar intervals to chromatic ones deserve to be called maximally even? One would have guessed that a scale containing n distinct pitches should be called maximally even if the pitches are as evenly distributed around the pitch-class circle as possible. Put another way, for any fixed n, we can always pick n pitches that divide the circle evenly into n segments. However, when n - 12, these pitches will not be equal-tempered! A maximally even collection should be the collection of pitches that are the best equal-tempered approximation of this perfectly even distribution. Miraculously, it turns out that our definition of maximally even is equivalent to this! Theorem. Fix n. Let Xn = {0, 12/n, 2(12/n), . . . , 11(12/n)} be the collection of pitches that divides the circle up into equal segments. (If n - 12, then some pitches of Xn will not be equal-tempered.) Let Yn be the equal-tempered collection you get by taking the integer points closest to each of the pitches i(12/n) in Xn . Then Yn is maximally even. Furthermore, an equal-tempered n-pitch scale X = {P1 , P2 , . . . , Pn } is maximally even if and only if it is a transposition of Yn . Thus for each n there is a unique maximally even scale-type! Maximally even: n = 5 Maximally even: n = 5 85 �� � � � �� � � � � ���������� � � Maximally even: n = 7 Maximally even: n = 7 86 � �� � � � �� � � � � �������� � � Maximally even: n = 8 Maximally even: n = 8 87 � �� � � � �� � � � � ��������� 9 � � � Wrap-up We have so far developed some fairly sophisticated techniques for understanding chords and scales (and to a lesser extent melodies), as well as some pervasive musical operations (transposition and inversion) that are applied to these chords and scales (and melodies). These tools are good at giving us a static understanding of what’s going on say in a particular measure, but do not yet capture the evolving or unfolding nature of a piece of music. For example, one of the most common (simplified) ways of describing a piece of music is as a chord progression, that is, as a sequence of chords (X1 , X2 , X3 , . . . ). We described The Who song “I can’t explain”, for example as the sequence of chords ({E, G], B}, {D, F], A}, {A, C], E}, {E, G], B}) . The question then arises: what is the logic behind chord progressions? Why or how do composers decide to move from one chord to another? Can we make sense of the notion of two chords being “close” to one another? Such questions fall under the rubric of voice leading, and I begin our wrap-up by giving an informal preview (or prelude, if you will) to some mathematical approaches to voice leading issues. 88 Introduction to voice leading We begin by giving a new model for chords: namely, we will now think of a chord as a sequence of pitches X = (P1 , P2 , P3 , . . . , Pn ). Here order matters, and we often think of the pitches Pi in the chord X as belonging to a particular voice, which we will call the i-th voice. Now for example the chord X1 = (0, 7) is different from the chord X2 = (7, 0), as in the first chord the first voice plays a C, while in the second chord the first voice plays a G. Next we define the space of r-chords to be the set of all such chords that is {(P1 , P2 , . . . , Pn ) : Pi ∈ R} =: Rn The set Rn of all n-tuples is called Euclidean n-space and comes equipped with a natural distance function: given X = (P1 , P2 , . . . , Pn ) and Y = (Q1 , Q2 , . . . , Qn ), we define p d(X, Y ) = (P1 − Q1 )2 + (P2 − Q2 )2 + · · · (Pn − Qn )2 . We can use this distance function to quantify how close two chords are to one another. What does the space of all dyads (2-chords) look like? As a set this is just R2 = {(x, y) : x, y ∈ R}, otherwise known as the xy-plane. We can represent a given dyad X = (x, y) as a plotted point in this plane, and given two different dyads X1 = (x1 , y1 ) and X2 = (x2 , y2 ), the distance function d(X1 , X2 ) is none other than the distance between their corresponding points in the plane. At this point your instructor will draw some pictures on the board. Please be patient. In A Geometry of Music in order to return chords back to unordered collections Tymoczko further quotients the space of dyads out by the relation (x, y) ∼ (y, x). What kind of space do we get when we do this? A Möbius band! In case you don’t like that realization of the Möbius band, here is how Tymoczko does it in his book. In the image below he has rotated the entire 89 xy-plane by 45 degrees. So already for n = 2 we see that the space of chords is geometrically interesting. Depending on whether you consider a dyad as an ordered pair of pitches, as an ordered pair of pitch classes, or as an unordered (multi)set of pitch classes (take your pick!), you get respectively (a) R2 , (b) R2 / ∼oct =: T2 , the 2-torus, or (c) T2 /(a, b) ∼ (b, a), the Möbius band. For n = 3, 4, . . . we get even more interesting spaces representing trichords, tetrachords, etc. The descriptions for general n are similar to the n = 2 case; the space of n-chords is represented either as Rn (ordered pitches), Tn (ordered pitch classes), or the quotient Tn /Sn (unordered multisets of pitch classes). As exotic as the various spaces modeling n-chords may seem, they are the result of equivalence relations coming straight from musical notions. Furthermore these spaces are all grounded with a very intuitive concept of the distance between two chords, which states in a quantitative way that two chords X = (P1 , P2 , . . . , Pn ) and Y = (Q1 , Q2 , . . . , Qn ) are close to one another exactly when each note Pi in X is close to the corresponding Qi in Y . Even more intuitively, the two chords are close if when playing X and then Y on the piano, your fingers don’t have to move very far! 90 These spaces also finally afford us a first, elegant mathematical model of a piece of music. The spaces themselves do not represent a particular piece of music. Rather, we think of a piece of music as describing a particular path through one of these spaces. We illustrate this idea with two examples taken from A Geometry of Music: Chopin’s E Minor Prelude, Op. 28, No. 4, and the prelude to Richard Wagner opera Tristan und Isolde. A fitting conclusion to this little prelude on voice leading. Chopin’s E Minor Prelude, Op. 28, No. 4 Tymoczko’s reduction of opening of piece, from A Geometry of Music. Chopin’s E Minor Prelude, Op. 28, No. 4 Tymoczko’s geometric representation of the region of 4-chord space traversed 91 during each of the cycles in the opening. Chopin’s E Minor Prelude, Op. 28, No. 4 Tymoczko’s geometric representation of the region of 4-chord space traversed during each of the cycles in the opening. Prelude from Wagner’s Tristan und Isolde Tymoczko’s geometric representation of the region of 4-chord space traversed 92 during the prelude Conclusions Recall our original outline for the course. 1. Ontological. Musical objects are very much like mathematical objects. We will describe and define the main musical parameters (melody, rhythm, harmony, timbre) in mathematical language (sets, sequences,topological spaces, groups). 2. Methodological. Mathematical thought, operations and objects are frequently employed both in the analysis and composition of music. We will look closely at examples of mathematical methods in both of these areas of musical practice. 3. Epistemological. Music often bears a strong logical quality. We speak of understanding a piece of music, of one passage of music following from another passage. Can these activities be compared to understanding or following mathematical arguments? We will explore these connections with the aid of formal logic. Ontological approach I would argue that mathematics is not being used simply to model aspects of music, but rather, more directly, that music is in fact largely made up of mathematical objects (sets, sequences, rigid motions, paths, etc. ), and further that a large part of music involves investigating relationships between these various objects. As Tymoczko says, describing the Prelude from Tristan und Isolde and the pervasive use therein of the half-diminished seventh chord (or Tristan chord), “the music is in some sense ‘about’ the various ways of resolving the chord.” Similarly, we don’t just use the concept of inversion to understand Bartók’s Mikrokosmos 141 (“Subject and Reflection”); rather the piece (and the acoustic 93 scale it articulates) is built from this operation, and is, as Tymoczko would have it, somehow a piece about this operation. As such a composer is directly engaged with a vast universe of what are usually called combinatorial objects in mathematics: finite or discrete sets with varying degrees of additional structure defined on them. (Tymoczko endeavors to embed this discrete universe into an even bigger, continuous one, and in so doing brings music out of the combinatorial realm and into a geometric one. ) Stravinsky himself seems to have shared this (combinatorial) view of things. “As for myself, I experience a sort of terror when, at the moment of setting to work and finding myself before the infinitude of possibilities that present themselves, I have the feeling that everything is permissible to me.” (From Poetics of Music) “Mathematicians will undoubtedly think this all very naive, and rightly so, but I consider that any inquiry, naive or not, is of value of only because it must lead to larger questions–in fact to the eventual mathematical formulation of music theory, and to, at long last, an empirical study of musical facts–and I mean the facts of the art of combination which is composition.” (From Expositions and Developments) Methodological approach Our ontological approach already provides a strong argument of why music, as compared to other arts, enjoys an especially close relationship to mathematics, but we should not be content with this. After all, like a Bach fugue or a Bartók Mikrokosmos, we can also create interesting wall paper designs using rigid motions (rotations, translations, reflections, etc.), but this fact alone is not enough to justify a semester-long course on the connection between wall paper design and mathematics. A first reply to this objection is that whereas the connection between wall paper design and mathematics essentially starts and ends with rigid motions, there are many more mathematical methods used in the design of music: algorithmic methods in process music, probabilistic methods, stochastic methods, use of permutation operations used in 12-tone and serial music, etc. (A sequel to this course would take a close look at examples of all these methods.) Epistemological approach More importantly, continuing our rebuttal of the wall paper objection, there is potentially no end to the number of mathematical methods that might be employed in the service of music. Why? Consider again our starting definitions of the two subjects: Music is the art of structured sound. Mathematics is the science of abstract structure. This course intended to give you a better sense of what is meant here by structure–if anything by showing you some examples both in music and mathematics. I hope that in the process it has persuaded you that the “structured 94 sound” that is the object of musical art and the “abstract structure” that is the object of the science of mathematics, are in large part the same thing! Furthermore, in looking at these examples we begin to see that the art of music and the science of mathematics are themselves alike in nature and purpose. Both can be seen as explorations or investigations of this universe of abstract structure. In this light the composer, like the mathematician, is on the hunt for new musical structure, or to articulate previously unrecognized properties of old musical structures. Of course, the properties that make a structure interesting musically speaking are not necessarily the same as those making it interesting mathematically speaking, but this is a topic for another time! Instead, let us end with one last quote from Stravinsky, who seems to be in agreement with us once again. “I have recently come across two sentences from the mathematician Marston Morse which express the ‘likeness’ of music and mathematics far better than I could have expressed it. Mr. Morse is only concerned with mathematics, of course, but his sentences apply to the art of musical composition more precisely than any statement I have seen by a musician: ‘Mathematics are the result of mysterious powers which no one understands, and in which the unconscious recognition of beauty must play an important part. Out of an infinity of designs a mathematician chooses one pattern for beauty’s sake and pulls it down to earth.’ ” (From Expositions and Developments) 95