Mathemusical Thought - LUC Sakai

Transcription

Mathemusical Thought - LUC Sakai
Mathemusical Thought
Aaron Greicius
Loyola University Chicago
Fall 2014
c 2015 Aaron Greicius
All Rights Reserved
Contents
1 Introduction to Mathemusical Thought:
1.1 Appeal to authority . . . . . . . . . . .
1.2 Definitions: meet the players . . . . . .
1.3 Vantage points, goals, questions . . . . .
Classic 1 . . . . . . . . . . . . . . . . . . . . .
meet the
. . . . . .
. . . . . .
. . . . . .
. . . . . .
players
. . . . .
. . . . .
. . . . .
. . . . .
.
.
.
.
.
.
.
.
.
.
.
.
2
2
3
3
4
2 Elementary music theory
2.1 Sound, tones and notes .
2.2 Pitch notation . . . . .
2.3 Intervals . . . . . . . . .
2.4 Onset and offset . . . .
Classic 2 . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5
5
6
9
11
13
3 Frequency
3.1 Frequency space . . . . . . . . . . .
3.2 The Pythagorean Legend . . . . . .
3.3 The transposition group Tfreq . . . .
3.4 Just tunings and equal temperament
Classic 3 . . . . . . . . . . . . . . . . . . .
Classic 4 . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
14
14
15
16
21
26
27
4 Pitch space
4.1 Pitch space . . . . . . . . . . .
4.2 The transposition group Tpitch
4.3 Comparing the two pictures . .
Classic 5 . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
28
28
29
31
35
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5 Pitch-class space
5.1 Octave equivalence . . . .
Classic 6 . . . . . . . . . . . . .
5.2 Equivalence relations . . .
5.3 Pitch-class space . . . . .
5.4 Pitch or pitch-class space?
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
37
38
39
41
44
50
6 Chords
6.1 Sets and sequences . . . . . . . . . .
6.2 Chords . . . . . . . . . . . . . . . . .
6.3 Operations on chords: transposition
6.4 Operations on chords: inversion . . .
Classic 7 . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
51
52
53
58
61
66
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7 Chord-types
68
7.1 Counting chords of the same type . . . . . . . . . . . . . . . . . . 70
7.2 Counting chord-types . . . . . . . . . . . . . . . . . . . . . . . . 72
Classic 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
8 Scales
8.1 Generated scales . . . . . . . . . . . . . . . .
8.2 Small-gap scales . . . . . . . . . . . . . . . .
8.3 Scalar intervals, transpositions and inversions
8.4 Maximally even scales . . . . . . . . . . . . .
9 Wrap-up
1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
74
77
79
82
83
88
Introduction to Mathemusical Thought: meet
the players
1.1
Appeal to authority
What mathematicians say
• “Mathematics and music, the most sharply contrasted fields of intellectual activity
which can be found, and yet related, supporting each other, as if to show forth the
secret connection which ties together all activities of the mind...”
–Hermann von Helmholtz
• “It is in its performance that the music comes alive and becomes part of our experience;
the music exists not on the printed page, but in our minds. The same is true for
mathematics; the symbols on a page are just a representation of the mathematics.
When read by a competent performer...the symbols on the printed page come alive–the
mathematics lives and breathes in the mind of the reader like some abstract symphony.”
–Keith Devlin
• “A mathematician, like a painter or a poet, is a maker of patterns. If his patterns are
more permanent than theirs, it is because they are made of ideas. His patterns, like
the painter’s or the poet’s must be beautiful; the ideas, like the colors or the words,
must fit together in a harmonious way.”
–G. H. Hardy (1877-1947)
2
What musicians say
• “Music is the arithmetic of sounds as optics is the geometry of light.”
–Claude Debussy
• “The fugue is like pure logic in music.”
–Frederic Chopin
• “Despite all the experience that I could have acquired in Music, as I had practiced it
for quite a long time, it’s only with the help of Mathematics that I have been able to
untangle my ideas, and that light made me aware of the comparative darkness in which
I was before.”
–Jean-Philippe Rameau
• “I am not saying that composers think in equations or charts of numbers, nor are those
things more able to symbolize music. But the way composers think–the way I think–is,
it seems to me, not very different from mathematical thinking.”
–Igor Stravinsky
• “Music is not to be decorative; it is to be true.”
–Arnold Schoenberg
1.2
Definitions: meet the players
The following represent a reduction of many carefully constructed definitions of
mathematics and music found in the philosophical literature.
Music is the art of structured sound.
Mathematics is the science of abstract structure.
A philosopher will surely not be content with such definitions, but they are a
useful starting point for us. In particular, they reveal both a potential hurdle to
making a connection between the two fields (art/science), as well as a potential
point of attack: the idea of structure.
1.3
Vantage points, goals, questions
Vantage points
The course will examine three points of contact. I list them here in order of
increasing profundity (toward a deep connection), and ornamented with some
fancy philosophical terms.
1. Ontological. Musical objects are very much like mathematical objects. We
will describe and define the main musical parameters (melody, rhythm,
harmony, timbre) in mathematical language (sets, sequences,topological
spaces, groups).
2. Methodological. Mathematical thought, operations and objects are frequently employed both in the analysis and composition of music. We will
look closely at examples of mathematical methods in both of these areas
of musical practice.
3
3. Epistemological. Music often bears a strong logical quality. We speak of
understanding a piece of music, of one passage of music following from
another passage. Can these activities be compared to understanding or
following mathematical arguments? We will explore these connections
with the aid of formal logic.
Goals
The following goals are listed in order of increasing ambitiousness.
1. Get to know some classics in both music and mathematics: compositions,
theorems, musical forms, proofs, etc.
2. Develop a short, “cocktail party” answer to the question: What exactly
is the connection between math and music?
3. Get comfortable reading both musical scores and mathematical arguments.
Come to understand better the nature of music and mathematics as practices.
4. Improve upon our “cocktail party” answer and articulate a deeper connection between music and mathematics.
Questions
Progress toward our last, most ambitious goal can be measured in part by our
ability to answer the following questions:
1. Does the connection between music and math actually extend beyond the
surface level, that is beyond the fact that works of music can be seen as
mathematical objects?
2. What is special about the math/music relation? Why is it any deeper
than the connection between say math and painting, or math and improv
comedy?
3. What precisely is the difference between the art of music, and the science
of mathematics?
Classic 1 (Musikalisches Opfer, Canon I. a 2 cancrizans, by J.S. Bach). Below
you find a facsimile of J.S. Bach’s Canon I, from Musikalisches Opfer (or The
Musical Offering). As the performance instructions indicate, this is an example
4
of a crab canon. Video link. Tim Smith’s overview of Musikalisches Opfer.
Performance instructions:
Instrument 1 plays through from left to right, then back.
Instrument 2 plays from right to left, then back.
Below you find the two parts written out separately; in this form, each instrument now performs the music from left to right, then back.
Turn the score into a Möbius band. You should first fold the score in half
lengthwise, obtaining a strip with Instrument 1 on one side and Instrument 2
on the other.
1. Describe Bach’s composition as a path along your Möbius band. Make
sure your path traverses the whole piece (36 measures in all)!
2. What properties of Bach’s composition are articulated by the geometry of
the Möbius band? What does the geometry say about the role of the two
different instruments?
3. We could have also made a simple cylinder (or hoop) out of our score-strip;
what advantage does the Möbius band representation have (if any)?
4. Compare our Möbius band representation to the one in the video. Which
is better?
2
2.1
Elementary music theory
Sound, tones and notes
In Musimathics: the mathematical foundations of music, Gareth Loy distinguishes between sounds, tones and notes.
5
• A sound can be thought of as a physical thing, the object of study of the
science of acoustics. Physical properties of sounds include frequency,
intensity, envelope, decay, etc.
• A tone, according to Loy is determined by three “sonic properties”:
pitch, loudness, and timbre (or color). These properties are closely
related, but not identical to corresponding physical properties of sound:
pitch and frequency, loudness and intensity, etc. For example: “Frequency
is a physical measure of vibrations per second. Pitch is the corresponding perceptual experience of frequency”. (Loy, 13). As such it seems a
tone is more of a perceptual entity, something that certain sounds are
transformed into by our mind.
• Lastly, a note is just a tone with the further properties of onset (when
the note begins) and offset (when the note ends).
Musical sound consists mostly of notes. Common Music Notation (CMN)
is a system for representing notes; it captures with varying degrees of precision
their 5 defining properties (pitch, loudness, timbre, onset, offset).
2.2
Pitch notation
Pitch names
Pitch is “that property of a sound that enables it to be ordered on a scale
going from low to high,” according to the ASASAT1 . We begin by first assigning
names to different pitches and identifying them with keys on the keyboard.
�♯/�♭ �♯/�♭
�
�
�
�
�
�
�♯/�♭ �♯/�♭ �♯/�♭
�♯/�♭ �♯/�♭
�♯/�♭ �♯/�♭ �♯/�♭
�
�
�
�
�
�
�
�
Some observations and terminology:
1. The sequence of pitch names repeats. For example, we see there are two
occurrences of C. The corresponding pitches on the piano are not the same;
they are in fact an octave apart. These two different pitches are said to
be octave equivalent, and the name ‘C’ here identifies a pitch only up
to octave equivalence. To specify exactly which C you mean, you add
a number indicating which octave range the note falls: e.g., ‘C2’ or ‘C6’.
More on this later.
1 Acoustical
Society of America Standard Acoustical Terminology
6
2. Our pitches are divided into white notes and black notes, depending
on which piano key they correspond to.
3. A single step along our sequence, from one pitch to the very next pitch (to
the left or right, whether black or white) is called a half step. A distance
of two steps along our sequence is called a whole step. For example: the
first C is one half step from the first C] , and one whole step (=two half
steps) from the first D.
4. Sharpening a pitch corresponds to moving one half step to the right in
our sequence; flattening a pitch corresponds to moving one half step to
the left.
5. There are multiple names for the same pitch. The first black note on the
left is both C] and D[ . These are said to be two enharmonic spellings
of the pitch.
White note mnemonics
I cannot refrain from including two classic mnemonic devices in (United
States) music pedagogy: the perfectly inoffensive “FACE”, and the ever so
creepy “Every Good Boy Does Fine”.
F
A
C
E
E
G
B
D
F
Every
Good
Boy
Does
Fine
Pitches on the staff
Our next step is to transport our pitch names from the keyboard diagram
to a musical staff.
Some orientation and terminology:
7
• Here we represent our pitches as notes on a staff. Note that the alternating
sequence of lines and spaces of the staff corresponds to our sequence of
white notes: C-D-E-F-G-A-B.
• Since moving up the staff corresponds to moving along the sequence of
white notes, this movement proceeds either in half steps or whole steps,
depending on where we are in the sequence. (Compare the staff increments
D-E and E-F, for example.)
• The symbol
G is called the treble clef (or G clef). It tells us where
G lies on the staff: viz., the second line from the bottom.
Now we add the black notes simply by applying the sharpening and flattening
operations to our white notes.
Note the natural symbol (\) that occurs in front of the note representing G.
It negates the flat applied to the G before it. In general once an accidental
(i.e., a sharp or flat) is applied to a note, all subsequent instances within the
measure retain this accidental, even in the absence of the ] or [ symbol.
Pitch: bass clef
The bass clef staff is another common musical staff. As the name suggests,
it is suited to instruments (or voices) of a lower register.
The idea is essentially the same, only now the bass clef (or F clef) symbol
tells us where F lies on the staff: viz., on the second line from the top.
Here is an online pitch reading tutorial I found after a not-so-exhaustive
internet search. You can probably find better ones on your own.
Key signatures
Often times a musical piece will consistently use a certain subset of the black
notes. In such cases a key signature is introduced, declaring that a certain
8
subset of notes will always bear a given accidental. Key signatures come in
either a sharp or flat flavor. Some examples:
The first signature declares F will always be sharped, the second that F and
C will always be sharped, etc. We will make more sense of this convention, in
particular the sequence of sharps/flats in a signature, when we discuss scales
and keys.
2.3
Intervals
We define an interval simply as a set of two pitches. The interval length is
the distance, measured in half steps, between these two pitches.
Comment 2.1. In Musimathics Loy defines an interval as the difference in
pitch between two pitches. As stated, this is not a well-defined notion: given
pitches P and Q, is the interval P − Q or Q − P ? We have not as yet assigned
any numeric values to pitches, but once we do, you will see that according to
our definition the interval length between two pitches P and Q is |P − Q|. This
notion is well-defined as |P − Q| = |Q − P |. Notice also that interval lengths
will always be nonnegative, thanks to the absolute value.
Example 2.1. Compute the intervals between the following sets of pitches.
(a) 5 half steps.
(b) 4 half steps.
(c) 6 half steps
(d) 12 half steps
Sonorities
In Musimathics Loy sorts various intervals into sonority classes (“perfect”,
“major”, “minor”, etc.) and summarizes their shared sonic properties in a nice
9
loy79076_ch02.fm Page 19 Wednesday, April 26, 2006 12:13 PM
Representing Music
19
table (Loy, 19).
Table 2.1
Interval Classification by Sonority
Class
Name
Semitones
Description
Perfect
Unison
Octave
Fourth
Fifth
0
12
5
7
Provides harmonic anchoring and framework.
Major
Third
Sixth
Seventh
Second
4
9
11
2
Provides expansive emotional color.
Minor
Third
Sixth
Seventh
Second
Upper pitch is one semitone smaller than major intervals.
Minor intervals provide a contractive emotional color.
Diminished
3
8
10
1
6
Augmented
6
Upper pitch is one half step less than a minor or a perfect
interval. A diminished fifth is called a tritone.
Upper pitch is one half step greater than a major or a
perfect interval. An augmented fourth is also called a
tritone.
Table 2.2
It should
beandnoted
that
the
resulting names of intervals (“perfect fifth”, “minor
Diatonic
Minor Scale
Interval
Order
sixth”, Diatonic
etc.) Degree
are not a. . .function
1 2 3 4 solely
5 6 7 of
1 the
2 3 number
4 5 6 7 of
1 half
2 3 steps,
4 5 6 but
7 . . . depend
also on Diatonic
the particular
way
the
pitches
are
spelled.
interval order . . . 2 2 1 2 2 2 1 2 2 1 2 2 2 1 2 2 1 2 2 2 1 . . .
. 2 2 1 2 below
2 2 1 comprise
2 2 1 2 24 half
2 1 2steps.
2 1 2 Describe
2 2 1 . . . each in
Minor interval
Example
2.2. order
Both . .intervals
terms of sonorities.
7
Major
1
2
Minor 6
(a) This interval is a major third.
3
(b) This interval is a diminished
fourth.
5
4
Sonority name algorithm
Given spellings of two pitches P and Q:
Figure 2.6
Major and minor scales.
1. First determine the IntervalName (“third”, “fifth”, etc.) by counting the
number of lines and spaces they span (inclusive) and using the following
table.
#Lines/Spaces
Unison
1
Second
2
Third
3
Fourth
4
Fifth
5
Octave
8
2. Then determine the IntervalQuality (“perfect”, “major”, “minor”, “diminished”, “augmented”) by computing the interval length (in half steps)
and referring to Loy’s Table 2.1. (See following example.)
Example 2.3. Apply the sonority name algorithm to the following intervals.
10
(a) IntervalName=“fourth”. IntervalLength=6= 5 +1. A perfect fourth is of
length 5 half steps, as per Table 2.1. Thus this is an augmented fourth,
also called a tritone.
(b) IntervalName=“fifth”. IntervalLength=6= 7 -1. A perfect fifth is of length
7 half steps. Thus this is a diminished fifth, likewise called a tritone.
(c) IntervalName=“third”. IntervalLenth=3. A third of length three half steps
is a minor third.
(d) IntervalName=“sixth”. IntervalLength=7= 8 -1. A sixth of length 8 half
steps is a minor sixth. Thus this is a diminished sixth.
(e) IntervalName=“octave”. IntervalLength=11= 12 -1. A perfect octave is of
length 12 half steps. Thus this is a diminished octave.
Naturally, musicians do not use such an algorithm when coming up with
names of intervals; instead, they simply know the names/qualities of all the
white note intervals, then compare the given interval to one of these, adding a
“diminished” or “augmented” as necessary. In other words, they have the chart
below burned into their memory. A useful observation in this regard is that
there is exactly one non-perfect white fourth, and exactly one non-perfect white
note fifth: the tritones containing B and F. (P=perfect, M=major, m=minor)
Fourths
P4
4  
 
 
 




Fifths
8
P5
Thirds
15
M3
Sixths
22
29
2.4
M6
P4
 
 
 
 

P5
m3
M6
Onset and offset
TRITONE!
P4
 
 
 
 

P5
m3
m6
P4
  
  
  
  
 
P5
P5
M3
M3
M6
M6




P4




 




P5
m3
m6
Recall what we originally set out to do: show how the 5 properties of notes are
represented in CMN. We’ve spent an inordinate amount of time on pitch, and
11
P4









TRITONE!
m3
m6
now I will proceed to give short shrift to loudness and timbre before moving on
loy79076_ch02.fm
Page 28
Wednesday, April 26, 2006 12:13 PM
to
onset and
offset.
Loudness
The28 loudness of a note is indicated by dynamics markings.Chapter
I will
not
2
attempt to improve upon Loy’s Table 2.5 (Loy, 28).
Table 2.5
CMN Indications for Dynamic Range
Pianississimo
ppp
As soft as possible
Mezzo forte
mf
Moderately loud
Pianissimo
pp
Very soft
Forte
f
Loud
Piano
p
Soft
Fortissimo
ff
Very loud
Mezzo piano
mp
Moderately soft
Fortississimo
fff
As loud as possible
level for his or her instrument, depending upon musical context. The nuances of this context are
quite subtle and extensive, usually requiring years to master.
The CMN indications for dynamic range are shown in table 2.5. The Italian names are univerTimbre
sally used, I suppose because they invented the usages, which were subsequently adopted by other
European
countries. The
dynamic
range indications
2.5 areproperties
entirely subjective.
describeand we
Timbre
is perhaps
the
slipperiest
of thein 5table
sonic
of aI note,
how toitrelate
them to objective
in section
will tackle
in earnest
latermeasurements
on. One way
of 4.24.
describing the timbre of a note is
For instruments that can change dynamic level over the course of time, the “hairpin” symbol
to describe what
instrument it sounds like, and indicates
this isaaccomplished
in musical
indicates a gradual increase in loudness, while
gradual decrease. Bowed
notation
by declaring
given
score level
is intended
forofpiano,
andsimply
blown instruments
can usually that
effect aachange
in dynamic
during the course
a single or for
note. Struck instruments including pianos generally can’t change the dynamic level of a note after
violin, etc.
it is sounded
but can
change
dynamic
levelsisover
the course
of several notes.
proper inter-in what
Further
details
about
the
timbre
given
by notation
thatTheindicates
pretation of these cues is part of every musician’s training.
manner a note should be played on a given instrument. For example, a score
for violin
will indicate whether a note should be played with vibrato, whether
2.8 Timbre
notes should be played legato or staccato, and what type of bowing should be
In musical
scores,techniques
timbre means the
type of instrument
to be played,
such as violin, trumpet, or basused. All
of these
effect
the timbre
of notes.
soon. But timbre also is used in a general sense to describe an instrument’s sound quality as sharp,
dull, shrill, and so forth.
Onset, offset,
duration
How quickly
an instrument speaks after the performer starts a note, whether it can be played with
and many
instrumental
qualities
are also lumped
as timbre.
Timbre also gets
We vibrato,
introduce
another
abstract
time
variable
t to together
the score,
measured
in beats.
mixed up with loudness because some instruments, like the trombone, get more shrill as they get
The beginning
of
the
score
is
set
to
time
t
=
0.
louder. As a consequence, it’s easier to say what timbre isn’t than what it is: timbre is everything
Using
lines,
aduration,
score and
is not
divided
upHowever,
into negative
measures,
aboutvertical
a tone that isbar
not its pitch,
not its
its loudness.
definitionseach of
are slippery
and provide no(though
new information.
which has
a well-defined
possibly variable) number of beats. Thus at
There areofother
of representing
tones
shed positive
light on timbre.
as colors
can using
the beginning
anyways
given
measure
wethat
know
how many
beatsJust
have
passed
be shown to consist of mixtures of light at various frequencies and strengths, sounds can be shown
some simple
to consistarithmetic.
of mixtures of sinusoids at various frequencies and strengths (see volume 2, chapter 3).
TheFor
onset
an we
event
scoreonis
the amount
ofour
time
t1 us
(in
between
instance,ofwhen
hear a in
noteaplayed
a trumpet,
even though
ears tell
webeats)
are hearing
a single tone,
fact score
we are hearing
simpler
tones mixed of
together
a characteristic
way that
the beginning
of inthe
and the
beginning
the inevent;
its offset
isour
the time
minds—perhaps
through long
perhaps
some intrinsic
into
t2 between
the beginning
ofexperience,
the score
andthrough
the end
of the capability—fuse
event; the duration
the perception of a trumpet sound.
of the event is t2 − t1 , the amount of time (in beats) the event lasts.
Note values and time signatures
The system of note values allows us to compare the durations of different
types of notes.
12
The duration of a given note value above is always 12 the duration of its
neighbor to the left; thus two half notes make a whole note (in duration), two
quarters make a half, etc.
Finally, a time signature is used both to specify the number m of beats per
measure, and the note value n (2=half, 4=quarter, etc.) which will be assigned
a duration of 1 beat. This is notated using a ratio-like notation of the form
m/n. Only now, with all this notational equipment at our disposal, can we fully
specify onsets, offsets and durations of notes in scores.
Example 2.4.
There are two additional bits of notation in the score above that require
explanation. The arc connecting the two E notes is called a tie; it indicates
that the note is sounded only once and is “held over the bar”, for a total value
of 2 quarter notes. Also, adding a dot after a note value, as we have after the
last C, has the effect of increasing the note value by 1/2; thus the last note has
a value of 2+1=3 quarter notes.
1. How many beats per measure are there? Ans: 6 beats.
2. Which note value has a duration of 1 beat? Ans: the eighth note.
3. What is the duration (in beats) of the A in the third measure? Ans: 1
half=2 quarters=4 eighths=4 beats.
4. What is the onset (in beats) of the A in the third measure? (Careful: the
first note of the piece has onset 0.) Ans: 13 beats.
Classic 2 (Musica Ricercata, No. 1, by György Ligeti). Hungarian composer
György Ligeti wrote Musica Ricercata between 1951-1953. Ligeti’s own description of the composition:
In 1951 I began to experiment with very simple structures of sonorities and
rhythms as if to build up a new kind of music starting from nothing. My approach was frankly Cartesian, in that I regarded all the music I knew and loved
as being, for my purposes, irrelevant and even invalid.
The word ‘ricercata’ is derived from the Italian verb ‘ricercare’, meaning “to
search” or “to investigate”. As such the title means something to the effect
of “investigative music” or perhaps even “experimental music” (as in scientific
experiment). In music a ricercar was a sort of fugue-precursor popular in the
16th and 17th centuries. Such pieces had an abstract or technical flavor; their
aim was often to investigate or articulate musical consequences of a single theme
using counterpoint. In Musikalisches Opfer Bach includes two ricercare based
on the same “Royal theme” (“Thema Regium”) from Classic 1.
Ligeti’s Musica Ricercata contains 11 pieces. The first piece uses only 2
pitch classes (A and D), and in each subsequent piece the number of pitch
13
classes is incremented by 1. Thus the last piece uses all 12 pitch classes, and
is itself a ricercar in homage to Girolamo Frescobaldi’s (c. 1583-1643) ‘Ricercar
cromatico’.
3
Frequency
We will now set about modeling sonic properties like frequency and pitch
using mathematical language. This will provide an opportunity to introduce
(or review) some basic mathematical notation and operations. Furthermore, we
will meet two very important types of mathematical objects that you probably
have not seen before: groups and topological spaces.
3.1
Frequency space
The physical property of a sound that is most strongly associated with pitch is
frequency. We will say in more detail what frequency is later (when discussing
timbre); for now, let us be content to say that pitched sound has a periodic (or
repetitive) quality, and frequency measures the number of repetitions (or cycles)
per second exhibited by the sound.
Some basic properties of frequency:
• The SI unit of measurement for frequency is the hertz (Hz), defined as
1 Hz = 1 cycle per second.
• As frequency increases, so does the perceived pitch.
• A frequency f , being a measure of the number of cycles per second, is a
positive number, though not necessarily an integer: we can have f = 12 ,
√
f = 2, f = π1 , etc.
• Humans can hear frequencies ranging from around 20 Hz to 20,000 Hz (or
20 kHz).
Frequency space
A frequency f is allowed to be any positive real number. The set of all real
numbers is denoted R. We define frequency space to be the set of all possible
frequencies.
Definition 1. The set of all possible frequencies is called frequency space,
denoted Xfreq . From the observation above, we see that
Xfreq
=
(0, ∞)
=
{x ∈ R : x > 0}
=: R>0
14
Comment 3.1. The three equalities in the definition above introduce some
interval notation, set notation, and a naming convention, respectively.
1. Recall that the open interval (a, b) is defined as the set of all x with
a < x < b: i.e., all numbers strictly between a and b. Similarly the closed
interval [a, b] is defined as the set of all x with a ≤ x ≤ b.
2. The second equality in the definition expresses this notion using set notation. In general we would write
(a, b) = {x ∈ R : a < x < b},
which reads: The set ({. . . }) of all elements x in (‘∈’) the reals such that
(‘ : ’) x is greater than a and less than b.
3. The last equality (‘=:’) is a naming notation that declares that the thing
on the left will be denoted by the thing on the right. Similarly (‘:=’) we
will be used to declare that the thing on the right will be denoted by the
thing on the left.
Below you find the frequencies associated to a variety of A pitches. Our
naming scheme for the pitch now includes a number indicating which octave the
pitch lies in on the standard 88-key piano. The numbering scheme is calibrated
on where C notes lie. Thus the lowest C on the piano is C1; as there is an A
below this lowest C, that A is called A0.
We immediately observe that going up an octave corresponds to doubling the
frequency. This is not a recent discovery.
3.2
The Pythagorean Legend
The history of the relation between intervals and frequencies goes back to an
apocryphal story about Pythagoras (c. 580-500 BC). See Figure 1.
Here is a reading of Gaffurius’ woodcut comic. Passing a blacksmith shop,
Pythagoras notices that the sounds created by hammers of different weights
striking the anvil sound nice together (consonant). When he measures the
weights of the different hammers he notices that their ratios take the form of
“simple” fractions: that is, fractions that can be expressed as ratios of small
whole numbers. He then performs similar experiments using bells of varying dimensions, glasses of water of varying height, pipes of varying lengths, etc. Each
15
Figure 1: Woodcut from Theorica Musicae, by Franchinus Gaffurius
time he observes a similar phenomenon: when the dimensions of two simultaneously sounding instruments form simple ratios (12/9=4/3, 8/6=4/3,etc.), the
resulting interval is consonant.2
The story is in a sense the founding legend of the connection between music
and mathematics. For Pythagoras and his followers, this was yet another striking example of mathematics governing nature. Historical consequences: earned
music a place in the classical quadrivium along with arithmetic, geometry and astronomy; added fuel to the Pythagoreans’ already raging obsession with rational
numbers; influenced thinking of future scientists (Aristotle, Ptolemy, Kepler),
who sought examples of such ratios in astronomy–hence the so-called “music of
the spheres”.
Pythagoras’ conclusion, in slightly updated language, is as follows: all of
these experiments, except the third, produce sounds whose frequencies form
simple ratios; thus two sounds whose frequencies f1 , f2 can be expressed as a
simple ratio (ratio of small integers) are consonant when sounded together. In
particular, when ff12 = 21 , the interval produced is an octave.
3.3
The transposition group Tfreq
Intervals as ratios
Given two frequencies f1 , f2 ∈ Xfreq , the interval of their corresponding
pitches is determined by the ratio f2 /f1 = c. How exactly c determines the
interval is not immediately clear. For example, if f1 /f2 = 1.724, what is the
corresponding interval?
We will begin with a few simple observations. Fix f1 /f2 = c.
2 The
third picture shows Pythagoras playing a stringed instrument called a monochord.
The tensions of the strings here would form simple ratios. Since frequency is proportional
to the square-root of the tension (assuming the length of the strings is held constant), this
experiment would not produce the same phenomenon as the other three.
16
(1) Since f1 and f2 are positive, so is their ratio c. That is, c ∈ R>0 .
(2) If f1 > f2 , then
f1
f2
> 1, and thus c > 1. Likewise, if f1 < f2 , then c < 1.
(3) To say ff12 = c is the same as saying f1 = cf2 . Thus multiplying a frequency
by c corresponds to moving up (if c > 1) or down (if c < 1) by a certain
interval. This operation is called a transposition (or shift, for short).
Note that if c = 1, then the frequency f is unchanged; we call this the the
trivial transposition (or trivial shift).
Transposition group
The last observation suggests that the ratios c we deal with are best understood as defining certain transpositions on the frequency space Xfreq . This
motivates the following definition.
Definition 2. Let Tfreq be the set of all possible ratios of frequencies. In set
notation we have
Tfreq = {f1 /f2 : f1 , f2 ∈ Xfreq }.
We call Tfreq the transposition group of Xfreq .
Comment 3.2. Since Tfreq is the set of all possible frequency ratios f1 /f2 , and
since f1 and f2 are allowed to be any positive real number, it follows that the
elements of Tfreq can be any positive real number; i.e.,
Tfreq = R>0 .
Thus from now on, we will no longer think of an element c ∈ Tfreq as a ratio,
but rather simply as a positive number that defines a certain transposition.
When investigating how precisely an element c ∈ Tfreq acts as a transposition
(or shift), it becomes clear that multiplication is the relevant operation.
Let’s make this more explicit. Keep in mind for what follows that we have
both Tfreq = R>0 and Xfreq = R>0 .
1. An element c ∈ Tfreq sends an arbitrary frequency f ∈ Xfreq to the new
frequency cf , the product of c and f :
f shift by c
/ cf
2. Transposing first by d and then by c corresponds to transposing by their
product c · d. Indeed, given any frequency f , we have
f shift by d
/ d·f shift by c
shift by c · d
17
/ c · (df ) = (c · d)f
7
3. To “undo” or “reverse” the transposition c, we simply transpose by its
multiplicative inverse 1c :
f
shift by c
/ c·f / 1 (c
shift by 1/c c 7
· f) = f
trivial shift
In mathematics we say the transposition 1/c is the inverse of the transposition c.
Example 3.1. Fix a frequency f . Recall that we have already observed that
the element c = 2 ∈ Tfreq corresponds to shifting up by an octave; i.e., the
pitch 2f is exactly one octave higher than f . Let’s use the observations above
to elaborate more on octave transpositions.
1. Following the second observation above, shifting f up by 2 octaves yields
the new frequency 2(2f ) = 22 f . More generally, shifting f up n octaves
yields the new frequency 2n f . Thus for n a positive integer, the element
2n ∈ Tfreq corresponds to shifting up n octaves.
2. Following the third observation, shifting down by 1 octave corresponds to
the element 12 ∈ Tfreq . It then follows that shifting down by n octaves
corresponds to the element ( 21 )n ∈ Tfreq .
3. Recall that 12 = 2−1 , and thus that ( 12 )n = 2−n . We can summarize
the last two observations as follows: let n be any positive integer, then
the element 2n ∈ Tfreq is transposition up by n octaves, and the element
2−n ∈ Tfreq is transposition down by n octaves.
Harmonic series
So far we know how the elements of the form c = 1 and c = 2m act as
transpositions: the first is the trivial shift, the second shifts up or down by a
number of octaves. What about other elements c ∈ R>0 ?
We approach this question by first looking at positive integer values of c; that
is, c = 1, 2, 3, . . . . If we start with a fixed frequency f and begin transposing by
these values of c we obtain what is called the harmonic series on f :
f, 2f, 3f, 4f, . . . ,
Here are the approximate pitches associated to 12 terms of the harmonic
series starting on f = 110 Hz:
18
To illustrate that the above is only an approximation, note that the pitch 3f
would have frequency 330 Hz, however the E4 written on the staff in fact has
frequency around 329.63 Hz. Why? Short answer: our tuning system is not
based on the harmonic series!
Interval arithmetic
This staff pitch approximation of the harmonic series provides a means of
associating familiar interval transpositions to an element c ∈ Tfreq when c is a
positive rational number: i.e., c = m
n , where m and n are integers.
1. The interval on the staff between 2f and 3f is a perfect fifth. This tells us
3
that c = 3f
2f = 2 corresponds roughly to transposing up by a perfect fifth.
We will call the interval determined by c = 32 a Pythagorean fifth.
2. The interval on the staff between 3f and 4f is a perfect fifth. This tells
4f
= 43 corresponds roughly to transposing up by a perfect
us that c = 3f
fourth. We will call the interval determined by c = 43 a Pythagorean
fourth.
3. Start with any frequency f . If we go up a Pythagorean fifth we get the
frequency f 0 = 23 f . If we then go down a Pythagorean fourth, we get
the frequency f 00 = 34 f 0 = 34 ( 32 f ) = 98 f . Since this process corresponds
roughly to going up a perfect fifth and then down a perfect fourth, we see
that the ratio c = 98 corresponds roughly to a major second!
Group structure of Tfreq
In coming to understand Tfreq = R>0 , we have seen how important a role
multiplication has played. As it turns out, the set R>0 taken with the multiplication operation is an important example of what is called a group in mathematics.
Definition 3. A group is a pair (G, ·), where G is a set, and · is an operation
which, given any two elements g1 , g2 ∈ G outputs a third element h = g1 ·g2 ∈ G,
and which further satisfies the following axioms:
(i) The operation · is associative: i.e., g1 · (g2 · g3 ) = (g1 · g2 ) · g3 for all
g1 , g2 , g3 ∈ G.
(ii) There is an identity element e ∈ G satisfying e · g = g and g · e = g for
all g ∈ G.
(iii) Every g ∈ G has an inverse in G: that is, there is an element h such that
g · h = h · g = e, the identity element. We write h = g −1 in this case.
Let’s carefully show that Tfreq is a group.
1. We must first state explicitly what the underlying set is, and what the
operation is. In this case the set is Tfreq = R>0 , the set of all positive
numbers, and the operation is simply real number multiplication.
19
2. Next we must show our operation is associative. This is immediate in our
case as we know already real multiplication is associative: r(st) = (rs)t,
for any real numbers r, s, t.
3. Next we must identify an identity element e in our set, and show it satisfies
the required property. In our case we take e = 1 ∈ Tfreq . For any other
c ∈ Tfreq we have 1 · c = c · 1 = c, again by familiar properties of real
number multiplication.
4. Lastly, given any c ∈ Tfreq we must show there is an inverse element
d ∈ Tfreq satisfying c · d = d · c = 1. We take d = 1c . This is indeed
an element in Tfreq , since if c > 0, then so is 1c . Once again, familiar
properties of multiplication imply c · 1c = 1c · c = 1.
This all looks deceptively simple, mainly because the underlying set and
operation in this example are both very familiar to us. However, the notion of a
group is very general, and examples can be much more exotic than this. Here’s
how:
1. The underlying set G need not be a set of numbers. It may be a set
of functions, or of letters, or of anything whatsoever. Furthermore, the
underlying set may be finite or infinite.
2. The group operation may have nothing to do with operations familiar to
you from arithmetic. As long as the operation is well-defined and satisfies
the three axioms, we have a group.
3. In particular, though the group operation must be associative, it is not
required to be commutative: that is, we can have groups (G, ·) such that
g1 · g2 is not necessarily equal to g2 · g1 for all elements g1 , g2 in G!
Example 3.2. Let G = {x, y} (a set with two elements), and define an operation ∗ on G as follows:
x∗x
x∗y
= x
= y∗x=y
y∗y
= x
Show that (G, ∗) is a group. (Note: as you see, we don’t always have to use ·
to denote the group operation.)
Solution: to show the operation is associative, one has to show that a
number of different equalities of the form a ∗ (b ∗ c) = (a ∗ b) ∗ c are true. As an
example observe that
x ∗ (y ∗ y)
= x ∗ x = x, and
(x ∗ y) ∗ y
= x ∗ x = x;
thus x ∗ (y ∗ y) = (x ∗ y) ∗ y.
20
Once we know the operation is associative, we need to identify the identity
and inverses. In this case, we declare e = x. This satisfies the identity axiom as
x ∗ x = x and x ∗ y = y ∗ x = y. Finally, since x ∗ x = x = e and y ∗ y = x = e,
we see that all elements are their own inverses! Thus we have x−1 = x and
y −1 = y.
Example 3.3. Let G = R>0 and let + denote the usual operation of real
number addition. Show that (R>0 , +) is not a group. Specify exactly which
axioms are satisfied, and which axioms fail.
Solution: Addition does in fact define an operation on R>0 : given any
positive x, y ∈ R>0 , their sum x + y is still positive, and thus lies in R>0 .
Furthermore, we know this operation is associative, since addition is in fact
associative on all of R.
However, I claim there is no identity element in R>0 with respect to addition.
Indeed suppose there were an e ∈ R>0 satisfying the identity axiom. Then in
particular we would have e + 1 = 1; but this implies that e = 0, which is a
contradiction since 0 ∈
/ R>0 .
Once we know there is no identity, there is no need to look for inverses, since
this notion makes use of an identity element in its definition.
Example 3.4. Now let G = R, the set of all real numbers. Show that (R, +) is
a group, but (R, ·) is not a group, where + and · denote real number addition
and multiplication, respectively.
Solution: consider first (R, +). As noted above, we know already that
addition is associative on R. We declare the identity element to be e = 0 ∈ R.
This satisfies the identity axiom as 0 + r = r + 0 = r for any r ∈ R. Lastly
given any r ∈ R, its inverse with respect to + is −r, since r + (−r) = 0 = e.
(Note: common decency prevents us from using the group notation for inverses
and writing in this case r−1 = −r.)
Now consider (R, ·). Multiplication is indeed an associative operation on R,
and furthermore we can set e = 1 as the group identity element–in fact, we are
forced to do so: if e · r = r for all r ∈ R, then in particular e · 1 = 1, which
implies e = 1. Furthermore given any nonzero r ∈ R, we can define its inverse
as r−1 = 1r . However, we cannot forget that 0 ∈ R, and 0 has no inverse with
respect to multiplication. Indeed, we have 0 · r = 0 for all r, so there can be no
r with 0 · r = 1 = e.
3.4
Just tunings and equal temperament
Let f ∈ Xfreq correspond to a particular instance of C, and let 2f be its transposition up one octave. Between f and 2f lie infinitely many frequencies in Xfreq ,
and yet our tuning system, 12-tone equal temperament, makes uses of only 12
of these–the 12 white and black notes of the keyboard starting with the first C
and ending with B.
What exactly are the corresponding frequencies of these pitches, and how
did we decide upon them?
21
The 12-tone equal-tempered system, though itself not strictly based on the
harmonic series on f (f , 2f , 3f ,. . . ), is the direct descendant of tuning systems
that were based on this series; we will call such systems just tunings. After rigorously defining the 12 pitches appearing in our (unjust) equal-tempered system,
we will compare this system with one of its direct ancestors, the Pythagorean
tuning system.
12-tone equal temperament
We continue to let f ∈ Xfreq correspond to a C pitch.
The pitches of equal temperament are generated using a single half step
interval that divides the octave from f to 2f into 12 equal subintervals. What
ratio c corresponds to this half step interval?
We are tempted to take the octave interval ratio 2 and divide it by 12,
yielding 2/12=1/6. However this breaks 2 into 12 equal parts in an additive
manner,
1
1 1
2 = + + ··· + ,
6
6
6}
|
{z
12 times
and we’ve seen that multiplication is the relevant operation when dealing with
frequency space. As we will see below we seek instead a number c such that
2 = |c · c ·{zc · · · }c .
12 times
Thinking in terms of our transposition group gives a clearer perspective of
things.
Raising f by a fixed interval step by step corresponds to fixing a c > 1 in
Tfreq and successively multiplying f by c. Thus the first step would be cf , the
second step would be c(cf ) = c2 f , and in general the n-th step would be cn f .
To say that this fixed interval c divides the octave into 12 steps means that
the 12-th step c12 f brings us to 2f : i.e., we have c12 f = 2f . Canceling f on
both sides, we conclude that c12 = 2. Lastly, we solve this equation for c to
conclude that
√
12
c = 21/12 =
2
represents transposition up by an equal-tempered half step!
Equal-tempered intervals
Once we know that the equal-tempered half step corresponds to c = 21/12 ,
we can easily represent any other equal-tempered interval in terms of c by computing the interval’s length in half steps: an interval of length n half steps
corresponds to
cn
=
=
(21/12 )n
2n/12 (using an old exponentiation rule).
22
Thus we easily derive the following table (M=major, m=minor, P=perfect):
Interval
m2
M2
m3
M3
P4
Tritone
P5
Half step length
1
2
3
4
5
6
7
Exact value
21/12
22/12
23/12
24/12
25/12
26/12
27/12
Decimal approx.
1.059
1.122
1.189
1.26
1.335
1.414
1.498
Sequence of fifths
There are other intervals besides the half step, which can generate all 12
pitches of the equal-tempered system. One example is the perfect fifth c5 =
27/12 . For what follows I will fix a frequency f corresponding to C2, though the
procedure I describe works with any starting pitch.
Beginning with f we transpose successively by a prefect fifth, and if necessary, reduce by an octave to get a pitch between f and 2f . The table below
illustrates the procedure for the first few pitches in this sequence.
Term
f0
f1
f2
f3
f4
Frequency
f
27/12 f0 = 27/12 f
27/12 f1 = 214/12 f
27/12 f2 = 29/12 f
27/12 f3 = 216/12 f
After octave adjustment
f
27/12 f
22/12 f
29/12 f
24/12 f
Pitch name
C2
G2
D2
A2
E2
The resulting sequence f0 , f1 , f2 , . . . is called a sequence of fifths: as the last
column shows, the pitch names jump up by fifths.
The procedure we described adjusts by octave at each step if necessary.
Alternatively, starting at our f corresponding to C2, we could simply repeatedly
transpose up by fifth to generate a 12-note sequence, and then adjust by the
correct number of octaves as shown in the figure below.
23
Here the last staff contains the pitches adjusted appropriately by octave. Note
the enharmonic relation between F] and G[ in measures 7 and 8.
Pythagorean tuning
In Gaffurius’ woodcut depiction of Pythagoras we see in each experiment
the sequence (6, 8, 9, 12). The sequence gives rise to the frequency ratios
8/6 = 4/3 , 9/6 = 3/2 , 12/6 = 2 .
Recall that the intervals corresponding to 4/3 and 3/2 are called the Pythagorean
fourth and fifth, respectively, and that these are roughly equal to the equaltempered perfect fourth and fifth.
The Pythagorean tuning system can be generated by these two intervals by
successively transposing up by a Pythagorean fourth or fifth, and then adjusting by octave as necessary–a process similar to the sequence of fifth procedure
described above.
In fact, since going up a fourth is the same as going down a fifth after octave
adjustment, we can generate the Pythagorean system via a sequence of fifths
going up and down from our starting frequency f .
Fix the frequency f corresponding to C4. The table below illustrates how
the Pythagorean tuning system is generated by transposing up and down by a
Pythagorean fifth (c = 32 ). Reading up (resp., down) the table corresponds to
transposing up (resp., down) by Pythagorean fifths.
Term
f6
f5
f4
f3
f2
f1
Frequency
= 729
f
256
After octave adjustment
729
f
512
Approx pitch name
F] 4
243
f
128
= 81
f
32
= 27
f
16
3
9
f = 4f
2 1
3
f
2
243
f
128
81
f
64
27
f
16
9
f
8
3
f
2
B4
G4
3
f
2 5
3
f
2 4
=
3
f
2 3
3
f
2 2
E4
A4
D4
f0
f
f
C4
f−1
2
f
3
F4
f−2
2
f
= 89 f
3 −1
2
32
f
= 27
f
3 −2
4
f
3
16
f
9
32
f
27
128
f
81
256
f
243
1024
f
729
f−3
f−4
f−5
f−6
2
64
f
= 81
f
3 −3
2
256
f
=
f
−4
3
243
512
2
f
=
f
3 −5
729
B[ 4
E[ 4
A[ 4
D[ 4
G[ 4
Some peculiarities about the Pythagorean system:
1. The Pythagorean half step is the interval between the Pythagorean
E and F. This corresponds to c = (4/3)/(81/64) = 256/243. Unlike the
equal-tempered half step, we cannot obtain the other intervals by taking
successive transpositions by c. For example, we have c2 < 9/8 < c3 ,
24
showing in particular that two Pythagorean half steps do not make a
Pythagorean whole step.
2. The table above contains in fact 13 distinct pitches! Although F] and G[
6
are the same pitch in equal-tempered tuning, the frequency f6 = 329 f is
10
slightly higher than our F] , and f−6 = 236 f is slightly lower than our G[ .
The ratio f6 /f−6 = 312 /219 is very close but not equal to 1. This ratio is
called the Pythagorean comma.
3. Whereas a sequence of equal-tempered fifths starting at G[ would take us
back exactly to G[ after 12 steps, whence “circle of fifths”, the sequence of
Pythagorean fifths takes us from f−6 to f6 , which is slightly higher than
f−6 . The sequence of Pythagorean fifths thus fails to close, whence “spiral
of Pythagorean fifths”, and this failure is measured by the Pythagorean
comma.
We saw that the sequence of Pythagorean primes did not close after 12 steps,
but what about after 13 steps, or 100 steps? In fact the sequence will never
close.
To state this more clearly, let c = 32 and fix any frequency f . Then no two
frequencies in the Pythagorean sequence of fifths
f, cf, c2 f, c3 f, . . .
are the same, even after adjusting octaves.
Here is a proof of this fact. Suppose by contradiction that two elements in
the sequence, say cn f and cm f with n < m, did in fact differ by a number of
octaves. Then we would have
cn f = 2−r cm f,
where r is the number of octaves between them. Canceling f and substituting
c = 32 , we obtain
3n
3m
= 2−r m .
n
2
2
Now clear denominators to obtain
2m+r−n = 3m−n .
This last equality is a contradiction! Why? An integer cannot simultaneously
be a power of 2 (the left-hand side) and a power of 3 (the right-hand side).
This is a consequence of the fundamental theorem of arithmetic, which we will
discuss below. Since we’ve reached a contradiction our original assumption must
be false. Thus no two frequencies in the Pythagorean sequence of fifths differ
by a number of octaves; the sequence never closes!
25
The unjustness of equal temperament
The Pythagorean system is a just one: its intervals all correspond to rational
numbers, ratios m/n where both m and n are integers. This is a consequence
of this system being generated by the interval 3/2 taken from the harmonic
series. The equal-tempered system was the result of efforts to iron out some of
the peculiarities of the Pythagorean system and its descendants. Though equal
temperament was successful in this regard, justness was lost in the process.
In fact the only intervals in equal temperament that are √
rational numbers are
octaves; the rest, starting with the half step 21/12 = 12 12 upon which the
system is founded, are irrational (that is, not rational).
√
Classic 3 (The irrationality of 12 12, by Euclid). Euclid has a lot of classics.
You should also check out The infinitude of primes.
√
The argument I describe below is more often used to demonstrate that 2 is
irrational, but the two proofs are very similar. Both rely on Euclid’s fundamental
theorem of arithmetic, which I will describe first, along with some terminology.
A prime number is a positive integer p ∈ Z that has exactly two factors,
namely 1 and p. Some examples: 1 is not prime, as it has exactly one factor; 2
is prime; 3 is prime; 4 is not prime as 1, 2 and 4 are all factors. Equivalently
an integer n > 1 is not prime (called composite) if and only if we can factor
n = a · b with a > 1 and b > 1. A useful property of prime numbers is that if a
prime p divides a product of integers a · b, then it divides a or b; in particular,
if a prime p divides a power of an integer ar , then p divides a.
The fundamental theorem of arithmetic (FTA) tells us that prime numbers are the building blocks of all integers. Formally, given any positive integer
n, we can:
(i) decompose n = p1 · p2 · · · · pn as a product of prime numbers pi , and
furthermore,
(ii) this decomposition is unique.
This theorem is also known as the unique factorization theorem. Let’s look at a
simple example to see how this works, and to understand what exactly is meant
by “unique”. The integer n = 12 can be written as a product of primes in a
number of ways: e.g., we have
12
=
2·2·3
12
=
2·3·2
12
=
22 · 3
Technically the last example expresses 12 as a product of powers of primes, but
this is the standard way we write the prime factorization of 12. These three
factorizations are all different in some sense, but each factorization draws from
the same set of primes {2, 3}, and furthermore, the number of times each prime
of this set “appears” is the same: 2 appears twice, 3 appears once. This then is
what we mean by uniqueness: given any two prime factorizations of an integer
26
n, the factorizations will contain the same set of primes, and each prime of this
set will appear the same number of times.
√
We are finally ready to prove that c = 21/12 = 12 2 is irrational: that is, we
can not write c = m
n with m and n integers. Indeed suppose we could. Writing
c = 21/12 , we would have
m
21/12 =
(1)
n
for two integers m and n. Canceling factors if needed in the ratio m/n, we can
assume that m and n share no common factors. This is very important later
on.
Now raising both sides of (1) to the 12th power, we see that
2
=
2n12
=
m12
, which implies
n12
m12 .
(2)
(3)
Since 2 divides the left-hand side of (3), it must also the right-hand side. So 2
divides m12 . Since 2 is prime, this implies that 2 divides m. Now this implies
that in fact 212 divides m12 , since each of the twelve m’s in this product has a
2 in it. Now we see that the right-hand side of (3) is divisible by 212 , thus so is
the left-hand side; 212 divides 2n12 . However, for this many 2’s to appear in the
factorization of 2n12 , we must have 2 appearing in the factorization of n; but
then m and n share 2 as a common factor, which is a contradiction! Thus we
cannot write c = 21/12 as a ratio of two integers, which means c is irrational.
Classic 4 (Eleven Intrusions, by Harry Partch). Harry Partch was a 20thcentury American composer who invented and composed with a just scale containing 43 distinct frequencies within the octave.
The first two pieces in this cycle are scored for Bass Marimba, and the
Harmonic Canon, instruments invented by Partch himself.
No. 1, “Study on Olympos’ Pentatonic”. Just intonation scale.
f
9
8f
6
5f
3
2f
8
5f
2f
G
A
B[
D
E[
G
No. 2, “Study on Archytus’ Enharmonic”. A scale built of two identical
tetrachords. The pitch G↑ corresponding to 28
27 is in fact not included in Partch’s
43-note scale on G.
f
28
27 f
16
15 f
4
3f
3
2f
14
9 f
8
5f
2f
G
G↑
A[
C
D
D↑
E[
G
No. 5, “The Waterfall”, scored for Apapted Guitar II and Diamond Marimba–
more Partch inventions.
27
4
Pitch space
The multiplicative picture of frequency space and its transposition group,
wherein intervals are identified with ratios of frequencies, and transposition is
identified with multiplication, is forced upon us by the physical nature of sound.
However, the language we use when speaking of pitch is additive in nature.
Earlier we defined the interval between two pitches in terms of their difference,
and we think of transposing a pitch up by an interval to be addition by a certain
number of half steps.
Recall that our notion of pitch is itself an abstraction of the physical property
of frequency: whereas frequency is a measurable property of sound waves, pitch
is a measure of how high or low we perceive a sound to be. Since our movement
from the world of frequency to the world of pitch is already a movement away
from the pure physical properties of sound, there is no harm in also freeing
ourselves from the multiplicative picture of frequency space the world imposes
on us, and choose instead an additive picture of pitch space that suits the
language we already use.
Mathematically speaking, our switch of pictures will correspond to replacing
the multiplicative group (R>0 , ·) with the additive group (R, +).
4.1
Pitch space
We will represent pitches as points on the real line. The pitches of 12-tone
equal temperament will be embedded in the real line by declaring that the pitch
C4 (“middle C”) will correspond to the real number 0 (the “middle” of the real
line), and that moving up or down by n half steps from C4 corresponds to taking
0 and adding or subtracting n.
Definition 4. We define pitch space to be the set Xpitch of all possible pitches
and identify this with the set of all real numbers: Xpitch = R.
The pitches of our 12-tone equal-tempered system are embedded in R by
(i) setting 0 equal to C4, and
(ii) declaring transposition up by a half step to correspond to addition by 1.
The formal definition of Xpitch gives us the following new picture of pitch
28
space.
G3 A♭3 A3 B♭3 B3
-5
-4
-3
-2
-1
C4 C♯4 D4 D♯4 E4 F4
0
1
2
3
4
5
ℝ
ℤ
As the diagram illustrates, the pitches of 12-tone equal temperament are embedded as the integers Z inside of R. (Recall that Z = {· · · − 3, −2, −1, 0, 1, 2 . . . }.)
The real numbers lying between successive integers correspond to pitches that
lie outside of our tuning system. It is instructive to compare our pitch space
picture of equal temperament with the corresponding frequency space one:
G3 A♭3 A3 B♭3 B3
-5
-4
-3
-2
-1
C4 C♯4 D4 D♯4 E4 F4
0
1
2
3
4
5
ℝ
ℤ
G3
200
C4
F4
300
B♭4
400
500
ℝ>0
4.2
The transposition group Tpitch
Given two equal-tempered pitches P1 < P2 in Xpitch , their difference P2 − P1
tells us the length in half steps of the interval between them. (Observe that
since P1 , P2 are equal-tempered, they will be integers, and thus so will their
difference.) We generalize this notion two any two pitches in Xpitch .
Definition 5. An interval in Xpitch is a set {P, Q}, where P, Q ∈ Xpitch . The
length (in half steps) of the interval {P, Q} is defined as |P − Q|.
29
Example 4.1.
1. Let P = 52 and Q = 13
2 . Then the interval {P, Q} has length |P − Q| =
| − 82 | = 4 half steps. Thus the interval is a perfect third lying wholly
outside of our tuning system!
2. Given any pitch P , the interval {P, P } has length |P − P | = 0; this is the
unison interval.
√
√
2 and Q =√2 + √3. Then
3. Let P√ = 2 − √
√ the√interval {P, Q} has length
|2 − 2 − (2 + 3)| = | − 2 − 3| = 2 + 3 half steps.
Given any real number t ∈ R, the operation
P 7→ (t + P )
defines a transposition operation on Xpitch . If t ≥ 0 this is transposition up by
t half steps; if t ≤ 0 this is transposition down by |t| half steps. This motivates
the following definition.
Definition 6. The transposition group of Xpitch is defined as the set Tpitch =
R. An element t ∈ Tpitch acts as a transposition by sending a pitch P to t + P .
Comment 4.1. As its name indicates, Tpitch is indeed a group: namely, the
group (R, +), of real numbers with group operation defined to be addition.
Arithmetic in Xpitch
Switching from Xfreq to Xpitch simplifies our interval arithmetic immensely.
Transpositions and interval lengths are now computed with additions and subtractions. This is more in line with how we naturally speak about pitches and
intervals. Furthermore transposing by an equal-tempered half step now corresponds to the supremely simple operation of adding the integer 1 to a pitch, as
opposed to the more complicated operation of multiplying a frequency by the
irrational number 21/12 .
Example 4.2. Find the t ∈ Tpitch that corresponds to the following transpositions.
1. Down by a major third.
Solution:(t = −4)
2. Up by a major 7th.
Solution: (t = 11)
3. Up by a tritone.
Solution: (t = 6)
30
4.3
Comparing the two pictures
Working in pitch space is easier than working in frequency space: so much
easier that you may be asking yourself whether we are cheating somehow. A
more nuanced question: do we lose any information when we switch from our
frequency picture to our pitch picture? The answer is no, as far as interval
information is concerned, and we have a precise mathematical way of showing
this.
For what follows let’s first drop the (ontological) distinction between frequency and pitch, referring to both simply as pitch. Then the main difference
between Xfreq and Tfreq , on the one hand, and Xpitch and Tpitch on the other,
is how exactly the notion of pitch is modeled. The former models pitch multiplicatively using R>0 ; the latter does so additively using R.
To convince ourselves that both models of pitch provide the same information
as far as intervals are concerned, we need to show there is a way of translating
(in a linguistic sense) from one model to the other, and back again. We will do
so mathematically using the function log2 (x).
First consider our two different transposition groups Tfreq = (R>0 , ·) and
Tpitch = (R, +), where here I emphasize the group structure of both of these
objects. We define a function
φ : Tfreq → Tpitch
by setting φ(c) = 12 log2 (c) for any c ∈ Tfreq .
Example 4.3.
1. Take c = 2 ∈ Tfreq , the Tfreq representation of transposition up by an
octave. Then φ(2) = 12 log2 (2) = 12. Thus φ sends 2 ∈ Tfreq to 12 ∈
Tpitch . That’s good, since 12 is the Tpitch representation of transposition
up by an octave (12 half steps=1 octave).
2. Take c = 21/12 ∈ Tfreq , the Tfreq representation of the half step. Then
φ(c) = 12 log2 (21/12 ) = 12(1/12) = 1, where here I have use the general
property that log2 (2r ) = r for any r. Note that 1 ∈ Tpitch is simply
the Tpitch representation of the half step. This shows that φ correctly
translates our Tfreq representation of the half step to the corresponding
Tpitch representation of the half step.
3. More generally take c = 2n/12 , the Tfreq representation of transposition up
by n half steps. Then φ(c) = 12 log2 (2n/12 ) = 12(n/12) = n, which is the
Tpitch representation of this same transposition. It seems φ(x) does a good
job of translating between our frequency language and pitch language!
Here’s another nice property of our translating function φ: take any c1 , c2 ∈
Tfreq and let φ(c1 ) = t1 and φ(c2 ) = t2 be their corresponding representatives
31
in Tpitch ; then we have
φ(c1 · c2 )
=
12 log2 (c1 · c2 ) (by definition)
=
12(log2 (c1 ) + log2 (c2 )) (since log2 (a · b) = log2 (a) + log2 (b) )
=
12 log2 (c1 ) + 12 log2 (c2 )
=
φ(c1 ) + φ(c2 ) = t1 + t2 .
Let’s unravel what this means. Let T1 and T2 stand for the transpositions that c1
and c2 represent in Tfreq . Then t1 and t2 are their corresponding representations
in Tpitch , and the equalities φ(ci ) = ti are understood as saying φ translates ci
as ti .
Now the element c1 · c2 is the Tfreq representation of transposing first by T2
and then by T1 . If φ acts as a good translator, then φ(c1 · c2 ) should be the
Tpitch representation of transposing first by T2 and then T2 . In other words, we
should have φ(c1 · c2 ) = t1 + t2 , which is exactly what the equations above show!
On the level of groups, the equation φ(c1 · c2 ) = t1 + t2 means that the
function φ respects the relevant group operations on each group; it sends a
product in R>0 to a sum in R. This is one of the defining properties of what is
called a group homomorphism.
Definition 7. Let (G, ·) and (H, ∗) be two groups. I’ve denoted the two relevant
operations with different symbols so that we can keep them straight; by the same
token, let eG be the identity element of G, and let eH be the identity element
of H. A group homomorphism from G to H is a function
φ: G → H
satisfying:
(i) φ(eG ) = eH ,
(ii) φ(g1 · g2 ) = φ(g1 ) ∗ φ(g2 ) for all g1 , g2 ∈ G.
Comment 4.2. The two conditions can be summarized by saying that a group
homomorphism is a function from G to H that respects the group structure of
each: it sends the identity element to the identity element, and it sends products
(using ·) in G to products (using ∗) in H.
Since we have already shown that our function φ : R>0 → R respects the
group operations, to show it is a group homomorphism we need only show that
it sends the identity of R>0 (that is, the element 1 ∈ R>0 ) to the identity of R
(that is, the element 0 ∈ R). This is easy: φ(1) = 12 log2 (1) = 12 · 0 = 0.
Translating back
It remains to show that we can translate back from pitch space language to
frequency space language. We do so with the translator function
ψ : Tpitch → Tfreq
32
defined by setting ψ(t) = 2t/12 .
To see that φ and ψ serve as complementary translators back and forth
between frequency and pitch space, consider what happens when you start with
any c ∈ Tfreq , apply φ (translating it into pitch language), and then apply ψ
(translating back into frequency language). Using function notation the result
would be
ψ(φ(c))
= ψ(12 log2 (c))
(def. of φ)
2(12 log2 (c))/12
(def. of ψ)
=
log2 (c)
=
2
(simple algebra)
=
c (since log2 (2c ) = c for any c)
We have shown that ψ(φ(c)) = c, which tells us that the translator ψ correctly
undoes the translating work of φ. The same is true going the other way: one
shows in a similar manner that φ(ψ(t)) = t for any t ∈ Tpitch .
In mathematics we say that the functions ψ and φ are inverses of one
another, denoted ψ = φ−1 and φ = ψ −1 .
Sameness of our two pictures
Let’s collect all the nice properties of our translator function φ : Tfreq →
Tpitch .
1. φ(21/12 ) = 1: φ sends the frequency space half step (21/12 ) to the pitch
space half step (1).
2. φ(c1 ·c2 ) = φ(c1 )+φ(c2 ): φ respects how each language expresses successive
transpositions.
3. φ has an inverse translator ψ going from pitch language back to frequency
language.
These three properties should convince us that anything we say using our frequency language has an equivalent translation into our pitch language, and vice
versa. This means that Tfreq and Tpitch are simply two different, but equivalent
pictures of the world of intervals and transpositions.
We might choose to use the one picture or the other, depending on how
convenient it is. Our translators φ and ψ allow us to move effortlessly between
the two.
Group isomorphisms
The discussion above shows that our function φ : R>0 → R is what is known
as a group isomorphism.
Definition 8. Given groups (G, ·) and (H, ∗) a group isomorphism is a a
homomorphism φ : G → H that is invertible: that is, for which there is an
inverse function ψ = φ−1 : H → G. All in all this means
33
(i) φ(eG ) = eH ,
(ii) φ(g1 · g2 ) = φ(g1 ) ∗ φ(g2 ),
(iii) φ is invertible.
We say in this case that G and H are isomorphic.
Comment 4.3. In the spirit of the foregoing discussion, it is useful to think
of two isomorphic groups as being different pictures, or representations, of the
same thing. They may look different as sets (as R>0 and R certainly do), but
as groups they are essentially the same.
Sequence of Pythagorean fifths in Xpitch
We end with a nice illustration of the advantages of being bilingual.
Recall that transposition up by a Pythagorean fifth is represented by the
constant c = 32 ∈ Tfreq .
In pitch language this translates as
3
3
φ( ) = 12 log( ) =: tPy5 .
2
2
A calculation shows tPy5 ≈ 7.02. This makes sense, since transposition by an
equal-tempered perfect fifth is represented by the element 7 in Tpitch .
Note how much simpler the frequency representation (c = 23 ) of the Pythagorean
fifth is compared to the pitch representation (tPy5 = 12 log2 ( 32 )). In general our
frequency language (using Xfreq and Tfreq ) is better suited to deal with questions of just tuning systems (where intervals correspond to rational numbers),
whereas our pitch language (using Xpitch and Tpitch ) is in a sense tailor-made
to deal with the equal-tempered system.
Now let’s model the Pythagorean
sequence of fifths in pitch space, starting with the pitch C4=0∈ Xpitch . From
above, transposing by a Pythagorean fifth corresponds to addition by the constant tPy5 = 12 log2 ( 23 ). Doing this in succession yields the sequence
0, 0 + tPy5 , 0 + 2tPy5 , 0 + 3tPy5 · · · =
0, tPy5 , 2tPy5 , 3tPy5 , · · · ≈
0, 7.02, 14.04, 21.06, . . .
Recall that we must adjust by octave to ensure all pitches are within an octave
of C4. In pitch space this corresponds to subtracting an appropriate multiple
of 12 (12 half steps=1 octave) from the terms above so that they lie within
[0, 12]. After octave adjustment our sequence of Pythagorean fifths then looks
approximately like
0, 7.02, 2.04, 9.06, . . .
The three figures below show sequences of Pythagorean fifths starting at
C4= 0 ∈ Xpitch and ending after 12, 24, and 36 transpositions, respectively.
34
The marked integers 0, 1, 2, . . . , 12 represent the pitches of the equal-tempered
system between C4 and C5.
0
1
2
3
4
5
6
7
8
9
10
11
12
0
1
2
3
4
5
6
7
8
9
10
11
12
0
1
2
3
4
5
6
7
8
9
10
11
12
Note in the first figure how the Pythagorean pitches are close to, but not equal
to any of the equal-tempered pitches. Here the 12th transposition is the dot
right next to C4=0. The linear distance between these two is the pitch space
version of the Pythagorean comma: 12 log2 (312 /219 ) ≈ 0.235 half steps.
As the remaining figures show, when we continue the sequence from here the
dots seem to entirely fill up the interval [0, 12], and yet they will never again
land squarely on any of the integers in 0, 1, 2, . . . , 12.
Moral: though the Pythagorean system is best modeled in terms of Xfreq
and Tfreq , it is only after translating this representation into the language of
Xpitch and Tpitch that we get a clear picture of the relation between it and
our equal-tempered system. It’s a good thing to know how to speak multiple
languages!
Classic 5 (Circle of fifths). We have seen how transposing successively by equaltempered perfect fifths cycles us through all 12 pitches of equal-temperament
without repetition. If we start with C, the sequence is
(C, G, D, A, E, B, F], D[, A[, E[, B[, F, C).
The return to C at the end of the sequence is what earns it the name “circle of
fifths”.
The circle of fifths is used pervasively in music of all different eras and styles.
Composers will often traverse segments of this sequence (forwards or backwards)
in order to “move” a piece from one tonal region to another.
Why a sequence of fifths, as opposed to say a circle of minor thirds, or even a
circle of half steps? Besides the obvious observation that a circle of minor thirds
does not hit all 12 pitches of our scale (C, E[, G[, A, C), and a circle of half
steps (C, C], D, D], E,. . . ) can sound too predictable, there are some further
properties of the circle of fifths that make it particularly appealing musically:
1. Movement by a fifths has strong harmonic significance.
2. If you take every other term in circle of fifths, you get two subsequences
that move by whole step: (C, D, E, . . . ), and (G, A, B, . . . ). Composers
often separate these two strands when employing a circle of fifths.
35
Figure 2: Mozart, Piano Sonata No. 12, KV 332, ca. 0’54” into the performance
of our example recording
Mozart, Piano Sonata No. 12, KV 332
Starting with the second to last measure of the first staff, Mozart moves
down the circle of fifths, using a pattern of rising fourths (F up to B[) followed
by falling fifths (B[ down to E[). The sequence segment we get here is (F, B[,
E[, A[, D[), at which point Mozart short-circuits things with a G instead of a
G[.
Down is in general the preferred direction along the circle of fifths, mainly
because the falling fifth pattern can be interpreted harmonically as a dominant
(V) to tonic (I) movement, one of the most common progressions in music. We
already see this movement between tonic and dominant in measures 2-4, where
we oscillate between F (tonic) and C (dominant). We’ll have more to say about
this later!
“Jordu”, written by Duke Jordan, performed by Clifford Brown
The circle of fifths is ubiquitous in jazz music. Here both the opening theme
as well as the bridge section (starting at the “B” marking in the third staff)
make their way down the sequence of fifths, as the chords notated above clearly
indicate. Listen for the characteristic rising fourths and falling fifths pattern in
36
the bass and piano parts.
Going up the circle of fifths
Examples where music ascends the circle of fifth are harder to come by. The
two examples below are from Haydn’s Piano Sonata No. 49 (Hob. XVI) and
Beethoven’s Bagetelle No. 2, from Seven Bagatelles, Op. 33.
5
Pitch-class space
You will have noticed by now that when speaking of pitches we often ignore
differences of octaves. More often than not we refer to the pitch C4 simply as
C, or the pitch G[ 7 simply as G[ . Our way of speaking of pitches suggests that
we perceive a general property of C-ness and G[ -ness that transcends octave
differences.
This identification between octaves is common in nearly all musical traditions, as it turns out. Furthermore, research in psychoacoustics suggests that we
perceive pitches that are octave transpositions of one another as being strongly
related, almost interchangeable.
We capture this identification mathematically by the notion of octave equivalence.
37
5
Figure 3: Haydn, Piano Sonata No. 49 (Hob. XVI), ca. 4’05”
5.1
Octave equivalence
Definition 9. Two pitches P1 and P2 are octave equivalent to one another,
written P1 ∼oct P2 if they are related by transposition by a number of octaves.
Recall that transposition (up or down) by a number of octaves corresponds to
adding an integer multiple of 12. Thus we see that two pitches P1 , P2 ∈ Xpitch
are octave equivalent if
P1 = P2 + n12
for some integer n ∈ Z.
Comment 5.1. Since P1 = P2 + n12 if and only if P1 − P2 = n12, it follows
that P1 ∼oct P2 if and only if
(i) P1 − P2 is an integer, and
(ii) this integer is divisible by 12.
Example 5.1. Decide whether the following pairs of pitches are octave equivalent.
1. P1 = 19, P2 = 151 Solution: we have P1 − P2 = 19 − 151 = −132. This is an
integer that is divisible by 12: −132 = (−11) · 12. Thus P1 ∼oct P2 . In fact both
pitches are equivalent to P = 7; they are all G’s!
2. P1 = 66, P2 = 5 Solution: we have P1 − P2 = 66 − 5 = 61. This is indeed an
integer, but not divisible by 12. Thus P1 6∼oct P2 . In fact P1 ∼oct 6. Showing that P1
is an F] , whereas P2 is an F.
38
Figure 4: Beethoven, Bagatelle No. 2 from Seven Bagatelles, Op. 33.
= −24, which is
3. P1 = 50/3, P2 = 122/3 Solution: we have P1 − P2 = − 72
3
indeed an integer divisible by 12. Thus P1 ∼oct P2 . Note that neither of these pitches
actually lies within the equal-tempered system.
√
√
√
4. P1 = 12 2, P2 = 11 2 Solution: we have P1 − P2 = 2. This is not an integer.
Thus P1 6∼ P2 .
In the last example we often took a pitch P and found an octave equivalent
pitch P 0 satisfying 0 ≤ P 0 < 12; for example, this helps when trying to identify
what pitch name corresponds to a given integer pitch in Xpitch . This procedure
works in general. Given any pitch P ∈ Xpitch we can always find an integer n
such that
12n ≤ P < 12(n + 1);
but then we have P ∼oct (P − 12n), and P − 12n lies within [0, 12).
Example 5.2. Take P = 200. Then we have 192 ≤ 200 < 204. Here 192 =
12 · 16 and 204 = 12 · 17. Then 200 ∼oct (200 − 12 · 16). Since 200 − 12 · 16 =
200 − 192 = 8, we see that P ∼oct 8, showing that our pitch is an A[ .
Classic 6 (The division algorithm). The preceding observations are an illustration of a general fact in mathematics often described as the division algorithm,
though it is not really an algorithm. We state it here as a theorem.
The division algorithm. Fix a positive integer n > 0. Given any real number
s ∈ R, there is a unique integer q and a unique real number r satisfying
(i) s = nq + r, where
(ii) 0 ≤ r < n.
39
The integer q is called the quotient upon division by n; the number r (not
necessarily an integer), is called the remainder upon division by q.
Fix a positive integer n and a real number s. To find the q and r such that
s = nq + r as in the division algorithm, simply locate locate s between two
consecutive multiples of n, as in the diagram below.
nq
︸
�
n(q+ 1)
n(q+ 2)
� = � - ��
Comment 5.2. Give an integer n and a real number s, there are of course
many ways to write s = nq 0 + r0 with q 0 an integer.
For example, take n = 12 and s = 21.5. Then we have
21.5
=
12 · 2 + (−2.5)
21.5
=
12 · 0 + (21.5)
21.5
=
12 · (−1) + 32.5
However, there is only one way to do this where 0 ≤ r0 < 12, namely
21.5 = 12 · 1 + 9.5.
Thus for n = 12 and s = 21.5 the unique choice of q and r is q = 1 and r = 9.5.
Definition 10. Fix a positive integer n. The division algorithm allows us to
define a function as follows: given any real number as input s ∈ R, we define
the output to be r, its remainder upon division by n. We write this as
s mod n = r.
Since 0 ≤ r < n (as stipulated by the division algorithm), we have defined a
function
mod n : R
→
s 7→
40
[0, n)
s mod n.
Example 5.3. Take n = 7. Then we have
mod 7 : R → [0, 7).
Compute x mod 7 for x = −73, 3π, and 49.
Simply apply the division algorithm to each choice of x.
−73 = 7(−11) + 4
⇒
3π ≈ 7(1) + 2.424
⇒ 3π mod 7 ≈ 2.424
−73 mod 7 = 4
49 = 7(7) + 0 ⇒ 49 mod 7 = 0
Comment 5.3. When dealing with pitches P ∈ Xpitch , we are simply applying
these concepts to the case n = 12. For example, to figure out what pitch name
P corresponds to, we simply compute P mod 12 and add this many half steps
to C.
5.2
Equivalence relations
As the notation P1 ∼oct P2 and our foregoing discussion suggest, when we say
P1 and P2 are octave equivalent, we mean something to the effect of they are
“kind of the same”. Indeed, we even sometimes say “P1 and P2 are the same
up to octave equivalence”, suggesting a kind of qualified equality relation.
Like actual equality, octave equivalence also enjoys some familiar basic properties:
(i) Reflexivity: P ∼oct P for all pitches P ∈ Xfreq .
(ii) Symmetry: if P1 ∼oct P2 , then P2 ∼oct P1 .
(iii) Transitivity: if P1 ∼oct P2 and P2 ∼oct P3 , then P1 ∼oct P3 .
These three properties make ∼oct what is called an equivalence relation.
Definition 11. Let X be a set, and let R be any relation defined on X: if the
relation holds between a pair (x, y) of elements of X, we write xRy.
The relation R is an equivalence relation if it satisfies the three following
properties:
(i) Reflexivity: xRx for all x ∈ X.
(ii) Symmetry: if xRy, then yRx.
(iii) Transitivity: if xRy and yRz, then xRz.
Example 5.4. Take X = Z and define a relation x ∼∗ y if y = 3r · x for some
r ∈ Z. For example 54 ∼∗ 6 since 6 = 3−2 · 54, but 8 6∼∗ 12, since we cannot
write 12 = 3r · 8 for any r ∈ Z.
I claim ∼∗ is an equivalence relation. Let’s show that it is reflexive, symmetric and transitive.
41
(i) Reflexive: given any x ∈ Z, we have x = 30 x. Thus x ∼∗ x.
(ii) Symmetric: suppose x ∼∗ y. Then y = 3r · x for some r ∈ Z; but then
x = 3−r · y. Since −r ∈ Z, we have y ∼∗ x.
(iii) Transitive: suppose x ∼∗ y and y ∼∗ z. Then there are integers r, s ∈ Z
such that y = 3r ·x, and z = 3s ·y; but then z = 3s ·y = 3s (3r ·x) = 3s+r ·x.
Since r + s ∈ Z, we have x ∼∗ z.
We’ve proved ∼∗ is an equivalence relation!
Example 5.5. Of course, not all relations are equivalence relations.
Consider the relation defined on Z>1 = {2, 3, 4, . . . } by xRy if x and y
share a common nontrivial factor–that is, a common factor greater than 1. This
relation is reflexive (since x is a common factor of x and x), and symmetric (if
x and y share a common factor, then so do y and x), but not transitive: x = 2
and y = 6 share a common nontrivial factor (2), and y = 6 and z = 15 share a
common nontrivial factor (3), but x = 2 and z = 15 share no nontrivial factor!
It is also easy to come up with “real world” examples. Let X be the set of
all living humans, and define xRy to mean x loves y. Tragically, this relation
fails to be symmetric! In fact, sadly, this relation also fails to be reflexive.
Congruences
Mathematically speaking, our octave equivalence relation is just one example
of a whole family of equivalence relations defined on the set R.
Definition 12. Let X = R, and let n be a positive integer. Given x, y ∈ R,
we say that x is congruent to y modulo n, if n | (x − y). We write x ≡ y
(mod n) in this case.
Comment 5.4. Recall that n | (x − y) if and only if there is an integer r with
nr = x − y. After a little algebra, we see that x ≡ y (mod n) if and only if
x = y + nr for some integer r. Thus x and y are congruent modulo n if and
only if they differ by a multiple of n.
Comment 5.5. It follows immediately from the definition that P1 ∼oct P2 if
and only if P1 ≡ P2 (mod 12).
Fix a positive integer n. It is indeed true that congruence modulo n defines
an equivalence relation: that is, we have
1. x ≡ x (mod n) for all x;
2. if x ≡ y (mod n), then y ≡ x mod n;
3. if x ≡ y (mod n) and y ≡ z mod n, then x ≡ z (mod n).
Furthermore, this relation also respects both addition and multiplication in R,
as the following theorem explains.
42
Modular substitution. Let x, y, x0 , y 0 be integers with x ≡ x0 (mod n) and
y ≡ y 0 (mod n) Then
(i) x + y ≡ x0 + y 0 (mod n);
(ii) x · y ≡ x0 · y 0 (mod n).
Congruence modulo n and mod n
You will note the notational similarity between the expressions ‘x ≡ y
(mod n)’ and ‘x mod n’. There is indeed a close connection between the two,
but we have to be careful when describing exactly what this is.
First observe the main difference between the two:
1. To say x ≡ y (mod n) is to assert that a certain relation holds between x
and y: namely, x − y is divisible by n.
2. On the other hand x mod n is simply a number–the result of plugging the
input x into the function mod n.
The following theorem tells us precisely how these two notions are related.
Theorem. Fix a positive integer n. Let x and y be any real numbers. Then
x≡y
(mod n) if and only if x mod n = y mod n.
Furthermore, let x mod n = r. Then x ≡ r (mod n).
Example 5.6. The last two theorems taken in tandem allow us to do much
heavy lifting in terms of modular arithmetic.
Fix n = 4. Compute (1023 + 61 · 91) mod 4.
Solution: According to the second theorem, to compute x mod 4, we can replace
x with anything equivalent to x modulo 4. Let’s find a nice small x equivalent
to 1023 + 61 · 91 modulo 4.
We first compute 102 mod 4 = 2, 61 mod 4 = 1, and 91 mod 4 = 3.
This implies by the second theorem that 102 ≡ 2 (mod 4), 61 ≡ 1 (mod 4),
and 91 ≡ 3 (mod 4).
Now we can substitute these in using the the first theorem (modular substitution):
(1023 + 61 · 91)
≡ (23 + 1 · 3)
≡ (8 + 3)
≡ 11
≡ 3
(mod 4)
(mod 4)
(mod 4)
(mod 4).
Lastly, we conlcude, again using the second theorem, that (1023 +61·91)mod4 =
3 mod 4 = 3.
Note: we could have done this the hard way by directly computing 1023 +
61 · 91 = 1066759, and the computing 1066759 mod 4 = 3.
43
5.3
Pitch-class space
Pitch-class space
As we have begun to see, there are many situations where music theorists
and composers alike are inclined to speak of octave equivalent pitches as being
the same. In such situations, working within the Xpitch model can be somewhat
cumbersome. We are constantly having to employ the phrase “up to octave
equivalence” in our discourse, or else having to apply mod12 in our arithmetic.
Pitch-class space, denoted Xpc , is a model especially suited to such situations. Roughly speaking Xpc is what you get by taking Xpitch and simply
declaring that octave equivalence, our “qualified equality relation” on Xpitch , is
now to be treated as honest to goodness equality.
As such Xpc is the result of collapsing all octave equivalent pitches into a
single entity, which we call a pitch-class. For example, the infinite set {· · · −
17, −5, 7, 19, 31 . . . } in Xpitch consisting of G4 and all of its octave transpositions
is collapsed in Xpc into a single pitch-class, which we call simply G. How much
exactly must we collapse Xpitch to get Xpc , and what are we left with?
Below you find a diagram of Xpitch with various instances of C and G marked
as reference points.
��
- 12
0
��
��
��
��
12
24
��
��
��
Since any pitch P ∈ Xpitch is octave equivalent to a pitch in the range [0, 12],
when moving to Xpc we first collapse the real line to the interval [0, 12].
0
12
�
44
Here all of our G’s are now identified with the single G between 0 and 12;
more generally, an arbitrary pitch P ∈ Xpitch is identified with P mod 12,
which lies within [0, 12]. Except for P = 0 and Q = 12, no two elements in
[0, 12] are octave equivalent, so we are nearly done with our collapsing; we need
only identify P = 0 and Q = 12. What sort of space do we get?
A circle!
� = ��
� ��
G
Equal-tempered pitch-classes
In the process of this collapse, our infinitely many equal tempered pitches, represented
by the integers Z in Xpitch , are collapsed to the finitely many integers {0, 1, 2, . . . 11}, which
divide the pitch-class circle into 12 equal arcs.
G3 A♭3 A3 B♭3 B3
-5
-4
-3
-2
-1
C4 C♯4 D4 D♯4 E4 F4
0
1
2
3
4
5
ℝ
ℤ
⇓
��
�
�
�
��
�
�
�
�
�
�
45
�
� ��
�
�
�
��
Figure 5: Transposition by t = 8 is rotation by 8 steps along our integers. This is a clockwise
rotation of 8 · 2π
= 4π
radians, or 240 degrees. Here the pitch-class P = 1 is shown rotated
12
3
to the pitch-class P 0 = 9.
Transpositions in pitch-class space
A transposition t ∈ Tpitch , which acts on Xpitch by shifting points to the left or right
a certain distance, also acts on Xpc ; but now the operation it defines is a rotation of our
pitch-class circle!
Quotient objects
Our construction of Xpc is an example of what is called quotienting out by an equivalence
relation, and works with any set X and any equivalence relation ∼ defined on X.
Our description of this process in terms of “collapsing” elements is a tad bit vague. The
way we make rigorous sense of this is through equivalence classes.
Definition 13. Let X be a set, and let ∼ be an equivalence relation defined on X.
(i) Given an element x ∈ X, its equivalence class, [x]∼ , is the set of all elements y ∈ X
that are equivalent to x:
[x]∼ := {y ∈ X : x ∼ y}.
(ii) The quotient of X by ∼, denoted X/ ∼, is defined as the set of all equivalence classes
of X:
X/ ∼:= {[x]∼ : x ∈ X}.
Here is how this looks in the example of Xpitch with relation octave equivalence.
1. Here the equivalence class of a pitch P ∈ Xpitch , denoted [P ]12 , is by
definition the set of all pitches that are octave equivalent to it. Thus
[P ]12 = {...P − 24, P − 12, P, P + 12, P + 24, P + 36, . . . }
In our particular musical example, we call such an equivalence class a
pitch-class.
46
2. Note that there are many different names for a given equivalence class.
For example, the equivalence class
{. . . , −17, −5, 7, 19, 31, . . . }
can equally well be denoted [7]12 , [−17]12 , [103]12 , or indeed [P ]12 for any
P that is octave equivalent to 7.
3. Pitch-class space is then defined to be the set of all pitch-classes: that is,
Xpc := {[P ]12 : P ∈ Xpitch }.
Some observations:
1. Notice how we have captured the notion of collapsing pitches: we take a
collection of octave equivalent pitches (the pitches −17, −5, 7, 19, . . . , for
example), throw them all into a single set, the corresponding pitch-class,
and treat this as a single object in Xpc . In this way our infinitely many
pitches (the pitches −17, −5, 7, 19, . . . , for example) have been collapsed
into one object: in our example [7]12 , or [−17]12 , or however you want to
denote the set {· · · − 17, −5, 7, 19, . . . }.
2. How on earth does this strange set of pitch-classes correspond to a circle?
(a) Though each pitch-class [P ]12 in Xpc has many different names, there
is exactly one choice of P with 0 ≤ P < 12. This is how we collapsed
Xpitch first to the interval [0, 12).
(b) Next we map the points of P ∈ [0, 12) in a 1-1 fashion onto the
points (x, y) of the unit circle using an invertible function! In our
circle representation I have used the function
P 7→ (sin(
2π
2π
P ), cos( P )).
12
12
Quotient groups
Suppose a set X with equivalence relation ∼ also happens to be a group.
A natural question to ask is whether the quotient X/ ∼ is also a group in a
natural way. The answer is often yes!
Example 5.7. Let G = Z with group operation +. Fix an integer n > 0 and
consider the equivalence relation r ≡ s (mod n). We denote the quotient object
in this case Z/nZ. By definition we have
Z/nZ = {[r]n : r ∈ Z},
where [r]n = {. . . r −2n, r −n, r, r +n, r +2n, r +3n, . . . } denotes the equivalence
class of all integers congruent to r modulo n. Since every integer r ∈ Z is
congruent to a unique integer r0 with 0 ≤ r0 < 12, we have
Z/nZ = {[0]n , [1]n , [2]n , . . . [n − 1]n }
47
Note: by quotienting out by n, we have collapsed the integers to a finite set
containing n elements!
This set has a natural group structure defined on it, namely
[r]n + [s]n = [r + s]n .
When applying this rule your are allowed to use any name you want for the
given equivalence classes: that is we have [r]n + [s]n = [r0 ]n + [s0 ]n for any choice
of r0 , s0 with r ≡ r0 (mod n) and s ≡ s0 (mod n).
Example 5.8. Consider Z/8Z = {[0]8 , [1]8 , . . . , [7]8 }. Compute the following
additions. Your answer should be given in the form [r]8 , where 0 ≤ r < 8.
1. [103]8 + [−1]8
2. [808]8 + [−23]8
SOLUTION:
[103]8 + [−1]8
[808]8 + [−23]8
=
[102]8
=
[6]8 , since 102 ≡ 6
=
[0]8 + [1]8 (since 808 ≡ 0
=
[1]8 .
(mod 8).
(mod 8), and −23 ≡ 1
(mod 8))
Spaces and quotient spaces
I promised long ago to say something about topological spaces. In mathematics, a topological space is a set with an additional bit of structure, called a
topology, that allows us to say in an abstract way when two elements x, y of the
set are “close to one another”.
As with groups, one defines what a topology is in a precise manner. The
general definition is somewhat technical, and as such I will omit it here. However, the three spaces we have seen thus far, viz., Xfreq , Xpitch , and Xpc , are
examples of a particular type of topological space, called a metric space, which
is easier to describe. Essentially, a metric space is a set X equipped with a
distance function d(x, y) that quantifies exactly how far elements x and y are
from each other.
Definition 14. A metric, or distance function, on a set X is a function that
assigns to any pair (x, y) of elements of X a nonnegative real number d(x, y)
(called the distance between x and y), satisfying the following properties:
(i) d(x, y) = 0 if and only if x = y,
(ii) d(x, y) = d(y, x),
(iii) d(x, z) ≤ d(x, y) + d(y, z) (the triangle inquality).
A set X, together with a distance function d is called a metric space.
48
Example 5.9. For Xpitch we define our distance function as d(P1 , P2 ) = |P1 −
P2 |. In other words, distance in this space is just a measure of interval size!
The fact that this is indeed a distance function (i.e., satisfies properties (i)-(iii)
of the definition), follows from well known properties of the absolute value.
Example 5.10. The space Xfreq = R>0 is a subset of R, and as such we could
simply restrict the metric in the previous example to R>0 .
However, we want our metric to measure interval size, and so instead we
define
d(f1 , f2 ) = |12 log2 (f1 ) − 12 log2 (f2 )| = |12 log2 (f1 /f2 )|.
Any guess as to where this came from?
Correct, I just used our invertible function φ(x) = 12 log2 (x) to map frequencies to pitches, and then used the definition of distance in pitch space!
Example 5.11 (Quotient spaces). As with groups, given a metric space X and
an equivalence relation ∼ defined on it, we can ask whether the quotient object
X/ ∼ is also a metric space in a natural way. The answer for Xpc is yes!
How do we define a distance function on Xpc ? Recall that we can identify
Xpc with the unit circle. So we can define the distance d([P ]12 , [Q]12 ) between
two pitch-classes to be the distance between their corresponding two points on
the unit circle. By distance here, we mean the minimum length of the two
circular arcs between the two points.
As in the previous examples, technically one must prove that this definition
of d(x, y) satisfies the three properties of a metric. I leave this as an exercise!
Let’s compute d(x, y) for x = [25]12 and y = [−13]12 in Xpc . First note that
x = [25]12 = [1]12 and y = [−13]12 = [11]12 . Thus x and y correspond to the
two points on the circle below.
��
�����������π/�
�
�������������π/�
The distance between these two points is the minimum of the arc lengths
(10/12)2π = 10π/6 and (2/12)2π = π/3. Thus d(x, y) = π/3.
49
5.4
Pitch or pitch-class space?
Where does music live, in pitch or pitch-class space? The answer varies depending on how composers conceive their musical constructs, and on how listeners
hear them.
Example–Winterreise, by Franz Schubert
In vocal music, octave shifts are often used for dramatic or virtuosic effect,
and as such octave equivalence is by no means treated as actual equality. Here
the natural model is usually pitch space.
We look at two examples from Franz Schubert’s song cycle Winterreise,
written just months before he died in 1828.
The first passage is from “Frühlingstraum”. The line “es schrien die Raben
vom Dach” is sung twice. The melody in both instances is identical up to octave
equivalence, and yet they are very distinct in character. Note in particular the
dramatic jump from E4 up an octave and a half step to F\ 5 in the second
instance that results from substituting E4 for E5.
In the second example, taken from “Die Krähe”, the melody for the line
“Krähe, lass mich endlich seh’n, Treue bis zum Grabe!” would look rather uninteresting if collapsed into pitch-space.
50
Example–Das Wohltemperierte Klavier I, Fuga 2, by J.S. Bach
On the other hand, when composers treat a musical idea as something to
be manipulated and transformed using various operations, as in contrapuntal
music from the Renaissance and Enlightenment periods, or 20th century serial
music, then pitch-class space is often the more natural setting to model the
music.
Das Wohltemperierte Klavier (or “The Well-tempered Klavier”) is a collection of two books of 24 preludes and fugues. In each book Bach cycles through
all 24 possible keys (12 choices for pitch name, 2 choices for major/minor) in
sequence. Thus the first fugue is in C major, the second in C minor, the third
in C] major, etc.
The piece is a fugue in 3 voices. The first two bars of the
fugue present the subject (S) in one voice. Then in measures 3-4 a second voice
plays (nearly) the same subject transposed a perfect fifth up (this is called the
answer (A)), while the first voice plays what is called the counter subject (CS).
The rest of the piece is built essentially out of near perfect transpositions of
234%(5
these three basic units.
6(7
$89(,:/
!"#"$%&'()(*+,-.*/-0(1
,,-,, ,,, ,-,, ,,, ,,, ,-,, ,,, ,,-,, +/,,-,
%
' )* )* ( )* )* )*
)* )* )* (
' )* .-, -,, -, )** -, )* )***
! 1###########################
$ % % & ** +)*** *** )*** )*** *** )*** ** *** 1 )** *** +)*** ** *** )*** )*** )*** )*** )*** 1 )*** **** +)***** +)**** )**** )*** )*** )** /*)* **** ***** #
** 1
* * *
* 1
1
1
1
"1
1
1
1
%% &
(
(
(
1###########################
0 %
1
1
#1
,, , ,, ,, ,,
,
, , , , , , , , , , , ,, , , , ,
% ,,-, ,-, .-,, ,-, +/-,)* ,,,-, ,,-, -,, ,,-, ,,,-, ,,, ,-,, ,,-, -,, ,,,, ,,, -,, -,, ,,- ,,,, ,,-,, ,-,)* -,, ,-,)* +-,,,, ,,-, ,,-, -,, -,, )* )*
2 /)* -)*, )** ,-)** )*** )*** )*** )* +-,)** )** 1 )*** +)*** *** *** )** )** )*** %)*** )*** )** )** *) ***#*** 1
! 1############################
$ % % )*** +)** )*** ** .)*** )*** )**** )*** 1 -,)***
* * ** ** *
** ** **
* * * * 1
1
1
1
"1
1
1
1
%% %
(
(
(
1############################
0
1
1
#1
!"!
One feature that %makes
jump from C4
,,-, ,, ,,, ,, subject
,,, ,,
,,, ,,-, .-,,,, ,,-, ,,,, ,-, is' the
,, ,,, ,, , ,the
, , ,,, /-,,counter
, so distinctive
,
+-,5
% % ,-)*** -, -,, ,-, ,,-, ,-,, ,,-,, ,,,-, 4)*** )*** )-*,** -,' -,,)** -,)* -,)* +-,,*), ,-,)* -,,)* ,,-, ,)*-,
' +)-,4*** ,)*** )*** )*** **)** ,)**#1
! 1############################
$ plus
*** ** )*** ** not
1 ** hesitate
* ** * does
*
)
to E5: an octave
a third.
to
cut
this
down to a
3
' Yet1 Bach
*
* * 1
1
1
1
*
)
*
)
/
*
)
/
)
*
/
/
*
)
/
*
)
/
/
*
)
jump of only a " third,
the
passage
below
where
we
get
C5
to
E5.
Thanks
+)*** /)*** in
*
)
*
+)
*
*
+)
*
*
)
*
*
*
*
*
)
*
+)
*
1 % ' /**)** as
1
1
1
)*
* * )** ** ** * * )** *)** **
*)** )*** 1 *)* *** *** *** *)*** )**** *)** )*** )*** )*** *)*** )*** )*** )** )**#)* 1
* * * * ** ** * *
*
1############################
0 % % counterpoint
1
* ** * ** adjusted version
in part to the dense
going
on, we hear
this octave
of CS as being no different from the usual version.
!#!
'
, , ,, , ,,
, , ,, , ,, , , , ,, ,, , , , ,, ,,
% ,-,) *,-, +-,, ,-,' +)-,,,4 ,-,,)* )* )* )* ,5
-,, ,,- -,, -,, ,,-' ,,-, -,,,)* -,,)* -,, -,, -, ,,-, -,, -,, -,, -, ,,,-, ,,-, ,-,,
,,-, ,,,-,
*
*
! 1############################
$%% '
'** ***& +'*&** +'*&*** **&* ***&* ***** **** *** * ** **** *** * )** ** 1 *)*** ** * *4)*** * ** *** * )*** )**** **1 ** * . . ** *** ** #
. 1
* , , +&*() &* 0
&* &* **& &* &* +(*&)* *&* 1 +&*** *&* *(&)* *** ** *&** *** ** &*1 '*&* 0
&
*
&
*
1 %%%
,5
,,,))#()) 1 3
())) )* ,()) ())) '() )() ())) ()) ())) )) 3 () ())) )* ,()) ())) )) '()&* () &*() /)** (&)* '())* 3 ')*&*
,,)*-,)) ,(/)) ,)-,* ()) '+())) )) ,-,, ,,-,,()))) ,-,, )() +(
,
$
+(
*
)
)
(
,
"!1 3############################
'
)
1
1
*
*
)
*
)
*
*
)
)
)
)
)
)
)
(
.
/
)
)
)
* )*
* )*
*
*
)*
*
)* -,) ) 1
3 0 % % % ')*)( **** )) **** )**** ***)) )*** )))*** )** *)** )*** )**** )*** *)** *)* *)* 3 )* 1 /*)) **** **** **** )**** %)**** )) )*** ) *)** ) *)* **** ) ))**** 3 ****)) 1 ** ** ** ) ** )*** )**** *** )***#1 3
1############################
** ** * **
'
)
(
'
)
(
* ** * 3 ** ** ** ** **
)
(
)
(
)
(
+(
)
)
)
"3
3
% ()
. -()) () +() . )-())
2
())) ()) ())) ()) )() ()) ())) 3 () )))) )))) )))) ()) )() ))) ))) ))) +())#33
3############################
1 % % ))
) )) )) 3
) ) ) ) )
) )) )
,
,
,
,
,
,
,
, , ,
+/)* +/)*
% ,. , ,,** , ***,,,, **&*,-,,, ,,-,, ,-,, .,,-, ,-,, .,,-,, ,-,, ,,-, ,-, -,, +/-, /**)***
)* )*** +)*** )**** %)*** )*** )*** ** )*** ** )*** ** *** ***
*&** *** *0
())
.#1
! 1############################
$ %% %%% +(,-,)) ,-,,()) -,,,&(*))) ,-, +&*-, 0
1 . ** )** .*)* )*** +&**(*) )*** **&**** **0
* * . . +&(*) *&(*) 0
)
(
&
*
*
)
(
&
*
)
(
)
&
)
(
*
)
,(
)
)
+&
*
)
(
!1 3############################
$
3
()
) )
#3
) ) )
) ) ) ) )) ()) )() ())
''1 ()) ,, 3 %''()))-,, )))) ,, )))) )))) ,, +''())) ,, ')()) ,, ))) ,,, )))) ))) ,,+'())) 1 3
,,-, /-,,,)* ,,-),* ,,,-, ,,),-* ) ,,-,, ) ,,,, ) )) ,,,,
3" 1
,
/
/
,
/
,
,
,
1 -,,)* )*** ) *)** -),**
1
*)***
)*-,** -,,*)*
-,,*)*
*
*
"1############################
3 0 % %%% *)** ()) **** '()))) ')())) *** ')())) )**** ,()) *** ())*)** -,*)** )*-,*
3
*
*
()) 1 3
) ) ))
/ () +(1 )) +(* )) 3 ()))** ())) ()))) * ()))) ** +()) ()) ())) ()) ())) ())) ()))) +()) ())) #
)) )
3############################
1%% )
#
) 3
) ) )
!$%!
!"#!
!$&!
!"$!
6
*
(((((;!"#"($%&'(.(234%(<="5($89(,:/>(((((
Chords
% . *&** *** *&** *** *** *&** *** ****& **&* *** *&** *** **&** **&* ** * **
. **
* ** *** *&** *** *&** *** **
*#3
! 3############################
$ % % ')() ())) ())) +&*()) ')() '() ''*&*() '' ''*& ()))) &* ()))) ())) 3 *(&)*)) )() +&* ())) ())) +'&**() *&* '&*)() ()) '*&*() &** 3 '*&**)( +&* . +(*-&)*) ()&*)) ())) ())) )()) +&'0
))( 3
)
)
)
)
)
)
(
)
)
)
)
)
) ) a)) whole
))) )) 3 of progress
)) )) )) '))() )) lot
) )) ) ) in3 ) our goal of) modeling
We have not3 made
musical
)
(
)
(
"3
3
3
3
)
)
(
)
(
)
)
(
% % -())) . . +(-))) objects.
. In
)
(
)
)
)
(
)
)
(
)
(
)
)
)
(
)
(
)
)
(
)
)
)
)
(
objects with mathematical
fact
so
far
we
only
know
how
to
model
%
)
)
)
)
(
)
)
)
3############################
1
))) ())) '()) '()) 3 ))() )() ()) )()) ())) '()) ()) )()) 3 )) ) )) ) ))) ))) ))() )()) ())) )))) ))) )))) ) ) )#) 3
!%&!
pitches.
* ** * *
*
% ** ** *** ** *** *
***** **&* ***&* **&* ***** **** &** **&* &** ****
. *&* ***&* **&** *&** **&* +&*** **&* **&** **4
)
(
)
(
*
&
*
&
*
! 3############################
$ % % *&()) &* +&* &*. *-&()*) *)(&*)) ())) ()) ())) '*0
3
)
(
)
(
3
3
)
(
)
)
)
(
')))( )) ))) )) )) )) )) +')-)( . . ')-)( '&**)-)( . . -&()*#
)) 3
) ) )() 3 ')))4
)
)
) )
3
3)
()) ()) () ()
"3
3
3
51 ()) () ()
%
2 / () +() +()) ()) )()) ()))#()))) 33
3############################
1 % % ())) ())) )())) ())) ()))) ())) ())) ()) '()) ))))) )))) ))))) ))) )())) ())) ())) 3 ())) ())) ()))) ())) ())) ())) ()) '()) '() )))) ))))) )))) )())) ()))) +())) +())) 3 ()))
) )) )
) ) ) ) ) )
) ) ) ) )) )) ) ))
!%'!
* *** *
% *** &** &** **& **** *** *&* *&** *&** **&* *** *** *** ***&* *&* *&** *&** . ** ** ** * ** * * * ** ** . *0
**
! 3############################
$ % % *-*&()) . . &*-())) &**-())) . . +&* +&*-())) &** 3 ())) ()))) ()))) ())) ))() &*()) &**())) *&(*))) 3 +'**&)( '*&(*)) **&* *&(*)) '*&**() '&*() '&**()) . +&#
'-)() 3
)
) 3
3
3
3 )) )) )) )) )) )
"3
3
3 ***
%% % ()))) ())) ()))) ()))) ())) ())) ()) ()) +() ()) () ())
*
.
-())) 3
)
(
)
(
*
3############################
1
&** **&* '***&
#
3
) ) ) ) ) ) )) )) )) )) ())) 3 ())) ()))) +())) ())) )) ())) ()))) ()))) ())) )) 3 **&* &** &**
!%(!
% *** *&** * * * * *** *&** *** *&** *** *** *&** +&**** *&*** *&***&*& **** *&** ***&* *&*** *&**() *** *** **&* **&* *** *** *
&*
&*
&* *
! 3############################
$ % % +&(*))) ()) ()) '() '&**() **&* +'**(&*) '**&** '**)(**& &* . +(&**))- 3 ()&*))
. -())) ())) ()) ()) ))( ())) 3 +'*&(())- . '())-( . &* +4'*)4 #6
!%)!
Despair not! Now most of the mathematical machinery is in place, and
this will allow us to easily model more complicated musical objects like chords,
scales, melodies and rhythms.
These further musical objects will be described as collections of objects (e.g.,
collection of pitches, or pitch-classes, or of onsets in the case of rhythms), and
come in two flavors depending on whether or not order is important in these
collections. For example, a chord will be defined as an unordered collection of
pitches or pitch-classes, whereas a melody will be defined as an ordered sequence
of pitches.
6.1
Sets and sequences
In mathematics this important distinction between unordered and ordered collections is articulated by two distinct types of fundamental mathematical objects: sets and sequences.
Recall that a set is defined simply as a collection (finite or infinite) of objects.
A set is determined soley by its contents (the objects it contains), and not by
any particular ordering of those objects. This is why the set A = {1, 2, 3} can
equally well be written as
A = {2, 1, 3}, A = {2, 3, 1}, A = {1, 2, 2, 2, 3, 1, 1}.
In all cases A is the set that contains the objects 1, 2 and 3, and only these
objects.
This notion of order not mattering is captured in the very definition of set
equality. Sets A and B are defined to be equal, written A = B, simply when
they contain the same elements: that is, given any object x , we have x ∈ A if
and only if x ∈ B.
4
2
Example 6.1 (Set equality
√ proof). Consider the sets A = {x ∈ R : x −3x +2 =
√
0} and B = {1, −1, 2, − 2}. I claim A = B.
To prove the claim we must show that x ∈ A if and only if x ∈ B. Observe that to prove an “if and only if” statement, we really are proving two
implications: x ∈ A ⇒ x ∈ B and x ∈ B ⇒ x ∈ A.
√
√
The second implication is easy. Each of the elements 1, −1, 2, − 2 of B are
also elements of A because plugging each of these numbers into the expression
x4 − 3x2 + 2 yields 0, as desired. This proves x ∈ B ⇒ x ∈ A.
Now go the other way. Take x ∈ A. Then x satisfies x4 − 3x2 + 2 = 0.
Factoring the expression on the left, we see that
√
√
(x4 − 3x2 + 2) = (x2 − 1)(x2 − 2) = (x − 1)(x + 1)(x − 2)(x + 2) = 0.
√
√
For this to be true we must have x = 1, −1 2, or − 2. In each case we have
x ∈ B. Thus we have proved x ∈ A ⇒ x ∈ B.
Note: by definition x ∈ A ⇒ x ∈ B means A ⊂ B. Thus another way of
proving A = B is to prove A ⊂ B and B ⊂ A.
52
Sets and sequences
For a sequence (x1 , x2 , x3 , . . . ) on the other hand, order does matter. For
example, in contrast to our example with sets the following sequences containing
1, 2 and 3 are all distinct:
(1, 2, 3), (2, 1, 3), (2, 3, 1), (1, 2, 2, 2, 3, 1, 1).
As with sets the notion of sequences being ordered objects is captured by
the very definition of equality of sequences: s1 = (x1 , x2 , . . . , xm ) and s2 =
(y1 , y2 , . . . , yn ) are equal, written s1 = s2 , if n = m (they are of the same
length) and xi = yi for all i.
We summarize equality of two sequences s1 and s2 as follows:
(i) they are of the same length,
(ii) they are composed of the same elements,
(iii) the elements appear in the same order in both sequences.
6.2
Chords
Musical definitions of what exactly a chord is tend to appeal to a notion of
“simultaneity” or “sounding together”. For example, the Harvard Dictionary
of Music defines a chord as ”three or more pitches sounded simultaneously or
functioning as if sounded simultaneously”.
The last phrase in this definition is there to accommodate examples like the
following passage, wherein every measure would be recognized as an “instance”
of a C-major triad.
As the example suggests, the identifying property of this chord is the simply the
unordered collection of pitches from which it is formed. Thus we should model
chords with sets!
Pitch sets and pitch-class sets
We will model chords either as sets of pitches, or sets of pitch-classes, depending on whether octave differences are taken into account.
Definition 15. We collect a number of definitions related to chords.
1. A pitch set is a (finite) set of pitches {P1 , P2 , . . . , Pr } ⊂ Xpitch .
2. A pitch-class set is a (finite) set of pitch-classes {[P1 ]12 , [P2 ]12 , . . . [Pr ]12 } ⊂
Xpc .
3. We will call both pitch sets and pitch-class sets chords.
53
4. Given a positive integer n, an n-chord is a chord (of either pitches or
pitch-classes) containing exactly n distinct objects.
Example 6.2.
As pitch sets the five instances of chords in the five measures above are modeled
as
{0, 4, 7}, {4, 7, 12}, {0, 4, 7, 12, 16}, {0, 4, 7, 12}, and, {7, 12, 16}.
Note that these sets are all distinct.
As pitch-class sets on the other hand they all collapse to the single pitch-class
set {[0]12 , [4]12 , [7]12 }. For example, the third chord is
{[0]12 , [4]12 , [7]12 , [12]12 , [16]12 }
= {[0]12 , [4]12 , [7]12 , [0]12 , [4]12 }
=
{[0]12 , [4]12 , [7]12 }.
In almost all cases we will model chords with pitch-classes, and as such we
will often drop the brackets from our notation.Thus {[0]12 , [4]12 , [7]12 } will most
often be written {0, 4, 7}.
Counting pitch-class sets
Let’s fix k > 0 and consider pitch-class sets containing exactly k pitchclasses taken from our 12 equal-tempered pitch-classes. The number of such
sets is finite. Thus up to octave equivalence, the number of equal-tempered
k-chords is finite.
To count these we use a very useful formula that counts the number of
subsets of size k taken from a set of n elements.
n
n!
subsets of size k
.
=:
#
=
taken from a set of n elements
k
k!(n − k)!
The symbol on the right is known as a binomial coefficient due to its appearance
in the binomial formula
n n−1
n n−2 2
n
(x + y)n = xn +
x
y+
x
y + ··· +
xy n−1 + y n .
1
2
n−1
In this context we also call nk “n choose k”, as it counts the number of ways of
n!
picking k objects out of a collection of n. We can use nk = k!(n−k)!
to count
equal-tempered k-chords. Here we are picking
k
pitch-classes
out
of
a
collection
of 12. Thus the number of k-chords is 12
k .
k = 1 A 1-chord is called a monad. There are 12
1 = 12!/(1!11!) = 12 monads:
namely {0}, {1}, . . . {11} (or {C}, {C] }, . . . , {B}).
54
k = 2 A 2-chord is called a dyad. There are 12
2 = 12!/(2!10!) = (12·11)/2 = 66
dyads: e.g., {0, 3} (or {C, E[}), {4, 9} (or {E, A}), etc.
k = 3 A 3-chord is called a trichord. There are 12
3 = (12 · 11 · 10)/(3 · 2) = 220
trichords: e.g., {0, 4, 7} (or {C, E, G}), {6, 7, 11} (or {F], G, B}), etc.
You get the idea. The naming scheme continues in this manner: 4-chords are
called tetrachords, 5-chords pentachords, and 6-chords hexachords.
Triads
The first “real” chords one learns about in music theory are triads, which
will also play a prominent role in our mathematical approach.
Definition 16. A triad is a trichord {P1 , P2 , P3 } built as follows:
1. Pick a root pitch P1 .
2. Transpose P1 up a third (major or minor) to get P2 , called the third of
the triad.
3. Transpose P2 up a third (major or minor) to get P3 , called the fifth of
the triad.
The choice of major/minor in each of steps (2) and (3) determine four different
types of triad:
major + minor
⇔
major triad
minor + major
⇔
minor triad
minor + minor
⇔
diminished triad (fifth is diminished)
major + major
⇔
augmented triad (fifth is augmented)
Comment 6.1. The process above is often summarized by saying a triad is a
“stack of thirds”. You can create chords by stacking other intervals as well. For
example, {B[, C, E[, F } is a stack of perfect fourths.
Example 6.3. Below you find all four types of triads built on roots C and F ] ,
respectively.
Notation. The table below illustrates how we denote triads using the pitch
name of the root (capital or lower-case depending on major/minor) along with
additional symbols in the augmented and diminished case.
Major
Minor
Diminished
Augmented
{C, E, G}, {F], A], C]}, . . .
{C, E[, G}, {F], A, C]}, . . .
{C, E[, G[}, {F], A, C}, . . .
{C, E, G]}, {F], A], D}, . . .
55
C, F] . . .
c, f] . . .
c◦ , f]◦ . . .
C+ , F]+ . . .
How many triads are there in the equal-tempered universe? The following
observations allow us to count them. As usual we work with pitch-classes.
1. Major, minor and diminished triads have a unique root. This is not true
of an augmented triad, as each of its pitches can serve as a root: e.g.,
{C, E, G]} can be built starting with either C, E, or G] as the root.
2. Thus to each of the 12 pitch-classes we can associate the three triads
(major, minor or diminished) of which it is the unique root. This gives
rise to a total of 36 major, minor or diminished triads.
3. A simple count (try it on a keyboard) shows there are only 4 augmented
triads: namely {C, E, G]}, {D[, F, A}, {D, F], A]}, {E[, G, B}.
4. Thus there are a total of 36 + 4 = 40 equal-tempered triads up to octave
equivalence!
Geometric representations of chords
The geometric nature of Xpitch (the real line) and Xpc (the unit circle) allows
us to easily picture chords–either as a collection of points plotted on the real
line (when using pitches), or as a collection of points in the circle (when using
pitch-classes). Below you find these two types of representations for the chord
{C4, G4, E5}.
-2
0
2
4
6
��
8
10
�
12
14
16
18
20
22
�
�
��
�
�
�
�
�
�
�
� ��
The geometry offers some insight into the four different types of triads. Below
56
you find the C] major, B minor, G diminished, and C augmented triads.
��
�
��
�
�
��
�
��
�
�
�
�
�
��
�
�
�
�
�
�
� ��
�
�
�
�
�
�
��
�
�
�
�
� ��
�
��
�
�
�
�
�
��
�
�
�
�
�
�
�
� ��
�
�
� ��
Functionality of triads
Looking at the geometry, the major and minor triads are the most irregular
of the four, at least in the sense of dividing up the circle into arcs of three
distinct lengths.
And yet in traditional harmony major and minor triads are considered to
be the most “stable” of the four triads; diminished and augmented triads are
treated as “unstable” triads that should “resolve” to a major or minor triad.
If we think of our triads as descendants from corresponding triads in a just
system (e.g., Pythagorean, or just intonation), then a possible explanation for
this dichotomy presents itself.
Starting with a fixed frequency f , we can
build just major/minor triads using “simple ratios”: e.g., {f, (6/5)f, (3/2)f } is
a major triad in just intonation. To build diminished or augmented triads, on
the other hand, we have to use “not so simple” (thus less consonant) ratios. For
example, the simplest instance of the diminished triad in the just intonation
system is {B, D, F} = {(15/8)f, (9/8)f, (4/3)f }, which contains the “dissonant
interval” {B, F} with frequency ratio 45/32.
Seventh chords
If you add one more third to a triad, you get what is called a seventh chord.
Definition 17. A seventh chord is a 4-chord built by stacking thirds. In other
words we create a seventh chord by starting with a root P1 and transposing up
three times by thirds to obtain the chord {P1 , P2 , P3 , P4 }.
57
The interval {P1 , P4 } in this case is a seventh (hence the name), and the set
{P1 , P2 , P3 } is a triad. The quality of both of these subchords determines the
seventh chord’s name as follows:
thirds
M3+m3+M3
M3+m3+m3
m3+M3+M3
m3+M3+m3
m3+m3+M3
m3+m3+m3
M3+M3+m3
{P1 , P2 , P3 }
major
major
minor
minor
diminished
diminished
augmented
{P1 , P4 }
M7
m7
M7
m7
m7
d7
M7
{P1 , P2 , P3 , P4 }
major seventh
dominant seventh
minor/major seventh
minor seventh
half-diminished seventh
diminished seventh
augmented/major seventh
Seventh chords
Comment 6.2. What about the stack M 3+M 3+M 3? In this case {P1 , P2 , P3 }
is an augmented triad, and P4 is an octave above P1 , which makes {P1 , P2 , P3 , P4 } =
{P1 , P2 , P3 } as pitch-class sets. Thus we do not get a 4-chord in this case.
Notation. The notation for seventh chords is similar to that of triads. As
before, we illustrate with examples. Note: there is a wide variety of accepted
chord notation. Here I follow the convention Dmitri Tymoczko outlines in his
Music 105 lecture notes.
Major
Dominant
Minor/major
Minor
Half-diminished
Diminished
Augmented/major
6.3
{C, E, G, B}, {F], A], C], E]}
{C, E, G, B[}, {F], A], C], E}
{C, E[, G, B}, {F], A, C], E]}
{C, E[, G, B[}, {F], A, C], E}
{C, E[, G[, B}, {F], A, C, E}
{C, E[, G[, B[}, {F], A, C, E[}
{C, E, G], B}, {F], A], C× , E]}
Cmaj7 ,F]maj7
C7 ,F]7
cmaj7 ,f]maj7
c7 ,f]7
c∅7 ,f]∅7
c◦7 ,f]◦7
C+maj7 ,F]+maj7
Operations on chords: transposition
We now model two types of musical procedures performed on chords with corresponding mathematical operations performed on pitch and pitch-class sets. The
first, transposition, is an easy generalization of the transposition operations we
defined on pitch and pitch-class space. The second, inversion, is an altogether
new type of operation.
Definition 18. Any real number α ∈ R defines a transposition operation tα
on chords (whether pitch or pitch-class sets), defined as follows.
Given a pitch set X = {P1 , P2 , . . . , Pn }, we define
tα (X) = {P1 + α, P2 + α, . . . , Pn + α}.
Given a pitch-class set X = {[P1 ]12 , [P2 ]12 , . . . , [Pr ]12 }, we define
tα (X) = {[P1 + α]12 , [P2 + α]12 , . . . , [Pr + α]12 }.
Again, if the context is clear, we will drop the bracket notation and simply
compute everything modulo 12.
58
Comment 6.3. In both cases we see that a transposition tα is an operation
which takes any chord and transposes each pitch in the chord by α.
Example 6.4. Consider the transposition t−8 . As an operation, this takes any
chord and transposes each pitch down a minor sixth.
For example, given the pitch set {−10, 4, 5, 19}, we have
{−10, 4, 5, 19} 7→ t−8 ({−10, 4, 5, 19}) = {−18, −4, −3, 11}.
When dealing with pitch-class sets, we do the same thing, but take everything
modulo 12. For example, given the pitch-class set {[2]12 , [4]12 , [5]12 , [7]12 }, we
have
{2, 4, 5, 7}
7→
t−8 ({2, 4, 5, 7})
=
{2 − 8, 4 − 8, 5 − 8, 7 − 8}
=
{−6, −4, −3, −1}
=
{6, 8, 9, 11}.
Geometric representation of transpositions
As operations on pitch sets, transpositions tα act as horizontal shifts (right
if α > 0, left if α < 0).
{−10, 4, 5, 19}
t−8
-20
-15
-10
-5
{−18, −4, −3, 11}
-20
-15
-10
-5
0
5
10
15
20
5
10
15
20
7→
7→
t−8
0
As operations on pitch-class sets, transpositions tα act as rotations (clockwise
if α > 0, counterclockwise if α < 0).
��
�
�
�
��
�
�
�
�
�
�
�
� ��
t−8 = by 8(π/6)
��
�
�
�
��
�
�
�
�
�
�
59
�
� ��
12-tone equal-tempered transposition group
As the discussion above illustrates, when restricting our attention to pitchclass sets taken from the 12-tone equal-tempered system, that is to subsets
X ⊂ {0, 1, 2, . . . , 11}, we
(i) only consider transpositions tα with α an integer (since otherwise tα would
transpose us straight out the the equal-tempered universe), and
(ii) we only care what α is up to congruence modulo 12.
This motivates the following definition.
Definition 19. Write Z/12Z = {0, 1, . . . , 11}; that is, we denote [i]12 by i. The
group of equal-tempered pitch-class transpositions is defined as T12 :=
{t0 , t1 , t2 , . . . , t11 }, with group operation
ti ◦ tj = ti+j .
Comment 6.4. It might be objected that T12 is simply Z/12Z in disguise.
More precisely, the map
ti 7→ [i]12
is an obvious isomorphism between the two groups. So why bother introducing
new notation?
Answer: we wish to understand the elements of T12 as transpositions, certain
operations acting on sets of pitch-class sets. This explains also why we use the
composition symbol ‘◦’ to denote the group operation, so that ti ◦tj is understood
as the composition of two transpositions done in succession.
Comment 6.5. As usual, if the context is clear, we will denote elements of
T12 simply as ti , and use modular substitution freely when computing with the
group.
For example, note that t−8 = t4 in T12 , since −8 ≡ 4 (mod 12). This is a
nice, succinct way of saying that transposing down by a m6 is the same thing
as transposing up by a M3 up to octave equivalence.
This also means that in our earlier example we could have computed
t−8 ({2, 4, 5, 7}) = t4 ({2, 4, 5, 7}) = {6, 8, 9, 11}.
More generally, for any i ∈ {0, 1, 2, . . . , 11}, we have t−i = t12−i in T12 , since
−i ≡ 12 − i (mod 12). This means transposing down by any interval i is the
same as transposing up by its inverse interval 12 − i up to octave equivalence,
as we have already had occasion to observe.
Perhaps the most concrete way of thinking of elements of T12 is as rotations
around the circle by a number of ticks. In this light equalities like t−i = t12−i
are fairly obvious consequences of “clock arithmetic”!
60
6.4
Operations on chords: inversion
We motivate the definition of inversion by first treating it as an operation on
melodies, as opposed to chords.
Consider the Contrapunctus No. 5, from J.S. Bach’s The Art of Fugue (Die
Kunst der Fuge).
The piece is built out of various transpositions of the subject and its inversion, shown below.
Roughly speaking, we obtain the pitch content of the inversion by reflecting the
pitches of the subject through the horizontal axis determined by the pitch A.
This is not literally the case in the example shown. To be a true reflection the
C in the inversion would have to be replaced by a C], which would give the
inversion a major key feel. As such the example is of what is called a diatonic
inversion.
Algebraic definition of inversion
It is not difficult to come up with an algebraic description of inversion. To
a reflect a pitch P through the axis determined by A4 = 9 we simply subtract
P − 9 from the pitch twice: the first subtraction brings the pitch to A, the
second subtraction sends reflects it through to the other side! Thus combined
map sends
P 7→ P − 2(P − 9) = −P + 18.
Let’s confirm this operation maps the subject in Bach’s example to its inverted
form:
P Name P 0 = −P + 18 Name
9
A4
9
A4
2
D4
16
E5
4
E4
14
D5
5
F4
13
C]5
..
..
..
···
.
.
.
We now can define a general inversion with respect to any fixed pitch or pitch
class Q0 .
Definition 20. We give separate definitions for pitches and for pitch classes.
1. Fix a pitch Q0 . We define inversion with respect to Q0 to be the
function iQ0 defined by
iQ0 (P ) = P − 2(P − Q0 ) = −P + 2Q0 .
61
2. Fix a pitch class [Q0 ]12 . We define inversion with respect to [Q0 ]12 to
be the function iQ0 defined by
iQ0 ([P ]12 ) = −[P ]12 + 2[Q0 ]12 = [−P + 2Q0 ]12 .
Comment 6.6. There are a lot of letters in play here. Note that Q0 is fixed,
and is part of the definition of the inversion. The input here is P (or [P ]12 ).
Example 6.5. The pitch Q0 in the definition of iQ0 need not be equal-tempered!
For example, consider the inversion with respect to Q0 = 1.5. By definition we
have
iQ0 (P ) = −P + 2Q0 = −P + 2(1.5) = −P + 3.
This inversion still maps integers to integers, and thus equal-tempered pitches
to equal-tempered pitches!
Geometrically this reflects pitches through the horizontal line halfway between C]4 and D4. In particular we have
i1.5 (C]4)
i1.5 (1) = −1 + 3 = 2 = D4, and
=
= i1.5 (2) = −2 + 3 = 1 = C]4,
i1.5 (D4)
as expected.
Example 6.6. Consider the same function as an operation on pitch-class space:
that is, take [Q0 ]12 = [1.5]12 . As usual we will drop the brackets and use modular
substitution with impunity. We have i1.5 (P ) = −P + 3 as before. Let’s compute
i1.5 (P ) for P = 0, 1, 2, . . . 11 and see what transformation of the circle we get:
P
i1.5 (P )
0
3
1
2
2
1
3
0
11
4
11
5
10
6
9
7
8
8
7
9
6
10
5
11
4
i 1.5 (P)
0
1
2
10
9
3
8
4
7
6
5
X pc
We see that inversions in pitch-class space are also reflections. Now the
function iQ0 (P ) reflects points of the circle through the diameter defined by the
points Q0 and Q0 + 6. When Q0 is either an integer or half-integer (i.e., Q0 = i
62
or Q0 = i/2 for some i ∈ {0, 1, . . . 11}), then iQ0 (P ) sends equal-tempered pitch
classes to themselves. Below you find typical examples of each case:
11
i 1.5 (P)
0
2
10
9
5
6
2
9
4
7
1
10
3
8
i 10 (P)
0
11
1
3
8
4
7
X pc
6
5
X pc
Applying inversions to chords
As with transpositions, our inversion functions iQ0 naturally define operations on chords X = {P1 , P2 , . . . , Pn }; we simply apply iQ0 to each element of
the chord.
Definition 21. Fix Q0 and let X = {P1 , P2 , . . . , Pn }. We define
iQ0 (X) = {iQ0 (P1 ), iQ0 (P2 ), . . . , iQ0 (Pn )}.
Comment 6.7. What do inversions do geometrically to chords? The foregoing
discussion already gives us the answer: they reflect them! In pitch space chords
get reflected through a certain line (determined by the particular inversion);
in pitch-class space, chords get reflected through a certain diameter (again,
determined by the particular inversion).
Example
Consider the inversion i2 (P ) = −P + 4 acting on the D major triad X =
{2, 6, 9}, considered as a pitch-class set. Then we have
i2 (X)
= {i2 (2), i2 (6), i2 (9)}
= {2, −2, −5}
= {2, 10, 7}
63
11
X={2,6,9}
0
11
1
2
10
9
6
5
2
9
4
7
i2 (X)={2,7,10}
1
10
3
8
0
3
8
4
7
X pc
6
5
X pc
Let X = {2, 6, 9} (the D major triad) and Y = {2, 7, 10} (the G minor triad).
In the last example we saw that i2 (X) = Y ; our inversion transformed a major
triad to a minor triad.
The observation is true in general: any inversion iQ0 maps any major triad
to a minor triad, and vice versa!
The geometry of pitch-class space makes this relatively clear. In pitch-class
space we identify a major triad as a sequence of a major third followed by a
minor third going clockwise around the circle. When you invert this chord,
reflecting it through some given diameter, you are left with a sequence of a
major third followed by a minor third going counterclockwise around the circle.
This is nothing more than a minor triad, as you can readily verify.
At work here is the fact that mathematically speaking a reflection is an
orientation reversing transformation of the circle.
Interaction of transpositions and inversions
What happens if you first apply an inversion, and then a transposition, or
vice versa? What sort of operation results?
Let’s focus on pitch-class space, and consider the inversion i0 (P ) = −P +
2 · 0 = −P . Take any one of our transpositions tj ∈ T12 and consider the
composition tj ◦ i0 . To see what sort of operation this is, we see what it does
to an arbitrary pitch-class P :
tj ◦ i0 (P )
=
tj (i0 (P ))
=
tj (−P )
=
−P + j
=
−P + 2(j/2)
=
ij/2 (P )
We just proved that
tj ◦ i0 = ij/2
for all tj ∈ T12 . In other words, when we compose a transposition with the
inversion i0 , we get another inversion! What happens if we compose in the
64
opposite order: that is, what sort of operation is i0 ◦ tj ? Again we see what it
does to an arbitrary P :
i0 ◦ tj (P )
=
i0 (tj (P ))
=
i0 (P + j)
=
−(P + j)
=
−P − j
=
−P + (12 − j) (modular subst.)
12 − j
−P + 2
2
i 12−j (P )
=
=
2
We just prove that
i0 ◦ tj = i 12−j
2
for all j. So composing in the other order also yields an inversion!
Adding inversions to T12
What if we wanted to enlarge our group T12 of transpositions by adding the inversion i0 ,
continuing to use composition as our group operation?
Since tj ◦ i0 = ij/2 for any j, if we want the group operation to be well-defined we also
have to add ij/2 to our new group for all j ∈ {0, 1, . . . , 11}.
There are exactly 12 of these ij/2 , and one can show in fact that they are precisely all
the inversions of pitch-class space which map equal-tempered pitches to equal-tempered ones.
The resulting set
{t0 , t1 , . . . , t11 , i0 , i1/2 , . . . , i5 , i11/2 }
thus has exactly 24 elements.
Is this a group? Yes. In fact, if we set e = t0 , t = t1 and i = i0 , then you can show that
the set above is precisely
{e, t, t2 , . . . , t11 , ti, t2 i, . . . , t11 i},
where t and i satisfy
t12
=
i2 = e
itj
=
t12−j i.
Look familiar?
Definition 22. Set t = t1 and i = i0 , considered as transformations of pitchclass space. The group of equal-tempered transpositions and inversions
is the group
M12 = {e, t, t2 , . . . , t11 , ti, t2 i, . . . , t11 i}
consisting of all transpositions and inversions of equal-tempered pitch classes.
The elements t and i sastify the relations
t12
= i2 = e
itj
= t12−j i,
making M12 isomorphic to the dihedral group D12 .
65
Comment 6.8. Geometrically speaking M12 is a subset of the set M of all
rigid motions of the circle; the ‘M’ thus stands for ‘motion’. A rigid motion is
a transformation of a space which preserves distance.
You can prove that every rigid motion of the circle is either a rotation (transposition) or a reflection (inversion). It follows that M is actually an infinite
group, though it has a similar structure to D12 : if we let i be the same inversion
(reflection) as above, and let tθ be transposition (rotation) by θ, then every
element of M can be written as tθ or tθ i for some real number θ ∈ [0, 12).
We obtain the finite group M12 by taking only those tθ and tθ i with θ = j
an integer. This is precisely the set of those rotations and reflections which map
integer classes to integer classes on our circle.
This can also be thought of as the group of rigid motions (or symmetries) of
the regular dodecahedron.
Classic 7 (Béla Bartók’s Mikrokosmos). Mikrokosmos by Hungarian composer
Béla Bartók (1881-1945) is a collection of 153 piano studies, ranging from beginner to professional level, published in 6 volumes.
Volumes 5 and 6 contain many little masterpieces of twentieth century composition, and in a similar manner to the works of Bach, are a treasure trove for
those looking for interesting examples of various composition techniques.
Bartók was a towering figure among Hungarian composers. Ligeti himself,
though never a student of Bartók’s, very deliberately attempted to get out from
under his shadow. This was in fact one of Ligeti’s goals in writing Musica
Ricercata, though as you may notice from the following Mikrokosmen, he didn’t
completely succeed.
Mikrokosmos, No. 140, “Free Variations”
66
The right-hand starting in mm.13-19 is clearly an inversion of the left-hand
in the mm. 1-7. Octave equivalence is an issue here. Do we consider this an
operation on pitch space, in which case it would be i4 , the inversion with respect
to E4? Or do we consider it as an operation on pitch-class space, in which case
it would be i9 , the inversion with respect to A?
Mikrokosmos, No. 142, “From the Diary of a Fly”
Mikrokosmos No. 142 makes prominent use of melodic inversion through
the axis splitting G and A[ (i7.5 ), as well as chordal inversion through the axis
splitting E[ (i13.5 ):
67
7
Chord-types
When discussing triads we naturally sorted them into 4 different types: major, minor, diminished and augmented. Musically speaking, what type of triad
a chord is determined by its intervallic content
Mathematically speaking the type of a triad is determined by the various distances between pitches–more precisely, the sequence of these distances, moving
around the circle clockwise.
This is precisely the information that is preserved when we apply a transposition to a chord, and is the motivation for the following definition.
Definition 23. We work within pitch-class space. Two chords X and Y are of
the same type if the one is a transposition of the other: that is, if there is a
transposition ti ∈ T12 such that ti (X) = Y .
This defines an equivalence relation on the set of all chords (pitch-class sets).
The equivalence classes determined by this relation are called chord-types.
Thus given a chord X, the set of all its transpositions forms a chord-type,
which we call the chord-type of X. Furthermore, chords X and Y have the same
chord-type if and only if they are transpositions of one another.
Comment 7.1. Two chords being of the same type is very closely related,
though not identical to the notion of two shapes in the plane being congruent.
Shapes X1 and X2 are congruent if the one can be obtained from the other via a
rigid motion: that is, some combination of translation, rotation and reflection.
Our definition of chord-type excludes the reflection option. Why? If we
allowed inversion, then we would have to say that {0, 4, 7} is the same type as
{0, 3, 7}, which would erase the difference between major and minor triads!
The music theorist Allen Forte, who died this October (2014) at the age of
87, and who is largely responsible for introducing these set theory ideas into
music theory, did in fact include inversion when considering chord-types. Thus
according to his taxonomy, the major and minor triad are one and the same
beast!
68
Example 7.1. The chord-type of X = {0, 4, 7} = {C, E, G} is defined as the
set of all equal-tempered transpositions of X. This is just the set
{{0, 4, 7}, {1, 5, 8}, {2, 6, 9}, . . . } = {{C, E, G}, {D[, F, A[}, {D, F], A}, . . . }
of all twelve of the major triads. In this sense our equivalence class does indeed
capture the property of being a major chord.
Example 7.2. Compute the chord-type of X = {0, 3, 6, 9} = {C, E[, G[, A =
B[[}. Begin by computing the transpositions of X one by one:
t0 (X)
=
{0, 3, 6, 9}
t1 (X)
=
{1, 4, 7, 10}
t2 (X)
=
{2, 5, 8, 11}
t3 (X)
=
{3, 6, 9, 12} = X!!
Thus the chord-type of X is
{{0, 3, 6, 9}, {1, 4, 7, 10}, {2, 5, 8, 11}} = {c◦7 , c]◦7 , d◦7 }.
These are precisely all three diminished seventh chords. So in this case our
chord-type captures the property of being a diminished seventh chord!
The prime form of a chord-type
Not all of our chord-types have natural names like ‘minor triad’ or ‘halfdiminished seventh’. To make up for this, we establish a convention for picking
one particular representative of a chord-type to serve as its name. We will called
this the prime form of a chord-type3 .
Definition 24. The prime form of a chord-type is the unique element X =
{P1 , P2 , . . . , Pr } (where 0 ≤ P1 < P2 < . . . Pr ≤ 11) in the equivalence class
such that
1. P1 = 0,
2. Pr is as small as possible,
3. the sequence (P2 , P3 , . . . , Pr−1 ) is as small as possible with respect to the
lexicographic order.
.
Comment 7.2. Ordering sequences lexicographically is the same thing as “alphabetizing” them using the natural ordering of the integers. Thus, for example,
the sequence (1, 4, 2, 3, 1, 4) is smaller than the sequence (1, 4, 2, 5, 1, 4); the first
term where they differ is the fourth, and 3 < 5.
3 Forte defines the prime form of a chord with respect to all transpositions and inversions.
This of course yields a different notion of chord-type.
69
Example 7.3. Consider the chord X = {1, 4, 6, 9, 10}.
To find the prime form of its chord-type, we first find all the transpositions
of X satisfying properties (1) and (2) from the definition. To do this draw X on
the circle and find the biggest gap between consecutive points (going clockwise).
Each time you find a gap of this maximal length between pitches P and Q (in
clockwise order), rotate the second pitch Q to 0.
In our case the biggest gap is 3, occurring between 1 and 4, 6 and 9, and 10
and 1. These give rise to three transpositions of X satisfying (1) and (2):
X1
= {0, 2, 5, 6, 9}
X2
= {0, 1, 4, 7, 9}
X3
= {0, 3, 5, 8, 9}
The prime form is now the smallest of these three, ordered alphabetically. This
is {0, 1, 4, 7, 9}.
7.1
Counting chords of the same type
Why are there 12 major triads, but only 3 diminished seventh chords? Why
do both these numbers divide 12? Is there in general a way of knowing how
many chords of a certain type there are, or equivalently, of knowing how many
different transpositions of a given chord X?
Group theory provides the answers to all these questions! The results are so
(dare I say) beautiful and far-reaching, that I’m sure you will forgive my having
to introduce some more terminology.
Group actions
Definition 25. Let G be a group and let S be a set. A group action of G on
S is a rule which given a group element g ∈ G and an element s ∈ S of our set,
outputs a new element s0 = g · s of S, and which satisfies the following axioms:
(i) e · s = s for all s ∈ S (identity element acts trivially);
(ii) (g1 g2 ) · s = g1 · (g2 · s) for all g1 , g2 ∈ G and s ∈ S (associativity).
Comment 7.3. This is just a generalization of our current setup. We have a
group G = T12 of transpositions that acts on the set S of all chords. The rule
for the action g · s is simply given by my definition of transposition: given a
transposition tj in T12 and a chord X in S, we define tj · X = tj (X).
Stabilizers and orbits
Definition 26. Suppose G acts on the set S. Fix an element x ∈ S.
The orbit of s, denoted Os is the set of all translates of s by elements of G:
Os = {g · s : g ∈ G} = {g1 s, g2 s, . . . }.
70
The stabilizer of s, denoted Stabs , is the set of elements of G which fix s:
Stabs = {g ∈ G : g · s = s}.
Comment 7.4. In our setup, given a chord X, its orbit
OX = {t0 (X), t1 (X), . . . , t11 (X)}
is precisely its chord-type, the set of all transpositions of X.
The stabilizer of X is
StabX = {tj ∈ T12 : tj (X) = X},
the set of transpositions that fix the chord.
Lagrange’s Theorem
We now state a wonderful theorem that allows us to count the translates
of an element s (in our context this will be the transpositions of a chord) by
counting the elements of its stabilizer Stabs . The latter is often easier to count,
which is why the theorem is so useful.
Theorem (Lagrange’s Theorem). Let the finite group G act on the set S. Given
any s ∈ S, we can count the size of its orbit as follows:
#Os = #G/# Stabs .
Example 7.4. Let G = T12 and let S be the set of all chords.
Take X = {0, 4, 7}. We have StabX = {t0 }, since any other transposition
does not map X to itself, as one readily verifies. Thus according to the theorem,
the number of transpositions of X is
#OX = #G/# StabX = 12/1 = 12,
as we already verified!
Take Y = {0, 3, 6, 9}. We have StabY = {t0 , t3 , t6 , t9 }, as one readily verifies.
It follows that
#OY = #G/# StabY = 12/4 = 3,
as we already verified.
Example 7.5. Continue with G = T12 and S the set of all chords. How many
chords are of the same type as X = {0, 1, 3, 6, 7, 10}?
We must count its different transpositions. We have StabX = {t0 , t6 } in this
case, from which it follows that
#OX = #G/# StabX = 12/2 = 6.
In this case using the formula is faster than computing the chord-type of X
directly, which would be
{t0 (X) = X, t1 (X), t2 (X), t3 (X), t4 (X), t5 (X)}
71
Transpositional and inversional symmetry
We easily conclude from Lagrange’s theorem that a chord X has less than 12
distinct transpositions if and only if there is a nontrivial transposition tj such
that tj (X) = X.
Since tj acts as rotation, it follows that if tj (X) = X, then the chord X
admits a rotational symmetry, when considered as a collection of points on the
circle, . Accordingly, we call such chords (and their corresponding chord-types)
symmetric, or more specifically, we say in this case that the chord X has
transpositional symmetry by j.
We have already seen two examples of symmetric chord-types: written in
prime form these are {0, 4, 8} (transpositional symmetry by 4) and {0, 3, 6, 9}
(transpositional symmetry by 3), the augmented triad and diminished seventh
chord.
Similarly, we say a chord X has inversional symmetry if there is an inversion iQ0 such that iQ0 (X) = X. Since inversion is reflection, geometrically
this means the chord X admits a an axis of symmetry.
7.2
Counting chord-types
Recall that there are 220 different trichords. How many different chord-types
do these divide up into? We have the four chord-types corresponding to the
four types of triads. These account for just 3 · 12 + 4 = 40 of the 220 trichords.
How do the remaining 180 break up into chord-types?
More generally, we can fix an n and ask how many different chord-types there
are among the set of all n-chords. This is not such an easy counting problem,
but luckily group theory comes to our aid in spectacular fashion.
From the group theoretic standpoint, when a group G acts on a set S, it
decomposes S into distinct orbits Os . We often want to know how many different
orbits S decomposes into under this action.
This question has a surprising answer in the form of Burnside’s Lemma.
Classic 8 (Burnside’s lemma). It turns out we can count the number of distinct
orbits in S by counting something seemingly totally unrelated: the number of
elements of S fixed by a given group element g ∈ G.
Definition 27. Let G be a group acting on a set S. Given g ∈ G the set of
fixed points of g is defined as the set
Fixg := {s ∈ S : g · s = s}
of all elements of S fixed by g.
Comment 7.5. There is clearly some structural similarities between Stabs and
Fixg , and it is easy to get the two notions muddled up. The following diagram
might be useful in keeping things straight, at least as far as showing where
everything lives. Everything in the left-most column lives in G; everything in
72
the right lives in S.
G
acts on
S
g∈G
=⇒
Fixg = {s ∈ S : g · s = s}
Stabs = {g ∈ G : g · s = s}
⇐=
s∈S
Burnside’s lemma. Let G = {g1 , g2 , . . . , gm } be a finite group acting on a set
S. Let Norbits be the number of orbits that S decomposes into under this action.
Then
1 X
Norbits =
# Fixg
#G
g∈G
=
1
(# Fixg1 +# Fixg2 + · · · + # Fixgr )
#G
Comment 7.6. I omit the proof here, as I did also with Lagrange’s theorem,
simply because of time constraints. I’m happy to report however, that I consider
both proofs well within your mathematical capabilities at this point.
We apply the preceding to the situation when G = T12 acting on S, the set
of all n-chords, in which case orbits correspond to chord-types of n-chords. We
start off with n = 2 to get warmed up.
Example 7.6. The set S of all 2-chords decomposes into a number of distinct
chord-types under the action of T12 . Since a 2-chord {P1 , P2 } is just an interval,
we call these chord-types interval-types. We count N2 , the number of intervaltypes, using Burnside’s lemma:
N2
=
X
1
# Fixtj
#T12
=
1
(# Fixt0 +# Fixt1 + · · · + # Fixt11 ).
12
tj ∈T12
So now all we have to do is count Fixtj ; that is, for each transposition tj we
have to count the number of intervals {P1 , P2 } it fixes. This
is easy.
First off we have # Fixt0 = 66, since t0 fixes all 66 = 12
2 dyads. Next we observe that # Fixt6 = 6 as it fixes all and only the 6 tritones {0, 6}, {1, 7}, . . . , {5, 11}.
Lastly, # Fixtj = 0 for all other j. Why? Think geometrically! Putting it all
together, we conclude that
N2 =
1
(66 + 6) = 72/12 = 6.
12
Comment 7.7. Does the last example make sense? The conclusion was that
there are only 6 different interval types, but shouldn’t there be 12 types (minor/major second, minor/major third, etc.) corresponding to the twelve possible
lengths of the interval?
73
Prime form
{0, 1}
{0, 2}
{0, 3}
{0, 4}
{0, 5}
{0, 6}
Interval type
minor second=major seventh
major second=minor seventh
minor third=major sixth
major third=minor sixth
perfect fourth=perfect fifth
tritone
Figure 6: The prime forms for the 6 different interval-types
Ah, we have forgotten that we work in pitch-class space, so there are in
fact only 6 possible lengths of an interval {P1 , P2 }: we take the shortest path
around the circle! Put another way, all inversely related intervals define the
same interval-type!
Example 7.7. Take n = 3. The set S of all 3-chords decomposes into a number
of distinct chord-types under the action of T12 . Call these trichord-types. We
count N3 , the number of trichord-types, using Burnside’s lemma:
N3
=
1
(# Fixt0 +# Fixt1 + · · · + # Fixt11 ).
12
Note: our computation will differ from the n = 2 case, as now we count how
many trichords {P1 , P2 , P3 } are fixed
by each tj .
As before # Fixt0 = 220 = 12
3 , since t0 fixes all trichords. Furthermore,
# Fixt4 = 4 as t4 fixes the 4 different augmented triads. It follows that t8 = t4 ◦t4
also fixes these and only these trichords; thus #F ixt8 = 4. Lastly, # Fixtj = 0
for all other j by another geometric argument: for a trichord to be fixed by a
rotation, it must be symmetric, hence an augmented triad.
Thus
1
(220 + 4 + 4) = 228/12 = 19.
N3 =
12
There are exactly 19 different trichord-types!
8
Scales
How should we model musical scales? Our first inclination, especially after listening to our downstairs neighbor diligently pound through all 12 major
scales, is to treat these as sequences of pitch-classes. Order matters here, right?
For example, we would represent the C major scale as the sequence (0, 2, 4, 5, 7, 9, 11)
(or (C, D, E, F, G, A, B), using pitch names) and the F] major scale as (6, 8, 10, 11, 1, 3, 5)
(or (F], G], A], B, C], D], E]), using pitch names).
Looking at these two sequences carefully, however, we see that the particular
ordering of the chosen pitches here is not all that interesting: we have just
listed the given pitch-classes in their natural order around the pitch-class circle.
In other words, our sequences don’t contain much more information than the
74
corresponding sets {0, 2, 4, 5, 7, 9, 11} and {6, 8, 10, 11, 1, 3, 5}. (The sequences
do in fact contain one more piece of information, namely who goes first, but
this is easily dealt with.)
Furthermore, scales are often treated by composers as fixed collections of
pitches from which they draw subsets in order to build chords or melodies.
Accordingly we will model scales as sets of pitch-classes, just as we did with
chords, but will develop some additional theory to reflect their particular musical
functions. Note: for the rest of this section we will work exclusively in pitchclass space. Accordingly we will drop the bracket notation, and use modular
arithmetic with impunity.
Definition 28. We will call a scale any subset X = {P1 , P2 , . . . , Pr } of r
distinct pitch-classes.
Most of the scales we will consider will be equal-tempered, which means as
usual that the Pi ∈ {0, 1, 2, . . . , 11} are taken from our set of 12 equal-tempered
pitch-classes.
Furthermore, we typically will insist that r ≥ 5; i.e., you should have at least
5 pitches to be considered a scale.
Lastly, as with chords we say that two equal-tempered scales X and Y are
of the same type if there is a transposition tj ∈ T12 such that tj (X) = Y (i.e.,
the one is a transposition of the other). The equivalence classes determined by
this relation are called scale-types.
Example 8.1. The scale {0, 2, 4, 5, 7, 9, 11} is called the C diatonic scale.
It can be described as starting with the pitch C and applying in order the
following sequence of whole (W = M2) and half step (H = m2) transpositions:
(W, W, H, W, W, W, H).
The same description in terms of W and H allows us to define diatonic scales
starting with any pitch. Thus the G diatonic scale is just {7, 9, 11, 0, 2, 4, 6} =
{0, 2, 4, 6, 7, 9, 11}.
It is clear each such diatonic scale is just a transposition of the C diatonic
scale, and thus together these comprise a single scale-type, which we call diatonic.
How many different diatonic scales are there? Is it possible, for example,
that the G[ diatonic scale is just the same thing as the B[ diatonic scale written
in a different order?
Use Lagrange’s theorem! The stabilizer of the C diatonic collection is trivial,
so its orbit has 12/1 = 12 different transpositions in it. There are indeed 12
different diatonic scales!
Modes
Let’s return to our earnest piano student downstairs. Listening more closely
we hear he actually plays two different versions of the C diatonic scale: one that
begins with C, the sequence (0, 2, 4, 5, 7, 9, 11), and one that begins with A, the
sequence (9, 11, 0, 2, 4, 5, 7).
This difference is not captured currently in our mathematical model of scales,
but we fix this easily with the notion of a mode.
75
Mode
(C, D, E, F, G, A, B)
(D, E, F, G, A, B, C)
(E, F, G, A, B, C, D)
(F, G, A, B, C, D, E)
(G, A, B, C, D, E, F)
(A, B, C, D, E, F, G)
(B, C, D, E, F, G, A)
W -H sequence
(W, W, H, W, W, W, H)
(W, H, W, W, W, H, W )
(H, W, W, W, H, W, W )
(W, W, W, H, W, W, H)
(W, W, H, W, W, H, W )
(W, H, W, W, H, W, W )
(H, W, W, H, W, W, W )
Name
C ionian (or C major)
D dorian
E phrygian
F lydian
G mixolydian
A aeolian (or A natural minor)
B locrian
Figure 7: The seven modes of the C diatonic scale
Definition 29. Let X = {P1 , P2 , . . . , Pr } be a scale, and assume the pitchclasses are written in a clockwise sequential order. A mode of X is the sequence
(Pj , Pj+1 , . . . , Pr , P1 , . . . , Pj−1 ) you get by starting with a pitch Pj and working
around the scale in clockwise fashion.
Given the a mode (Q1 , Q2 , . . . , Qr ), we call the i-th pitch in the mode the
i-th scale degree of the mode (or scale degree i), denoted bi. We will use
the same terminology when dealing with scales too, at least when their names
indicate a preferred “first” pitch. For example, in the D diatonic scale, scale
degree 3 is b
3 = F], and scale degree 7 is b
7 = C].
Diatonic modes
In general a scale containing r distinct pitches will have r different modes,
determined by who goes first. As with chords, we transpose modes simply by
transposing each pitch:
tj ((P1 , P2 , . . . , Pr )) := (tj (P1 ), tj (P2 ), . . . tj (Pr )).
Since transposition preserves the W -H sequences above, these define the different mode-types for diatonic modes, and we can use them to generate any mode
starting with any pitch. For example the F dorian mode is (F, G, A[, B[, C, D, E[).
However, it is perhaps easier to just remember the white note modes and transpose these accordingly.
Comment 8.1. It is not just scalar runs played by our downstairs neighbor that
we identify with a particular mode. When analyzing music, we often describe
entire passages as being written in a particular mode: e.g., “this passage is in
F lydian”, or “here the composer switches to a G dorian mode”.
Such assertions indicate two musical properties:
1. the underlying scalar collection the composer is using, and
2. a particular pitch that is given special emphasis, sometimes called the
tonal center of the passage.
For example, a passage written in G dorian makes use of the F diatonic collection
= {F, G, A, B[, C, D, E} = {0, 2, 4, 5, 7, 9, 10}, and gives special emphasis to G
somehow: perhaps the melody begins and ends on G, for example.
In general the scalar collection (1) can be indicated fairly unambiguously,
whereas the tonal center (2) can be trickier to identify.
76
Example 8.2 (“Paddy’s Green Shamrock Shore”, performed by Paul Brady).
All pitches here are white notes (note the F] in the signature is always made
natural!), making the scalar collection here C diatonic. As G is clearly the tonal
center of the piece, we conclude it is written in G mixolydian.
Example 8.3 (“I can’t explain”, The Who). (I lifted this example from Tymoczko’s Music 105 lecture notes.) The piece opens with the repeated sequence
of major triads E-D-A-E. Collecting all the pitches in these chords yields the A
diatonic collection {A, B, C], D, E, F], G]}. As E is clearly the preferred pitch,
this is E mixolydian.
8.1
Generated scales
Recall that we can generate the entire 12-tone scale by starting with a pitch
and transposing up repeatedly by a perfect fifth.
If we start with F, then the first seven pitches in this sequence are precisely
the pitches of the C diatonic scale:
{F, C, G, D, A, E, B}.
We say in this case that the scale is generated.
Definition 30. A scale X is generated, if there is a pitch P1 and a transposition t such that
X = {P1 , t(P1 ), t2 (P1 ), . . . , tr−1 (P1 )}.
(As usual, tj (P ) means transpose by t a total of j times. )
Pentatonic scale
The first five pitches of the sequence of fifths starting on F comprise what
is called a pentatonic scale: {D, F, G, A, C}. By definition it is a generated
scale, and a subscale of the diatonic scale.
77
The prime form of the pentatonic scale is {0, 2, 4, 7, 9}. I will name a pentatonic scale according to the unique pitch that functions as 0 in the prime form.
Thus {F, G, A, C, D} is the F pentatonic scale. As the naming scheme suggests,
there are 12 different pentatonic scales. (Use Lagrange’s theorem!)
Note that the black notes of the keyboard {6, 8, 10, 1, 3} also form a pentatonic scale: the G[ pentatonic scale. This should come as no surprise. If we
pick up our sequence of fifths where we left off after generating the white notes,
we get precisely the five black notes {G[, D[, A[, E[, B[}. This is most often the
first pentatonic scale we meet in our musical life, and the black note pattern is
the best way of remembering the intervallic content of the pentatonic scale.
Stacks of stacks
If you try creating generated scales with smaller intervals, thirds for examples, you get “scales” with less than five notes, and with sizable gaps between
pitches.
For example, if we start with C = 0 and use a major third as our generating
interval, we get the “scale” {0, 4, 8}, which is none other than our augmented
triad. Though this stack of thirds is not enough to form a scale, we can combine
it with other augmented triads to get various 6-note scales:
{0, 4, 8} + {1, 5, 9}
=
{0, 1, 4, 5, 8, 9} the hexatonic scale (or augmented scale)
{0, 4, 8} + {2, 6, 10}
= {0, 2, 4, 6, 8, 10} the whole-tone scale
{0, 4, 8} + {3, 7, 11}
= {0, 3, 4, 7, 8, 11} = t3 ({0, 1, 4, 5, 8, 9}).
Note that the whole-tone scale is a generated scale, using transposition by a
whole tone. The stabilizer of the whole-tone scale is H = {t0 , t2 , t4 , t6 , t8 , t10 }.
Thus there are 12/6 = 2 whole-tone scales. We name them as follows:
WT-0
=
{0, 2, 4, 6, 8, 10} = {C, D, E, F], G], A]}
WT-1
=
{1, 3, 5, 7, 9, 11} = {D[, E[, F, G, A, B}
Octatonic scale
If we play the same game with a minor third, we begin with C = 0 and
generate a diminished seventh chord {0, 3, 6, 9}. Up to transposition, combining
any two such diminished seventh chords always produces the same scale-type,
called the octatonic:
{0, 3, 6, 9} + {1, 4, 7, 10} = {0, 1, 3, 4, 6, 7, 9, 10}
How many different octatonic scales are there? The stabilizer is H = {t0 , t3 , t6 , t9 }.
Thus there are 12/4 = 3 different octatonic scales. We will denote them as follows:
Oct0,1
= {0, 1, 3, 4, 6, 7, 9, 10}
Oct0,2
= {0, 2, 3, 5, 6, 8, 9, 11}
Oct1,2
= {1, 2, 4, 5, 7, 8, 10, 11}
78
We pause here to collect information about our current list of scale-types.
Name
diatonic
pentatonic
hexatonic
whole-tone
octatonic
Step sequence
(2, 2, 1, 2, 2, 2, 1)
(2, 2, 3, 2, 3)
(1, 3, 1, 3, 1, 3)
(2, 2, 2, 2, 2, 2)
(1, 2, 1, 2, 1, 2, 1, 2)
Prime form
{0, 1, 3, 5, 6, 8, 10}
{0, 2, 4, 7, 9}
{0, 1, 4, 5, 8, 9}
{0, 2, 4, 6, 8, 10}
{0, 1, 3, 4, 6, 7, 9, 10}
Stab
{t0 }
{t0 }
ht4 i
ht2 i
ht3 i
Interval vector
254361
032140
303630
060603
448444
The interval vector a1 a2 a3 a4 a5 a6 of a scale gives the number ai of intervals
contained in the scale of length i half steps.
The step sequence just indicates the number of half steps between successive
pitches in the scale. These sequences can be read straight off of the prime form,
though I have cycled the diatonic sequence around to its most familiar form
(viz., “whole, whole, half, whole, whole...”).
For the stabilizer column, the notation htj i denotes the subgroup of T12
generated by tj . Thus ht2 i = {t0 , t2 , t4 , t6 , t8 , t10 }, and ht3 i = {t0 , t3 , t6 , t9 }.
Geometric summary with inversional symmetry indicated
��
�
�
��
�
�
�
�
�
�
�
�
�
�����-����
�
�
���������
�
� ��
�
�
� ��
�
�
�
�
�
�
��
�
�
�
�
� ��
��
�
��
8.2
�
�
�
�
�
�
����������
�
��
�
�
� ��
��
�
�
�
�
��
�
��
�
��������
�
��
�
�
�
�
�
���������
�
�
� ��
Small-gap scales
One of the defining characteristics of the diatonic scale is that the gaps between
successive pitches are no more than 2 half steps, and that there are never two
79
consecutive gaps of size one half step. It is natural then to consider all scaletypes satisfying these two properties, as they will be in some sense diatonic-like.
It turns out that, up to translation, there are not so many.
whole-tone
diatonic
acoustic
octatonic
{C, D, E, F], G], A]}
{C, D, E, F, G, A, B}
{C, D, E, F], G, A, B[}
{C, C], D], E, F], G, A, B[}
(2, 2, 2, 2, 2, 2)
(2, 2, 1, 2, 2, 2, 1)
(2, 2, 2, 1, 2, 1, 2)
(1, 2, 1, 2, 1, 2, 1, 2)
(Note: The acoustic scale is so called, as these pitches are the equal-tempered
best approximation of the first 7 pitches of the harmonic scale.)
In A Geometry of Music Dmitri Tymoczko defines an n-gap scale to be one
where the gap between successive pitches is at most n half steps. He groups 2gap and 3-gap scales under the general heading of small-gap scales. Tymoczko
adds two 3-gap seven note scales (harmonic minor and harmonic major)
to our list, and we will follow suit here, yielding the following (final) table of
scale-types:
pentatonic
hexatonic
whole-tone
diatonic
acoustic
harmonic minor
harmonic major
octatonic
{C, D, E, G, A}
{C, C], E, F, G], A}
{C, D, E, F], G], A]}
{C, D, E, F, G, A, B}
{C, D, E, F], G, A, B[}
{C, D, E[, F, G, A[, B}
{C, D, E, F, G, A[, B}
{C, C], D], E, F], G, A, B[}
(2, 2, 3, 2, 3)
(1, 3, 1, 3, 1, 3)
(2, 2, 2, 2, 2, 2)
(2, 2, 1, 2, 2, 2, 1)
(2, 2, 2, 1, 2, 1, 2)
(2, 2, 1, 2, 1, 3, 1)
(2, 2, 1, 2, 1, 3, 1)
(1, 2, 1, 2, 1, 2, 1, 2)
As observed by Tymoczko, this collection of scales is “tonally complete” in the
following sense: any chord X which does not contain a chromatic cluster (three
or more consecutive pitches separated by half step) is contained within one of
these scales.
Claude Debussy, Préludes I, “Voiles”
80
Igor Stravinsky, Petroushka, II.Chez Petroushka
Olivier Messiaen, Vingt Regards sur l’Enfant-Jésus, I. Regard du
Père
81
8.3
Scalar intervals, transpositions and inversions
Tymoczko likes to think of a scale as a ruler that measures pitch-class space in
a particular way, in terms of scalar steps.
Definition 31. Given a scale X = {P1 , P2 , . . . , Pr }, where we assume the Pi
are listed in clockwise order, we say an interval of the form {Pi , Pi+k } has a
scalar length of k (scalar) steps.
Following the interval naming conventions of the diatonic scale, we call
{Pi , Pi+1 } a (scalar) second, {Pi , Pi+2 } a (scalar) third, etc.
Example 8.4. Let X = Oct0,1 = {0, 1, 3, 4, 6, 7, 9, 10}. Then {0, 4} is an octatonic fourth, since 4 is three scalar steps up from 0. Similarly, {1, 7} is an
octatonic fifth.
Let X = {0, 2, 4, 7, 9}, the C pentatonic scale. Then X has two different
kinds of pentatonic seconds: those of chromatic length 2 ({0, 2}, {2, 4}, {7, 9}),
and those of chromatic length 3 ({4, 7}, {9, 0}).
Scalar transposition
Definition 32. Given a scale X = {P1 , P2 , . . . , Pr } written in clockwise order,
we define scalar transposition by k steps to be the function stk : X → X
defined as stk (Pi ) = Pi+k , where we must take i + k modulo r for this to make
sense.
Intuitively, the scalar transposition stk shifts each pitch k places “forward”
in the scale.
Example 8.5. Let X = {0, 2, 4, 5, 7, 9, 11}, the C diatonic scale, and consider the scalar transposition st1 that shifts everything up by 1. Then we have
st1 (0) = 2, st1 (2) = 4, st1 (4) = 5, . . . , st1 (9) = 11, st1 (11) = 0.
Note that unlike normal transposition, scalar transpositions move pitches by
a varying amount. They do not preserve the (chromatic) distance between scale
pitches, but they do preserve the scalar distances!
Comment 8.2. Once we know how to define scalar transpositions on the pitches
of a scale, we go on to define scalar transpositions of subsets (chords) and
sequences (modes, melodies) in the usual way.
For example, let X = {0, 2, 4, 5, 7, 9, 11} again, and consider the scalar
melody “Do a deer”: (0, 2, 4). Transposing this up by 1 scalar step yields the
new melody (2, 4, 5), which is “Re a drop (of golden sun)”.
The example is Tymoczko’s, and his point is that though chromatically
speaking the two sequences are different (W-W, versus W-H), when measured
by the C diatonic scale they are somehow the same: namely, both melodies
simply ascend two scale steps.
Scalar inversion
We can also define a scalar version of inversion. Fix a scale X = {P1 , P2 , . . . , Pr },
written as usual in clockwise order. To invert around a scalar pitch Pj , we take
82
any pitch that is k scalar steps above Pj and and send it to the pitch that is k
scalar steps below: that is, we want a map that sends
Pj+k 7→ Pj−k .
It is easy to see that the map
Pi 7→ Pi−2(i−j) = P−i+2j
does the trick. As with scalar transposition, we must compute −i + 2j modulo
r for this to make sense.
Definition 33. Given a scale X = {P1 , P2 , . . . , Pr } written in clockwise order,
and choice of pitch Pj in the scale, we define scalar inversion with respect
to Pj to be the function sij : X → X defined as sij (Pi ) = P−i+2j , where we
must take −i + 2j modulo r for this to make sense.
Example 8.6. Return to our example from The Art of Fugue.
                      
Subject
Inversion
Recall that the inversion is not a strict chromatic inversion of the theme. Can
we express this operation in terms of scalar operations?
Yes, but to do so, we need to use the harmonic minor scale on D:
X = {D, E, F, G, A, B[, C]} = {P1 , P2 , . . . , P7 }.
Now to get the inverted form from the subject, first transpose up by 3 scale
steps (using st3 ) to make the A a D, then invert with respect to D (using si1 ).
The corresponding scalar operation is then
si1 ◦ st3 (Pi ) = si1 (Pi+3 ) = P−(i+3)+2 = P−i−1 = P−i+6 = si3 (Pi )!
Let’s check that this operation exactly maps the subject onto the inverted form:
P
P5
si1 ◦ st3 (P ) P1
8.4
P1
P5
P2
P4
P3
P3
P4
P2
P5
P1
P6
P7
P5
P1
P4
P2
P3
P3 .
Maximally even scales
Maximally even scales
What makes the diatonic scale so special? We have seen already that it is
rich in intervallic content, as evidenced by its interval vector 254361. This is
also apparent in the following property: each scalar interval of the diatonic scale
comes in two chromatic flavors (m2/M2, m3/M3, etc.). A scale satisfying this
property is called maximally even.
83
Definition 34. Given a scale X = {P1 , P2 , . . . , Pr } written in clockwise order,
we say X is maximally even if for every 1 ≤ k ≤ r − 1, the scalar intervals of
size k are either all of the same chromatic length, or else come in exactly two
consecutive chromatic lengths: that is, one of size ` half steps, the other of size
` + 1 half steps.
Example 8.7. Take the C pentatonic scale X = {0, 2, 4, 7, 9} = {P1 , P2 , P3 , P4 , P5 }.
We investigate the different chromatic flavors of each scalar interval of size k,
1 ≤ k ≤ 7.
Scalar size k Chromatic sizes
1
2, 3
2
4, 5
3
7, 8
4
9, 10
This shows the C pentatonic is maximally even, and hence that all pentatonic
scales are maximally even.
Example 8.8. Take
Oct0,1 = {0, 1, 3, 4, 6, 7, 9, 10} = {P1 , P2 , P3 , P4 , P5 , P6 , P7 , P8 }.
We investigate the different chromatic flavors of each scalar interval of size k,
1 ≤ k ≤ 7.
Scalar size k Chromatic sizes
1
1, 2
2
3
3
4, 5
4
6
5
7, 8
6
9
7
10, 11
This shows Oct0,1 is also maximally even, and thus the same is true for all
octatonic scales.
Example 8.9. Take the hexatonic scale X = {0, 1, 4, 5, 8, 9} = {P1 , P2 , P3 , P4 , P5 , P6 }.
We investigate the different chromatic flavors of each scalar interval of size k,
1 ≤ k ≤ 5.
Scalar size k Chromatic sizes
1
1, 3
2
4
3
5, 7
4
8
5
9, 11
This shows the hexatonic scales are not maximally even: the scalar seconds, for
example, come in two chromatic lengths, 1 and 3, which are not consecutive
84
Why does this property about the relation of scalar intervals to chromatic
ones deserve to be called maximally even? One would have guessed that a scale
containing n distinct pitches should be called maximally even if the pitches are
as evenly distributed around the pitch-class circle as possible.
Put another way, for any fixed n, we can always pick n pitches that divide
the circle evenly into n segments. However, when n - 12, these pitches will
not be equal-tempered! A maximally even collection should be the collection of
pitches that are the best equal-tempered approximation of this perfectly even
distribution.
Miraculously, it turns out that our definition of maximally even is equivalent
to this!
Theorem. Fix n. Let Xn = {0, 12/n, 2(12/n), . . . , 11(12/n)} be the collection
of pitches that divides the circle up into equal segments. (If n - 12, then some
pitches of Xn will not be equal-tempered.) Let Yn be the equal-tempered collection
you get by taking the integer points closest to each of the pitches i(12/n) in Xn .
Then Yn is maximally even. Furthermore, an equal-tempered n-pitch scale
X = {P1 , P2 , . . . , Pn } is maximally even if and only if it is a transposition of
Yn .
Thus for each n there is a unique maximally even scale-type!
Maximally even: n = 5
Maximally even: n = 5
85
��
�
�
�
��
�
�
�
�
����������
�
�
Maximally even: n = 7
Maximally even: n = 7
86
�
��
�
�
�
��
�
�
�
�
��������
�
�
Maximally even: n = 8
Maximally even: n = 8
87
�
��
�
�
�
��
�
�
�
�
���������
9
�
�
�
Wrap-up
We have so far developed some fairly sophisticated techniques for understanding
chords and scales (and to a lesser extent melodies), as well as some pervasive
musical operations (transposition and inversion) that are applied to these chords
and scales (and melodies).
These tools are good at giving us a static understanding of what’s going on
say in a particular measure, but do not yet capture the evolving or unfolding
nature of a piece of music.
For example, one of the most common (simplified) ways of describing a piece
of music is as a chord progression, that is, as a sequence of chords
(X1 , X2 , X3 , . . . ).
We described The Who song “I can’t explain”, for example as the sequence of
chords
({E, G], B}, {D, F], A}, {A, C], E}, {E, G], B}) .
The question then arises: what is the logic behind chord progressions? Why or
how do composers decide to move from one chord to another? Can we make
sense of the notion of two chords being “close” to one another?
Such questions fall under the rubric of voice leading, and I begin our wrap-up
by giving an informal preview (or prelude, if you will) to some mathematical
approaches to voice leading issues.
88
Introduction to voice leading
We begin by giving a new model for chords: namely, we will now think of
a chord as a sequence of pitches X = (P1 , P2 , P3 , . . . , Pn ). Here order matters,
and we often think of the pitches Pi in the chord X as belonging to a particular
voice, which we will call the i-th voice.
Now for example the chord X1 = (0, 7) is different from the chord X2 =
(7, 0), as in the first chord the first voice plays a C, while in the second chord
the first voice plays a G.
Next we define the space of r-chords to be the set of all such chords that is
{(P1 , P2 , . . . , Pn ) : Pi ∈ R} =: Rn
The set Rn of all n-tuples is called Euclidean n-space and comes equipped with a
natural distance function: given X = (P1 , P2 , . . . , Pn ) and Y = (Q1 , Q2 , . . . , Qn ),
we define
p
d(X, Y ) = (P1 − Q1 )2 + (P2 − Q2 )2 + · · · (Pn − Qn )2 .
We can use this distance function to quantify how close two chords are to one
another. What does the space of all dyads (2-chords) look like?
As a set this is just R2 = {(x, y) : x, y ∈ R}, otherwise known as the xy-plane.
We can represent a given dyad X = (x, y) as a plotted point in this plane, and
given two different dyads X1 = (x1 , y1 ) and X2 = (x2 , y2 ), the distance function
d(X1 , X2 ) is none other than the distance between their corresponding points
in the plane.
At this point your instructor will draw some pictures on the board. Please
be patient.
In A Geometry of Music in order to return chords back to
unordered collections Tymoczko further quotients the space of dyads out by
the relation (x, y) ∼ (y, x). What kind of space do we get when we do this?
A Möbius band!
In case you don’t like that realization of the Möbius band, here is how
Tymoczko does it in his book. In the image below he has rotated the entire
89
xy-plane by 45 degrees.
So already for n = 2 we see that the space of chords is geometrically interesting.
Depending on whether you consider a dyad as an ordered pair of pitches, as an
ordered pair of pitch classes, or as an unordered (multi)set of pitch classes (take
your pick!), you get respectively
(a) R2 ,
(b) R2 / ∼oct =: T2 , the 2-torus, or
(c) T2 /(a, b) ∼ (b, a), the Möbius band.
For n = 3, 4, . . . we get even more interesting spaces representing trichords,
tetrachords, etc. The descriptions for general n are similar to the n = 2 case;
the space of n-chords is represented either as Rn (ordered pitches), Tn (ordered
pitch classes), or the quotient Tn /Sn (unordered multisets of pitch classes). As
exotic as the various spaces modeling n-chords may seem, they are the result of
equivalence relations coming straight from musical notions. Furthermore these
spaces are all grounded with a very intuitive concept of the distance between two
chords, which states in a quantitative way that two chords X = (P1 , P2 , . . . , Pn )
and Y = (Q1 , Q2 , . . . , Qn ) are close to one another exactly when each note Pi in
X is close to the corresponding Qi in Y . Even more intuitively, the two chords
are close if when playing X and then Y on the piano, your fingers don’t have
to move very far!
90
These spaces also finally afford us a first, elegant mathematical model of
a piece of music. The spaces themselves do not represent a particular piece
of music. Rather, we think of a piece of music as describing a particular path
through one of these spaces.
We illustrate this idea with two examples taken from A Geometry of Music:
Chopin’s E Minor Prelude, Op. 28, No. 4, and the prelude to Richard Wagner
opera Tristan und Isolde. A fitting conclusion to this little prelude on voice
leading.
Chopin’s E Minor Prelude, Op. 28, No. 4
Tymoczko’s reduction of opening of piece, from A Geometry of Music.
Chopin’s E Minor Prelude, Op. 28, No. 4
Tymoczko’s geometric representation of the region of 4-chord space traversed
91
during each of the cycles in the opening.
Chopin’s E Minor Prelude, Op. 28, No. 4
Tymoczko’s geometric representation of the region of 4-chord space traversed
during each of the cycles in the opening.
Prelude from Wagner’s Tristan und Isolde
Tymoczko’s geometric representation of the region of 4-chord space traversed
92
during the prelude
Conclusions
Recall our original outline for the course.
1. Ontological. Musical objects are very much like mathematical objects. We
will describe and define the main musical parameters (melody, rhythm,
harmony, timbre) in mathematical language (sets, sequences,topological
spaces, groups).
2. Methodological. Mathematical thought, operations and objects are frequently employed both in the analysis and composition of music. We will
look closely at examples of mathematical methods in both of these areas
of musical practice.
3. Epistemological. Music often bears a strong logical quality. We speak of
understanding a piece of music, of one passage of music following from
another passage. Can these activities be compared to understanding or
following mathematical arguments? We will explore these connections
with the aid of formal logic.
Ontological approach
I would argue that mathematics is not being used simply to model aspects
of music, but rather, more directly, that music is in fact largely made up of
mathematical objects (sets, sequences, rigid motions, paths, etc. ), and further
that a large part of music involves investigating relationships between these
various objects.
As Tymoczko says, describing the Prelude from Tristan und Isolde and the
pervasive use therein of the half-diminished seventh chord (or Tristan chord),
“the music is in some sense ‘about’ the various ways of resolving the chord.”
Similarly, we don’t just use the concept of inversion to understand Bartók’s
Mikrokosmos 141 (“Subject and Reflection”); rather the piece (and the acoustic
93
scale it articulates) is built from this operation, and is, as Tymoczko would have
it, somehow a piece about this operation.
As such a composer is directly engaged with a vast universe of what are
usually called combinatorial objects in mathematics: finite or discrete sets with
varying degrees of additional structure defined on them.
(Tymoczko endeavors to embed this discrete universe into an even bigger,
continuous one, and in so doing brings music out of the combinatorial realm
and into a geometric one. )
Stravinsky himself seems to have shared this (combinatorial) view of things.
“As for myself, I experience a sort of terror when, at the moment of setting
to work and finding myself before the infinitude of possibilities that present
themselves, I have the feeling that everything is permissible to me.”
(From Poetics of Music)
“Mathematicians will undoubtedly think this all very naive, and rightly so, but
I consider that any inquiry, naive or not, is of value of only because it must lead
to larger questions–in fact to the eventual mathematical formulation of music
theory, and to, at long last, an empirical study of musical facts–and I mean the
facts of the art of combination which is composition.”
(From Expositions and Developments)
Methodological approach
Our ontological approach already provides a strong argument of why music,
as compared to other arts, enjoys an especially close relationship to mathematics, but we should not be content with this.
After all, like a Bach fugue or a Bartók Mikrokosmos, we can also create
interesting wall paper designs using rigid motions (rotations, translations, reflections, etc.), but this fact alone is not enough to justify a semester-long course
on the connection between wall paper design and mathematics.
A first reply to this objection is that whereas the connection between wall
paper design and mathematics essentially starts and ends with rigid motions,
there are many more mathematical methods used in the design of music: algorithmic methods in process music, probabilistic methods, stochastic methods,
use of permutation operations used in 12-tone and serial music, etc. (A sequel
to this course would take a close look at examples of all these methods.)
Epistemological approach
More importantly, continuing our rebuttal of the wall paper objection, there
is potentially no end to the number of mathematical methods that might be
employed in the service of music. Why?
Consider again our starting definitions of the two subjects:
Music is the art of structured sound.
Mathematics is the science of abstract structure.
This course intended to give you a better sense of what is meant here by
structure–if anything by showing you some examples both in music and mathematics. I hope that in the process it has persuaded you that the “structured
94
sound” that is the object of musical art and the “abstract structure” that is the
object of the science of mathematics, are in large part the same thing!
Furthermore, in looking at these examples we begin to see that the art
of music and the science of mathematics are themselves alike in nature and
purpose. Both can be seen as explorations or investigations of this universe
of abstract structure. In this light the composer, like the mathematician, is
on the hunt for new musical structure, or to articulate previously unrecognized
properties of old musical structures.
Of course, the properties that make a structure interesting musically speaking are not necessarily the same as those making it interesting mathematically
speaking, but this is a topic for another time! Instead, let us end with one last
quote from Stravinsky, who seems to be in agreement with us once again.
“I have recently come across two sentences from the mathematician Marston
Morse which express the ‘likeness’ of music and mathematics far better than
I could have expressed it. Mr. Morse is only concerned with mathematics, of
course, but his sentences apply to the art of musical composition more precisely
than any statement I have seen by a musician: ‘Mathematics are the result
of mysterious powers which no one understands, and in which the unconscious
recognition of beauty must play an important part. Out of an infinity of designs
a mathematician chooses one pattern for beauty’s sake and pulls it down to
earth.’ ”
(From Expositions and Developments)
95