A Phylogenetic Analysis of the Caminalcules

Transcription

A Phylogenetic Analysis of the Caminalcules
Society of Systematic Biologists
A Phylogenetic Analysis of the Caminalcules. I. The Data Base
Author(s): Robert R. Sokal
Reviewed work(s):
Source: Systematic Zoology, Vol. 32, No. 2 (Jun., 1983), pp. 159-184
Published by: Taylor & Francis, Ltd. for the Society of Systematic Biologists
Stable URL: http://www.jstor.org/stable/2413279 .
Accessed: 02/04/2012 22:07
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
Taylor & Francis, Ltd. and Society of Systematic Biologists are collaborating with JSTOR to digitize, preserve
and extend access to Systematic Zoology.
http://www.jstor.org
Syst.Zool., 32(2):159-184, 1983
A PHYLOGENETIC
ANALYSIS OF THE CAMINALCULES.
L. THE DATA BASE
ROBERT R. SOKAL
State University
ofNew Yorkat StonyBrook,
DepartmentofEcologyand Evolution,
StonyBrook,New York11794
Abstract.-The Caminalcules are a group of "organisms" generated artificiallyaccording to
principles believed to resemble those operating in real organisms. A reanalysis of an earlier
data matrixof the Caminalcules revealed some inconsistenciesand errorswhich necessitated
recoding of some characters.The resultingdifferenceswith earlierresultsare minor.The images
of all 77 Caminalcules are featured,those of the 48 fossilspecies forthe firsttime.The characters
of the Caminalcules are defined and a data matrixis furnishedforall Recent and fossilspecies.
A new phenetic standard is proposed for the Caminalcules which divides them into five
"genera." The true cladogram is revealed for the firsttime. Recent Caminalcules have evolved
over 19 time periods. Five branches correspond to the phenetic genera but originateat greatly
differingtime periods. Four lines terminatein fossils.
A series of measures forquantifyingevolutionarychange is defined,including measures for
homoplasy,parallelism,and reversal.A survey is made of these measures and of other statistics
of relevance to systematicsfor 19 data sets fromthe numerical taxonomic literature.The Caminalcules turn out to be compatible to data sets on real organisms with respect to all these
measures,as well as with respect to evolutionaryrates and species longevities. Thus, questions
raised by an analysis of the Caminalcules should be of interestto systematistsconcerned with
the analysis of data sets on real organisms. [Phenetic classifications;cladistic classifications;
estimatedcladograms; homoplasy; Wagner trees; Caminalcules; numerical taxonomy.]
This paper,and othersto follow,takesadvantage of the opportunityaffordedby a
group of artificialorganismswith a known
phylogeny,the Caminalcules,to throwlight
on some of the questions concerningprinciples and proceduresthatcurrentlyengage
theattentionoftaxonomists.
Thereis considerabledisagreementon the relativemeritsof
pheneticand cladisticclassifications
(Sneath
and Sokal, 1973;Eldredgeand Cracraft,
1980;
Wiley, 1981). Some workers contend that
classifications
based on phylogeneticprinciples are empiricallybetterby variouscriteria
of optimality(Farris,1977, 1979a,b; Mickevich,1978a,1980;Schuh and Polhemus,1980;
Schuh and Farris,1981). These claims have
been questionedby otherswho findthe evidence and methodologypresented to be
flawed(Colless, 1980;Rohlfand Sokal, 1980,
1981; Sokal and Rohlf,1981a; Rohlf et al.,
1983a,b). The empiricalstudies,and the argumentspro and con phylogeneticclassificationsderivedin such investigations,
suffer
froma major impediment.All of the phylogeneticclassificationsreportedin the literatureare only estimatesof the true phylogeny, which is unknown for all real
organisms.By contrastthe variousresultsof
this study merit serious attentionbecause
they can be examined against the benchmarkof the truephylogenyof the group.
The Caminalculesare artifactscreatedby
the late ProfessorJosephH. Camin of the
Universityof Kansas and in effectrepresent
a single simulationof the evolutionaryprocess by rules that have not been made explicit.However,readerswill findthatthese
organisms,which have the advantage over
othersimulationsin presentinga visual record to the investigator,
illustratea varietyof
evolutionaryphenomena and are therefore
of considerable pedagogical and heuristic
value. The relevanceof this data set to currentlyactiveissues in systematics
will readily becomeevidentto thereaderofthisseries.
I shall show thatwithrespectto a substantial
arrayof measurableproperties,the Caminalcules are well withinthe range of empirically observed values for real taxonomic
groupsand that,conversely,forno property
of consequence in numericaltaxonomyare
the Caminalcules beyond the range of observedvalues in real organisms.
At the suggestionof the Editorand some
159
160
SYSTEMATIC ZOOLOGY
reviewers,this series of publicationsis initiatedin thispaper with the presentationof
the data on which previousand succeeding
studies have been based. I furnishthe images of the previouslypublished 29 Recent
species and forthe firsttime the 48 "fossil"
species. I also presentforthe firsttime the
truephylogenyof the Caminalculesas generatedby ProfessorCamin. WiththeseillustrationsI providea listof the descriptionsof
charactersas adopted in my laboratoryas
well as a data matrixgiving the character
statesforall 106charactersforeach ofthe 77
Recentand fossilCaminalcules.In addition
to presentinga new standardpheneticclassification,the paper describesa number of
measures for taxonomic and evolutionary
propertiesand compares the Caminalcules
withdata setson real organismswithrespect
to thesemeasures.Subsequentstudiesin this
serieswill treatestimatesof the true cladogram,theinclusionoffossilsin pheneticand
congruenceand charcladisticclassifications,
acterstability,and OTU stability.
VOL. 32
Ehrlichof StanfordUniversityand Dr. W.
Wayne Moss of the Philadelphia Academy
of Sciencesin additionto myself.The originals drawn on the ditto mastersappear to
have been lost followingthe death of ProfessorCamin in 1979.
All examinationsof the Caminalculesfor
numericaltaxonomicstudieshave been carried out on the xeroxesof the images. Illustrationsof all 29 Recent OTUs have been
published three times previously (Sokal,
1966;Rohlfand Sokal, 1967;Sokal and Rohlf,
1980). For this purpose,inked copies of the
xeroxed images were photographed. Althoughthe artist'scopies of the originalxeroxesare quite faithful,
inevitablysome fine
detailhas been alteredor lost.Thus,notevery
characterstate describedbelow can be unequivocallyrecognizedin the featuredillustrations.The versionof the 29 RecentOTUs
published in Sokal (1966) was "beautified"
by thepublisher'sartistand cannotbe relied
upon fordetail.The images of the 48 fossils
were newly inked forthisstudyand all difcharacteristics
can be observed
ferentiating
ORIGIN OF THE DATA BASE
in them.
The RecentOTUs, numbered1 to 29, are
The originalintentionin generatingthe
Caminalcules was to study the nature of shown in Figure 1. The fossilOTUs, given
code names by Camin, were rantaxonomicjudgment(eventuallypublished different
by Sokal and Rohlf,1980), but work with domly assigned numbers 30 to 77 by me.
theseanimalshas led to otherdevelopments They are shown in Figure2.
The truecladogramofthegroupwas comin numericaltaxonomicmethodology,such
as an early method of numericalcladistics municatedto me by Camin in 1970.But al(Camin and Sokal, 1965) and a method for though this informationwas employed in
obtaining taxonomic structureby random the computationsleading to the analysis of
and systematic
scanningofbiologicalimages taxonomicjudgment(Sokal and Rohlf,1980),
(Sokal and Rohlf, 1966; Rohlf and Sokal, access to it was restrictedeven forworkers
1967). Other experiments on taxonomic on this project.I did not become intimately
treeuntil1981
judgmenthave also been based on the Cam- familiarwiththephylogenetic
inalcules(Moss, 1971;Sokal, 1974;Moss and during the final analyses leading to this
manuscript.
Hansell, 1980).
Readers of this paper and of subsequent
Camin drew the Caminalculesusing master stencilsfor dittomachines.The genetic ones in thisseriesshould note thatI use the
in the sense in which I origcontinuityofthe Caminalculeswas achieved -termcladogram
by Camin by tracingsuccessivedrawingsof inallycoined it (Camin and Sokal, 1965;also
the animals,permittingthe preservationof independentlycoined with the same meanall charactersexceptfor such modifications ing by Mayr, 1965). One definitionof this
as were desired.Xeroxcopies of the images meaning(Sneathand Sokal, 1973:29)is as "A
of the RecentOTUs were made available in branching... networkof ancestor-descentheearly1960s,thoseofthefossilOTUs some dant relationships."This definitiondiffers
years later.Independentxeroxcopies of all fromthe several meanings attachedto the
OTUs are in the possession of Dr. Paul A. termcladogramby various cladists(e.g., El-
1983
161
CAMINALCULES: DATA BASE
o
u'
U)
in
C
0'~~~~~~~~~0
00.
04~~~~~~~~~~~~4
U.
U
0
0
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~~~~~~1
0
0
~~~~~~~~~~~~~~~~~~u
0
U)
04 .4
0
V
*0~~~~~~~~~~~~~~~
~~~~~~~~~~*
~ ~ ~
04
*0
~ ~ ~ ~ ~
K
IEI
_~~
~~~~~~~4
co
~~~~
~~~~~~~~~~~~~~~
C.4
162
SYSTEMATIC ZOOLOGY
VOL. 32
-
.o
c, o
g .4
CNJ~
~ ~ ~~ ~ ~ ~ ~ ~
%6oo
-~~~~~~~~~~~~
0e
00~~~~~~
0%
~~~~~~~~~~~~~
*~~~~~~~~~~~~~~~~~~~,
o
cli~~~~~~~~~~~~~~~~~~~~~~~~~b
r)
ODs
6**
" 0
C)
-U,
163
CAMINALCULES: DATA BASE
1983
Nl~
~
~
'
N
NA~~~~~~
*
00
'~~~~~~~~~~~~~~~~~~~~~~~~~0~~~~0~~~~~~~~~~~
rl9j?~
I')~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~4
~ ~
164
SYSTEMATIC ZOOLOGY
dredgeand Cracraft,1980;Nelson and Platnick,1981;Wiley,1981;see also Sneath,1982).
Thus,cladogramas used in thisand succeeding papers refersto a branchingsequence
depictingthe actualor hypothesizedgenealogy of the OTUs. It does not include length
ofbranchesofthetreeand is not a statement
about the evolutionor patternof character
states akin to "nested synapomorphy
schemes."
Camin did not keep a writtenrecordof
characterchanges. In the 1960s,A. J.Boyce
and I. Huber prepared a data matrixfor a
numerical'pheneticstudy of the 29 Recent
OTUs, describing86 characters.This data
matrixwas the basis of various phenetic
analyses of the Caminalcules (Sokal, 1966;
Sokal and Rohlf,1966,1980;Rohlfand Sokal,
1967). For the presentstudyit became necessaryto map the coded charactersonto the
now known cladogramand duringthisprocess each characterwas re-evaluatedforthe
of the
29 RecentOTUs. This re-examination
imagesled to the discoveryof severalinconsistenciesin the Boyce and Huber data matrix.Altogether13 ofthe2,494entriesin that
29 X 86 matrixseemed to be in errorand
were corrected.Originalcharacter58 was deletedbecause it was discoveredto be invariant. All measurementcharacterswere rewith earlier
measuredand small differences
measurementswere recorded.Because the
images available duringmy currentstudies
are poorer quality xerox copies than those
available to Boyceand Huber,the resolution
of some of the more minutemorphological
and in a fewcases
featureswas moredifficult
characterswere recoded into fewer,coarser
classes.
Remeasurements,
changesin logic,and revision of characterstate coding accounted
for changes in 33 of the 85 characters.To
accountforthe featuresof the fossilOTUs,
which had not been coded previously,
another12 characterswere redefinedand 21
new characterswere added. This augmented
data matrix comprises 77 OTUs and 106
charactersof which 65 are binary,31 are ordered integercharacters,and 10 are measurementcharacters.Fromthismatrixan 85
characterX 29 OTU subsetwas extractedfor
the RecentOTUs. This matrixhas 48 binary
VOL. 32
characters,
27 orderedintegercharactersand
10 measurementcharacters.Both the 106
characterX 77 OTU matrixand the 85 characterX 29 OTU matrix contain characters
withNCs (no comparisoncodes). All NCs in
thisstudyare logical(i.e.,presenceofa given
stateforcharacterh makingit impossibleto
definea stateforcharacteri) ratherthandue
to missinginformation.
In the largermatrix,
79 ofthe 106 characterscontainNC codes; in
thesmaller,59 ofthe85 characters
have NCs.
The measurementcharactersare available in
two versions.One is as directmeasurements
in millimeters,
the otheris coded as ordered
integerstates but with gaps omitted.The
forone characterwas
rangeof measurements
divided into 10 equally wide classes coded 0
to 9. Each characterstatewas then assigned
to one of these classes, but classes lacking
observed characterstates were ultimately
omitted.Thus, if therewere no OTUs with
measurements
in class3, thesubsequentclass
4 was renumbered3. The numberof states
in the 10 charactersresultingfromthis procedure ranges fromfourto nine. A list definingthe states of each characteris furnished in the Appendix. The actual data
matrix(integercoded version) is shown in
Table 1.
The consequences of recoding the measurementcharactersin integerscale were
minimal.Matrixcorrelationsforthe 29-OTU
study between the resemblance matrices
based on these two methods of character
coding are very high (0.998 and 0.997, respectively,for r and d, the correlationand
taxonomicdistancecoefficients).
The classificationsresultingfrom these two coding
methods are identical. In this and subsequent papers,only the data matrixusing integerscale codingof measurementcharacter
states is featured,since this is the simpler
coding preferredfor the several cladistic
methodsemployed.
The effectsof correctingand recodingthe
originalBoyceand Huber data matrixforthe
be29 OTUs were minor.Matrixcorrelations
tween the two versionsof the resemblance
matricesare high (0.980 for r, 0.969 for d).
UPGMA phenogramsbased on theseresemblancematricesare close;thestrictconsensus
index CIc forthe classifications
based on the
1983
CAMINALCULES:
DATA
BASE
165
two correlationmatricesis 0.889, for those the logical choice as a new standardfor a
based on thetwodistancematrices0.815.This pheneticclassification
ofthe29 RecentOTUs
index (Rohlf,1982) rangesbetween 0 forno to be comparedwiththetruecladogram.This
consensusand 1 forperfectconsensus.It is phenogramalso has the highestcophenetic
identicalto the consensusforkindex of Col- correlationcoefficient
of any pheneticclasless (1980) and the proportionalconsensus sificationof the Caminalcules computed so
index of Sokal and Rohlf(1981a).
far(0.965).
The differences
betweenthe classifications Five major clusters,the "genera" of the
based on correlationsand distancescomput- Caminalcules,can be seen in thephenogram
ed fromthe originalBoyce and Huber data in Figure3. These fallnaturallyintotwo mamatrix(see fig.3 of Rohlfand Sokal, 1967) jor groups,thatcontaininggeneraA, B, and
and thosebased on theupdated,integer-cod- F, and a second groupcontainingC and DE.
ed versionall involvetaxonomicaffinities
at The statusof D, consistingof OTUs 3 and 4,
the "species" level. The "generic"classifica- is problematical.
Only in one previousstudy,
tionis the same forbothversions.When the based on a mechanicalmethodof scanning
new classifications
were coriparedto 49 sub- the images (Rohlfand Sokal, 1967: fig.3A),
jective classificationsof the Caminalcules was D unequivocallyseparatedfromE at a
producedby 22 experimentalsubjects(Sokal level of pheneticsimilarityequal to thatof
and Rohlf,1980),thenew phenogramsofthe the othergenera-A, B, C, and F. The Boyce
Caminalculeswere foundto be no more(and and Huber correlation
phenogram(Rohlfand
no less) similarto these intuitiveclassifica- Sokal, 1967: fig.3C) shows D joining E at a
tionsthan the earlierBoyceand Huber clas- level closerthanthatof at leastone member
sifications.
of A joining its genus. Intuitiveclassifications by various taxonomists(Sokal and
A NUMERICAL PHENETIC CLASSIFICATION
Rohlf,1980)had frequentlyassignedspecies
For pheneticanalyses the data were sub- 3 and 4 to genus E, althoughsome had sinjectedto standardtaxometricproceduresus- gled out the divergenceof these two OTUs
ing the NTSYS systemof numericaltaxo- which are unique in having rudimentary
nomiccomputerprograms(Rohlfet al., 1980). posteriorappendagesas well as an elongated
Characterswere standardizedbefore com- body obliteratingthe neck. Since the pheputationof correlationand taxonomicdis- nogramin Figure3 clearlyaffiliatesD with
tance coefficients
betweenOTUs.
E, it seems appropriateto join the two toRohlfand Sokal (1967) noted thatpheno- getherintoa singlegenuswhichI have called
gramsbased on distancesand correlationsof DE to preservecontinuitywith earlierpubthe Caminalculesdiffersubstantiallyin the lications.
indicated.
specificas well as genericaffinities
THE TRUE CLADOGRAM
Previousworkby Rohlfand Sokal (1967)and
The truecladogramas furnishedby J.H.
by Sokal and Rohlf(1980) showed that the
classification
based on the correlationmatrix Camin is shown in Figure4. The diversity
correspondedmore closely than that based of the taxonwas generatedover 19 timepeon the distance matrix to the taxonomic riods.Lines undergoingevolutionarychange
structurenoted by individual taxonomists are indicatedin Figure4 by slanted(as disgroupingthe Caminalculesby conventional, tinctfromvertical)lines. Species thatunderintuitivemethods.This undoubtedlyoccurs went evolutionarychange during a given
because correlationcoefficients
are moresen- timeperiod are shown as solid circlesat the
sitiveto shape than are distancecoefficientsend of thatperiod.Ancestralspeciesthatare
(Rohlfand Sokal,1965),and itis shape rather not continuedinto the next time period as
than size on which taxonomiststendto base verticallines are consideredto have become
theirjudgmentson taxonomicaffinity.
For extinctand are indicated by small hollow
this reason,the UPGMA clustering(shown circles.Fivebranchesleadingto Recentforms,
in Fig. 3) of the correlationmatrixbased on correspondingto the five phenetic genera,
the updated,integer-codeddata matrixwas can be recognized but these originate at
166
SYSTEMATIC ZOOLOGY
TABLE
1-5
1
2
3
4
5
6-10
11-15
VOL. 32
1. Data matrixof 77 Recent and fossil Caminalcules.a
16-20
21-25
26-30
31-35
36-40
41-45
46-50
51-55
56-60
61-65
66-70
71-75
76-77
11111 11111 11111 11111 11110 11111 11111 11101 11111 11111 11111 11111 11111 11111 11111 11
11111 01110 01111 11111 OlliX 11111 11111 iliXi
11111 11111 11111 11111 11111 11111 11111 11
XXXXX lXXXO OXXXX XXXXX lXXXX XXXXX XXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XX
00110 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00010 00000 00001 00
XX1OX XXXXXXXXXX XXXXX XXXXXXXXXX XXXXX XXXXX XXXXX XXXXX XXXXX XXXXX XXXOX XXXXX XXXX1 XX
6
7
8
9
10
10000 12200 00220 10011
Xllll
10111 11112 X1122
XXXXX XXXXX XXXX1 XXxO1
00000 01000 00000 00000
11111 10111 11010 11111
00004 10310 10000 00050 00000 00001 00000 01002 00001 00010 21000 00
11110 2X121 21111 11101 11111 1X112 11111 11111 12111 lliXi
02111 11
XXXXX OXXOX OXXXX XXXXXXXXXX XXXXO XXXXX XXXXX XOXXX XXXXX XOXXX XX
00001 00000 00000 00010 00000 00000 00000 00000 00000 00000 10000 00
11110 11011 11111 11101 11111 11111 11111 11101 10111 11111 00111 11
11
12
13
14
15
20000 OXOOO OOXOX 30000
00000 00000 00000 00011
XXXXX XXXXX XXXXX XXX55
XXXXX XXXXX XXXXX XXXlX
12001 10000 02000 01011
OOOOX O1XOO 00000 OOOXO 00000 01000 00000 OOOXO OXOOO 00010 XXOOO 00
00000 10010 10000 01100 00000 00101 00000 00000 00000 00001 01000 00
XXXXX 5XX3X 5XXXX XO1XX XXXXXXX3X3 XXXXX XXXXX XXXXXXXXX2 X4XXX XX
XXXXX 1XXOX 2XXXX XOOXX XXXXXXXOXO XXXXX XXXXX XXXXXXXXXO XOXXX XX
01001 10011 10010 10110 11000 00101 10201 01000 10000 11011 01000 00
16
17
18
19
20
XlXXX
01000
11111
54444
01001
21
22
23
24
25
01111 OXOOO OlOOX 00100 01100 00001 00001 10000 10100 10010 00101 11000 00011 11000 00001 10
1XXXX 1X221 lX22X 21X22 OXX12 2222X 2213X X2222 X3X33 X22X2 32X2X XX222 321XX XX122 2222X X3
Xliii
XXXXX XlXXX XXlXX xoixx xxxxo xxxxi OxXXX iXiXX iXXOX XXiXO lOXXX Xxxii iOXXX XXXX1 ix
XOioo XXXXX XiXXX XXOXX XOXXX XXXXX XXXXX XXXXX OXXXX OXXXX XXOXX XXXXXXXXOX XXXXXXXXXO XX
X1010 XXXXX XOXXX XXOXX XOXXX XXXXX XXXXX XXXXX OXXXX OXXXX XXiXX XXXXX XXXOX XXXXX XXXX1 XX
26
27
28
29
30
OXXXX OXOOO OXOOX OOxil OXXOO lOOiX lOOOX XO100 XOXOO XOXl OOXOX XXOOO OOOXX XXOO1 OlOOX XO
XXXXX XXXXX XXXXX XXX34 XXXXX 5XX5X 3XXXX XXOXX XXXXX XX2X3 XXXXX XXXXX XXXXX XXXX1 X5XXX XX
XXXXX XXXXX XXXXX XXX23 XXXXX 6XX4X 3XXXX XXOXX XXXXX XX2X2 XXXXX XXXXX XXXXX XXXX1 X5XXX XX
00000 iXOil ilOOX 00000 10000 00000 00100 00000 00000 00000 00000 00000 00000 00000 00000 00
XXXXX 1XXOO 1XXXX XXXXX 2XXXX XXXXX XXNXX XXXXX XXXXX XXXXX XXXXX XXXXX XXXXX XXXXXXXXXX XX
31
32
33
34
35
12222 00000
XXXXX 10011
XXXXX X11XX
XXXXX X11XX
XXXXX 2XXOO
36
37
38
39
40
X2012
XlXXO
XiXXX
01001
00000
41
42
43
44
45
00000 11111 10111 00000 10001 00101 01100 11011 00000 00010 01011 01111 01000 01000 10110 00
01111 00000 01000 00100 Ol1OO 00000 00000 00000 10100 10000 00100 10000 00011 10000 00001 10
XXXXX XIXXX XXOXX X21XX XXXXX XXXXXXXXXX lXOXX 1XXXX XXlXX OXXXX XXX1N 2XXXX XXXX1 OX
Xllll
10000 00000 00000 11000 00010 01000 00011 00000 00010 01000 00000 00000 00100 00110 00000 01
2XXXX XXXXX XXXXX 2OXXX XXX2X XIXXX XXX12 XXXXX XXXOX X2XXX XXXXX XXXXX XX2XX XX22X XXXXX XO
46
47
48
49
50
11111 00000 01000 11100 01110 01000 00011 00000 10110 11000 00100 10000 00111
20000 XXXXX XOXXX 12OXX X002X XIXXX XXX03 XXXXX OX02X O1XXX XXOXX OXXXX XX202
2XXXX XXXXX XXXXX O1XXX XXXOX XOXXX XXXX1 XXXXX XXX1X XOXXX XXXXX XXXXX XXlXl
00000 00000 00000 00000 10000 00000 00010 00000 00000 00000 00000 00000 00000
00000 00000 00000 00011 00000 1001,0 10000 00000 00000 00101 00000 00000 00000
51
52
53
54
55
XXXXX XXXXX XXXXX XXX13
XXXXX XXXXX XXXXX XXX10
10000 00000 00000 11000
01221 00000 02000 00200
XX02X XXXXX XOXXX XXlXX
56
57
58
59
60
XX20X XXXXX X3XXX XXlXX X11XX XXXXX XXXXX XXXXX lXlXX 1XXXX XXlXX XXXXX XXXOX 1XXXX XXXXO XX
XiXXO XXXXX XXXXX XXXXX XXXXXXXXXX XXXXX XXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXX XX
XXXXX XXXXX XXXXX XXXO1 XXXXX OXXOX OXXXX XObXX XXXXX XXOXO XXXXX XXXXX XXXXX XXXXO XOXXX XX
XXlOX XXXXX XOXXX XXOXX XOOXX XXXXXXXXXX XXXXX OXOXX OXXXX XXOXX OXXXX XXXOX OXXXX XXXXO XX
22023 XXXXX XlXXX 323XX X133X X2XXX XXXN6 XXXXX 3X44X 33XXX XX2XX 3XXXX XX536 lX32X XXXX3 6N
61
62
63
64
65
XOX1O
X1212
43233
11111
01111
XXXXXXOXXX XXXXXXXXXX XXXXXXXXXX XXXXX XXXXXXXXXX XXOXX XXXXX XXXXXXXXXX XXXXX XX
00000 01000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00
10111 11110 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11
4X034 4400X 56378 53350 75063 71334 21202 32322 35415 20412 23000 20434 44553 06013 32
OXOOO OlOOX 00000 01000 00000 00000 00000 10000 00000 00100 00000 00000 10000 00000 00
02000
lXOOO
XXO11
XX112
lXXXX
11200 02210 02000 00001 00000 20100 21000 00200 00000 00121 20110 00002 10
XXXOO lXXXO OX002OQOlOX 20001 XOXOO XXOOO 0OX02 02000 OOXXX X2XXO OOOOX XO
XXX44 XXXX2 4X04X 46X7X X432X X6X76 XX464 61X6X 7X111 61XXX XXXX3 1524X X7
XXXOO XXXX1 OXOlX 02X2X X221X X2X22 XX121 22X2X 2X221 22XXX XXXX2 1112X X2
XXXXX lXXXX XXXXX XXOXX XXXXN XXXXXXXXXX XXXXX XXXXX XXXXX XXXXXXXXXX XX
XXXXX X2XXX XX2XX X22XX X2XXX XXXXX XXXXX 2XXXX 2XXXX XX2XX XXXXX XXX2X 2XXXX XXXX1 XX
XXXXX XlXXX XXOXX XlOXX XOXXX XXXXX XXXXX OXXXX OXXXX XXiXX XXXXX XXXOX lXXXX XXXXX XX
XXXXX XiXXX XXXXX XOXXX XXXXX XXXXX XXXXX XXXXXXXXXX XXlXX XXXXX XXXXX OXXXX XXXXX XX
22222 21222 10100 21112 01202 02211 22222 10111 11222 02122 12222 02101 12102 20220 11
00000 00000 00001 00000 00000 10000 00000 00000 00000 00000 00000 00000 00000 00000 00
10110
OX22X
XX12X
00000
00000
00001 11
XXXXO 02
XXXXX 1X
00000 00
01000 00
XXXXX lXX4X 2XXXX XXXXX XXXXX XXOX1 XXXXX XXXXX XXXXX XXXXX X4XXX XX
XXXXX 2XX1X 1XXXX XXXXXXXXXX XXlXl XXXXX XXXXX XXXXX XXXXXX2XXX XX
00010 01000 00001 00000 00000 01000 00000 00000 00100 00110 00000 00
02200 00000 00000 00000 20200 20000 00200 20000 00020 20000 00002 00
XOOXX XXXXXXXXXX XXXXX lXOXX OXXXX XXOXX OXXXX XXX3X OXXXX XXXX2 XX
xxxxx XXXXX XXOXX XXOXX XXXXX XXXXX XXXXX OXXXX OXXXX XXOXX OXXXX Xxxix
XXXXX XlXXX XX2XX X02XX XXXXXXXXiX XXXXX 2X2XX 2XXXX XXlXX 2XXXX XXX2X
33333 33332 74310 33363 17313 13324 33333 33333 37131 33323 33333 33533
00000 01000 11100 01110 01000 00011 00000 11111 11000 10100 10000 10111
11111 11111 10111 11110 11111 11111 11111 11111 11111 11111 11111 Nllll
XXXXXXXXX1 XX
OXXXX XXXX2 2X
33542 31332 32
10110 00001 11
11101 11111 11
a Columns are species code numbers. Recent OTUs are
species 1-29. Rows are the 106 charactersdescribed in the Appendix. The first85
charactersare the integer-codeddata for the Recent OTUs. The rest are those necessary fordescribing the fossils. The characterstates,ranging
numericallyfrom0 to 8, are in the body of the table. The following two stateshave special symbols:X-no comparison (NC) code; N-represents
negative one (-1).
1983
167
CAMINALCULES: DATA BASE
TABLE 1.
1-5
6-10
11-15
16-20
21-25
26-30
31-35
Continued.
36-40
41-45
46-50
51-55
56-60
61-65
66-70
71-75
76-77
66
67
68
69
70
XOOO1 00000
XXXX1 xxXXX
10000 33333
Oxxxx xxxxx
XXXXX 00000
71
72
73
74
75
XXXXX XXXXX XXXXXXXX1o XXXXX lXXOX OXXXX XXXXXXXXXX XXXX1 XXXXX XXXXXXXXXX XXXXX XOXXX XX
XXXXX XXXXX XXXXX XXXXO XXXXX Xxxix lXXXX XXXXX XXXXX XXXXX XXXXX XXXXX XXXXXXXXXX XOXXX XX
XxXXX xxxxX XXXXX XXXOX XXXXX lXXXX XXXXX XXXXXXXXXX XXXXO XXXXX XXXXX XXXXX XXXXX XXXXXXX
13333 33333 33323 01333 33311 30233 33332 33313 33333 30333 33333 33333 23233 33113 33333 33
OXXXX XXXXX XXXOX XOXXX XXX1o XXOXX XXXXO XXXOX XXXXX XXXXXXXXXX XXXXX OXlXX XXOOX XXXXX XX
76
77
78
79
80
Xliii
XOllO
Xxoxx
XXllX
XX02X
81
82
83
84
85
XlXXl XXOXX XlOXX XXXXXXXXXX XXXXXXXXXX XXXXX lXXXX lXXXX OXlXX XXXXO XXXXO XXXXX XXXXX XX
XOXX1 XXXXX XOXXX XXXXXXXXXX XXXXXXXXXX XXXXX lXXXX lXXXX XXOXX XXXXXXXXXX XXXXX XXXXX XX
XXXXX XX2XX XXOXX XXXXX XXXXX XXXXX XXXXX XXXXX XXXXX XXXXX 3XXXX XXXXO XXXX1 XXXXX XXXXX XX
X2222 XXXXX X2XXX XX2XX X20XX XXXXO XXXXO OXXXX 2XOXX 2XXOX XX2XO OOXXX XXX20 1OXXX XXXX2 OX
00330 00000 00000 00056 00000 50040 50000 00100 00000 01304 00000 00000 00020 00002 04003 00
86
87
88
89
90
XXXXX XXXXX XXXXX XXXXX XOXXX XXXX2 XXXXX 1XXXX XXXXX XXXiX XXXX1 X2XXX XXXXX X3XXX XXXXX XX
xxXXX xxXXx XXXXXXXXXX XXXXX XXXXX XXXXX lXXXX XXXXX XXXOX XXXX1 XXXXX XXXXX XXXXX XXXXX XX
Xxxxx xxxxx XXXxX XXXXXXXXXX XXXXX XXXXX OXXXX XXXXX XXXXX XXXX1 XXXXX XXXXX XXXXX XXXXX XX
xxXXx xxxXx xxxxX XXXXXXXXXX XXXXO XXXXX XXXXX XXXXX XXXXXXXXXX XiXXX XXXXX XXXXXXXXXX XX
XOOOO XXXXX XOXXX XXOXX XXOXX XXXXX XXXX2 XXXXX OXOXX OXXXX XXOXX OXXXX XXX02 OXXXX XXXXO 1X
91
92
93
94
95
XXXXX XOOXX XXOOO XXXOO XXXXO OXOOX OiXiX XiiOX XiXii XXiiO lOXiX iXOOO iOXXX XXXXi OOOiX Xi
XXXXX XXXXX XXXXX XXXXXXXXXX XXXXXXOXOX XOOXX XOXOO XXOiX OXXOX OXXXX OXXXX XXXXO XXXOX XO
Xxxxx xxxxx xxxxx XXXXXXXXXX XXXXO XXXXX OXXXX XXXXX XXXXX XXXXO XOXXX XXXXX XiXXX XXXXX XX
XXXXX XXXXX XXXXXXXXXX XXXXX XXXXi XXXXX OXXXX XXXXX XXXXX XXXXi XiXXX XXXXX XOXXX XXXXX XX
xxXXX xxXXx XXXXXXXXXX XXXXX XXXXi XXXXX XXXXX XXXXX XXXXXXXXXO XlXXX XXXXXXXXXX XXXXX XX
96
97
98
99
100
XXXXX XXXXX XXXXX XOXXX XXXXx XXXXX XXXXX XXXXX Xxxix XXXXX XXXXX XXXXXXXXXX XXXXX XXXXX Xi
10000 XXXxXXOXXX iiOXX XOOiX XiXXX XXXOi XXXXX OXOiX OiXXX XXOXX OXXXX XXioi OXiiX XXXXO 10
OXXXX XXXXX XXXXX OOXXX XXXOX XOXXX XXXio XXXXX XXXOX XOXXX XXXXX XXXXX XXOXX XXOOX XXXXX Xi
XXOOO XXXXX XOXXX XXOXX XOOXX XXXXX XXXXXXXXXX OXOXX OXXXX XXOXX iXXXX XXXOX OXXXX XXXXO XX
00000 00000 00000 00000 00000 00000 00000 00000 01000 00000 10000 00000 10000 00000 00000 00
101
102
103
104
105
XXXXX XXXXX XXXXX XXXXX XXXXXXXXXX XXXXX XXXXXXOXXX XXXXX lXXXX
XOOOX 00000 00000 OXOXX OOXOX XOOXO XOOOO 00010 00000 OOXOX 20000
XXXXX 00000 OXOOO XXXOO OXXXO OXOOO OOOOX 00000 XOXOO XXOOO iOXOO
XOOOO 00000 00000 XXOOO OOOXO OXOOl 00010 00000 01000 OXOOO 10000
xxxxx xxxxx xxxxx xxxxx xxxxx xxxxi xxxox xxxxx xoxxx xxxxx oxxxx
106
XXXXX XXXXX XXXXX XXXXX XXXXX XXXXO XXXXX XXXXX XXXXX XXXXXXXXXX XiXXX XXXXX XXXXX XXXXX XX
00000 OXOll OOlOX 10010 10000 00000 00000 00101 00000 00000 XOOOO OOOXi 01000 00
XXXXX XXXOO XXlXX OXXOX OXXXX XXXXX XXXXXXXOXO XXXXX XXXXXXXXXX XXXX1 XOXXX XX
30333 11133 30013 31333 33331 33333 03233 11333 33033 33333 23101 03113 33330 13
xxxxx Oo1xx xxxoX XOXxX XXXXO XXXXX XXlXX lOXXX XXXXX XXXXX OXOXO XXOOX XXXXX OX
OXOOO Xxxii OXXXO iXO10 lOOOX 00000 XOXOO XXOO1 OOXOO 00000 XOXXX XOXXO OlOOX XO
01100 OliXO XXOll OlOXX lXX1o lOOOX OOOXO 10000 iXOOl 10100 00001 XOXll lOXXO 11001 00
XlOXX XOOXX Xxxii XlXXX lxxix lXXXX XXXXX OXXXX OXXX1 OXOXX XXXXO XXX1o lXXXX llXXl XX
XXXXXXXXXX XXXXXXlXXX XXXXX XXXXX XXXXXXXXXX XXXXX XXXXX XXXXX XXXOX XXXXXXXXXO XX
XOXXX XXXXX XXXOO XlXXX OXXOX OXXXX XXXXXXXXXX XXXXO XXXXX XXXXX Xxxix lXXXX OOXX1 XX
X2XXX XXXXX XXX10 XOXXX 1XXOX OXXXX XXXXX XXXXX XXXX1 XXXXX XXXXX XXXOX 1XXXX 2OXXO XX
greatlydiffering
timeperiods.Thereare also
four lineages that underwentevolutionary
changebeforebecomingextinct.The extinct
terminal species of these fossil lines are
shown as large hollow circles.I have indicated the amount of evolutionarychange(path lengthof the internode)based on 85
characters
by thelengthofthethickenedbars
along the slante,dlines. Note that path
lengthsforlines leading to fossilsare based
on only 85 charactersto make themcomparable to path lengthsof Recentforms.Path
lengthsbased on all 106 characterswill be
illustratedin paper III of this series.
XXXXX 2XXXX XXXXX XXXXX XX
00000 XOOOO OO1XX OXOOO 00
10000 XOXXX XOXXO OOOOX XO
01000 10000 OOXXO 00000 01
xixxx xxxxx xxxxx xxxxx xo
MEASURES OF PHENETIC AND
EVOLUTIONARY CHANGE
To describeevolutionarychanges in this
and succeeding papers, various statistics
based on the charactersand the true cladogram are needed. They are summarizedin
Table 2. Some termsneed to be defined.An
entiretaxon consistsof t OTUs and is describedby n characters.OTUs are labelledas
as 1, ... , hi
1, ... , j, k,.. ., t and characters
if..., n. The mostrecentcommonancestor
ofall oftheOTUs is o. z is summationover
all OTUs indexedby j, usually t OTUs.
168
1.00
0.80
SYSTEMATIC ZOOLOGY
VOL. 32
4 3 22 12 2 5 18 23 16 27 24 17 1 6 10 11 9 21 7 15 8 14 13 28 25 26 19 29 20
X
0.600.40-
0-0.20
-0.40
FIG. 3. Standard phenogram of the 29 Recent Caminalcules based on the updated data set of this study.
It is based on the data matrixwith integer codes substitutedfor the measurementsand was obtained by
and UPGMA
correlationcoefficients
standardizationof charactersfollowedby computationof product-moment
clustering.The ordinate is in correlationcoefficientscale. The phenetic genera are labeled with capital letters.
Let us assume that we know the true
cladogramof the taxon,as is the case in the
Caminalcules.The nodes of the treewill be
the Recent OTUs and their ancestors,and
the internodeswill be the lines of descent
connectingthese taxa. ThroughoutI shall
assume that evolutionary(characterstate)
change occursonly along internodes.Let I11
be the sum of the lengthsin characterstate
changesover all internodesalong the directed path fromOTU j to o, the most recent
common ancestor (root) of the taxon,
summedover all characters.Quantitylohas
been called the patristiclengthof OTU j by
Farris(1969) and others.I shall referto it by
theneutralterm"path length,"since it mea-
sureshomoplasticas well as patristicresemblance (Sneath and Sokal, 1973).Next,let us
define Lmax(u) =
Ioas the sum of such
lengthsover all t OTUs. The quantityLmax(u)
is the maximallengthwhich the truecladogram could assume if it were rearrangedso
thateveryOTU evolves singlyand independentlyfromthe ancestoro, with the OTUs
diverging from that ancestor in bushlike
fashion and repeating separatelyfor each
OTU the changesin characterstatesthatactually occurredalong the common evolutionarystems in the true cladogram.Thus
Lmax(u) is a measurewhich indicatestheupper
possible bound of parallelismand reversals
FIG. 4. True cladogram of the Caminalcules as furnishedby J.H. Camin. Morphological change occurred
during 19 time periods fromtime 1 to time 20 along the ordinate. Vertical lines indicate periods without
morphological change, slanted lines indicate such change. The amount of evolutionarychange (path length
of the internode)based on 85 charactersis shown by the length of the thickenedbars along the slanted lines.
To furnishan indication of scale, the length of the internodesubtending OTU 54 is one Manhattan distance
unit, thatsubtending OTU 76 is 10 such units. Path lengths based on 106 characters,necessary fordifferentiating fossil species, are shown in a subsequent publication (Sokal 1983b: fig. 2). Squares identifyRecent
1983
CAMINALCULES: DATA BASE
F
B
2619 2029
20-illl
26
19476
/6 21 11 10
11111
C
DE
9
4 3
22 5 12 2182316
20 72
2
17 -
24
9
29
66
28
68
53
63
61
3
18
14
13 -19
12 II
50
-
61
56
6
7
51
344
10
9-
761525
34
I
10
30
7
67
~~~~~~~~~44
~~~~~~~2
33
57
67
8
7 -404
30
54
6
36
49
3
37
32
74
14
13
39
557
5
2
25
413
15
3
7
1226
16
4
A
I 151413288
272417
7 21
18
169
62
5
~~~~~~60
5
5
73
species, black circlesfossilspecies. However, note thatsome fossilspecies extend into the Recent (e.g., species
8). Hollow circles indicate extinctspecies. Large hollow circles terminateextinctlineages whereas the tiny
ones symbolize extinctspecies whose lineages continue with evolutionary change. The numbers next to
squares and circles identifythe species. Recent species have been given the numbers 1 through29 familiar
fromthe literature.Fossil species were assigned numbers 30 through 77 at random. Note that, although
Camin indicated evolution between species 58 and 52, my coding shows no morphological change between
these two forms.The internode between species 36 and 55 appears to be a similar case, but it exhibits
morphological change for charactersnumbered 86-106. The phenetic genera of the Caminalcules are identifiedby bracketsacross the top of the tree.
170
SYSTEMATIC ZOOLOGY
VOL. 32
2. Summaryof formulasof phenetic and evolutionarychange.a
TABLE
Ijo= Sum of lengths in characterstate changes over all internodesalong the directed path fromOTU
j to most recent common ancestor o
= i;, where 2 is summation over all OTUs
Lma,,(,,)
MD,, =
I
j
IX X,JXI where : is summationover all charactersand X, is the characterstateforcharacter
i and OTU j. OTU o is the most recentcommon ancestor
Lmax(l)=
3
MD,,
m,(l)
R, = Lmax(u)IlL
= Range in characterstatesof characteri forthe assemblage of t OTUs plusthe most recentcommon
ri(tO)
ancestor o
Lmzn(,)=
r,(t,,)
= Minimum length forthe taxon,given any tree structureoriginatingfromthe common ancestor o,
Lmln(u,
and an evolutionarymodel forallowable types of characterstate changes
Lac,= Sum of path lengths over all internodes,given knowledge of the true cladogram
Lest= Sum of path lengths over all internodes of an estimatedcladogram
- La,t)I(L.X(u) - Lmin(,))(Dendritic index)
DI = (Lm,,x(u)
H =La.tILmm(l)
= 1/C, where C is the consistencyindex of Kluge and Farris(1969)
H* LestL,Lm,n()*
DR =
(l,k ik
MDjk)! ; MD,k, where 3
Ik
(deviation ratio of J.S. Farris)
is summation of all OTU pairs except for pairs jj and kk
jk
-2), where the subscript M refers to the subtaxon M
Pbr,= Lbr,.,gI(lpp' + Lb,avg), where p is the most recent common ancestor of subtaxon M and p' is the most
Lb,avg
= L.t,MI(2tM
recentcommon ancestor of thatsubtaxon that is also an ancestor of a Recent nonmemberof M
Presented in order of appearance in text.Quantities with asterisksare based on estimatedratherthan true cladograms.
in the data, given the distributionof characterstatesover RecentOTUs and the characterstatesof the fossilOTUs.
A generallyshorterlengthforeach OTU
j resultsif one computes
of the ratio of unnecessary changes (reversalsand repeatedforwardchanges)among
the evolutionarysteps for that OTU. Consequently,
R =
1.0
(2)
is a measureof the amountof reversalsand
repeatsin characterstatechangesforthe enwhere Xi1is the characterstateof OTU j for tire taxon consideredas a bush. Note that
character
is summation
overthen reversalsnear the base will be repeatedin
i and
various OTUs and, hence, weighted more
characters.
NotethatMDj, is theManhattan heavily.
o. It is
distancebetweenOTU j and ancestor
Two measuresof minimumlengthof the
the minimumpossibleevolutionary
length treewere considered.DefineLmin(l)=ri(t=),
MD,,,
X= xi-
Lmax(u)ILmax(l)?
Xiol(1
o. We can
foreachOTU j fromtheancestor
assembletheselengthsto forma newupper where ri(t,)standsforthe range in character
statesof characteri forthe assemblageofthe
boundlengthof the entiretreeLmax(l)
-
MD,,givena minimallengthbushfromthe
ancestor.
NotethatsinceMDjo? 1jo0Lmax(l)
Forany OTU j otherthantheancestor
Lmax(u).
> 1.0is a measure
o theratioRjo= 1jo/MD1O
t OTUs in thetaxonplusitsmostrecentcommon ancestoro. Length Lmin(l) is the minimumamountofevolutionnecessaryforproducingthetaxonomicdiversityofthe t OTUs
in thetaxon.It is a theoreticalquantity,rare-
1983
CAMINALCULES: DATA BASE
4,
Lmnt,e) Lmin(U)
I
II
HOMOPLASY
FIG. 5.
L ct
Lmox(,e)
Lmax(iu)
DENDRITIC INDEX
Schematicto show the relationsamong the
measures of phenetic and evolutionary change discussed in the text.Note thatthe various statisticshave
been placed along a line representingthe range of
possible lengths of an evolutionarytree in positions
that they might assume in a "typical" tree. In any
actual case, the length of any segments of this line
may differfromthat shown in the figureand may
indeed be zero. The double-arrowed bracket above
the line indicates that La,Cmay vary between Lm()
and L
and is undefined with respect to its position vis-a-visLmX(,)The double headed arrows at the
bottomof the figureindicate the proportionsof the
length due to homoplasy and to its dendritic structure. The location of the boundary between these
two (the opposed arrowheads) is at the value forL,,.
171
ancestor,can be computedonly by enumeration,which is practicalforsmall numbers
of OTUs only.For mostreal data sets,informationis available only forRecentOTUs. In
such cases an estimatedcladogramis produced by a numericalcladisticalgorithmusing an outgroupOTU believedto be close to
the mostrecentcommonancestoror a vector
of characterstatesbelieved to be primitive.
As a resultofthecladisticestimationprocess,
the HTU ultimatelyspecifiedas the mostrecent commonancestorof the group may be
different
fromthe outgroupfirstfurnished.
In thesecases,thestatistics
Lmax(l)' Lmax(u)'
Lmin(l)'
and
Lmin(u) must
be defined with respect to
the estimatedcladogramand the HTU representingthehypothesizedmostrecentcommon ancestor.To distinguishthese statistics
based on estimatedratherthan true cladograms,I have added asterisksto theirsymbol. The equivalent of the quantityLactfor
the estimatedcladogramobtained by a numericalcladisticproceduresuch as a Wagner
treealgorithmis the lengthLest.For any given data set and evolutionarymodel,given a
ly, if ever, obtained in real data because of
theactualdistribution
ofcharacterstatesover
the OTUs. For Lmin(l)
to be realized, there
would have to exista cladogramforthe taxon forwhich all characterswould have to be
compatible.In such a case therecould be no specified root o, Lest> Lmzn(u)*.This estimate
reversalsor parallelisms.Since this is a hy- by various computationalalgorithmswill
pothetical,largely unattainable minimum frequently
notproducetheminimumlength
* Note that Lestcan be less than or
length, a second minimum length Lm,n(,)Lmzn(u)
needs to be definedas the minimumlength greaterthan La,t
forthe taxon with t OTUs, given any tree
Thesequantitiespermitcomputationofthe
structure originating from the common followingstatistics.
The saving in evolutionancestoro and an evolutionarymodel forthe ary lengthby the known treestructurecan
allowable types of characterstate changes. be defined as Lmax(u)- Lact.Reversals and par> Lmin(l)
since some homo- allelismsare in bothcoefficients
NecessarilyLmin(,)
but theyare
morenumerousin Lmax(u)
plasy is usuallypresent.
since thisis a bushGiven knowledgeof the true cladogram, like structurerepeatingthe path length of
the actual lengthof the tree,Lact,is the sum common shared stems separatelyfor each
of the lengthsover all the internodes.Thus OTU. A dendriticindex can be defined
it representsall the evolutionarychanges
DI= (Lmax(u) -Lact)I(Lmax(u) - Lmin(l)),
(3)
over all charactersthathave taken place on
thistree.Clearly,Lmln(u)< Lact < Lmax(u)but Lact which expressesthe savings in lengthdue
could be greateror smallerthan Lmax(l). The to the tree's dendriticstructuredeparting
relationsamong the various quantitiesare fromthat of a bush as a proportionof the
illustratedin Figure5.
tree'smaximumpossible range in length.If
QuantitiesLactand Lmax(u)are computable DI = 0, thenthe taxonis a bush and thereis
only in those extremelyrare cases, such as no common evolution. If DI = 1, then the
the presentone, in which the true evolu- characterstatesare fullycompatibleon the
tionarysequence is known; quantitiesLmin(l) cladogramand thereare no parallelismsor
and Lmax(l)requireat least knowledgeof the reversals.Because basal internodesare more
most recent common ancestor o. Quantity ofteninvolved in determiningIjothan are
internodesnear the tips of the tree, DI is
Lmin(u)'even with knowledge of the common
SYSTEMATIC ZOOLOGY
172
more heavily affectedby savings in length
in the dendriticstructurenear the base. For
data setswithunknowncladogeniesone can
define
DI*
(Lmax(u)*-Lest)I(Lmax(u)*
- Lmn(l)*).
(4)
VOL. 32
wise homoplasticdistancesdivided by the
sum of the Manhattandistancesamong all
pairs of OTUs. The pairwise homoplastic
distancesforOTUs j and k are foundas 1]kMD1k,whereas the Manhattandistancesare
MD,k.Therefore,
A second quantity,Lact - Lmln(l) describes
DR= z (,k- MDjk)/I MDjk,
the excess in evolutionarylength over the
jk
jk
(8)
absolute hypotheticalminimum.It may be
usefulto partitionthis excessinto two parts where z is summationof all OTU pairs jk
as follows:
jk
except for pairs jj and kk. This ratio is afLmin(l) = (Lmln(u)
Lact
Lmin(l))
fectedmoreby homoplasyat the base of the
+ (Lact -Lmin(u)).
(5) tree, even though the denominator also
The firstquantifiesthe necessaryparallel- countsbasal lengthsrepeatedly,because any
isms to allow forthe departurefroma fully excess length due to homoplasy will be
compatible distributionof characterstates counted more oftenfor basal than for terover OTUs and the second term describes minal internodes.
the extraparallelismsand reversalsthat ocA second measureof characterreversalR2
curredin the actual evolutionaryhistoryof is analogous to the DR ratio.It is definedas
the group over the minimumamount nec- the sum of the pairwisedistancesdue to reessaryto accountforthe observeddistribu- versals and nonparallel repeated forward
tion of characterstatesover OTUs. Biologi- changes divided by the sum of the homocallyspeaking,however,the two termsmay plastic distancesamong all pairs of OTUs,
to distinguishsincebothdescribe which is the numeratorof DR. It indicates
be difficult
departuresfromperfectconsistencyof char- the proportionof homoplasy due to reacter states with the cladogeny. Clearly, versals.Its 1-complementis the proportion
is a measureof homoplasyas it due to parallelisms.
Lact -Lmin(l)
is conventionallyunderstoodand when exIn studyingsubtaxawithina largertaxon,
pressed as a proportionof the maximum as, for example, the genera in the present
possible range of lengthof the treeLmax(u)study,it is usefulto definethe above statistics also for subtaxa. In such a case when
Lmin(l) is simply1 - DI (see Fig. 5).
An alternativestatistic,
adoptedin thispa- workingwithsubtaxonM, I employthemost
recentcommonancestorPM' or simplyp, of
per,is
H = LactILmin(l1)
(6) that taxon, replacing o by p in the above
formulas.Summationsover OTUs will then
which expressesthe homoplasy as a ratio, be carriedout not over
the t OTUs of the
necessarilygreaterthan 1. The amount by entirestudy,but only over the tM
OTUs of
which H is greaterthan unity is the extra subtaxonM.
lengthof the actual tree,expressedas a proIt is usefulto recordthe path lengthofthe
portionof the minimumtree lengthneces- stem of each of the subtaxa.
Following the
saryforevolutionof the characterstates.For earliersymbolism,thiscan be defined
as 1pp,,,
data sets with unknown cladogenies
where p' is the most recentcommonancesH = LestILmln(l)*.
(7) torthatis also an ancestorof a Recentnonmemberof subtaxonM. An average branch
Note that H* = 1/C where C is the consis- lengthLbr,avg
of the internodessubtendedby
tencyindex of Kluge and Farris(1969),also the most recentcommon ancestorp is also
employed by Kluge (1976) and Mickevich useful.It can be computedby dividingLact,M,
the observedlengthof the treerepresenting
(1978a).
A thirdmeasureof homoplasyis DR, the the subtaxon,by 2tM- 2, the numberof its
deviationratiofeaturedin Farris'WAGNER internodes.Comparing lpp,with Lbr,avgcon78 program,which is the sum of the pair- traststhe evolution on the stem preceding
1983
CAMINALCULES: DATA BASE
173
pectsin 19 zoologicaldata sets,rangingfrom
8 to 97 OTUs and based on from20 to 139
characterslackingNCs. None of these data
sets was subjected to the exhaustive analysisto which the Caminalculeswere treated
(9) (recountedin subsequentpapers of this sePbrl
Lbr,avgl(lpp'+ Lbr,avg).
An additional complicationarises when- ries)fortheobvious reasonsthatsuch a projever NCs are presentas in thisstudy.I have ectwould have takenseveraladditionalyears,
adopted two conventions,which are modi- thatfindingoutgroupsforthemwould have
fications of the Manhattan distance. In takenmuchspecializedknowledge,and that
"transparent" measure the NC state is the truecladogeniesof the data are of course
thoughtof as an unknown.For computation unknown.As will be seen below,the results
of distancesbetween terminalOTUs, char- foralmost all parametersbracketthose obacter states for OTU pairs, one or both of tainedforthe Caminalcules.It is not expectin treetopologythat
which had NCs fora given character,were ed thatthe differences
ignoredduringthe computationof the dis- might be accomplishedby repeated applitance and NCs were passed over during cation of a cladisticalgorithmto the same
computation,of path lengths. Thus a 0 - data set in the hope of findinga shortertree
NC - 3 - 4 path is four units long. In would alter these relationswith respectto
"opaque" measureNCs are considereda dis- homoplasyand othermeasures.I have sumtinctcharacterstate and changes along the marized my conclusionsin Table 3 forhotreefroman expressedcharacterstateto an moplasy, tree symmetryand adequacy of
unexpressedone (NC) are considereda sin- characters.
Homoplasy.-Threeindices of homoplasy
gle step. SuccessiveNC statescount as zero
steps and a change froman NC to an ex- have been mentionedearlierand in the litpressedstateis again a singlestep,regardless erature.These are the 1-complementof the
of the magnitudeof the expressedcharacter dendriticindex(1 - DI), thehomoplasyratio
state.Thus the path 0 - NC - 3 - 4 is of H, which is the reciprocalof C, the consislength3. Computingdistancesbetweenter- tencyindex of Kluge and Farris(1969),and
betweentwo NCs Farris'deviationratio DR. In the CaminalminalOTUs, thedifference
was consideredto be zero,thatbetweenany cules,two values can be obtainedforeach of
numericalstateand an NC was considered theseindices.One is based on thetruecladoto be one. Opaque distance was employed gram and should be the correctmeasure of
mainlybecause it made it simple to express homoplasyin the groupforthe given index;
path lengthforany internodealong the tree the other,comparableto those furnishedin
and to partitionthe path lengthinto homo- the literatureon real organisms,is based on
plastic and reversaldistancesas needed in the estimatedcladogram.As an estimateI
the nextsectionof thispaper. In transparent employedthe approximateWagnertreeobmeasureit is notalways possibleto uniquely tained with the WAGNER 78 programdeveloped by J. S. Farris,using the distance
estimatepath lengthforeach internode.
Wagner procedure and midpoint rooting.
Opaque distanceswere analyzed to permit
RELEVANCE OF CAMINALCULES FOR
computation of all homoplasy statistics.
SYSTEMATIC INQUIRY
However, forthe two statisticsthat can be
It is of interestto examine in how many computed in transparentmeasure (H and
ways theCaminalculesresembledata setson DR), these values are reportedas well. Stathe tisticsbased on estimatedcladogramsare disrealorganisms,sincethiswill strengthen
relevanceof resultsobtainedfromthem for tinguishedby affixingan asteriskto their
systematicinquiry.Below I inspectas many symbols. ApproximateWagner trees were
separateaspectsof the Caminalculesas seem computedforthe 19 data sets, again using
to me relevantto systematicmethods and the WAGNER 78 programand rootedby the
principles.I also examine threeof these as- midpointmethod.Becausethesedatasetslack
the mostrecentcommonancestorof the genus with the subsequent evolution within
thegenus.The averagebranchlengthcan be
expressedas a branchlengthproportion,
=
174
SYSTEMATIC ZOOLOGY
TABLE
VOL. 32
3. Minima and maxima of tree statisticsfor 19 zoological data sets and the Caminalcules.a
Caminalcules
Tree statistics
Homoplasy
1 - DI*
H*
C
DR*
Symmetry
BSUM2
BSUM3
SHAO2
SHAQ3
COLLESS2
Low value
Estimated
True
High value
0.0559b
0.1326
1.417
0.7057
0.4646c
6.690d
0.8621b
0.1124b
0.1795
0.1745
2.327
0.4298
0.2597d
0.4900
0.2420
0.6186
0.5225
0.3331
0.3779
0.0753
0.7720
0.7146
0.1693
0.7714e
0.3462g
0.8021c
0.6840c
0.4889g
1.16h
2.93
2.93'
6.62i
1.160b
0.1495d
-0.0976'
0.4483h
0.3421h
0.1061'
1.3591
1.7395d
Adequacy
nit
High and low values referto numerical values, not necessarilyto the propertydescribed. Thus, high values of C indicate low homoplasy,
and high values of BSUM2, BSUM3, and COLLESS2 indicate low symmetry(high asymmetry).The values shown are based on means of three
runs of the WAGNER 78 program with midpoint rooting for the symmetrymeasures, and on single estimatesfor the other measures. Other
cladistic estimateswere run as well; fordetails see text.The resultsconformto those shown in this table. The data sets exhibitingextremevalues
are identifiedin footnotes.For explanations of tree statisticssee text.
b Leptopodomorpha (Schuh and Polhemus, 1980).
c Orthopteroidinsects (Blackithand Blackith,1968).
d Hoplitiscomplex (Michener and Sokal, 1957).
e WesternBufo(Feder, 1979).
f Drosophila(real set; Throckmorton,1968).
8 Hemoglobin , (Dayhoff,1969).
h Dasyuridae (Archer,1976).
Is 5.34 if binary coded data are considered.
l Pygopodidae (binary coding; Kluge, 1976).
a
NCs, the distinctionbetween opaque and
transparent
measur(edoes not apply to them.
As an experiment,I also triedtreeswith an
outgrouprootingusing OTU 1 as the outgroup (clearly not a recommendedprocedure,althoughit has been used by some numericalcladists;e.g.,Mickevich,1978b;Farris,
1979a).Only midpointrootingis reportedin
Table 3. Resultsforthe OTU 1 rootingprocedurewere similar.
It is worthemphasizingagain that,both
for the Caminalcules and the 19 real data
sets, the above method provides only approximate Wagner trees. Better estimates
could have been obtainedwithfurther
work
and were indeed obtained forthe Caminalcules (see Sokal, 1983a).I am employingthe
"cruder"estimateforthe Caminalcules,since
it is comparableto the estimatesin the 19
data sets.Presumablyhomoplasywould decrease if the length of the trees could be
shortenedalgorithmically.
But,on the average, thiswould occurproportionately
forall
data setsand, sincetheCaminalculesare cur-
rentlybracketedby the data sets on real organisms,theypresumablywould stillbe so
bracketedafterthe length of all trees had
been reduced.
In the Caminalcules 1 - DI = 0.1745 and
1 - DI* = 0.1326.In the 19 data sets 1 - DI*
rangesfrom0.0559in the Leptopodomorpha
(Schuh and Polhemus,1980) to 0.4646in 12
orthopteroidinsects(Blackithand Blackith,
1968). Because the true trees for these real
organismsare not known, the propercomparison with the Caminalculesis with 1 DI* fromthe approximateWagnerestimate.
For the truecladogramH = 2.327,and for
the Wagner estimateH* = 1.417 and 1.261
for opaque and transparentestimates,respectively,yielding correspondingconsistency indices of 0.4298,0.7057,and 0.8048.
Mickevich(1978a) reporteda range of consistencyindicesfrom0.33 forAedesand papilionidsto 0.86 forcytochromeC and globin.
In the 19 data sets,H* rangesfrom1.160in
the Leptopodomorphato 6.690in bees ofthe
Hoplitiscomplex(Michenerand Sokal, 1957),
1983
CAMINALCULES: DATA BASE
175
corresponding to consistency indices of WAGNER program,rootedat the trueancestor,and an estimatedWagnertreeobtained
0.8621and 0.1495,respectively.
The deviationratio(DR) forthetrueclado- with the WAGNER 78 programwith midgram is 1.3591,a very high value, but DR* pointrooting.The latterwas replicatedthree
is 0.1795 and 0.3852 for opaque and trans- timeswithrandomlypermutedinputorders
parentWagnerestimates,respectively.This of OTUs.
great discrepancycan be accounted for by
These same statisticswere also calculated
the numerous reversalsin the true clado- forsimilarlyestimatedcladograms(exceptfor
gram,especiallyalong the stemsdefiningthe the impossibletrueancestorrooting)forthe
genera. Rememberthat the estimateopti- 19 data sets.Again,rootingusingOTU 1 was
mizes the positionof the HTUs and, in con- employed additionallyon an experimental
sequence,attemptsto minimizehomoplasy. basis for trees obtained by both programs.
The formulaforcomputingthe deviationra- Not all data sets were suitableforeach estitio involves all pairwise distancesbetween mationmethodand statistic,
but no reported
OTUs and, therefore,
repeatedlycountsthese rangeofvalues is based on fewerthan14data
basal relations.Actuallytheparallelismcom- sets.In Table 3 resultsare reportedonly for
ponentofthe homoplasyis relativelylow in estimatedcladogramsbased on the WAGthe Caminalcules.For the lower triangular NER 78 programand midpointrooting,but
matrixin opaque measure,the sum of the below the otherresultsare summarizedas
homoplastic distances is 21,225, of which well. For WAGNER 78 with midpointroot17,093is due to reversalsand 4,132is due to ing, the truecladogramis containedwithin
parallelisms.The Wagnertree algorithmin the range of observed symmetriesfor all
tryingto obtaina shortestlengthtreecancels coefficientsexcept SHAO3. But the true
many of the reversals,so thatthe deviation cladogramshows more symmetrythan the
ratio actually observed is relatively low, observedrange of symmetryvalues fores0.1795,whereasthedeviationratioofthetrue timatedcladogramsbased on the PHYLIP
cladogramis much higher.The rangeof ob- WAGNER program and on WAGNER 78
served values of DR* in the 19 data sets is with OTU 1 as an outgroup.Yet, when esfrom 0.1124 in the Leptopodomorpha to timated cladograms are computed for the
1.7395 in the Hoplitiscomplex. If reversals Caminalcules,these are almost always less
are as common in real organismsas in the symmetrical
than the truecladogram.In TaCaminalcules and if we knew their true ble 3 formidpointrootedWagnertreesthe
cladograms,real organismsmight well ex- symmetriesof estimatedcladogramsof the
hibitgenerallyhigherdeviationratios.Real Caminalculesare containedwithintherange
and apparent homoplasy in the Caminal- of observedvalues and, when otherestimatcules is well within the reportedrange for ed cladogramsare consideredas well, this
real organisms.
relationholds in 17 out of 20 comparisons.
5ymmetry.-Thetrue cladogram of the Since theother19 classifications
are all based
Caminalculesis fairlysymmetrical.
No uni- on estimatedcladograms,it is the estimated
versallyacceptedcriterionof symmetryhas cladogramsoftheCaminalculesthatmustbe
yet been established,but K. T. Shao (pers. compared with these data sets ratherthan
comm.),who is investigatingthis problem, the truecladogram.Thus on thisbasis, also,
has used fiveseparateindicesto describedif- it appears that the Caminalcules are not
ferentaspectsofsymmetry.
Theseare BSUM2 atypical.
and BSUM3 (modifiedfromSackin, 1972),
forresolvingthe
Adequacyof thecharacters
SHAO2 and SHAO3 (Shao,pers.comm.),and cladogram.-The29 Recent OTUs in a fully
treewould derivefrom27 bifurCOLLESS2 (Colless, 1982).These coefficients bifurcating
were computedforthetruecladogramofthe cations,not countingthe basal bifurcation
at
Caminalculesas well as forseveralestimated the root.Since one of the branchingpoints
cladogramsof the Caminalcules. The esti- in the true cladogramis a trifurcation,
26
mated cladogramsinclude one obtained by synapomorphiesare minimally needed to
means of Joseph Felsenstein's PHYLIP resolve the tree. How can we measure the
176
SYSTEMATIC ZOOLOGY
VOL. 32
adequacy of the data set forthistask?At the termsof the charactersavailable,in addition
simplest level, one can count characters. to the known trifurcation-which goes
Fewer than 26 binarycharacterscannot,re- counterto the assumptionsof mostcladistic
solve the tree. Subjectedto additive binary methods.Correlatedcharactersundoubtedly
coding, the 85 X 29 data yield 155 binary definefurcationsredundantly,and parallelcharacters-superficially
a more than ade- ism allows a single characterto definefurquate number.However, such an approach cationsin different
partsof the tree.Unforis too crude since it does not allow forchar- tunately,knowledgeof thistypeis virtually
actercorrelations.In fact,we know that in impossibleto obtain fromreal data sets,beadditionto the single trifurcation,
threebi- cause the truecladogramsare unknown.Usfurcationsin the true cladogram(19-26,11- ing estimatedcladogramsforsuch inferences
21, 18-23)are not supportedby any evident would tend to make the argumentcircular.
evolutionarychange in the stems subtend- In any case,it appears thatthe Caminalcules
ing them.Thus,even thoughthereare more are unlikelyto be less capableofhavingtheir
thanfivetimesas manybinarycharactersin cladogramresolved than most data sets in
thisdata set than are necessaryto definethe systematics.
tree,they are distributedacross the tree in
Becauseofthe incompletenessofthe fossil
such a way as to makeit impossibleto define record,less is known fromreal organismsof
more than 23 branchingpoints. Most nu- thefollowingthreeaspects.However,it may
merical cladisticstudies are carriedout on be of some interestto treatthese topics at
far fewer charactersand because the true least brieflyin an effortto describewhere
cladogramsof real organismsare unknown, the Caminalculesare locatedwith respectto
it is generallynot known whetherthe data evolutionary
parameters-evolutionary
rates,
matrixis adequate forresolvingthetruetree. specieslongevities,and speciation-extinction
In the 19 data sets the ratio n/t,where n is rates-thatmightbe employedin simulation
the numberof charactersand t the number studies.It will be seen thatthe Caminalcules
of OTUs, ranges from 1.16 in Dasyurus are not in contradictionwith such findings
(Archer,1976)to 6.62 forbinarycoded mem- as are reportedin the paleobiologicalliterabersofthe Pygopodidae(Kluge, 1976).These ture.
figurescomparewith 2.93 and 5.34 formulrates.-A frequencydistribuEvolutionary
tistateand binary coded Caminalcules,re- tion of the path lengths(amount of evoluspectively(see Table 3). Since the values for tionarychange) of each internodefor each
the 19 data sets reflectdifferencesin char- timeperiodshows extremeclumping.When
actercoding,theymust be interpretedwith examinedagainstPoisson expectations(Fig.
caution. Nevertheless,it is clear that the 6), the overdispersionis highly significant
Caminalculesfallwell withinthe bounds of (P < 0.001). Thus, thereare many more pedata fromthe literature.
riodsofno evolutionas well as moreperiods
One can examine the true cladogramto with substantialamountsof evolutionthan
determinethe number of OTUs subtended expected.Two explanationscan be advanced
by any given furcation.It ranges from2 to forthis phenomenon.(1) The Caminalcules
22 OTUs. These figuresshould minimallybe are similar to real organismsin that their
matchedby the numbersof OTUs (greater evolutionis of organicformas a whole raththan one) sharingany one characterstateto er than independent for each character.
providesynapomorphiesforthe recognition Changes in form in the Caminalcules inof these furcations.When the requireddis- volve variouscorrelatedcharactersand thus
tributionof OTU numbers is compared to those segments of the cladogram during
theactualdistributions
ofsuch numbers,one which extensive morphological evolution
findsat least as many observedfrequencies occurred will exhibit greater amounts of
as are required.However,thisis insufficient change.(2) Therealso maybe local clumping
evidence forthe adequacy of a data set be- of evolutionarychangeson the treefornoncause we already know that three bifurca- biologicalreasons which have an analog in
tionsin the Caminalculesare not resolvedin real phylogeneticprocesses. We must as-
1983
CAMINALCULES: DATA BASE
8-
177
5-
6
z
03
4
7
............
L
0
>7
201
6 _
1 23 4
~~~~~
4
7
8
FIG. 6. Observed and expected frequenciesof evolutionary rates (path length per evolutionary time
period) in the true cladogram of the Caminalcules.
These data are based only on those lines leading to
Recent OTUs. The bar diagram of the "skyline" (bars
above the abscissa) indicates the expected frequencies of a Poisson distribution;the "inverted skyline"
(shaded portion of the bars) indicates the observed
frequencies.Bothfrequenciesare given in square roots
of actual values to emphasize the deviations. Where
the invertedskylinedoes not reach the abscissa,there
are fewerobserved than expected frequencies.Wherever it reaches below the abscissa, there is an excess
of observed frequencies over expected frequencies.
The ordinateshows square rooted frequencies,while
the abscissa expresses evolutionaryrates.
sume thatCamin in constructing
his treewas
at leastsomewhatgoal orientedso thatas he
evolved any one line he carriedit through
to some majormorphologicalchange rather
thanabandon the changein midcourse.This
processalso tends to induce inhomogeneity
of rates among the lines which resultsin
clumpingof the distribution
of evolutionary
changes when consideredoverall. The bio-
o
2
4
6
8
10
12
14
16
18
20
TIME PERIOD
FIG. 7. Average evolutionaryrate plotted against
time period. The rate is expressed along the ordinate
as average path length in the (unit) time interval
preceding the point in time shown along the abscissa. The path lengthsare computed on the basis of
106 characters.The means of this figureare based on
lines leading to Recent OTUs only.
logicalanalogis to assumethatmajorchanges
in environmentor in genetic architecture
leading to adaptive radiation are ongoing
processeswith a momentumof theirown.
The data in Figure6 are based on onlythose
lines leading to RecentOTUs. When all internodesincludingthose leading to extinct
terminalspeciesare examined,theresultsare
very similarto those already reportedand
illustratedin Figure6. Similardata are hard
to obtain in real organismsbecause of the
incompletenessof the fossilrecord,but the
observed patternof evolutionarychange is
at least biologicallyplausible.
In Figure7, I show a graphof the average
evolutionaryratesoverall evolutionarylines
foreach of the 19 time periods (again, only
forlines leadingto RecentOTUs; the results
includinglines leadingto extinctspeciesare
quite similar).The ratesare computedfrom
the 106 characterX 77 OTU data base. There
is clear evidenceof differential
evolutionary
ratesthroughtime.Betweentimes7 and 11,
rates were generally lower than at other
times. By time 7, all genera except for DE
and C had alreadybeen definedbut major
within-genusdiversityhad not yet begun
exceptin genus A. A runs-up-and-down
test
(Sokal and Rohlf,1981b),significantat P <
178
SYSTEMATIC ZOOLOGY
0.01,demonstratesthe alternatingnatureof
the rate changes. Periods of higher change
alternatedwithperiodsoflowerchangemore
frequently
thancould be expectedby chance
alone. Relevant comparisonswith real organisms are hard to come by. That evolutionaryratesdifferover the fossilrecordhas
been well established since the work of
Simpson (1944, 1949,1953).Yet, because paleontologicalseriesare usuallybased on few
charactersand because the fossillineagesare
not well known, estimatesof evolutionary
ratesbased on multiplecharactersas in Figure 7 are lacking.For singlecharacters,
rates
do vary with time within species (Stanley
[1979:58ff.]
illustratestwo examples) as well
as within and among genera (Gingerich,
1974,1979).Clearly,thepatternofevolutionary (phenetic)ratesobservedin the Caminalcules is not uncharacteristicof observationsmade on real organisms.
Specieslongevities.-Theknown cladogeny
of the Caminalculesalso permitsan examinationof the distributionof species longevities.These are shown in Figure8 separately
forthe 29 Recent species,as well as forall
Recent-and-fossil
species in the group. The
longevitiesof Recent species are shown in
two different
ways. In Figure8A, I show the
actual longevityfromthe time of originof
each species to the Recentor to the time of
extinction.Expressedas a cumulative percentage,it can be read as the percentageof
livingspeciesthatextendbackwardsby various amounts of evolutionarytime. Such
curves are generallynot used by paleontologists (Stanley, 1979:113),because the incompletenessofthefossilrecordwould make
estimatesof longevitiesinaccurate.In the
Caminalcules,of course,this graph is fully
descriptiveand accurate.This curve can be
contrastedwiththatin Figure8B,a so-called
Lyelliancurve,which depictsthepercentage
ofRecentspeciesthatcan be foundin faunas
ofa givenage. As expected,withan increase
in age ofthefaunalassembly,thepercentage
of its species thatsurviveto the Recentdecreases. None of the five species extantat
time period 4 survivedto Recenttimes,the
faunal assemblyat time period 5 being the
firstone to contributea memberspecies to
the Recentfaunalassemblage.Althoughthe
VOL. 32
A
100 IL
>
80 60 -
-
40
D
20 _
RECENT
---- RECENT+FOSSIL
'\
s _
2
4
6
-_^s
8
10 12 14 16
LONGEVITIESOF SPECIES IN TIME PERIODS
Z Z
I:
X<LL\
J
B
100
80
QO(
0
60 _
:z
40
wz w
3
0 0
W AL
-
20
il
2
2
4
6
8
I
tt
I ._
10 12 14 16
TIME PERIODS AGO (FAUNAL AGE)
FIG. 8. Species longevities. (A) Longevities of
species in given time periods plotted as cumulative
percent,i.e., the percentage of living species thatextended backwards by various amounts of evolutionary time. Solid line-Recent species only; dashed
line-Recent and fossilspecies. (B) Percentage of Recent species that can be found in faunas of a given
age.
total number of species in faunal assemblages was nondecreasingwith time, the
number of species survivingto the Recent
did not increase proportionately.This explains the lack of monotonicity
in Figure8B.
Hence,thereare timeintervalsduringwhich
the numberof species in the faunal assembly increased without a correspondingincreasein the numberof species survivingto
the Recent,producinga loweringof the percentageof extantspecies in the fossilfauna.
In Figure8A, the longevitycurve forRecent OTUs can be compared with that including both Recentand fossilOTUs. Both
curvesresemblethehollow curvesof similar
data fromthepaleontologicalliterature,
with
theRecent-and-fossil
curve,comprisingmore
data,presentingthe smootherappearanceas
mightbe expected.Note thatbothcurvesdecline to zero at time period 16. No Caminalcule species lived for more than that
1983
CAMINALCULES: DATA BASE
amountoftime.Corresponding
graphsin the
paleontologicalliterature
generallyapproach
an asymptoteat a higherpercentagebecause
the faunal assembliesare not tracedback a
sufficient
time forall the Recent species to
disappear.The Lyellian curve,Figure8B, is
not atypicalwith respectto those published
in the literature(see figs.5-8 or 9-3 in Stanley, 1979).
The model under which ProfessorCamin
evolved the Caminalcules requires each
species to have a longevityof at least one
time period. The observed distributionof
longevities is highly clumped (P < 0.001
againstPoissonexpectations);
thereare many
morespeciesthanexpectedbecomingextinct
aftera single time intervaland substantial
numbers survive for quite long periods of
time.The latterphenomenonseemsto agree
withobservationson realorganisms.The excess of short-livedspeciesis moredifficult
to
demonstratein real organismsbecause there
would be an inherentbias againstobserving
these species in fossil assemblages. Moreover, real organismsare not limitedto discrete time amounts as were the Caminalcules. That is, species existingforless than
one timeunitclearlymustoccurin the fossil
record.We may conclude,however,thatdespitesome peculiaritiesdue to theirmode of
generation,the distribution
of longevitiesin
the Caminalcules resembles similar distributionsobservedin real organisms.
ratios.-In the CamSpeciation-extinction
inalcules,56.0% of all evolutionarychanges
are accompanied by extinction, whereas
54.4% of lines showing no evolutionary
change(stasis;verticallines in Fig. 4) lead to
extinctions.Thus thereis no preferencefor
one or the othertype of process leading to
extinctionbuilt into the evolutionarytree.
The availabilityofcompleterecordsin the
Caminalculespermitscomputationof thenet
rate of increase in number of species (R),
speciationrate(S) and extinctionrate(E) for
all time periods. In real data, R and E are
usuallyapproximatedby variousindirectapproaches and S is estimatedas S = R + E.
Here these values can be directlycomputed
and they are R = 0.213, S = 0.538 and E =
0.324. Note that if R is estimatedusing an
exponential growth model (Stanley, 1979:
179
104),the resultingvalue is 0.177,an underestimateofthe truerate.The absolutevalues
ofR, S, and E cannotbe contrasted
withthose
of real organismssince the time intervals
chosen for the Caminalcules are arbitrary
But the relativemagnitudesof these quantitiescan be compared.The SIE ratioin the
Caminalcules is 1.661. This compares with
similar ratios estimatedunder varying assumptionsforPlio-Pleistocenemammals of
Europewhich rangefrom1.310to 2.048,and
fortemperateand subtropicalbivalvesofthe
Pacific which range from 1.667 to 3.333
(Stanley,1979:117).Bothof these groupsare
consideredby Stanleyto be undergoingactive adaptive radiationinto the present.By
this criterion,the Caminalculesalso do not
differmarkedlyfromreal organisms.
Conclusions.-Bythe six criteriaexamined
above, the Caminalculesfallwell withinthe
rangeof observedvalues forreal organisms.
At least with respectto these criteria,the
Caminalcules simulate real organisms,although,of course,this is not to imply that
theremay not be otheraspectsof evolutionary change in real organismsthat will be
found lacking in the Caminalcules.But by
these criteriaalone the Caminalcules offer
considerablescope foranalyses of methods
and principlesof numericalpheneticsand
cladisticsand of taxonomyin general.
ACKNOWLEDGMENTS
This is ContributionNo. 463 in Ecology and Evolutionfromthe StateUniversityof New York at Stony
Brook.The studywas greatlyaided by two unusually
competentassistants.Leora Ben Ami initiatedthe recoding of the characters.Barbara Thomson completed this task,took over the very considerable amount
of computations,and prepared the tables, chartsand
illustrations,
becoming so engrossedin thiswork that
she became transmogrifiedinto a Caminalcule on
Halloween's eve 1982. Throughoutthis work, I have
had the unstintedcounsel of myfriendand colleague
F. JamesRohlf. Much advice and computational assistance were given by other members of the Stony
BrookNumerical TaxonomyGroup: Kent Fiala, Gene
Hart, and Kwang-tso (Chet) Shao. I am indebted to
JosephFelsenstein forhis valuable PHYLIP program
package, to Kent Fiala for use of the CLINCH program,and to Leslie F. Marcus forfurnishingour laboratorywith a copy of the WAGNER 78 program.The
manuscriptbenefitedgreatlyfroma criticalreading
by Messrs. Fiala, Hart, Rohlf and Shao as well as by
criticalreviews by Raymond B. Phillips, University
of Oklahoma, and two anonymous reviewers. The
180
SYSTEMATIC ZOOLOGY
sectionson evolutionaryrates,species longevities,and
speciation-extinction
ratioswere read by Michael Bell,
Lev R. Ginzburg, and Jeffrey
S. Levinton at Stony
Brook, who provided useful comments. Barbara
McKay uncomplainingly wordprocessed innumerable versions of this manuscript.JoyceSchirmerprepared the final illustrations.Last but farfromleast I
must acknowledge my debt to the late ProfessorJoseph H. Camin, my formercolleague at the Universityof Kansas. Withouthis fancifulimaginationcoupled with a thorough understandingof systematics,
there would not have been any Caminalcules. The
computationswere carried out on a UNIVAC computer at the Stony Brook Computer Center and on
the MODCOMP minicomputer facility at the Departmentof Ecology and Evolution.This researchwas
supported by grantDEB 80-03508 fromthe National
Science Foundation,whose continued support of my
researchis much appreciated.
VOL. 32
ogy. Pages 41-77 in Phylogenetic analysis and paleontology (J.Cracraftand N. Eldredge, eds.). Columbia Univ. Press, New York.
KLUGE,A. G. 1976. Phylogenetic relationships in
the lizard family Pygopodidae: An evaluation of
theory,methods and data. Misc. Publ. Mus. Zool.,
Univ. Michigan, No. 152.
KLUGE, A. G., AND J. S. FARRIS. 1969. Quantitative
phyleticsand the evolution of anurans. Syst,Zool.,
18:1-32.
MAYR,E. 1965. Numerical phenetics and taxonomic
theory.Syst. Zool., 14:73-97.
MICHENER,
C. D., AND R. R. SOKAL. 1957. A quantitativeapproach to a problem in classification.Evolution, 11:130-162.
MICKEVICH,M. F. 1978a. Taxonomic congruence.
Syst. Zool., 27:143-158.
MICKEVICH,M. F. 1978b. Taxonomic congruence.
Ph.D. Dissertation, State Univ. of New Yprk at
Stony Brook,Stony Brook,New York.
MICKEVICH,M. F. 1980. Taxonomic congruence:
REFERENCES
Rohlf and Sokal's misunderstanding.Syst. Zool.,
ARCHER, M. 1976. The dasyurid dentition and its
29:162-176.
relationshipto thatof didelphids, thylacinids,bor- Moss, W. W. 1971. Taxonomic repeatability:An exhyaenids (Marsupicarnivora) and peramelids
perimentalapproach. Syst. Zool., 20:309-330.
(Peramelina:Marsupialia). Aust.J.Zool., Suppl. Ser. Moss, W. W., AND R. I. C. HANSELL. 1980. Subjective
No. 39.
classification:The effectsof characterweightingon
BLACKITH, R. E., AND R. M. BLACKITH. 1968. A nudecision space. Bull. Classific.Soc., 4(4):2-6.
merical taxonomyof orthopteroidinsects. Aust. J. NELSON, G., AND N. PLATNICK. 1981. Systematicsand
Zool., 16:111-141.
biogeography:Cladistics and vicariance. Columbia
CAMIN, J.H., AND R. R. SOKAL. 1965. A method for
Univ. Press, New York.
deducing branchingsequences in phylogeny.Evo- ROHLF, F. J. 1982. Consensus indices forcomparing
lution, 19:311-326.
classifications.Math. Biosci., 59:131-144.
COLLESS, D. H. 1980. Congruence between morpho- ROHLF, F. J.,D. H. COLLESS, AND G. HART. 1983a.
metricand allozyme data for the Menidia species:
Taxonomic congruence-A reanalysis.Pages 82-86
A reappraisal. Syst. Zool., 29:288-299.
in Numerical taxonomy. Proceedings of a NATO
COLLESS, D. H. 1982. [Review of]Phylogenetics:The
Advanced Study Institute (J. Felsenstein, ed.).
theory and practice of phylogenetic systematics.
NATO Adv. Study Inst. Ser. G (Ecological SciSyst. Zool., 31:100-104.
ences), No. 1. Springer-Verlag,New York.
DAYHOFF, 0. (ed.). 1969. Atlas of protein sequence
ROHLF, F. J.,D. H. COLLESS,AND G. HART. 1983b.
and structure.Vol. 4. National Biomedical ReTaxonomiccongruencere-examined.Syst.Zool., 32:
144-158.
search Foundation, Silver Spring, Maryland.
ELDREDGE, N., AND J.CRACRAFT. 1980. Phylogenetic ROHLF, F. J.,J. KISHPAUGH,AND D. KIRK. 1980.
patternsand the evolutionary process. Columbia
NTSYS, numerical taxonomysystemof multivariate statisticalprograms.Tech. Rep. StateUniv. New
Univ. Press, New York.
FARRIS,J.S. 1969. A successive approximationsapYork, Stony Brook.
proach to characterweighting. Syst.Zool., 18:374- ROHLF,F. J.,AND R. R. SOKAL. 1965. Coefficientsof
385.
correlationand distance in numerical taxonomy.
FARRIS,J.S. 1977. On the phenetic approach to verUniv. Kansas Sci. Bull., 45:3-27.
tebrateclassification.Pages 823-850 in Major pat- ROHLF, F. J., AND R. R. SOKAL. 1967. Taxonomterns in vertebrateevolution (M. K. Hecht, P. C.
ic structure from randomly and systematically
Goody, and B. M. Hecht, eds.). Plenum, New York.
scanned biological images. Syst. Zool., 16:246-260.
FARRIS,J.S. 1979a. On the naturalness of phyloge- ROHLF,F. J.,AND R. R. SOKAL. 1980. Comments on
netic classification. Syst. Zool., 28:200-214.
taxonomiccongruence. Syst. Zool., 29:97-101.
FARRIS,J. S. 1979b. The information content of the
ROHLF,F. J.,AND R. R. SOKAL. 1981. Comparing nuphylogeneticsystem.Syst. Zool., 28:483-519.
merical taxonomicstudies. Syst. Zool., 30:459-490.
FEDER,J.H. 1979. Natural hybridizationand genetic SACKIN,M. J. 1972. "Good" and "bad" phenograms.
divergence between the toads Bufo boreas and Bufo
Syst. Zool., 21:225-226.
punctatus.Evolution, 33:1089-1097.
SCHUH,R. T., AND J. S. FARRIS. 1981. Methods for
GINGERICH,P. D. 1974. Stratigraphicrecord of Early
investigatingtaxonomic congruence and their apEocene Hyopsodus and the geometry of mammalian
plication to the Leptopodomorpha. Syst. Zool., 30:
phylogeny.Nature, 248:107-109.
331-351.
GINGERICH, P. D. 1979. Stratophenetic approach to
1980. Analysis of
SCHUH, R. T., AND J.T. POLHEMUS.
taxonomiccongruence among morphological,ecophylogeny reconstruction in vertebrate paleontol-
1983
CAMINALCULES: DATA BASE
logical, and biogeographic data sets for the Leptopodomorpha (Hemiptera). Syst. Zool., 29:1-26.
SIMPSON, G. G. 1944. Tempo and mode in evolution.
Columbia Univ. Press, New York.
SIMPSON, G. G. 1949. The meaningof evolution.Yale
Univ. Press, New Haven, Connecticut.
SIMPSON, G. G. 1953. The major featuresof evolution. Columbia Univ. Press, New York.
SNEATH, P. H. A. 1982. [Review of] Systematicsand
biogeography:Cladisticsand vicariance.Syst.Zool.,
31:208-217.
SNEATH, P. H. A., AND R. R. SOKAL. 1973. Numerical
taxonomy.W. H. Freeman,San Francisco.
SOKAL, R. R. 1966. Numerical taxonomy.Sci. Am.,
215:106-116.
SOKAL, R. R. 1974. Classification:Purposes, principles, progress,prospects.Science, 185:1115-1123.
SOKAL, R. R. 1983a. A phylogenetic analysis of the
Caminalcules. II. Estimating the true cladogram.
Syst. Zool., 32:185-201.
SOKAL,R. R. 1983b. A phylogeneticanalysis of the
181
Caminalcules. III. Fossils and classification.Syst.
Zool. (in press).
SOKAL,R. R., AND F. J.ROHLF. 1966. Random scanning of taxonomiccharacters.Nature,210:461-462.
SOKAL,R. R., ANDF. J.ROHLF. 1980. An experiment
in taxonomicjudgment. Syst. Bot., 5:341-365.
SOKAL,R. R., AND F. J.ROHLF. 1981a. Taxonomic
congruence in the Leptopodomorpha re-examined.
Syst. Zool., 30:309-325.
SOKAL,R. R., AND F. J.ROHLF. 1981b. Biometry.2nd
ed. W. H. Freeman,San Francisco.
STANLEY, S. M. 1979. Macroevolution, patternand
process. W. H. Freeman,San Francisco.
THROCKMORTON, L. H. 1968. Concordance and discordance of taxonomic characters in Drosophila
classification.Syst. Zool., 17:355-387.
WILEY,E. 0. 1981. Phylogenetics.JohnWiley,New
York.
Received14 January1983; accepted11 April1983.
APPENDIX
ListofCharacters
for77 Fossiland RecentCaminalculesa,b
Head and neck
1. Head junction complex (folded) (1) or not (0). [73, 39]
2. If complex, degree of folding complete (1) or partial (0). [73, 10]
3. If partial,secondary junction narrow (0) or broad (1). [10, 21]
4. P (1) or A (0) of horn. [75, 73]
5. If horn present,pointed (1) or flattened(0). [75, 64]
6. Length of head in mm fromrear end of folded section to front,excluding horn and anteriorprojections.
If head not complex, measure fromcollar. Recoded as integer in the following manner: (0) 9-10.9; (1)
10.9-12.8; (2) 12.8-14.7; (3) 20.4-22.3; (4) 24.2-26.1; (5) 26.1-28. [73, 65, 60, 28, 25, 39]
7. Anteriorend of head concave (0), flat(1) or convex (2). [71, 73, 62]
8. If convex, rounded (0) or sharply pointed (1). [62, 15]
9. P (1) or A (0) of anteriorprojections.[71, 73]
10. P (1) or A (0) of eyes. [73, 59]
11. If present, states of fusion of eyes: (0) two separate eyes; (1) grown togetherbut two eyes still
discernible; (2) grown togetherinto oblong approximatelythe size of two eyes; (3) grown together
into circle approximatelythe size of one eye. [73, 47, 1, 16]
12. P (1) or A (0) of eye stalks. [37, 73]
13. If stalked,length of stalk (excluding eye) in mm. Recoded as integerin the following manner: (0)
3-4.5; (1) 4.5-5.9; (2) 6-7.5; (3) 10.5-12; (4) 13.5-15; (5) 16.5-18. [37, 38, 70, 48, 72, 19]
I These characterswere originallydefined by A. J.Boyce and I. Huber and were revised by B. Thomson and R. R. Sokal,
and conventions: P-presence; A-absence. NumbAbbreviations
bers in parenthesesare characterstatecodes. Numbers in square brackets following definitionof each character are the OTU numbers of
representativeOTUs that exhibit the characterstates in the orderdescribedin thelist.Thus forcharacter1 [73, 39] means thatOTU 73 is an
example of a complex (folded) head junctionand OTU 39 is an example
of a not complex head junction. Whenever an "If" statementis denied,
as in character2 for OTUs whose state for character 1 is 0, an NC is
recorded forthe OTU for that character.Since all measurementcharacterswere recorded to the nearestmm, there is no ambiguitycreated
by shared, more refined class limits of adjacent classes as listed for
some characters(e.g., for character6, a measurementof 10 mm was
coded 0, while one of 11 mm was coded 1).
The character numbers 1-85 employed in this list correspond to
those employed in earlier studies and originallydefined by Boyce and
Huber. Characters numbered 86-106 were needed to describe additional observed differencesin fossil species. In order to preserve both
the old characternumbering systemand at the same time the logical
order of charactersin the list, it became necessary to list some char-
acters out of numerical order. The following paragraph gives the location of all charactersthatare not in strictnumerical order in the list.
Characters26-28 in this list follow upon character22; 58 follows 13;
59 follows 56; 60 and 61 follow 48; 62 follows 47; 78 follows 80; 84
follows 90 (23); 85 follows 63; 86-90 follow 23; 91 and 92 follow 34;
93-95 follow 35; 96 follows 45; 97 follows 62 (47); 98 follows 61 (48);
99 follows 55; 100 and 101 follow 85 (63); 102 follows 67; 103 follows
69; and 104-106 follow 75.
The following lists of characterscomprise the several body regions:
head and neck, 1-17, 58 (total of 18); anteriorappendages, 18-30, 84,
86-90 (total of 19); posterior appendages, 31-38, 91-95 (total of 13);
collar and abdomen, 39-57, 59-63, 85, 96-101 (total of 31); abdominal
pores, 64-83, 102-106 (total of 25). The overall total is 106 characters.
The images shown in Figure 1 (OTUs 1-29) were traced from the
originals, whereas those in Figure 2 were traced fromxeroxes of the
originals. In the process and in the reductionnecessaryforillustration
here some detail visible to the data coders was necessarily lost. The
states of at least one character(17) are quite indetectable in the published figures.Lengths in millimetersare given in terms of the dimensions of the original images. As reduced in Figures 1 and 2, these
measurementsmust be multiplied by 0.387.
182
SYSTEMATIC ZOOLOGY
VOL. 32
58. If stalked, stalks fused (1) or not (0). [20, 37]
14. If stalks not fused, tips divergent(0), parallel (1) or convergent(2). [37, 19, 31]
15. Top of head depressed (0), flat(1) or crested (2). [73, 41, 53]
16. If crested,single (0) or lobate (1). [53, 2]
17. P (1) of A (0) of groove in neck. [12, 73]
Anteriorappendages
18. P (1) or A (0) of anteriorappendages. [73, 15]
19. If appendages present,length in mm fromtangent to posteriorend of abdomen (excluding posterior appendages) up to the anterior end of the longer appendage. Recoded as integer in the
following manner: (0) 40-46.3; (1) 46.3-52.6; (2) 52.6-58.9; (3) 58.9-65.2; (4) 65.2-71.5; (5) 71.5-77.8;
(6) 77.8-84.1; (7) 84.1-90.4; (8) 96.7-103. [73, 74, 45, 43, 66, 24, 17, 19, 20]
20. If appendages present,P (1) or A (0) of flexion(elbow) in appendages. [41, 73]
21. If appendages present,end of appendage divided (1) or not (0). [56, 73]
22. If not divided, end of appendage is tendril(0), sharplypointed (1), tapered or rounded (2) or
knobbed (3). [21, 68, 73, 45]
26. If not divided, P (1) or A (0) of flange.[38, 73]
27. If flange present, length expressed as sum of right and left sides in mm. Recoded as
integerin the following manner: (0) 17-20; (1) 23-26; (2) 32-34.9; (3) 35-37.9; (4) 38-41;
(5) 44-47. [38, 70, 48, 50, 20, 29]
28. If flange present, width expressed as sum of right and left sides in mm. Recoded as
integer in the following manner: (0) 5-6.9; (1) 10.7-12.6; (2) 14.5-16.4; (3) 16.418.3; (4) 18.3-20.2; (5) 20.2-22.1; (6) 22.1-24. [38, 70, 48, 31, 29, 72, 26]
23. If divided, two divisions (0) or three (1). [22, 56]
86. If two divisions, ending as fingers(0), clamp (pincers without joint) (1), pincers with
joint (2) or two wavy spikes (3). [22, 49, 30, 67]
87. If ending as clamp, closed (0) or open (1). [49, 36]
88. If open, clamp is small (0) or large (1). [36, 55]
89. If ending as pincers with joint, serratededges P (1) or A (0). [57, 30]
90. If three divisions, one "finger" much longer than the other two (2), one fingerslightly
longer than the other two (1) or all fingersthe same size (0). [65, 76, 56]
84. If divided, bulbs on ends of divisions present on all divisions (2), present on some divisions
(1) or absent (0). [18, 66, 56]
24. If bulbs present on all divisions, P (1) or A (0) of claws. [12, 46]
25. If bulbs present on all divisions, P (1) or A (0) of pads. [75, 46]
29. If appendages present,P (1) or A (0) of pigment. [9, 73]
30. If pigment present,distributedin small dots (-1), small circles (0), large circles (1) or very
broad areas (2). [33, 9, 11, 21]
Posteriorappendages
31. Appendage single (0), partially double (1) or completely double (2). (Completely double means that
appendage originates from 2 stalks ratherthan 1; it does not necessarily mean that the 2 halves are
separated.) [73, 43, 18]
32. If single, disclike (pedestal like) (0), platelike (1) or propellerlike (2). [73, 40, 36]
33. If disclike, width of disc in mm. Recoded as integer in the following manner: (0) 5-5.7; (1)
5.7-6.4; (2) 6.4-7.1; (3) 7.8-8.5; (4) 8.5-9.2; (5) 9.9-10.6; (6) 10.6-11.3; (7) 11.3-12. [13, 58, 39,
38, 37, 72, 42, 77]
34. If disclike, stalk long (>3 mm) (2), short(<3 mm) (1) or absent (0). [74, 73, 28]
91. If disclike, round (0) or square (1). [73, 74]
92. If square, P (1) or A (0) of division lines. [49, 74]
35. If platelike, no indentation (-1), weakly emarginate(curved indentation) (0), cleft(V-indentation) (1), or deeply cleft(V-indentationcontinued into division line) (2). [40, 33, 11, 6]
93. If propellerlike,blades rounded (0) or pointed (1). [36, 67]
94. If propellerlike,stalk short (<5 mm) (0) or long (>5 mm) (1). [36, 55]
95. If stalk long, straight(0) or twisted(1). [55, 30]
36. If completelydouble, ends pointed (0), clubbed (1) or platelike (2). [3, 75, 18]
37. If platelike, divided (1) or not (0). [66, 18]
38. If divided, two divisions (0) or three (1). [66, 53]
Collarand abdomen
39. Rim of abdomen plain (0), narrowlyraised (1) or broadly raised (2). [64, 45, 73]
40. Anteriormargin of abdomen well delineated (0) or imperfectlydelineated (1). [73, 31]
41. P (1) or A (0) of abdominal ridge (raised area on central posteriorabdomen). [73, 45]
1983
CAMINALCULES: DATA BASE
183
42. P (1) or A (0) of large areas of color on anteriorabdomen. [56, 73]
43. If present, whole (2), bisected (1), quadrisected and square (0) or quadrisected and irregularly
shaped (-1). [66, 46, 56, 65]
44. P (1) or A (0) of spots on anteriorabdomen. [44, 73]
45. If present,small (0), large (1) or large and cross-shaped (2). [44, 27, 35]
96. If small, 2 (0) or 4 (1) spots. [17, 44]
46. P (1) or A (0) of postabdominal spots. [44, 73]
47. If postabdominal spots present,large (open circles) (0), medium (> 1.5 and ?2.0 mm avg.) (1), small
(>1.0 and <1.5 mm avg.) (2) or dots (<1.0 mm) (3). [56, 47, 44, 35]
62. If large, anteriorlateral spots medially fused (0), posteriorlyfused (1) or free (2). [66, 53, 56]
97. If postabdominal spots present,P (1) or A (0) of a group of fourfree posteriorspots. [44, 56]
48. If group present,posteriorgroup of fouris stronglyconcave (0), weakly concave (1), or convex
(2). [24, 44, 69]
60. If postabdominal spots present,number of free (not fused) spots on postabdomen: zero (-1), two
(0), four (1), six (2), eight (3), ten (4), twelve (5) or fourteen(6). [77, 3, 66, 53, 56, 44, 63, 76]
61. If six or eight large(char. 45), freespots, P (1) or A (0) of gap between second and thirdrows
of spots. [64, 18]
98. If postabdominal spots present, and anterior abdominal spots (char. 44) also present, spots connected into racquet design (either by lines or by fusion) (1) or not (0). [77, 44]
49. Posteriorend rounded (0) or angular (1). [73, 21]
50. P (1) or A (0) of dorsal fin.[50, 73]
51. If present,length of attachmentin mm. Recoded as integerin the following manner: (0) 7-7.7; (1)
7.7-8.4; (2) 8.4-9.1; (3) 9.8-10.5; (4) 13.3-14. [48, 50, 31, 20, 29]
52. If present,posteriorapex projectingbeyond rear of attachment(2), not projecting(1) or projecting
anteriorto rear of attachment(0). [72, 50, 20]
53. Posteriorabdomen swollen (1) or not (0). [35, 73]
54. Posteriorbars (apparentlyfused spots) absent (0), single (1) or double (2). [73, 2, 43]
55. If double, shortestdistance between bars in mm. Recoded as integerin the following manner: (0)
1-1.8; (1) 1.8-2.6; (2) 6.6-7.4; (3) 8.2-9. [43, 18, 75, 64]
99. If double, elongate (0) or fatand roughly triangular(1). [43, 56]
56. If elongate, straight(0), curved (1), weakly J-shaped(posteriorends thickened)(2), or strongly
J-shaped(3). [64, 43, 3, 12]
59. If double, P (1) or A (0) of large cross-shapedspot in postabdomen median to anteriortips of bars.
[3, 56]
57. If single, semicircular(0) or U-shaped (1). [5, 2]
63. Greatestwidth of abdomen in mm. Recoded as integer in the following manner: (0) 15.95-17.95; (1)
19.95-21.95; (2) 21.95-23.95; (3) 23.95-25.95; (4) 27.95-29.95; (5) 29.95-31.95; (6) 33.95-35.95; (7) 36. [20,
48, 54, 73, 35, 63, 24, 47]
85. Length of abdomen in mm frombase to tip of collar. Recoded as integer in the following manner: (0)
38-42.9; (1) 42.9-47.8; (2) 47.8-52.7; (3) 52.7-57.6; (4) 57.6-62.5; (5) 72.3-77.2; (6) 82.1-87. [73, 47, 64, 75,
50, 19, 20]
100. P (1) or A (0) of lateral flippers.[42, 73]
101. If present,small (0), medium (1) or large (2). [42, 51, 61]
Abdominalpores
64. P (1) or A (0) of large pore in column 10 (most anteriorposition). [45, 73]
65. Group II (columns 4 and 5) has zero pores (-1), one pore (0) or two pores (1). [61, 69, 73]
66. If two pores, fused (1) or not (0). [5, 73]
67. If fused, fusion complete (0) or incomplete (1). [48, 5]
102. If not fused,number of pores looking like slits: zero (0), one (1) or two (2). [73, 68, 51]
68. Group III (columns 6-10) has one (0), two (1), three (2) or five (3) pores. [64, 18, 43, 73]
69. If two or three pores, one pore looks like a slit (1) or not (0). [43, 76]
103. If five pores, the two pores closest to group II look like slits (1) or not (0). [56, 73]
70. If five pores, P (1) or A (0) of fusion. [50, 73]
71. If fusion present,complete (0) or incomplete (1). [29, 50]
72. If complete, one (0) or two (1) groups of fused pores. [72, 29]
73. If incomplete,P (1) or A (0) of a group of three fused pores. [26, 50]
74. Group I (columns 1-3) has zero (0), one (1), two (2) or three (3) pores. [47, 24, 35, 73]
75. If one or two pores, one pore looks like a slit (1) or not (0). [63, 35]
104. If two or three pores, pore nearest posteriorenlarged (1) or not (0). [42, 73]
105. If three pores present and posteriorone enlarged, eniarged pore apparentlysee-through(1)
or not (0). [30, 42]
106. If see-through,fused with posteriorpore fromother side's Group I (1) or not (0). [57,
30]
184
SYSTEMATIC ZOOLOGY
VOL. 32
76. If three pores, separated as three open pores of any size (0) or in some way modified(1). [73, 46]
77. If modified,modifiedby fusion (1) or some other way (0). [64, 46]
79. If fusion,two (0) or three (1) pores fused. [50, 64]
80. If fusion,broad and complete (0), broad and incomplete (1), or slit-like(2). [64, 66, 71]
78. If broad and complete and three pores fused (char. 79), both lateral pores reduced
and joining median one (1) or not (0). [22, 64]
81. If modifiedin some other way, P (1) or A (0) of line (or slit) connecting pores. [46, 65]
82. If present,two (0) or three (1) interconnectedpores. [53, 46]
83. If absent, slits in column 1 (0), column 2 (1), columns 1 and 2 (2), or column 3 (3).
[60, 65, 8, 51]