A Phylogenetic Analysis of the Caminalcules
Transcription
A Phylogenetic Analysis of the Caminalcules
Society of Systematic Biologists A Phylogenetic Analysis of the Caminalcules. I. The Data Base Author(s): Robert R. Sokal Reviewed work(s): Source: Systematic Zoology, Vol. 32, No. 2 (Jun., 1983), pp. 159-184 Published by: Taylor & Francis, Ltd. for the Society of Systematic Biologists Stable URL: http://www.jstor.org/stable/2413279 . Accessed: 02/04/2012 22:07 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org. Taylor & Francis, Ltd. and Society of Systematic Biologists are collaborating with JSTOR to digitize, preserve and extend access to Systematic Zoology. http://www.jstor.org Syst.Zool., 32(2):159-184, 1983 A PHYLOGENETIC ANALYSIS OF THE CAMINALCULES. L. THE DATA BASE ROBERT R. SOKAL State University ofNew Yorkat StonyBrook, DepartmentofEcologyand Evolution, StonyBrook,New York11794 Abstract.-The Caminalcules are a group of "organisms" generated artificiallyaccording to principles believed to resemble those operating in real organisms. A reanalysis of an earlier data matrixof the Caminalcules revealed some inconsistenciesand errorswhich necessitated recoding of some characters.The resultingdifferenceswith earlierresultsare minor.The images of all 77 Caminalcules are featured,those of the 48 fossilspecies forthe firsttime.The characters of the Caminalcules are defined and a data matrixis furnishedforall Recent and fossilspecies. A new phenetic standard is proposed for the Caminalcules which divides them into five "genera." The true cladogram is revealed for the firsttime. Recent Caminalcules have evolved over 19 time periods. Five branches correspond to the phenetic genera but originateat greatly differingtime periods. Four lines terminatein fossils. A series of measures forquantifyingevolutionarychange is defined,including measures for homoplasy,parallelism,and reversal.A survey is made of these measures and of other statistics of relevance to systematicsfor 19 data sets fromthe numerical taxonomic literature.The Caminalcules turn out to be compatible to data sets on real organisms with respect to all these measures,as well as with respect to evolutionaryrates and species longevities. Thus, questions raised by an analysis of the Caminalcules should be of interestto systematistsconcerned with the analysis of data sets on real organisms. [Phenetic classifications;cladistic classifications; estimatedcladograms; homoplasy; Wagner trees; Caminalcules; numerical taxonomy.] This paper,and othersto follow,takesadvantage of the opportunityaffordedby a group of artificialorganismswith a known phylogeny,the Caminalcules,to throwlight on some of the questions concerningprinciples and proceduresthatcurrentlyengage theattentionoftaxonomists. Thereis considerabledisagreementon the relativemeritsof pheneticand cladisticclassifications (Sneath and Sokal, 1973;Eldredgeand Cracraft, 1980; Wiley, 1981). Some workers contend that classifications based on phylogeneticprinciples are empiricallybetterby variouscriteria of optimality(Farris,1977, 1979a,b; Mickevich,1978a,1980;Schuh and Polhemus,1980; Schuh and Farris,1981). These claims have been questionedby otherswho findthe evidence and methodologypresented to be flawed(Colless, 1980;Rohlfand Sokal, 1980, 1981; Sokal and Rohlf,1981a; Rohlf et al., 1983a,b). The empiricalstudies,and the argumentspro and con phylogeneticclassificationsderivedin such investigations, suffer froma major impediment.All of the phylogeneticclassificationsreportedin the literatureare only estimatesof the true phylogeny, which is unknown for all real organisms.By contrastthe variousresultsof this study merit serious attentionbecause they can be examined against the benchmarkof the truephylogenyof the group. The Caminalculesare artifactscreatedby the late ProfessorJosephH. Camin of the Universityof Kansas and in effectrepresent a single simulationof the evolutionaryprocess by rules that have not been made explicit.However,readerswill findthatthese organisms,which have the advantage over othersimulationsin presentinga visual record to the investigator, illustratea varietyof evolutionaryphenomena and are therefore of considerable pedagogical and heuristic value. The relevanceof this data set to currentlyactiveissues in systematics will readily becomeevidentto thereaderofthisseries. I shall show thatwithrespectto a substantial arrayof measurableproperties,the Caminalcules are well withinthe range of empirically observed values for real taxonomic groupsand that,conversely,forno property of consequence in numericaltaxonomyare the Caminalcules beyond the range of observedvalues in real organisms. At the suggestionof the Editorand some 159 160 SYSTEMATIC ZOOLOGY reviewers,this series of publicationsis initiatedin thispaper with the presentationof the data on which previousand succeeding studies have been based. I furnishthe images of the previouslypublished 29 Recent species and forthe firsttime the 48 "fossil" species. I also presentforthe firsttime the truephylogenyof the Caminalculesas generatedby ProfessorCamin. WiththeseillustrationsI providea listof the descriptionsof charactersas adopted in my laboratoryas well as a data matrixgiving the character statesforall 106charactersforeach ofthe 77 Recentand fossilCaminalcules.In addition to presentinga new standardpheneticclassification,the paper describesa number of measures for taxonomic and evolutionary propertiesand compares the Caminalcules withdata setson real organismswithrespect to thesemeasures.Subsequentstudiesin this serieswill treatestimatesof the true cladogram,theinclusionoffossilsin pheneticand congruenceand charcladisticclassifications, acterstability,and OTU stability. VOL. 32 Ehrlichof StanfordUniversityand Dr. W. Wayne Moss of the Philadelphia Academy of Sciencesin additionto myself.The originals drawn on the ditto mastersappear to have been lost followingthe death of ProfessorCamin in 1979. All examinationsof the Caminalculesfor numericaltaxonomicstudieshave been carried out on the xeroxesof the images. Illustrationsof all 29 Recent OTUs have been published three times previously (Sokal, 1966;Rohlfand Sokal, 1967;Sokal and Rohlf, 1980). For this purpose,inked copies of the xeroxed images were photographed. Althoughthe artist'scopies of the originalxeroxesare quite faithful, inevitablysome fine detailhas been alteredor lost.Thus,notevery characterstate describedbelow can be unequivocallyrecognizedin the featuredillustrations.The versionof the 29 RecentOTUs published in Sokal (1966) was "beautified" by thepublisher'sartistand cannotbe relied upon fordetail.The images of the 48 fossils were newly inked forthisstudyand all difcharacteristics can be observed ferentiating ORIGIN OF THE DATA BASE in them. The RecentOTUs, numbered1 to 29, are The originalintentionin generatingthe Caminalcules was to study the nature of shown in Figure 1. The fossilOTUs, given code names by Camin, were rantaxonomicjudgment(eventuallypublished different by Sokal and Rohlf,1980), but work with domly assigned numbers 30 to 77 by me. theseanimalshas led to otherdevelopments They are shown in Figure2. The truecladogramofthegroupwas comin numericaltaxonomicmethodology,such as an early method of numericalcladistics municatedto me by Camin in 1970.But al(Camin and Sokal, 1965) and a method for though this informationwas employed in obtaining taxonomic structureby random the computationsleading to the analysis of and systematic scanningofbiologicalimages taxonomicjudgment(Sokal and Rohlf,1980), (Sokal and Rohlf, 1966; Rohlf and Sokal, access to it was restrictedeven forworkers 1967). Other experiments on taxonomic on this project.I did not become intimately treeuntil1981 judgmenthave also been based on the Cam- familiarwiththephylogenetic inalcules(Moss, 1971;Sokal, 1974;Moss and during the final analyses leading to this manuscript. Hansell, 1980). Readers of this paper and of subsequent Camin drew the Caminalculesusing master stencilsfor dittomachines.The genetic ones in thisseriesshould note thatI use the in the sense in which I origcontinuityofthe Caminalculeswas achieved -termcladogram by Camin by tracingsuccessivedrawingsof inallycoined it (Camin and Sokal, 1965;also the animals,permittingthe preservationof independentlycoined with the same meanall charactersexceptfor such modifications ing by Mayr, 1965). One definitionof this as were desired.Xeroxcopies of the images meaning(Sneathand Sokal, 1973:29)is as "A of the RecentOTUs were made available in branching... networkof ancestor-descentheearly1960s,thoseofthefossilOTUs some dant relationships."This definitiondiffers years later.Independentxeroxcopies of all fromthe several meanings attachedto the OTUs are in the possession of Dr. Paul A. termcladogramby various cladists(e.g., El- 1983 161 CAMINALCULES: DATA BASE o u' U) in C 0'~~~~~~~~~0 00. 04~~~~~~~~~~~~4 U. U 0 0 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~~~~~~1 0 0 ~~~~~~~~~~~~~~~~~~u 0 U) 04 .4 0 V *0~~~~~~~~~~~~~~~ ~~~~~~~~~~* ~ ~ ~ 04 *0 ~ ~ ~ ~ ~ K IEI _~~ ~~~~~~~4 co ~~~~ ~~~~~~~~~~~~~~~ C.4 162 SYSTEMATIC ZOOLOGY VOL. 32 - .o c, o g .4 CNJ~ ~ ~ ~~ ~ ~ ~ ~ ~ %6oo -~~~~~~~~~~~~ 0e 00~~~~~~ 0% ~~~~~~~~~~~~~ *~~~~~~~~~~~~~~~~~~~, o cli~~~~~~~~~~~~~~~~~~~~~~~~~b r) ODs 6** " 0 C) -U, 163 CAMINALCULES: DATA BASE 1983 Nl~ ~ ~ ' N NA~~~~~~ * 00 '~~~~~~~~~~~~~~~~~~~~~~~~~0~~~~0~~~~~~~~~~~ rl9j?~ I')~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~4 ~ ~ 164 SYSTEMATIC ZOOLOGY dredgeand Cracraft,1980;Nelson and Platnick,1981;Wiley,1981;see also Sneath,1982). Thus,cladogramas used in thisand succeeding papers refersto a branchingsequence depictingthe actualor hypothesizedgenealogy of the OTUs. It does not include length ofbranchesofthetreeand is not a statement about the evolutionor patternof character states akin to "nested synapomorphy schemes." Camin did not keep a writtenrecordof characterchanges. In the 1960s,A. J.Boyce and I. Huber prepared a data matrixfor a numerical'pheneticstudy of the 29 Recent OTUs, describing86 characters.This data matrixwas the basis of various phenetic analyses of the Caminalcules (Sokal, 1966; Sokal and Rohlf,1966,1980;Rohlfand Sokal, 1967). For the presentstudyit became necessaryto map the coded charactersonto the now known cladogramand duringthisprocess each characterwas re-evaluatedforthe of the 29 RecentOTUs. This re-examination imagesled to the discoveryof severalinconsistenciesin the Boyce and Huber data matrix.Altogether13 ofthe2,494entriesin that 29 X 86 matrixseemed to be in errorand were corrected.Originalcharacter58 was deletedbecause it was discoveredto be invariant. All measurementcharacterswere rewith earlier measuredand small differences measurementswere recorded.Because the images available duringmy currentstudies are poorer quality xerox copies than those available to Boyceand Huber,the resolution of some of the more minutemorphological and in a fewcases featureswas moredifficult characterswere recoded into fewer,coarser classes. Remeasurements, changesin logic,and revision of characterstate coding accounted for changes in 33 of the 85 characters.To accountforthe featuresof the fossilOTUs, which had not been coded previously, another12 characterswere redefinedand 21 new characterswere added. This augmented data matrix comprises 77 OTUs and 106 charactersof which 65 are binary,31 are ordered integercharacters,and 10 are measurementcharacters.Fromthismatrixan 85 characterX 29 OTU subsetwas extractedfor the RecentOTUs. This matrixhas 48 binary VOL. 32 characters, 27 orderedintegercharactersand 10 measurementcharacters.Both the 106 characterX 77 OTU matrixand the 85 characterX 29 OTU matrix contain characters withNCs (no comparisoncodes). All NCs in thisstudyare logical(i.e.,presenceofa given stateforcharacterh makingit impossibleto definea stateforcharacteri) ratherthandue to missinginformation. In the largermatrix, 79 ofthe 106 characterscontainNC codes; in thesmaller,59 ofthe85 characters have NCs. The measurementcharactersare available in two versions.One is as directmeasurements in millimeters, the otheris coded as ordered integerstates but with gaps omitted.The forone characterwas rangeof measurements divided into 10 equally wide classes coded 0 to 9. Each characterstatewas then assigned to one of these classes, but classes lacking observed characterstates were ultimately omitted.Thus, if therewere no OTUs with measurements in class3, thesubsequentclass 4 was renumbered3. The numberof states in the 10 charactersresultingfromthis procedure ranges fromfourto nine. A list definingthe states of each characteris furnished in the Appendix. The actual data matrix(integercoded version) is shown in Table 1. The consequences of recoding the measurementcharactersin integerscale were minimal.Matrixcorrelationsforthe 29-OTU study between the resemblance matrices based on these two methods of character coding are very high (0.998 and 0.997, respectively,for r and d, the correlationand taxonomicdistancecoefficients). The classificationsresultingfrom these two coding methods are identical. In this and subsequent papers,only the data matrixusing integerscale codingof measurementcharacter states is featured,since this is the simpler coding preferredfor the several cladistic methodsemployed. The effectsof correctingand recodingthe originalBoyceand Huber data matrixforthe be29 OTUs were minor.Matrixcorrelations tween the two versionsof the resemblance matricesare high (0.980 for r, 0.969 for d). UPGMA phenogramsbased on theseresemblancematricesare close;thestrictconsensus index CIc forthe classifications based on the 1983 CAMINALCULES: DATA BASE 165 two correlationmatricesis 0.889, for those the logical choice as a new standardfor a based on thetwodistancematrices0.815.This pheneticclassification ofthe29 RecentOTUs index (Rohlf,1982) rangesbetween 0 forno to be comparedwiththetruecladogram.This consensusand 1 forperfectconsensus.It is phenogramalso has the highestcophenetic identicalto the consensusforkindex of Col- correlationcoefficient of any pheneticclasless (1980) and the proportionalconsensus sificationof the Caminalcules computed so index of Sokal and Rohlf(1981a). far(0.965). The differences betweenthe classifications Five major clusters,the "genera" of the based on correlationsand distancescomput- Caminalcules,can be seen in thephenogram ed fromthe originalBoyce and Huber data in Figure3. These fallnaturallyintotwo mamatrix(see fig.3 of Rohlfand Sokal, 1967) jor groups,thatcontaininggeneraA, B, and and thosebased on theupdated,integer-cod- F, and a second groupcontainingC and DE. ed versionall involvetaxonomicaffinities at The statusof D, consistingof OTUs 3 and 4, the "species" level. The "generic"classifica- is problematical. Only in one previousstudy, tionis the same forbothversions.When the based on a mechanicalmethodof scanning new classifications were coriparedto 49 sub- the images (Rohlfand Sokal, 1967: fig.3A), jective classificationsof the Caminalcules was D unequivocallyseparatedfromE at a producedby 22 experimentalsubjects(Sokal level of pheneticsimilarityequal to thatof and Rohlf,1980),thenew phenogramsofthe the othergenera-A, B, C, and F. The Boyce Caminalculeswere foundto be no more(and and Huber correlation phenogram(Rohlfand no less) similarto these intuitiveclassifica- Sokal, 1967: fig.3C) shows D joining E at a tionsthan the earlierBoyceand Huber clas- level closerthanthatof at leastone member sifications. of A joining its genus. Intuitiveclassifications by various taxonomists(Sokal and A NUMERICAL PHENETIC CLASSIFICATION Rohlf,1980)had frequentlyassignedspecies For pheneticanalyses the data were sub- 3 and 4 to genus E, althoughsome had sinjectedto standardtaxometricproceduresus- gled out the divergenceof these two OTUs ing the NTSYS systemof numericaltaxo- which are unique in having rudimentary nomiccomputerprograms(Rohlfet al., 1980). posteriorappendagesas well as an elongated Characterswere standardizedbefore com- body obliteratingthe neck. Since the pheputationof correlationand taxonomicdis- nogramin Figure3 clearlyaffiliatesD with tance coefficients betweenOTUs. E, it seems appropriateto join the two toRohlfand Sokal (1967) noted thatpheno- getherintoa singlegenuswhichI have called gramsbased on distancesand correlationsof DE to preservecontinuitywith earlierpubthe Caminalculesdiffersubstantiallyin the lications. indicated. specificas well as genericaffinities THE TRUE CLADOGRAM Previousworkby Rohlfand Sokal (1967)and The truecladogramas furnishedby J.H. by Sokal and Rohlf(1980) showed that the classification based on the correlationmatrix Camin is shown in Figure4. The diversity correspondedmore closely than that based of the taxonwas generatedover 19 timepeon the distance matrix to the taxonomic riods.Lines undergoingevolutionarychange structurenoted by individual taxonomists are indicatedin Figure4 by slanted(as disgroupingthe Caminalculesby conventional, tinctfromvertical)lines. Species thatunderintuitivemethods.This undoubtedlyoccurs went evolutionarychange during a given because correlationcoefficients are moresen- timeperiod are shown as solid circlesat the sitiveto shape than are distancecoefficientsend of thatperiod.Ancestralspeciesthatare (Rohlfand Sokal,1965),and itis shape rather not continuedinto the next time period as than size on which taxonomiststendto base verticallines are consideredto have become theirjudgmentson taxonomicaffinity. For extinctand are indicated by small hollow this reason,the UPGMA clustering(shown circles.Fivebranchesleadingto Recentforms, in Fig. 3) of the correlationmatrixbased on correspondingto the five phenetic genera, the updated,integer-codeddata matrixwas can be recognized but these originate at 166 SYSTEMATIC ZOOLOGY TABLE 1-5 1 2 3 4 5 6-10 11-15 VOL. 32 1. Data matrixof 77 Recent and fossil Caminalcules.a 16-20 21-25 26-30 31-35 36-40 41-45 46-50 51-55 56-60 61-65 66-70 71-75 76-77 11111 11111 11111 11111 11110 11111 11111 11101 11111 11111 11111 11111 11111 11111 11111 11 11111 01110 01111 11111 OlliX 11111 11111 iliXi 11111 11111 11111 11111 11111 11111 11111 11 XXXXX lXXXO OXXXX XXXXX lXXXX XXXXX XXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XX 00110 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00010 00000 00001 00 XX1OX XXXXXXXXXX XXXXX XXXXXXXXXX XXXXX XXXXX XXXXX XXXXX XXXXX XXXXX XXXOX XXXXX XXXX1 XX 6 7 8 9 10 10000 12200 00220 10011 Xllll 10111 11112 X1122 XXXXX XXXXX XXXX1 XXxO1 00000 01000 00000 00000 11111 10111 11010 11111 00004 10310 10000 00050 00000 00001 00000 01002 00001 00010 21000 00 11110 2X121 21111 11101 11111 1X112 11111 11111 12111 lliXi 02111 11 XXXXX OXXOX OXXXX XXXXXXXXXX XXXXO XXXXX XXXXX XOXXX XXXXX XOXXX XX 00001 00000 00000 00010 00000 00000 00000 00000 00000 00000 10000 00 11110 11011 11111 11101 11111 11111 11111 11101 10111 11111 00111 11 11 12 13 14 15 20000 OXOOO OOXOX 30000 00000 00000 00000 00011 XXXXX XXXXX XXXXX XXX55 XXXXX XXXXX XXXXX XXXlX 12001 10000 02000 01011 OOOOX O1XOO 00000 OOOXO 00000 01000 00000 OOOXO OXOOO 00010 XXOOO 00 00000 10010 10000 01100 00000 00101 00000 00000 00000 00001 01000 00 XXXXX 5XX3X 5XXXX XO1XX XXXXXXX3X3 XXXXX XXXXX XXXXXXXXX2 X4XXX XX XXXXX 1XXOX 2XXXX XOOXX XXXXXXXOXO XXXXX XXXXX XXXXXXXXXO XOXXX XX 01001 10011 10010 10110 11000 00101 10201 01000 10000 11011 01000 00 16 17 18 19 20 XlXXX 01000 11111 54444 01001 21 22 23 24 25 01111 OXOOO OlOOX 00100 01100 00001 00001 10000 10100 10010 00101 11000 00011 11000 00001 10 1XXXX 1X221 lX22X 21X22 OXX12 2222X 2213X X2222 X3X33 X22X2 32X2X XX222 321XX XX122 2222X X3 Xliii XXXXX XlXXX XXlXX xoixx xxxxo xxxxi OxXXX iXiXX iXXOX XXiXO lOXXX Xxxii iOXXX XXXX1 ix XOioo XXXXX XiXXX XXOXX XOXXX XXXXX XXXXX XXXXX OXXXX OXXXX XXOXX XXXXXXXXOX XXXXXXXXXO XX X1010 XXXXX XOXXX XXOXX XOXXX XXXXX XXXXX XXXXX OXXXX OXXXX XXiXX XXXXX XXXOX XXXXX XXXX1 XX 26 27 28 29 30 OXXXX OXOOO OXOOX OOxil OXXOO lOOiX lOOOX XO100 XOXOO XOXl OOXOX XXOOO OOOXX XXOO1 OlOOX XO XXXXX XXXXX XXXXX XXX34 XXXXX 5XX5X 3XXXX XXOXX XXXXX XX2X3 XXXXX XXXXX XXXXX XXXX1 X5XXX XX XXXXX XXXXX XXXXX XXX23 XXXXX 6XX4X 3XXXX XXOXX XXXXX XX2X2 XXXXX XXXXX XXXXX XXXX1 X5XXX XX 00000 iXOil ilOOX 00000 10000 00000 00100 00000 00000 00000 00000 00000 00000 00000 00000 00 XXXXX 1XXOO 1XXXX XXXXX 2XXXX XXXXX XXNXX XXXXX XXXXX XXXXX XXXXX XXXXX XXXXX XXXXXXXXXX XX 31 32 33 34 35 12222 00000 XXXXX 10011 XXXXX X11XX XXXXX X11XX XXXXX 2XXOO 36 37 38 39 40 X2012 XlXXO XiXXX 01001 00000 41 42 43 44 45 00000 11111 10111 00000 10001 00101 01100 11011 00000 00010 01011 01111 01000 01000 10110 00 01111 00000 01000 00100 Ol1OO 00000 00000 00000 10100 10000 00100 10000 00011 10000 00001 10 XXXXX XIXXX XXOXX X21XX XXXXX XXXXXXXXXX lXOXX 1XXXX XXlXX OXXXX XXX1N 2XXXX XXXX1 OX Xllll 10000 00000 00000 11000 00010 01000 00011 00000 00010 01000 00000 00000 00100 00110 00000 01 2XXXX XXXXX XXXXX 2OXXX XXX2X XIXXX XXX12 XXXXX XXXOX X2XXX XXXXX XXXXX XX2XX XX22X XXXXX XO 46 47 48 49 50 11111 00000 01000 11100 01110 01000 00011 00000 10110 11000 00100 10000 00111 20000 XXXXX XOXXX 12OXX X002X XIXXX XXX03 XXXXX OX02X O1XXX XXOXX OXXXX XX202 2XXXX XXXXX XXXXX O1XXX XXXOX XOXXX XXXX1 XXXXX XXX1X XOXXX XXXXX XXXXX XXlXl 00000 00000 00000 00000 10000 00000 00010 00000 00000 00000 00000 00000 00000 00000 00000 00000 00011 00000 1001,0 10000 00000 00000 00101 00000 00000 00000 51 52 53 54 55 XXXXX XXXXX XXXXX XXX13 XXXXX XXXXX XXXXX XXX10 10000 00000 00000 11000 01221 00000 02000 00200 XX02X XXXXX XOXXX XXlXX 56 57 58 59 60 XX20X XXXXX X3XXX XXlXX X11XX XXXXX XXXXX XXXXX lXlXX 1XXXX XXlXX XXXXX XXXOX 1XXXX XXXXO XX XiXXO XXXXX XXXXX XXXXX XXXXXXXXXX XXXXX XXXXX XXXXXXXXXX XXXXXXXXXX XXXXXXXXXX XXXXX XX XXXXX XXXXX XXXXX XXXO1 XXXXX OXXOX OXXXX XObXX XXXXX XXOXO XXXXX XXXXX XXXXX XXXXO XOXXX XX XXlOX XXXXX XOXXX XXOXX XOOXX XXXXXXXXXX XXXXX OXOXX OXXXX XXOXX OXXXX XXXOX OXXXX XXXXO XX 22023 XXXXX XlXXX 323XX X133X X2XXX XXXN6 XXXXX 3X44X 33XXX XX2XX 3XXXX XX536 lX32X XXXX3 6N 61 62 63 64 65 XOX1O X1212 43233 11111 01111 XXXXXXOXXX XXXXXXXXXX XXXXXXXXXX XXXXX XXXXXXXXXX XXOXX XXXXX XXXXXXXXXX XXXXX XX 00000 01000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00 10111 11110 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11 4X034 4400X 56378 53350 75063 71334 21202 32322 35415 20412 23000 20434 44553 06013 32 OXOOO OlOOX 00000 01000 00000 00000 00000 10000 00000 00100 00000 00000 10000 00000 00 02000 lXOOO XXO11 XX112 lXXXX 11200 02210 02000 00001 00000 20100 21000 00200 00000 00121 20110 00002 10 XXXOO lXXXO OX002OQOlOX 20001 XOXOO XXOOO 0OX02 02000 OOXXX X2XXO OOOOX XO XXX44 XXXX2 4X04X 46X7X X432X X6X76 XX464 61X6X 7X111 61XXX XXXX3 1524X X7 XXXOO XXXX1 OXOlX 02X2X X221X X2X22 XX121 22X2X 2X221 22XXX XXXX2 1112X X2 XXXXX lXXXX XXXXX XXOXX XXXXN XXXXXXXXXX XXXXX XXXXX XXXXX XXXXXXXXXX XX XXXXX X2XXX XX2XX X22XX X2XXX XXXXX XXXXX 2XXXX 2XXXX XX2XX XXXXX XXX2X 2XXXX XXXX1 XX XXXXX XlXXX XXOXX XlOXX XOXXX XXXXX XXXXX OXXXX OXXXX XXiXX XXXXX XXXOX lXXXX XXXXX XX XXXXX XiXXX XXXXX XOXXX XXXXX XXXXX XXXXX XXXXXXXXXX XXlXX XXXXX XXXXX OXXXX XXXXX XX 22222 21222 10100 21112 01202 02211 22222 10111 11222 02122 12222 02101 12102 20220 11 00000 00000 00001 00000 00000 10000 00000 00000 00000 00000 00000 00000 00000 00000 00 10110 OX22X XX12X 00000 00000 00001 11 XXXXO 02 XXXXX 1X 00000 00 01000 00 XXXXX lXX4X 2XXXX XXXXX XXXXX XXOX1 XXXXX XXXXX XXXXX XXXXX X4XXX XX XXXXX 2XX1X 1XXXX XXXXXXXXXX XXlXl XXXXX XXXXX XXXXX XXXXXX2XXX XX 00010 01000 00001 00000 00000 01000 00000 00000 00100 00110 00000 00 02200 00000 00000 00000 20200 20000 00200 20000 00020 20000 00002 00 XOOXX XXXXXXXXXX XXXXX lXOXX OXXXX XXOXX OXXXX XXX3X OXXXX XXXX2 XX xxxxx XXXXX XXOXX XXOXX XXXXX XXXXX XXXXX OXXXX OXXXX XXOXX OXXXX Xxxix XXXXX XlXXX XX2XX X02XX XXXXXXXXiX XXXXX 2X2XX 2XXXX XXlXX 2XXXX XXX2X 33333 33332 74310 33363 17313 13324 33333 33333 37131 33323 33333 33533 00000 01000 11100 01110 01000 00011 00000 11111 11000 10100 10000 10111 11111 11111 10111 11110 11111 11111 11111 11111 11111 11111 11111 Nllll XXXXXXXXX1 XX OXXXX XXXX2 2X 33542 31332 32 10110 00001 11 11101 11111 11 a Columns are species code numbers. Recent OTUs are species 1-29. Rows are the 106 charactersdescribed in the Appendix. The first85 charactersare the integer-codeddata for the Recent OTUs. The rest are those necessary fordescribing the fossils. The characterstates,ranging numericallyfrom0 to 8, are in the body of the table. The following two stateshave special symbols:X-no comparison (NC) code; N-represents negative one (-1). 1983 167 CAMINALCULES: DATA BASE TABLE 1. 1-5 6-10 11-15 16-20 21-25 26-30 31-35 Continued. 36-40 41-45 46-50 51-55 56-60 61-65 66-70 71-75 76-77 66 67 68 69 70 XOOO1 00000 XXXX1 xxXXX 10000 33333 Oxxxx xxxxx XXXXX 00000 71 72 73 74 75 XXXXX XXXXX XXXXXXXX1o XXXXX lXXOX OXXXX XXXXXXXXXX XXXX1 XXXXX XXXXXXXXXX XXXXX XOXXX XX XXXXX XXXXX XXXXX XXXXO XXXXX Xxxix lXXXX XXXXX XXXXX XXXXX XXXXX XXXXX XXXXXXXXXX XOXXX XX XxXXX xxxxX XXXXX XXXOX XXXXX lXXXX XXXXX XXXXXXXXXX XXXXO XXXXX XXXXX XXXXX XXXXX XXXXXXX 13333 33333 33323 01333 33311 30233 33332 33313 33333 30333 33333 33333 23233 33113 33333 33 OXXXX XXXXX XXXOX XOXXX XXX1o XXOXX XXXXO XXXOX XXXXX XXXXXXXXXX XXXXX OXlXX XXOOX XXXXX XX 76 77 78 79 80 Xliii XOllO Xxoxx XXllX XX02X 81 82 83 84 85 XlXXl XXOXX XlOXX XXXXXXXXXX XXXXXXXXXX XXXXX lXXXX lXXXX OXlXX XXXXO XXXXO XXXXX XXXXX XX XOXX1 XXXXX XOXXX XXXXXXXXXX XXXXXXXXXX XXXXX lXXXX lXXXX XXOXX XXXXXXXXXX XXXXX XXXXX XX XXXXX XX2XX XXOXX XXXXX XXXXX XXXXX XXXXX XXXXX XXXXX XXXXX 3XXXX XXXXO XXXX1 XXXXX XXXXX XX X2222 XXXXX X2XXX XX2XX X20XX XXXXO XXXXO OXXXX 2XOXX 2XXOX XX2XO OOXXX XXX20 1OXXX XXXX2 OX 00330 00000 00000 00056 00000 50040 50000 00100 00000 01304 00000 00000 00020 00002 04003 00 86 87 88 89 90 XXXXX XXXXX XXXXX XXXXX XOXXX XXXX2 XXXXX 1XXXX XXXXX XXXiX XXXX1 X2XXX XXXXX X3XXX XXXXX XX xxXXX xxXXx XXXXXXXXXX XXXXX XXXXX XXXXX lXXXX XXXXX XXXOX XXXX1 XXXXX XXXXX XXXXX XXXXX XX Xxxxx xxxxx XXXxX XXXXXXXXXX XXXXX XXXXX OXXXX XXXXX XXXXX XXXX1 XXXXX XXXXX XXXXX XXXXX XX xxXXx xxxXx xxxxX XXXXXXXXXX XXXXO XXXXX XXXXX XXXXX XXXXXXXXXX XiXXX XXXXX XXXXXXXXXX XX XOOOO XXXXX XOXXX XXOXX XXOXX XXXXX XXXX2 XXXXX OXOXX OXXXX XXOXX OXXXX XXX02 OXXXX XXXXO 1X 91 92 93 94 95 XXXXX XOOXX XXOOO XXXOO XXXXO OXOOX OiXiX XiiOX XiXii XXiiO lOXiX iXOOO iOXXX XXXXi OOOiX Xi XXXXX XXXXX XXXXX XXXXXXXXXX XXXXXXOXOX XOOXX XOXOO XXOiX OXXOX OXXXX OXXXX XXXXO XXXOX XO Xxxxx xxxxx xxxxx XXXXXXXXXX XXXXO XXXXX OXXXX XXXXX XXXXX XXXXO XOXXX XXXXX XiXXX XXXXX XX XXXXX XXXXX XXXXXXXXXX XXXXX XXXXi XXXXX OXXXX XXXXX XXXXX XXXXi XiXXX XXXXX XOXXX XXXXX XX xxXXX xxXXx XXXXXXXXXX XXXXX XXXXi XXXXX XXXXX XXXXX XXXXXXXXXO XlXXX XXXXXXXXXX XXXXX XX 96 97 98 99 100 XXXXX XXXXX XXXXX XOXXX XXXXx XXXXX XXXXX XXXXX Xxxix XXXXX XXXXX XXXXXXXXXX XXXXX XXXXX Xi 10000 XXXxXXOXXX iiOXX XOOiX XiXXX XXXOi XXXXX OXOiX OiXXX XXOXX OXXXX XXioi OXiiX XXXXO 10 OXXXX XXXXX XXXXX OOXXX XXXOX XOXXX XXXio XXXXX XXXOX XOXXX XXXXX XXXXX XXOXX XXOOX XXXXX Xi XXOOO XXXXX XOXXX XXOXX XOOXX XXXXX XXXXXXXXXX OXOXX OXXXX XXOXX iXXXX XXXOX OXXXX XXXXO XX 00000 00000 00000 00000 00000 00000 00000 00000 01000 00000 10000 00000 10000 00000 00000 00 101 102 103 104 105 XXXXX XXXXX XXXXX XXXXX XXXXXXXXXX XXXXX XXXXXXOXXX XXXXX lXXXX XOOOX 00000 00000 OXOXX OOXOX XOOXO XOOOO 00010 00000 OOXOX 20000 XXXXX 00000 OXOOO XXXOO OXXXO OXOOO OOOOX 00000 XOXOO XXOOO iOXOO XOOOO 00000 00000 XXOOO OOOXO OXOOl 00010 00000 01000 OXOOO 10000 xxxxx xxxxx xxxxx xxxxx xxxxx xxxxi xxxox xxxxx xoxxx xxxxx oxxxx 106 XXXXX XXXXX XXXXX XXXXX XXXXX XXXXO XXXXX XXXXX XXXXX XXXXXXXXXX XiXXX XXXXX XXXXX XXXXX XX 00000 OXOll OOlOX 10010 10000 00000 00000 00101 00000 00000 XOOOO OOOXi 01000 00 XXXXX XXXOO XXlXX OXXOX OXXXX XXXXX XXXXXXXOXO XXXXX XXXXXXXXXX XXXX1 XOXXX XX 30333 11133 30013 31333 33331 33333 03233 11333 33033 33333 23101 03113 33330 13 xxxxx Oo1xx xxxoX XOXxX XXXXO XXXXX XXlXX lOXXX XXXXX XXXXX OXOXO XXOOX XXXXX OX OXOOO Xxxii OXXXO iXO10 lOOOX 00000 XOXOO XXOO1 OOXOO 00000 XOXXX XOXXO OlOOX XO 01100 OliXO XXOll OlOXX lXX1o lOOOX OOOXO 10000 iXOOl 10100 00001 XOXll lOXXO 11001 00 XlOXX XOOXX Xxxii XlXXX lxxix lXXXX XXXXX OXXXX OXXX1 OXOXX XXXXO XXX1o lXXXX llXXl XX XXXXXXXXXX XXXXXXlXXX XXXXX XXXXX XXXXXXXXXX XXXXX XXXXX XXXXX XXXOX XXXXXXXXXO XX XOXXX XXXXX XXXOO XlXXX OXXOX OXXXX XXXXXXXXXX XXXXO XXXXX XXXXX Xxxix lXXXX OOXX1 XX X2XXX XXXXX XXX10 XOXXX 1XXOX OXXXX XXXXX XXXXX XXXX1 XXXXX XXXXX XXXOX 1XXXX 2OXXO XX greatlydiffering timeperiods.Thereare also four lineages that underwentevolutionary changebeforebecomingextinct.The extinct terminal species of these fossil lines are shown as large hollow circles.I have indicated the amount of evolutionarychange(path lengthof the internode)based on 85 characters by thelengthofthethickenedbars along the slante,dlines. Note that path lengthsforlines leading to fossilsare based on only 85 charactersto make themcomparable to path lengthsof Recentforms.Path lengthsbased on all 106 characterswill be illustratedin paper III of this series. XXXXX 2XXXX XXXXX XXXXX XX 00000 XOOOO OO1XX OXOOO 00 10000 XOXXX XOXXO OOOOX XO 01000 10000 OOXXO 00000 01 xixxx xxxxx xxxxx xxxxx xo MEASURES OF PHENETIC AND EVOLUTIONARY CHANGE To describeevolutionarychanges in this and succeeding papers, various statistics based on the charactersand the true cladogram are needed. They are summarizedin Table 2. Some termsneed to be defined.An entiretaxon consistsof t OTUs and is describedby n characters.OTUs are labelledas as 1, ... , hi 1, ... , j, k,.. ., t and characters if..., n. The mostrecentcommonancestor ofall oftheOTUs is o. z is summationover all OTUs indexedby j, usually t OTUs. 168 1.00 0.80 SYSTEMATIC ZOOLOGY VOL. 32 4 3 22 12 2 5 18 23 16 27 24 17 1 6 10 11 9 21 7 15 8 14 13 28 25 26 19 29 20 X 0.600.40- 0-0.20 -0.40 FIG. 3. Standard phenogram of the 29 Recent Caminalcules based on the updated data set of this study. It is based on the data matrixwith integer codes substitutedfor the measurementsand was obtained by and UPGMA correlationcoefficients standardizationof charactersfollowedby computationof product-moment clustering.The ordinate is in correlationcoefficientscale. The phenetic genera are labeled with capital letters. Let us assume that we know the true cladogramof the taxon,as is the case in the Caminalcules.The nodes of the treewill be the Recent OTUs and their ancestors,and the internodeswill be the lines of descent connectingthese taxa. ThroughoutI shall assume that evolutionary(characterstate) change occursonly along internodes.Let I11 be the sum of the lengthsin characterstate changesover all internodesalong the directed path fromOTU j to o, the most recent common ancestor (root) of the taxon, summedover all characters.Quantitylohas been called the patristiclengthof OTU j by Farris(1969) and others.I shall referto it by theneutralterm"path length,"since it mea- sureshomoplasticas well as patristicresemblance (Sneath and Sokal, 1973).Next,let us define Lmax(u) = Ioas the sum of such lengthsover all t OTUs. The quantityLmax(u) is the maximallengthwhich the truecladogram could assume if it were rearrangedso thateveryOTU evolves singlyand independentlyfromthe ancestoro, with the OTUs diverging from that ancestor in bushlike fashion and repeating separatelyfor each OTU the changesin characterstatesthatactually occurredalong the common evolutionarystems in the true cladogram.Thus Lmax(u) is a measurewhich indicatestheupper possible bound of parallelismand reversals FIG. 4. True cladogram of the Caminalcules as furnishedby J.H. Camin. Morphological change occurred during 19 time periods fromtime 1 to time 20 along the ordinate. Vertical lines indicate periods without morphological change, slanted lines indicate such change. The amount of evolutionarychange (path length of the internode)based on 85 charactersis shown by the length of the thickenedbars along the slanted lines. To furnishan indication of scale, the length of the internodesubtending OTU 54 is one Manhattan distance unit, thatsubtending OTU 76 is 10 such units. Path lengths based on 106 characters,necessary fordifferentiating fossil species, are shown in a subsequent publication (Sokal 1983b: fig. 2). Squares identifyRecent 1983 CAMINALCULES: DATA BASE F B 2619 2029 20-illl 26 19476 /6 21 11 10 11111 C DE 9 4 3 22 5 12 2182316 20 72 2 17 - 24 9 29 66 28 68 53 63 61 3 18 14 13 -19 12 II 50 - 61 56 6 7 51 344 10 9- 761525 34 I 10 30 7 67 ~~~~~~~~~44 ~~~~~~~2 33 57 67 8 7 -404 30 54 6 36 49 3 37 32 74 14 13 39 557 5 2 25 413 15 3 7 1226 16 4 A I 151413288 272417 7 21 18 169 62 5 ~~~~~~60 5 5 73 species, black circlesfossilspecies. However, note thatsome fossilspecies extend into the Recent (e.g., species 8). Hollow circles indicate extinctspecies. Large hollow circles terminateextinctlineages whereas the tiny ones symbolize extinctspecies whose lineages continue with evolutionary change. The numbers next to squares and circles identifythe species. Recent species have been given the numbers 1 through29 familiar fromthe literature.Fossil species were assigned numbers 30 through 77 at random. Note that, although Camin indicated evolution between species 58 and 52, my coding shows no morphological change between these two forms.The internode between species 36 and 55 appears to be a similar case, but it exhibits morphological change for charactersnumbered 86-106. The phenetic genera of the Caminalcules are identifiedby bracketsacross the top of the tree. 170 SYSTEMATIC ZOOLOGY VOL. 32 2. Summaryof formulasof phenetic and evolutionarychange.a TABLE Ijo= Sum of lengths in characterstate changes over all internodesalong the directed path fromOTU j to most recent common ancestor o = i;, where 2 is summation over all OTUs Lma,,(,,) MD,, = I j IX X,JXI where : is summationover all charactersand X, is the characterstateforcharacter i and OTU j. OTU o is the most recentcommon ancestor Lmax(l)= 3 MD,, m,(l) R, = Lmax(u)IlL = Range in characterstatesof characteri forthe assemblage of t OTUs plusthe most recentcommon ri(tO) ancestor o Lmzn(,)= r,(t,,) = Minimum length forthe taxon,given any tree structureoriginatingfromthe common ancestor o, Lmln(u, and an evolutionarymodel forallowable types of characterstate changes Lac,= Sum of path lengths over all internodes,given knowledge of the true cladogram Lest= Sum of path lengths over all internodes of an estimatedcladogram - La,t)I(L.X(u) - Lmin(,))(Dendritic index) DI = (Lm,,x(u) H =La.tILmm(l) = 1/C, where C is the consistencyindex of Kluge and Farris(1969) H* LestL,Lm,n()* DR = (l,k ik MDjk)! ; MD,k, where 3 Ik (deviation ratio of J.S. Farris) is summation of all OTU pairs except for pairs jj and kk jk -2), where the subscript M refers to the subtaxon M Pbr,= Lbr,.,gI(lpp' + Lb,avg), where p is the most recent common ancestor of subtaxon M and p' is the most Lb,avg = L.t,MI(2tM recentcommon ancestor of thatsubtaxon that is also an ancestor of a Recent nonmemberof M Presented in order of appearance in text.Quantities with asterisksare based on estimatedratherthan true cladograms. in the data, given the distributionof characterstatesover RecentOTUs and the characterstatesof the fossilOTUs. A generallyshorterlengthforeach OTU j resultsif one computes of the ratio of unnecessary changes (reversalsand repeatedforwardchanges)among the evolutionarysteps for that OTU. Consequently, R = 1.0 (2) is a measureof the amountof reversalsand repeatsin characterstatechangesforthe enwhere Xi1is the characterstateof OTU j for tire taxon consideredas a bush. Note that character is summation overthen reversalsnear the base will be repeatedin i and various OTUs and, hence, weighted more characters. NotethatMDj, is theManhattan heavily. o. It is distancebetweenOTU j and ancestor Two measuresof minimumlengthof the the minimumpossibleevolutionary length treewere considered.DefineLmin(l)=ri(t=), MD,,, X= xi- Lmax(u)ILmax(l)? Xiol(1 o. We can foreachOTU j fromtheancestor assembletheselengthsto forma newupper where ri(t,)standsforthe range in character statesof characteri forthe assemblageofthe boundlengthof the entiretreeLmax(l) - MD,,givena minimallengthbushfromthe ancestor. NotethatsinceMDjo? 1jo0Lmax(l) Forany OTU j otherthantheancestor Lmax(u). > 1.0is a measure o theratioRjo= 1jo/MD1O t OTUs in thetaxonplusitsmostrecentcommon ancestoro. Length Lmin(l) is the minimumamountofevolutionnecessaryforproducingthetaxonomicdiversityofthe t OTUs in thetaxon.It is a theoreticalquantity,rare- 1983 CAMINALCULES: DATA BASE 4, Lmnt,e) Lmin(U) I II HOMOPLASY FIG. 5. L ct Lmox(,e) Lmax(iu) DENDRITIC INDEX Schematicto show the relationsamong the measures of phenetic and evolutionary change discussed in the text.Note thatthe various statisticshave been placed along a line representingthe range of possible lengths of an evolutionarytree in positions that they might assume in a "typical" tree. In any actual case, the length of any segments of this line may differfromthat shown in the figureand may indeed be zero. The double-arrowed bracket above the line indicates that La,Cmay vary between Lm() and L and is undefined with respect to its position vis-a-visLmX(,)The double headed arrows at the bottomof the figureindicate the proportionsof the length due to homoplasy and to its dendritic structure. The location of the boundary between these two (the opposed arrowheads) is at the value forL,,. 171 ancestor,can be computedonly by enumeration,which is practicalforsmall numbers of OTUs only.For mostreal data sets,informationis available only forRecentOTUs. In such cases an estimatedcladogramis produced by a numericalcladisticalgorithmusing an outgroupOTU believedto be close to the mostrecentcommonancestoror a vector of characterstatesbelieved to be primitive. As a resultofthecladisticestimationprocess, the HTU ultimatelyspecifiedas the mostrecent commonancestorof the group may be different fromthe outgroupfirstfurnished. In thesecases,thestatistics Lmax(l)' Lmax(u)' Lmin(l)' and Lmin(u) must be defined with respect to the estimatedcladogramand the HTU representingthehypothesizedmostrecentcommon ancestor.To distinguishthese statistics based on estimatedratherthan true cladograms,I have added asterisksto theirsymbol. The equivalent of the quantityLactfor the estimatedcladogramobtained by a numericalcladisticproceduresuch as a Wagner treealgorithmis the lengthLest.For any given data set and evolutionarymodel,given a ly, if ever, obtained in real data because of theactualdistribution ofcharacterstatesover the OTUs. For Lmin(l) to be realized, there would have to exista cladogramforthe taxon forwhich all characterswould have to be compatible.In such a case therecould be no specified root o, Lest> Lmzn(u)*.This estimate reversalsor parallelisms.Since this is a hy- by various computationalalgorithmswill pothetical,largely unattainable minimum frequently notproducetheminimumlength * Note that Lestcan be less than or length, a second minimum length Lm,n(,)Lmzn(u) needs to be definedas the minimumlength greaterthan La,t forthe taxon with t OTUs, given any tree Thesequantitiespermitcomputationofthe structure originating from the common followingstatistics. The saving in evolutionancestoro and an evolutionarymodel forthe ary lengthby the known treestructurecan allowable types of characterstate changes. be defined as Lmax(u)- Lact.Reversals and par> Lmin(l) since some homo- allelismsare in bothcoefficients NecessarilyLmin(,) but theyare morenumerousin Lmax(u) plasy is usuallypresent. since thisis a bushGiven knowledgeof the true cladogram, like structurerepeatingthe path length of the actual lengthof the tree,Lact,is the sum common shared stems separatelyfor each of the lengthsover all the internodes.Thus OTU. A dendriticindex can be defined it representsall the evolutionarychanges DI= (Lmax(u) -Lact)I(Lmax(u) - Lmin(l)), (3) over all charactersthathave taken place on thistree.Clearly,Lmln(u)< Lact < Lmax(u)but Lact which expressesthe savings in lengthdue could be greateror smallerthan Lmax(l). The to the tree's dendriticstructuredeparting relationsamong the various quantitiesare fromthat of a bush as a proportionof the illustratedin Figure5. tree'smaximumpossible range in length.If QuantitiesLactand Lmax(u)are computable DI = 0, thenthe taxonis a bush and thereis only in those extremelyrare cases, such as no common evolution. If DI = 1, then the the presentone, in which the true evolu- characterstatesare fullycompatibleon the tionarysequence is known; quantitiesLmin(l) cladogramand thereare no parallelismsor and Lmax(l)requireat least knowledgeof the reversals.Because basal internodesare more most recent common ancestor o. Quantity ofteninvolved in determiningIjothan are internodesnear the tips of the tree, DI is Lmin(u)'even with knowledge of the common SYSTEMATIC ZOOLOGY 172 more heavily affectedby savings in length in the dendriticstructurenear the base. For data setswithunknowncladogeniesone can define DI* (Lmax(u)*-Lest)I(Lmax(u)* - Lmn(l)*). (4) VOL. 32 wise homoplasticdistancesdivided by the sum of the Manhattandistancesamong all pairs of OTUs. The pairwise homoplastic distancesforOTUs j and k are foundas 1]kMD1k,whereas the Manhattandistancesare MD,k.Therefore, A second quantity,Lact - Lmln(l) describes DR= z (,k- MDjk)/I MDjk, the excess in evolutionarylength over the jk jk (8) absolute hypotheticalminimum.It may be usefulto partitionthis excessinto two parts where z is summationof all OTU pairs jk as follows: jk except for pairs jj and kk. This ratio is afLmin(l) = (Lmln(u) Lact Lmin(l)) fectedmoreby homoplasyat the base of the + (Lact -Lmin(u)). (5) tree, even though the denominator also The firstquantifiesthe necessaryparallel- countsbasal lengthsrepeatedly,because any isms to allow forthe departurefroma fully excess length due to homoplasy will be compatible distributionof characterstates counted more oftenfor basal than for terover OTUs and the second term describes minal internodes. the extraparallelismsand reversalsthat ocA second measureof characterreversalR2 curredin the actual evolutionaryhistoryof is analogous to the DR ratio.It is definedas the group over the minimumamount nec- the sum of the pairwisedistancesdue to reessaryto accountforthe observeddistribu- versals and nonparallel repeated forward tion of characterstatesover OTUs. Biologi- changes divided by the sum of the homocallyspeaking,however,the two termsmay plastic distancesamong all pairs of OTUs, to distinguishsincebothdescribe which is the numeratorof DR. It indicates be difficult departuresfromperfectconsistencyof char- the proportionof homoplasy due to reacter states with the cladogeny. Clearly, versals.Its 1-complementis the proportion is a measureof homoplasyas it due to parallelisms. Lact -Lmin(l) is conventionallyunderstoodand when exIn studyingsubtaxawithina largertaxon, pressed as a proportionof the maximum as, for example, the genera in the present possible range of lengthof the treeLmax(u)study,it is usefulto definethe above statistics also for subtaxa. In such a case when Lmin(l) is simply1 - DI (see Fig. 5). An alternativestatistic, adoptedin thispa- workingwithsubtaxonM, I employthemost recentcommonancestorPM' or simplyp, of per,is H = LactILmin(l1) (6) that taxon, replacing o by p in the above formulas.Summationsover OTUs will then which expressesthe homoplasy as a ratio, be carriedout not over the t OTUs of the necessarilygreaterthan 1. The amount by entirestudy,but only over the tM OTUs of which H is greaterthan unity is the extra subtaxonM. lengthof the actual tree,expressedas a proIt is usefulto recordthe path lengthofthe portionof the minimumtree lengthneces- stem of each of the subtaxa. Following the saryforevolutionof the characterstates.For earliersymbolism,thiscan be defined as 1pp,,, data sets with unknown cladogenies where p' is the most recentcommonancesH = LestILmln(l)*. (7) torthatis also an ancestorof a Recentnonmemberof subtaxonM. An average branch Note that H* = 1/C where C is the consis- lengthLbr,avg of the internodessubtendedby tencyindex of Kluge and Farris(1969),also the most recentcommon ancestorp is also employed by Kluge (1976) and Mickevich useful.It can be computedby dividingLact,M, the observedlengthof the treerepresenting (1978a). A thirdmeasureof homoplasyis DR, the the subtaxon,by 2tM- 2, the numberof its deviationratiofeaturedin Farris'WAGNER internodes.Comparing lpp,with Lbr,avgcon78 program,which is the sum of the pair- traststhe evolution on the stem preceding 1983 CAMINALCULES: DATA BASE 173 pectsin 19 zoologicaldata sets,rangingfrom 8 to 97 OTUs and based on from20 to 139 characterslackingNCs. None of these data sets was subjected to the exhaustive analysisto which the Caminalculeswere treated (9) (recountedin subsequentpapers of this sePbrl Lbr,avgl(lpp'+ Lbr,avg). An additional complicationarises when- ries)fortheobvious reasonsthatsuch a projever NCs are presentas in thisstudy.I have ectwould have takenseveraladditionalyears, adopted two conventions,which are modi- thatfindingoutgroupsforthemwould have fications of the Manhattan distance. In takenmuchspecializedknowledge,and that "transparent" measure the NC state is the truecladogeniesof the data are of course thoughtof as an unknown.For computation unknown.As will be seen below,the results of distancesbetween terminalOTUs, char- foralmost all parametersbracketthose obacter states for OTU pairs, one or both of tainedforthe Caminalcules.It is not expectin treetopologythat which had NCs fora given character,were ed thatthe differences ignoredduringthe computationof the dis- might be accomplishedby repeated applitance and NCs were passed over during cation of a cladisticalgorithmto the same computation,of path lengths. Thus a 0 - data set in the hope of findinga shortertree NC - 3 - 4 path is four units long. In would alter these relationswith respectto "opaque" measureNCs are considereda dis- homoplasyand othermeasures.I have sumtinctcharacterstate and changes along the marized my conclusionsin Table 3 forhotreefroman expressedcharacterstateto an moplasy, tree symmetryand adequacy of unexpressedone (NC) are considereda sin- characters. Homoplasy.-Threeindices of homoplasy gle step. SuccessiveNC statescount as zero steps and a change froman NC to an ex- have been mentionedearlierand in the litpressedstateis again a singlestep,regardless erature.These are the 1-complementof the of the magnitudeof the expressedcharacter dendriticindex(1 - DI), thehomoplasyratio state.Thus the path 0 - NC - 3 - 4 is of H, which is the reciprocalof C, the consislength3. Computingdistancesbetweenter- tencyindex of Kluge and Farris(1969),and betweentwo NCs Farris'deviationratio DR. In the CaminalminalOTUs, thedifference was consideredto be zero,thatbetweenany cules,two values can be obtainedforeach of numericalstateand an NC was considered theseindices.One is based on thetruecladoto be one. Opaque distance was employed gram and should be the correctmeasure of mainlybecause it made it simple to express homoplasyin the groupforthe given index; path lengthforany internodealong the tree the other,comparableto those furnishedin and to partitionthe path lengthinto homo- the literatureon real organisms,is based on plastic and reversaldistancesas needed in the estimatedcladogram.As an estimateI the nextsectionof thispaper. In transparent employedthe approximateWagnertreeobmeasureit is notalways possibleto uniquely tained with the WAGNER 78 programdeveloped by J. S. Farris,using the distance estimatepath lengthforeach internode. Wagner procedure and midpoint rooting. Opaque distanceswere analyzed to permit RELEVANCE OF CAMINALCULES FOR computation of all homoplasy statistics. SYSTEMATIC INQUIRY However, forthe two statisticsthat can be It is of interestto examine in how many computed in transparentmeasure (H and ways theCaminalculesresembledata setson DR), these values are reportedas well. Stathe tisticsbased on estimatedcladogramsare disrealorganisms,sincethiswill strengthen relevanceof resultsobtainedfromthem for tinguishedby affixingan asteriskto their systematicinquiry.Below I inspectas many symbols. ApproximateWagner trees were separateaspectsof the Caminalculesas seem computedforthe 19 data sets, again using to me relevantto systematicmethods and the WAGNER 78 programand rootedby the principles.I also examine threeof these as- midpointmethod.Becausethesedatasetslack the mostrecentcommonancestorof the genus with the subsequent evolution within thegenus.The averagebranchlengthcan be expressedas a branchlengthproportion, = 174 SYSTEMATIC ZOOLOGY TABLE VOL. 32 3. Minima and maxima of tree statisticsfor 19 zoological data sets and the Caminalcules.a Caminalcules Tree statistics Homoplasy 1 - DI* H* C DR* Symmetry BSUM2 BSUM3 SHAO2 SHAQ3 COLLESS2 Low value Estimated True High value 0.0559b 0.1326 1.417 0.7057 0.4646c 6.690d 0.8621b 0.1124b 0.1795 0.1745 2.327 0.4298 0.2597d 0.4900 0.2420 0.6186 0.5225 0.3331 0.3779 0.0753 0.7720 0.7146 0.1693 0.7714e 0.3462g 0.8021c 0.6840c 0.4889g 1.16h 2.93 2.93' 6.62i 1.160b 0.1495d -0.0976' 0.4483h 0.3421h 0.1061' 1.3591 1.7395d Adequacy nit High and low values referto numerical values, not necessarilyto the propertydescribed. Thus, high values of C indicate low homoplasy, and high values of BSUM2, BSUM3, and COLLESS2 indicate low symmetry(high asymmetry).The values shown are based on means of three runs of the WAGNER 78 program with midpoint rooting for the symmetrymeasures, and on single estimatesfor the other measures. Other cladistic estimateswere run as well; fordetails see text.The resultsconformto those shown in this table. The data sets exhibitingextremevalues are identifiedin footnotes.For explanations of tree statisticssee text. b Leptopodomorpha (Schuh and Polhemus, 1980). c Orthopteroidinsects (Blackithand Blackith,1968). d Hoplitiscomplex (Michener and Sokal, 1957). e WesternBufo(Feder, 1979). f Drosophila(real set; Throckmorton,1968). 8 Hemoglobin , (Dayhoff,1969). h Dasyuridae (Archer,1976). Is 5.34 if binary coded data are considered. l Pygopodidae (binary coding; Kluge, 1976). a NCs, the distinctionbetween opaque and transparent measur(edoes not apply to them. As an experiment,I also triedtreeswith an outgrouprootingusing OTU 1 as the outgroup (clearly not a recommendedprocedure,althoughit has been used by some numericalcladists;e.g.,Mickevich,1978b;Farris, 1979a).Only midpointrootingis reportedin Table 3. Resultsforthe OTU 1 rootingprocedurewere similar. It is worthemphasizingagain that,both for the Caminalcules and the 19 real data sets, the above method provides only approximate Wagner trees. Better estimates could have been obtainedwithfurther work and were indeed obtained forthe Caminalcules (see Sokal, 1983a).I am employingthe "cruder"estimateforthe Caminalcules,since it is comparableto the estimatesin the 19 data sets.Presumablyhomoplasywould decrease if the length of the trees could be shortenedalgorithmically. But,on the average, thiswould occurproportionately forall data setsand, sincetheCaminalculesare cur- rentlybracketedby the data sets on real organisms,theypresumablywould stillbe so bracketedafterthe length of all trees had been reduced. In the Caminalcules 1 - DI = 0.1745 and 1 - DI* = 0.1326.In the 19 data sets 1 - DI* rangesfrom0.0559in the Leptopodomorpha (Schuh and Polhemus,1980) to 0.4646in 12 orthopteroidinsects(Blackithand Blackith, 1968). Because the true trees for these real organismsare not known, the propercomparison with the Caminalculesis with 1 DI* fromthe approximateWagnerestimate. For the truecladogramH = 2.327,and for the Wagner estimateH* = 1.417 and 1.261 for opaque and transparentestimates,respectively,yielding correspondingconsistency indices of 0.4298,0.7057,and 0.8048. Mickevich(1978a) reporteda range of consistencyindicesfrom0.33 forAedesand papilionidsto 0.86 forcytochromeC and globin. In the 19 data sets,H* rangesfrom1.160in the Leptopodomorphato 6.690in bees ofthe Hoplitiscomplex(Michenerand Sokal, 1957), 1983 CAMINALCULES: DATA BASE 175 corresponding to consistency indices of WAGNER program,rootedat the trueancestor,and an estimatedWagnertreeobtained 0.8621and 0.1495,respectively. The deviationratio(DR) forthetrueclado- with the WAGNER 78 programwith midgram is 1.3591,a very high value, but DR* pointrooting.The latterwas replicatedthree is 0.1795 and 0.3852 for opaque and trans- timeswithrandomlypermutedinputorders parentWagnerestimates,respectively.This of OTUs. great discrepancycan be accounted for by These same statisticswere also calculated the numerous reversalsin the true clado- forsimilarlyestimatedcladograms(exceptfor gram,especiallyalong the stemsdefiningthe the impossibletrueancestorrooting)forthe genera. Rememberthat the estimateopti- 19 data sets.Again,rootingusingOTU 1 was mizes the positionof the HTUs and, in con- employed additionallyon an experimental sequence,attemptsto minimizehomoplasy. basis for trees obtained by both programs. The formulaforcomputingthe deviationra- Not all data sets were suitableforeach estitio involves all pairwise distancesbetween mationmethodand statistic, but no reported OTUs and, therefore, repeatedlycountsthese rangeofvalues is based on fewerthan14data basal relations.Actuallytheparallelismcom- sets.In Table 3 resultsare reportedonly for ponentofthe homoplasyis relativelylow in estimatedcladogramsbased on the WAGthe Caminalcules.For the lower triangular NER 78 programand midpointrooting,but matrixin opaque measure,the sum of the below the otherresultsare summarizedas homoplastic distances is 21,225, of which well. For WAGNER 78 with midpointroot17,093is due to reversalsand 4,132is due to ing, the truecladogramis containedwithin parallelisms.The Wagnertree algorithmin the range of observed symmetriesfor all tryingto obtaina shortestlengthtreecancels coefficientsexcept SHAO3. But the true many of the reversals,so thatthe deviation cladogramshows more symmetrythan the ratio actually observed is relatively low, observedrange of symmetryvalues fores0.1795,whereasthedeviationratioofthetrue timatedcladogramsbased on the PHYLIP cladogramis much higher.The rangeof ob- WAGNER program and on WAGNER 78 served values of DR* in the 19 data sets is with OTU 1 as an outgroup.Yet, when esfrom 0.1124 in the Leptopodomorpha to timated cladograms are computed for the 1.7395 in the Hoplitiscomplex. If reversals Caminalcules,these are almost always less are as common in real organismsas in the symmetrical than the truecladogram.In TaCaminalcules and if we knew their true ble 3 formidpointrootedWagnertreesthe cladograms,real organismsmight well ex- symmetriesof estimatedcladogramsof the hibitgenerallyhigherdeviationratios.Real Caminalculesare containedwithintherange and apparent homoplasy in the Caminal- of observedvalues and, when otherestimatcules is well within the reportedrange for ed cladogramsare consideredas well, this real organisms. relationholds in 17 out of 20 comparisons. 5ymmetry.-Thetrue cladogram of the Since theother19 classifications are all based Caminalculesis fairlysymmetrical. No uni- on estimatedcladograms,it is the estimated versallyacceptedcriterionof symmetryhas cladogramsoftheCaminalculesthatmustbe yet been established,but K. T. Shao (pers. compared with these data sets ratherthan comm.),who is investigatingthis problem, the truecladogram.Thus on thisbasis, also, has used fiveseparateindicesto describedif- it appears that the Caminalcules are not ferentaspectsofsymmetry. Theseare BSUM2 atypical. and BSUM3 (modifiedfromSackin, 1972), forresolvingthe Adequacyof thecharacters SHAO2 and SHAO3 (Shao,pers.comm.),and cladogram.-The29 Recent OTUs in a fully treewould derivefrom27 bifurCOLLESS2 (Colless, 1982).These coefficients bifurcating were computedforthetruecladogramofthe cations,not countingthe basal bifurcation at Caminalculesas well as forseveralestimated the root.Since one of the branchingpoints cladogramsof the Caminalcules. The esti- in the true cladogramis a trifurcation, 26 mated cladogramsinclude one obtained by synapomorphiesare minimally needed to means of Joseph Felsenstein's PHYLIP resolve the tree. How can we measure the 176 SYSTEMATIC ZOOLOGY VOL. 32 adequacy of the data set forthistask?At the termsof the charactersavailable,in addition simplest level, one can count characters. to the known trifurcation-which goes Fewer than 26 binarycharacterscannot,re- counterto the assumptionsof mostcladistic solve the tree. Subjectedto additive binary methods.Correlatedcharactersundoubtedly coding, the 85 X 29 data yield 155 binary definefurcationsredundantly,and parallelcharacters-superficially a more than ade- ism allows a single characterto definefurquate number.However, such an approach cationsin different partsof the tree.Unforis too crude since it does not allow forchar- tunately,knowledgeof thistypeis virtually actercorrelations.In fact,we know that in impossibleto obtain fromreal data sets,beadditionto the single trifurcation, threebi- cause the truecladogramsare unknown.Usfurcationsin the true cladogram(19-26,11- ing estimatedcladogramsforsuch inferences 21, 18-23)are not supportedby any evident would tend to make the argumentcircular. evolutionarychange in the stems subtend- In any case,it appears thatthe Caminalcules ing them.Thus,even thoughthereare more are unlikelyto be less capableofhavingtheir thanfivetimesas manybinarycharactersin cladogramresolved than most data sets in thisdata set than are necessaryto definethe systematics. tree,they are distributedacross the tree in Becauseofthe incompletenessofthe fossil such a way as to makeit impossibleto define record,less is known fromreal organismsof more than 23 branchingpoints. Most nu- thefollowingthreeaspects.However,it may merical cladisticstudies are carriedout on be of some interestto treatthese topics at far fewer charactersand because the true least brieflyin an effortto describewhere cladogramsof real organismsare unknown, the Caminalculesare locatedwith respectto it is generallynot known whetherthe data evolutionary parameters-evolutionary rates, matrixis adequate forresolvingthetruetree. specieslongevities,and speciation-extinction In the 19 data sets the ratio n/t,where n is rates-thatmightbe employedin simulation the numberof charactersand t the number studies.It will be seen thatthe Caminalcules of OTUs, ranges from 1.16 in Dasyurus are not in contradictionwith such findings (Archer,1976)to 6.62 forbinarycoded mem- as are reportedin the paleobiologicalliterabersofthe Pygopodidae(Kluge, 1976).These ture. figurescomparewith 2.93 and 5.34 formulrates.-A frequencydistribuEvolutionary tistateand binary coded Caminalcules,re- tion of the path lengths(amount of evoluspectively(see Table 3). Since the values for tionarychange) of each internodefor each the 19 data sets reflectdifferencesin char- timeperiodshows extremeclumping.When actercoding,theymust be interpretedwith examinedagainstPoisson expectations(Fig. caution. Nevertheless,it is clear that the 6), the overdispersionis highly significant Caminalculesfallwell withinthe bounds of (P < 0.001). Thus, thereare many more pedata fromthe literature. riodsofno evolutionas well as moreperiods One can examine the true cladogramto with substantialamountsof evolutionthan determinethe number of OTUs subtended expected.Two explanationscan be advanced by any given furcation.It ranges from2 to forthis phenomenon.(1) The Caminalcules 22 OTUs. These figuresshould minimallybe are similar to real organismsin that their matchedby the numbersof OTUs (greater evolutionis of organicformas a whole raththan one) sharingany one characterstateto er than independent for each character. providesynapomorphiesforthe recognition Changes in form in the Caminalcules inof these furcations.When the requireddis- volve variouscorrelatedcharactersand thus tributionof OTU numbers is compared to those segments of the cladogram during theactualdistributions ofsuch numbers,one which extensive morphological evolution findsat least as many observedfrequencies occurred will exhibit greater amounts of as are required.However,thisis insufficient change.(2) Therealso maybe local clumping evidence forthe adequacy of a data set be- of evolutionarychangeson the treefornoncause we already know that three bifurca- biologicalreasons which have an analog in tionsin the Caminalculesare not resolvedin real phylogeneticprocesses. We must as- 1983 CAMINALCULES: DATA BASE 8- 177 5- 6 z 03 4 7 ............ L 0 >7 201 6 _ 1 23 4 ~~~~~ 4 7 8 FIG. 6. Observed and expected frequenciesof evolutionary rates (path length per evolutionary time period) in the true cladogram of the Caminalcules. These data are based only on those lines leading to Recent OTUs. The bar diagram of the "skyline" (bars above the abscissa) indicates the expected frequencies of a Poisson distribution;the "inverted skyline" (shaded portion of the bars) indicates the observed frequencies.Bothfrequenciesare given in square roots of actual values to emphasize the deviations. Where the invertedskylinedoes not reach the abscissa,there are fewerobserved than expected frequencies.Wherever it reaches below the abscissa, there is an excess of observed frequencies over expected frequencies. The ordinateshows square rooted frequencies,while the abscissa expresses evolutionaryrates. sume thatCamin in constructing his treewas at leastsomewhatgoal orientedso thatas he evolved any one line he carriedit through to some majormorphologicalchange rather thanabandon the changein midcourse.This processalso tends to induce inhomogeneity of rates among the lines which resultsin clumpingof the distribution of evolutionary changes when consideredoverall. The bio- o 2 4 6 8 10 12 14 16 18 20 TIME PERIOD FIG. 7. Average evolutionaryrate plotted against time period. The rate is expressed along the ordinate as average path length in the (unit) time interval preceding the point in time shown along the abscissa. The path lengthsare computed on the basis of 106 characters.The means of this figureare based on lines leading to Recent OTUs only. logicalanalogis to assumethatmajorchanges in environmentor in genetic architecture leading to adaptive radiation are ongoing processeswith a momentumof theirown. The data in Figure6 are based on onlythose lines leading to RecentOTUs. When all internodesincludingthose leading to extinct terminalspeciesare examined,theresultsare very similarto those already reportedand illustratedin Figure6. Similardata are hard to obtain in real organismsbecause of the incompletenessof the fossilrecord,but the observed patternof evolutionarychange is at least biologicallyplausible. In Figure7, I show a graphof the average evolutionaryratesoverall evolutionarylines foreach of the 19 time periods (again, only forlines leadingto RecentOTUs; the results includinglines leadingto extinctspeciesare quite similar).The ratesare computedfrom the 106 characterX 77 OTU data base. There is clear evidenceof differential evolutionary ratesthroughtime.Betweentimes7 and 11, rates were generally lower than at other times. By time 7, all genera except for DE and C had alreadybeen definedbut major within-genusdiversityhad not yet begun exceptin genus A. A runs-up-and-down test (Sokal and Rohlf,1981b),significantat P < 178 SYSTEMATIC ZOOLOGY 0.01,demonstratesthe alternatingnatureof the rate changes. Periods of higher change alternatedwithperiodsoflowerchangemore frequently thancould be expectedby chance alone. Relevant comparisonswith real organisms are hard to come by. That evolutionaryratesdifferover the fossilrecordhas been well established since the work of Simpson (1944, 1949,1953).Yet, because paleontologicalseriesare usuallybased on few charactersand because the fossillineagesare not well known, estimatesof evolutionary ratesbased on multiplecharactersas in Figure 7 are lacking.For singlecharacters, rates do vary with time within species (Stanley [1979:58ff.] illustratestwo examples) as well as within and among genera (Gingerich, 1974,1979).Clearly,thepatternofevolutionary (phenetic)ratesobservedin the Caminalcules is not uncharacteristicof observationsmade on real organisms. Specieslongevities.-Theknown cladogeny of the Caminalculesalso permitsan examinationof the distributionof species longevities.These are shown in Figure8 separately forthe 29 Recent species,as well as forall Recent-and-fossil species in the group. The longevitiesof Recent species are shown in two different ways. In Figure8A, I show the actual longevityfromthe time of originof each species to the Recentor to the time of extinction.Expressedas a cumulative percentage,it can be read as the percentageof livingspeciesthatextendbackwardsby various amounts of evolutionarytime. Such curves are generallynot used by paleontologists (Stanley, 1979:113),because the incompletenessofthefossilrecordwould make estimatesof longevitiesinaccurate.In the Caminalcules,of course,this graph is fully descriptiveand accurate.This curve can be contrastedwiththatin Figure8B,a so-called Lyelliancurve,which depictsthepercentage ofRecentspeciesthatcan be foundin faunas ofa givenage. As expected,withan increase in age ofthefaunalassembly,thepercentage of its species thatsurviveto the Recentdecreases. None of the five species extantat time period 4 survivedto Recenttimes,the faunal assemblyat time period 5 being the firstone to contributea memberspecies to the Recentfaunalassemblage.Althoughthe VOL. 32 A 100 IL > 80 60 - - 40 D 20 _ RECENT ---- RECENT+FOSSIL '\ s _ 2 4 6 -_^s 8 10 12 14 16 LONGEVITIESOF SPECIES IN TIME PERIODS Z Z I: X<LL\ J B 100 80 QO( 0 60 _ :z 40 wz w 3 0 0 W AL - 20 il 2 2 4 6 8 I tt I ._ 10 12 14 16 TIME PERIODS AGO (FAUNAL AGE) FIG. 8. Species longevities. (A) Longevities of species in given time periods plotted as cumulative percent,i.e., the percentage of living species thatextended backwards by various amounts of evolutionary time. Solid line-Recent species only; dashed line-Recent and fossilspecies. (B) Percentage of Recent species that can be found in faunas of a given age. total number of species in faunal assemblages was nondecreasingwith time, the number of species survivingto the Recent did not increase proportionately.This explains the lack of monotonicity in Figure8B. Hence,thereare timeintervalsduringwhich the numberof species in the faunal assembly increased without a correspondingincreasein the numberof species survivingto the Recent,producinga loweringof the percentageof extantspecies in the fossilfauna. In Figure8A, the longevitycurve forRecent OTUs can be compared with that including both Recentand fossilOTUs. Both curvesresemblethehollow curvesof similar data fromthepaleontologicalliterature, with theRecent-and-fossil curve,comprisingmore data,presentingthe smootherappearanceas mightbe expected.Note thatbothcurvesdecline to zero at time period 16. No Caminalcule species lived for more than that 1983 CAMINALCULES: DATA BASE amountoftime.Corresponding graphsin the paleontologicalliterature generallyapproach an asymptoteat a higherpercentagebecause the faunal assembliesare not tracedback a sufficient time forall the Recent species to disappear.The Lyellian curve,Figure8B, is not atypicalwith respectto those published in the literature(see figs.5-8 or 9-3 in Stanley, 1979). The model under which ProfessorCamin evolved the Caminalcules requires each species to have a longevityof at least one time period. The observed distributionof longevities is highly clumped (P < 0.001 againstPoissonexpectations); thereare many morespeciesthanexpectedbecomingextinct aftera single time intervaland substantial numbers survive for quite long periods of time.The latterphenomenonseemsto agree withobservationson realorganisms.The excess of short-livedspeciesis moredifficult to demonstratein real organismsbecause there would be an inherentbias againstobserving these species in fossil assemblages. Moreover, real organismsare not limitedto discrete time amounts as were the Caminalcules. That is, species existingforless than one timeunitclearlymustoccurin the fossil record.We may conclude,however,thatdespitesome peculiaritiesdue to theirmode of generation,the distribution of longevitiesin the Caminalcules resembles similar distributionsobservedin real organisms. ratios.-In the CamSpeciation-extinction inalcules,56.0% of all evolutionarychanges are accompanied by extinction, whereas 54.4% of lines showing no evolutionary change(stasis;verticallines in Fig. 4) lead to extinctions.Thus thereis no preferencefor one or the othertype of process leading to extinctionbuilt into the evolutionarytree. The availabilityofcompleterecordsin the Caminalculespermitscomputationof thenet rate of increase in number of species (R), speciationrate(S) and extinctionrate(E) for all time periods. In real data, R and E are usuallyapproximatedby variousindirectapproaches and S is estimatedas S = R + E. Here these values can be directlycomputed and they are R = 0.213, S = 0.538 and E = 0.324. Note that if R is estimatedusing an exponential growth model (Stanley, 1979: 179 104),the resultingvalue is 0.177,an underestimateofthe truerate.The absolutevalues ofR, S, and E cannotbe contrasted withthose of real organismssince the time intervals chosen for the Caminalcules are arbitrary But the relativemagnitudesof these quantitiescan be compared.The SIE ratioin the Caminalcules is 1.661. This compares with similar ratios estimatedunder varying assumptionsforPlio-Pleistocenemammals of Europewhich rangefrom1.310to 2.048,and fortemperateand subtropicalbivalvesofthe Pacific which range from 1.667 to 3.333 (Stanley,1979:117).Bothof these groupsare consideredby Stanleyto be undergoingactive adaptive radiationinto the present.By this criterion,the Caminalculesalso do not differmarkedlyfromreal organisms. Conclusions.-Bythe six criteriaexamined above, the Caminalculesfallwell withinthe rangeof observedvalues forreal organisms. At least with respectto these criteria,the Caminalcules simulate real organisms,although,of course,this is not to imply that theremay not be otheraspectsof evolutionary change in real organismsthat will be found lacking in the Caminalcules.But by these criteriaalone the Caminalcules offer considerablescope foranalyses of methods and principlesof numericalpheneticsand cladisticsand of taxonomyin general. ACKNOWLEDGMENTS This is ContributionNo. 463 in Ecology and Evolutionfromthe StateUniversityof New York at Stony Brook.The studywas greatlyaided by two unusually competentassistants.Leora Ben Ami initiatedthe recoding of the characters.Barbara Thomson completed this task,took over the very considerable amount of computations,and prepared the tables, chartsand illustrations, becoming so engrossedin thiswork that she became transmogrifiedinto a Caminalcule on Halloween's eve 1982. Throughoutthis work, I have had the unstintedcounsel of myfriendand colleague F. JamesRohlf. Much advice and computational assistance were given by other members of the Stony BrookNumerical TaxonomyGroup: Kent Fiala, Gene Hart, and Kwang-tso (Chet) Shao. I am indebted to JosephFelsenstein forhis valuable PHYLIP program package, to Kent Fiala for use of the CLINCH program,and to Leslie F. Marcus forfurnishingour laboratorywith a copy of the WAGNER 78 program.The manuscriptbenefitedgreatlyfroma criticalreading by Messrs. Fiala, Hart, Rohlf and Shao as well as by criticalreviews by Raymond B. Phillips, University of Oklahoma, and two anonymous reviewers. The 180 SYSTEMATIC ZOOLOGY sectionson evolutionaryrates,species longevities,and speciation-extinction ratioswere read by Michael Bell, Lev R. Ginzburg, and Jeffrey S. Levinton at Stony Brook, who provided useful comments. Barbara McKay uncomplainingly wordprocessed innumerable versions of this manuscript.JoyceSchirmerprepared the final illustrations.Last but farfromleast I must acknowledge my debt to the late ProfessorJoseph H. Camin, my formercolleague at the Universityof Kansas. Withouthis fancifulimaginationcoupled with a thorough understandingof systematics, there would not have been any Caminalcules. The computationswere carried out on a UNIVAC computer at the Stony Brook Computer Center and on the MODCOMP minicomputer facility at the Departmentof Ecology and Evolution.This researchwas supported by grantDEB 80-03508 fromthe National Science Foundation,whose continued support of my researchis much appreciated. VOL. 32 ogy. Pages 41-77 in Phylogenetic analysis and paleontology (J.Cracraftand N. Eldredge, eds.). Columbia Univ. Press, New York. KLUGE,A. G. 1976. Phylogenetic relationships in the lizard family Pygopodidae: An evaluation of theory,methods and data. Misc. Publ. Mus. Zool., Univ. Michigan, No. 152. KLUGE, A. G., AND J. S. FARRIS. 1969. Quantitative phyleticsand the evolution of anurans. Syst,Zool., 18:1-32. MAYR,E. 1965. Numerical phenetics and taxonomic theory.Syst. Zool., 14:73-97. MICHENER, C. D., AND R. R. SOKAL. 1957. A quantitativeapproach to a problem in classification.Evolution, 11:130-162. MICKEVICH,M. F. 1978a. Taxonomic congruence. Syst. Zool., 27:143-158. MICKEVICH,M. F. 1978b. Taxonomic congruence. Ph.D. Dissertation, State Univ. of New Yprk at Stony Brook,Stony Brook,New York. MICKEVICH,M. F. 1980. Taxonomic congruence: REFERENCES Rohlf and Sokal's misunderstanding.Syst. Zool., ARCHER, M. 1976. The dasyurid dentition and its 29:162-176. relationshipto thatof didelphids, thylacinids,bor- Moss, W. W. 1971. Taxonomic repeatability:An exhyaenids (Marsupicarnivora) and peramelids perimentalapproach. Syst. Zool., 20:309-330. (Peramelina:Marsupialia). Aust.J.Zool., Suppl. Ser. Moss, W. W., AND R. I. C. HANSELL. 1980. Subjective No. 39. classification:The effectsof characterweightingon BLACKITH, R. E., AND R. M. BLACKITH. 1968. A nudecision space. Bull. Classific.Soc., 4(4):2-6. merical taxonomyof orthopteroidinsects. Aust. J. NELSON, G., AND N. PLATNICK. 1981. Systematicsand Zool., 16:111-141. biogeography:Cladistics and vicariance. Columbia CAMIN, J.H., AND R. R. SOKAL. 1965. A method for Univ. Press, New York. deducing branchingsequences in phylogeny.Evo- ROHLF, F. J. 1982. Consensus indices forcomparing lution, 19:311-326. classifications.Math. Biosci., 59:131-144. COLLESS, D. H. 1980. Congruence between morpho- ROHLF, F. J.,D. H. COLLESS, AND G. HART. 1983a. metricand allozyme data for the Menidia species: Taxonomic congruence-A reanalysis.Pages 82-86 A reappraisal. Syst. Zool., 29:288-299. in Numerical taxonomy. Proceedings of a NATO COLLESS, D. H. 1982. [Review of]Phylogenetics:The Advanced Study Institute (J. Felsenstein, ed.). theory and practice of phylogenetic systematics. NATO Adv. Study Inst. Ser. G (Ecological SciSyst. Zool., 31:100-104. ences), No. 1. Springer-Verlag,New York. DAYHOFF, 0. (ed.). 1969. Atlas of protein sequence ROHLF, F. J.,D. H. COLLESS,AND G. HART. 1983b. and structure.Vol. 4. National Biomedical ReTaxonomiccongruencere-examined.Syst.Zool., 32: 144-158. search Foundation, Silver Spring, Maryland. ELDREDGE, N., AND J.CRACRAFT. 1980. Phylogenetic ROHLF, F. J.,J. KISHPAUGH,AND D. KIRK. 1980. patternsand the evolutionary process. Columbia NTSYS, numerical taxonomysystemof multivariate statisticalprograms.Tech. Rep. StateUniv. New Univ. Press, New York. FARRIS,J.S. 1969. A successive approximationsapYork, Stony Brook. proach to characterweighting. Syst.Zool., 18:374- ROHLF,F. J.,AND R. R. SOKAL. 1965. Coefficientsof 385. correlationand distance in numerical taxonomy. FARRIS,J.S. 1977. On the phenetic approach to verUniv. Kansas Sci. Bull., 45:3-27. tebrateclassification.Pages 823-850 in Major pat- ROHLF, F. J., AND R. R. SOKAL. 1967. Taxonomterns in vertebrateevolution (M. K. Hecht, P. C. ic structure from randomly and systematically Goody, and B. M. Hecht, eds.). Plenum, New York. scanned biological images. Syst. Zool., 16:246-260. FARRIS,J.S. 1979a. On the naturalness of phyloge- ROHLF,F. J.,AND R. R. SOKAL. 1980. Comments on netic classification. Syst. Zool., 28:200-214. taxonomiccongruence. Syst. Zool., 29:97-101. FARRIS,J. S. 1979b. The information content of the ROHLF,F. J.,AND R. R. SOKAL. 1981. Comparing nuphylogeneticsystem.Syst. Zool., 28:483-519. merical taxonomicstudies. Syst. Zool., 30:459-490. FEDER,J.H. 1979. Natural hybridizationand genetic SACKIN,M. J. 1972. "Good" and "bad" phenograms. divergence between the toads Bufo boreas and Bufo Syst. Zool., 21:225-226. punctatus.Evolution, 33:1089-1097. SCHUH,R. T., AND J. S. FARRIS. 1981. Methods for GINGERICH,P. D. 1974. Stratigraphicrecord of Early investigatingtaxonomic congruence and their apEocene Hyopsodus and the geometry of mammalian plication to the Leptopodomorpha. Syst. Zool., 30: phylogeny.Nature, 248:107-109. 331-351. GINGERICH, P. D. 1979. Stratophenetic approach to 1980. Analysis of SCHUH, R. T., AND J.T. POLHEMUS. taxonomiccongruence among morphological,ecophylogeny reconstruction in vertebrate paleontol- 1983 CAMINALCULES: DATA BASE logical, and biogeographic data sets for the Leptopodomorpha (Hemiptera). Syst. Zool., 29:1-26. SIMPSON, G. G. 1944. Tempo and mode in evolution. Columbia Univ. Press, New York. SIMPSON, G. G. 1949. The meaningof evolution.Yale Univ. Press, New Haven, Connecticut. SIMPSON, G. G. 1953. The major featuresof evolution. Columbia Univ. Press, New York. SNEATH, P. H. A. 1982. [Review of] Systematicsand biogeography:Cladisticsand vicariance.Syst.Zool., 31:208-217. SNEATH, P. H. A., AND R. R. SOKAL. 1973. Numerical taxonomy.W. H. Freeman,San Francisco. SOKAL, R. R. 1966. Numerical taxonomy.Sci. Am., 215:106-116. SOKAL, R. R. 1974. Classification:Purposes, principles, progress,prospects.Science, 185:1115-1123. SOKAL, R. R. 1983a. A phylogenetic analysis of the Caminalcules. II. Estimating the true cladogram. Syst. Zool., 32:185-201. SOKAL,R. R. 1983b. A phylogeneticanalysis of the 181 Caminalcules. III. Fossils and classification.Syst. Zool. (in press). SOKAL,R. R., AND F. J.ROHLF. 1966. Random scanning of taxonomiccharacters.Nature,210:461-462. SOKAL,R. R., ANDF. J.ROHLF. 1980. An experiment in taxonomicjudgment. Syst. Bot., 5:341-365. SOKAL,R. R., AND F. J.ROHLF. 1981a. Taxonomic congruence in the Leptopodomorpha re-examined. Syst. Zool., 30:309-325. SOKAL,R. R., AND F. J.ROHLF. 1981b. Biometry.2nd ed. W. H. Freeman,San Francisco. STANLEY, S. M. 1979. Macroevolution, patternand process. W. H. Freeman,San Francisco. THROCKMORTON, L. H. 1968. Concordance and discordance of taxonomic characters in Drosophila classification.Syst. Zool., 17:355-387. WILEY,E. 0. 1981. Phylogenetics.JohnWiley,New York. Received14 January1983; accepted11 April1983. APPENDIX ListofCharacters for77 Fossiland RecentCaminalculesa,b Head and neck 1. Head junction complex (folded) (1) or not (0). [73, 39] 2. If complex, degree of folding complete (1) or partial (0). [73, 10] 3. If partial,secondary junction narrow (0) or broad (1). [10, 21] 4. P (1) or A (0) of horn. [75, 73] 5. If horn present,pointed (1) or flattened(0). [75, 64] 6. Length of head in mm fromrear end of folded section to front,excluding horn and anteriorprojections. If head not complex, measure fromcollar. Recoded as integer in the following manner: (0) 9-10.9; (1) 10.9-12.8; (2) 12.8-14.7; (3) 20.4-22.3; (4) 24.2-26.1; (5) 26.1-28. [73, 65, 60, 28, 25, 39] 7. Anteriorend of head concave (0), flat(1) or convex (2). [71, 73, 62] 8. If convex, rounded (0) or sharply pointed (1). [62, 15] 9. P (1) or A (0) of anteriorprojections.[71, 73] 10. P (1) or A (0) of eyes. [73, 59] 11. If present, states of fusion of eyes: (0) two separate eyes; (1) grown togetherbut two eyes still discernible; (2) grown togetherinto oblong approximatelythe size of two eyes; (3) grown together into circle approximatelythe size of one eye. [73, 47, 1, 16] 12. P (1) or A (0) of eye stalks. [37, 73] 13. If stalked,length of stalk (excluding eye) in mm. Recoded as integerin the following manner: (0) 3-4.5; (1) 4.5-5.9; (2) 6-7.5; (3) 10.5-12; (4) 13.5-15; (5) 16.5-18. [37, 38, 70, 48, 72, 19] I These characterswere originallydefined by A. J.Boyce and I. Huber and were revised by B. Thomson and R. R. Sokal, and conventions: P-presence; A-absence. NumbAbbreviations bers in parenthesesare characterstatecodes. Numbers in square brackets following definitionof each character are the OTU numbers of representativeOTUs that exhibit the characterstates in the orderdescribedin thelist.Thus forcharacter1 [73, 39] means thatOTU 73 is an example of a complex (folded) head junctionand OTU 39 is an example of a not complex head junction. Whenever an "If" statementis denied, as in character2 for OTUs whose state for character 1 is 0, an NC is recorded forthe OTU for that character.Since all measurementcharacterswere recorded to the nearestmm, there is no ambiguitycreated by shared, more refined class limits of adjacent classes as listed for some characters(e.g., for character6, a measurementof 10 mm was coded 0, while one of 11 mm was coded 1). The character numbers 1-85 employed in this list correspond to those employed in earlier studies and originallydefined by Boyce and Huber. Characters numbered 86-106 were needed to describe additional observed differencesin fossil species. In order to preserve both the old characternumbering systemand at the same time the logical order of charactersin the list, it became necessary to list some char- acters out of numerical order. The following paragraph gives the location of all charactersthatare not in strictnumerical order in the list. Characters26-28 in this list follow upon character22; 58 follows 13; 59 follows 56; 60 and 61 follow 48; 62 follows 47; 78 follows 80; 84 follows 90 (23); 85 follows 63; 86-90 follow 23; 91 and 92 follow 34; 93-95 follow 35; 96 follows 45; 97 follows 62 (47); 98 follows 61 (48); 99 follows 55; 100 and 101 follow 85 (63); 102 follows 67; 103 follows 69; and 104-106 follow 75. The following lists of characterscomprise the several body regions: head and neck, 1-17, 58 (total of 18); anteriorappendages, 18-30, 84, 86-90 (total of 19); posterior appendages, 31-38, 91-95 (total of 13); collar and abdomen, 39-57, 59-63, 85, 96-101 (total of 31); abdominal pores, 64-83, 102-106 (total of 25). The overall total is 106 characters. The images shown in Figure 1 (OTUs 1-29) were traced from the originals, whereas those in Figure 2 were traced fromxeroxes of the originals. In the process and in the reductionnecessaryforillustration here some detail visible to the data coders was necessarily lost. The states of at least one character(17) are quite indetectable in the published figures.Lengths in millimetersare given in terms of the dimensions of the original images. As reduced in Figures 1 and 2, these measurementsmust be multiplied by 0.387. 182 SYSTEMATIC ZOOLOGY VOL. 32 58. If stalked, stalks fused (1) or not (0). [20, 37] 14. If stalks not fused, tips divergent(0), parallel (1) or convergent(2). [37, 19, 31] 15. Top of head depressed (0), flat(1) or crested (2). [73, 41, 53] 16. If crested,single (0) or lobate (1). [53, 2] 17. P (1) of A (0) of groove in neck. [12, 73] Anteriorappendages 18. P (1) or A (0) of anteriorappendages. [73, 15] 19. If appendages present,length in mm fromtangent to posteriorend of abdomen (excluding posterior appendages) up to the anterior end of the longer appendage. Recoded as integer in the following manner: (0) 40-46.3; (1) 46.3-52.6; (2) 52.6-58.9; (3) 58.9-65.2; (4) 65.2-71.5; (5) 71.5-77.8; (6) 77.8-84.1; (7) 84.1-90.4; (8) 96.7-103. [73, 74, 45, 43, 66, 24, 17, 19, 20] 20. If appendages present,P (1) or A (0) of flexion(elbow) in appendages. [41, 73] 21. If appendages present,end of appendage divided (1) or not (0). [56, 73] 22. If not divided, end of appendage is tendril(0), sharplypointed (1), tapered or rounded (2) or knobbed (3). [21, 68, 73, 45] 26. If not divided, P (1) or A (0) of flange.[38, 73] 27. If flange present, length expressed as sum of right and left sides in mm. Recoded as integerin the following manner: (0) 17-20; (1) 23-26; (2) 32-34.9; (3) 35-37.9; (4) 38-41; (5) 44-47. [38, 70, 48, 50, 20, 29] 28. If flange present, width expressed as sum of right and left sides in mm. Recoded as integer in the following manner: (0) 5-6.9; (1) 10.7-12.6; (2) 14.5-16.4; (3) 16.418.3; (4) 18.3-20.2; (5) 20.2-22.1; (6) 22.1-24. [38, 70, 48, 31, 29, 72, 26] 23. If divided, two divisions (0) or three (1). [22, 56] 86. If two divisions, ending as fingers(0), clamp (pincers without joint) (1), pincers with joint (2) or two wavy spikes (3). [22, 49, 30, 67] 87. If ending as clamp, closed (0) or open (1). [49, 36] 88. If open, clamp is small (0) or large (1). [36, 55] 89. If ending as pincers with joint, serratededges P (1) or A (0). [57, 30] 90. If three divisions, one "finger" much longer than the other two (2), one fingerslightly longer than the other two (1) or all fingersthe same size (0). [65, 76, 56] 84. If divided, bulbs on ends of divisions present on all divisions (2), present on some divisions (1) or absent (0). [18, 66, 56] 24. If bulbs present on all divisions, P (1) or A (0) of claws. [12, 46] 25. If bulbs present on all divisions, P (1) or A (0) of pads. [75, 46] 29. If appendages present,P (1) or A (0) of pigment. [9, 73] 30. If pigment present,distributedin small dots (-1), small circles (0), large circles (1) or very broad areas (2). [33, 9, 11, 21] Posteriorappendages 31. Appendage single (0), partially double (1) or completely double (2). (Completely double means that appendage originates from 2 stalks ratherthan 1; it does not necessarily mean that the 2 halves are separated.) [73, 43, 18] 32. If single, disclike (pedestal like) (0), platelike (1) or propellerlike (2). [73, 40, 36] 33. If disclike, width of disc in mm. Recoded as integer in the following manner: (0) 5-5.7; (1) 5.7-6.4; (2) 6.4-7.1; (3) 7.8-8.5; (4) 8.5-9.2; (5) 9.9-10.6; (6) 10.6-11.3; (7) 11.3-12. [13, 58, 39, 38, 37, 72, 42, 77] 34. If disclike, stalk long (>3 mm) (2), short(<3 mm) (1) or absent (0). [74, 73, 28] 91. If disclike, round (0) or square (1). [73, 74] 92. If square, P (1) or A (0) of division lines. [49, 74] 35. If platelike, no indentation (-1), weakly emarginate(curved indentation) (0), cleft(V-indentation) (1), or deeply cleft(V-indentationcontinued into division line) (2). [40, 33, 11, 6] 93. If propellerlike,blades rounded (0) or pointed (1). [36, 67] 94. If propellerlike,stalk short (<5 mm) (0) or long (>5 mm) (1). [36, 55] 95. If stalk long, straight(0) or twisted(1). [55, 30] 36. If completelydouble, ends pointed (0), clubbed (1) or platelike (2). [3, 75, 18] 37. If platelike, divided (1) or not (0). [66, 18] 38. If divided, two divisions (0) or three (1). [66, 53] Collarand abdomen 39. Rim of abdomen plain (0), narrowlyraised (1) or broadly raised (2). [64, 45, 73] 40. Anteriormargin of abdomen well delineated (0) or imperfectlydelineated (1). [73, 31] 41. P (1) or A (0) of abdominal ridge (raised area on central posteriorabdomen). [73, 45] 1983 CAMINALCULES: DATA BASE 183 42. P (1) or A (0) of large areas of color on anteriorabdomen. [56, 73] 43. If present, whole (2), bisected (1), quadrisected and square (0) or quadrisected and irregularly shaped (-1). [66, 46, 56, 65] 44. P (1) or A (0) of spots on anteriorabdomen. [44, 73] 45. If present,small (0), large (1) or large and cross-shaped (2). [44, 27, 35] 96. If small, 2 (0) or 4 (1) spots. [17, 44] 46. P (1) or A (0) of postabdominal spots. [44, 73] 47. If postabdominal spots present,large (open circles) (0), medium (> 1.5 and ?2.0 mm avg.) (1), small (>1.0 and <1.5 mm avg.) (2) or dots (<1.0 mm) (3). [56, 47, 44, 35] 62. If large, anteriorlateral spots medially fused (0), posteriorlyfused (1) or free (2). [66, 53, 56] 97. If postabdominal spots present,P (1) or A (0) of a group of fourfree posteriorspots. [44, 56] 48. If group present,posteriorgroup of fouris stronglyconcave (0), weakly concave (1), or convex (2). [24, 44, 69] 60. If postabdominal spots present,number of free (not fused) spots on postabdomen: zero (-1), two (0), four (1), six (2), eight (3), ten (4), twelve (5) or fourteen(6). [77, 3, 66, 53, 56, 44, 63, 76] 61. If six or eight large(char. 45), freespots, P (1) or A (0) of gap between second and thirdrows of spots. [64, 18] 98. If postabdominal spots present, and anterior abdominal spots (char. 44) also present, spots connected into racquet design (either by lines or by fusion) (1) or not (0). [77, 44] 49. Posteriorend rounded (0) or angular (1). [73, 21] 50. P (1) or A (0) of dorsal fin.[50, 73] 51. If present,length of attachmentin mm. Recoded as integerin the following manner: (0) 7-7.7; (1) 7.7-8.4; (2) 8.4-9.1; (3) 9.8-10.5; (4) 13.3-14. [48, 50, 31, 20, 29] 52. If present,posteriorapex projectingbeyond rear of attachment(2), not projecting(1) or projecting anteriorto rear of attachment(0). [72, 50, 20] 53. Posteriorabdomen swollen (1) or not (0). [35, 73] 54. Posteriorbars (apparentlyfused spots) absent (0), single (1) or double (2). [73, 2, 43] 55. If double, shortestdistance between bars in mm. Recoded as integerin the following manner: (0) 1-1.8; (1) 1.8-2.6; (2) 6.6-7.4; (3) 8.2-9. [43, 18, 75, 64] 99. If double, elongate (0) or fatand roughly triangular(1). [43, 56] 56. If elongate, straight(0), curved (1), weakly J-shaped(posteriorends thickened)(2), or strongly J-shaped(3). [64, 43, 3, 12] 59. If double, P (1) or A (0) of large cross-shapedspot in postabdomen median to anteriortips of bars. [3, 56] 57. If single, semicircular(0) or U-shaped (1). [5, 2] 63. Greatestwidth of abdomen in mm. Recoded as integer in the following manner: (0) 15.95-17.95; (1) 19.95-21.95; (2) 21.95-23.95; (3) 23.95-25.95; (4) 27.95-29.95; (5) 29.95-31.95; (6) 33.95-35.95; (7) 36. [20, 48, 54, 73, 35, 63, 24, 47] 85. Length of abdomen in mm frombase to tip of collar. Recoded as integer in the following manner: (0) 38-42.9; (1) 42.9-47.8; (2) 47.8-52.7; (3) 52.7-57.6; (4) 57.6-62.5; (5) 72.3-77.2; (6) 82.1-87. [73, 47, 64, 75, 50, 19, 20] 100. P (1) or A (0) of lateral flippers.[42, 73] 101. If present,small (0), medium (1) or large (2). [42, 51, 61] Abdominalpores 64. P (1) or A (0) of large pore in column 10 (most anteriorposition). [45, 73] 65. Group II (columns 4 and 5) has zero pores (-1), one pore (0) or two pores (1). [61, 69, 73] 66. If two pores, fused (1) or not (0). [5, 73] 67. If fused, fusion complete (0) or incomplete (1). [48, 5] 102. If not fused,number of pores looking like slits: zero (0), one (1) or two (2). [73, 68, 51] 68. Group III (columns 6-10) has one (0), two (1), three (2) or five (3) pores. [64, 18, 43, 73] 69. If two or three pores, one pore looks like a slit (1) or not (0). [43, 76] 103. If five pores, the two pores closest to group II look like slits (1) or not (0). [56, 73] 70. If five pores, P (1) or A (0) of fusion. [50, 73] 71. If fusion present,complete (0) or incomplete (1). [29, 50] 72. If complete, one (0) or two (1) groups of fused pores. [72, 29] 73. If incomplete,P (1) or A (0) of a group of three fused pores. [26, 50] 74. Group I (columns 1-3) has zero (0), one (1), two (2) or three (3) pores. [47, 24, 35, 73] 75. If one or two pores, one pore looks like a slit (1) or not (0). [63, 35] 104. If two or three pores, pore nearest posteriorenlarged (1) or not (0). [42, 73] 105. If three pores present and posteriorone enlarged, eniarged pore apparentlysee-through(1) or not (0). [30, 42] 106. If see-through,fused with posteriorpore fromother side's Group I (1) or not (0). [57, 30] 184 SYSTEMATIC ZOOLOGY VOL. 32 76. If three pores, separated as three open pores of any size (0) or in some way modified(1). [73, 46] 77. If modified,modifiedby fusion (1) or some other way (0). [64, 46] 79. If fusion,two (0) or three (1) pores fused. [50, 64] 80. If fusion,broad and complete (0), broad and incomplete (1), or slit-like(2). [64, 66, 71] 78. If broad and complete and three pores fused (char. 79), both lateral pores reduced and joining median one (1) or not (0). [22, 64] 81. If modifiedin some other way, P (1) or A (0) of line (or slit) connecting pores. [46, 65] 82. If present,two (0) or three (1) interconnectedpores. [53, 46] 83. If absent, slits in column 1 (0), column 2 (1), columns 1 and 2 (2), or column 3 (3). [60, 65, 8, 51]