Electronic Supplementary Material Details of sample collection, lab, and analytical methods
Transcription
Electronic Supplementary Material Details of sample collection, lab, and analytical methods
Electronic Supplementary Material Details of sample collection, lab, and analytical methods Sample collection—Stool samples were collected opportunistically from the forest floor in Ndundulu in 2005-2007 while following groups of kipunji for ecological study (Jones, unpubl.). The locations of freshly excreted samples (Supplementary Table 1) were recorded using handheld GPS units (Garmin 60Cx, Garmin, UK), before placing each individual scat in RNAlater stabilization solution (Ambion, Inc., Austin, TX). Based on the collection locations, dates, and conditions, we are confident that the sequenced samples come from six different animals. All samples were exported under permits from the Wildlife Division of Tanzania and the Convention on International Trade in Endangered Species. A male subadult kipunji was found dead by Claire Bracebridge on 15 July 2008 in Livingstone Forest (Rungwe District, Mbeya Region, Tanzania; 9.20483º S, 33.89046º E, WGS84, 1872 m), approximately 2.5 km east of Kiguru village. This specimen was prepared as a museum study skin, skull, and fluid-preserved carcass and is deposited at the Wildlife Conservation Society (WCS) Southern Highlands Conservation Programme office, Mbeya Tanzania (SHCP 2458). Muscle and kidney samples were preserved in EDTA buffer solution. The second specimen was a living but injured kipunji being held by residents of Syukula village on Mt. Rungwe (09.166510º S, 33.63304º E, 1770 m) discovered by Noah Mpunga on 1 May 2007. The tail tip of this animal had been cut, and Mpunga was able to remove a small amount of tissue from the wound before releasing it otherwise unharmed (tissue sampling was conducted in accordance with the guidelines of the American Society of Mammalogists' Animal Care and Use Committee 1 [Gannon et al. 2007]) . This sample was delivered to and exported by WTS and is logged in his field catalogue (archived at the Field Museum, Chicago, IL, USA) as WTS 9308. Lab methods—We extracted DNA from stool samples using the Qiagen DNA Stool Mini (QIAGEN, Hilden, Germany) protocol for liquid samples, and from tissue samples using the PureGene Animal Tissue kit (Gentra Systems Inc., Minneapolis, MN, USA). Extractions from stool samples were done in a separate lab in a PCR-free building. We PCR-amplified DNA fragments in 10-25 uL reactions with Promega GoTaq (Promega Corp., Madison, WI) following the manufacturer’s recommended PCR protocol. PCR amplification of DNA from stool samples also included bovine serum albumin (BSA; New England Biolabs, Ipswich, MA) at a concentration of 0.1 mg/mL. We used the following primers: for 12SrRNA, L1091 and H1478 (Kocher et al. 1989); for COI, OWMCO-If and OWMCO-Ir (Lorenz et al. 2005); for CO2, CO2F2 (Davenport et al. 2006) and BCO2R1 (Switzer et al. 2005); and for ND4/5, A896LF, A896HR, B896LF, B896HR, or H12652 (Newman et al. 2004). We purified 20 µL of the amplicons using 0.25 µL exonuclease I, 0.50 µL shrimp alkaline phosphatase, and 2.0 µL 10x buffer (USB Corp., Cleveland, OH, USA) at 37°C for 15 minutes followed by 80°C for 15 minutes, or by vacuum filtration through a Millipore HTS plate (Millipore Corp., Billerica, MA, USA) following the manufacturer’s instructions. Amplicons were cycle sequenced in both directions using the amplification primers and ABI BigDyes 3.1 dye termination (Applied Biosystems, Foster City CA), purified by centrifugal filtration through Sephadex G-50 fine (Amersham Biosciences, Uppsala, Sweden) in a multiscreen filter plate (Millipore Corp., Billerica, MA, USA), and sequenced on an ABI 3130 or ABI 3130xl automated sequencer. Sequences were visualized, edited, and assembled in Sequencher 4.8 (GeneCodes, Ann Arbor 2 MI). Not all fragments were sequenced for all stool samples (Table 1). Sequences have been deposited in GenBank under accession numbers GU068059–GU068086. Taxon sampling—We downloaded representative sequences for non-papionin outgroups and all available papionin sequences for the sequenced fragments, excluding those thought to be nuclear copies (numts) by their authors. These fragments allow us to maximize our comparison to Papio sequences previously published by other investigators, especially Zinner et al. (2009), Burrell et al. (2009), Wildman et al. (2004), Newman et al. (2004), Switzer et al. (2005), van der Kuyl et al. (1995), and Lorenz et al. (2005), so the data sets for different fragments contain substantially different sampling within Papio. All mitochondrial genes are linked and therefore share a phylogenetic history, so concatenation is justified and should provide more power than analyses of single fragments. However, individual analyses can help to identify unexpected conflict, which could be a sign of contamination or the amplification of nuclear copies of mitochondrial genes (numts). A different sample of Papio sequences is available for each of these fragments, so individual analyses also allowed us to make the greatest use of comparative sequence data. For outgroups and cercopithecines outside Papio, we concatenated sequence fragments from a single individual (for example, from complete mitochondrial genomes) wherever possible but from different individuals where necessary to maintain adequate taxon sampling. Sampling of individuals within Papio is critical in this study, and concatenating fragments from different individuals at this level could influence results, so we excluded Papio individuals in the combined analysis for which COI, COII, and 12S were not all available. Only one Papio sequence includes all four 3 fragments, so we excluded the ND4/ND5 fragment from the combined analysis. Because all Rungwecebus stool samples were identical for all sequenced genes, we included this Ndundulu haplotype only once in the analyses. Mitochondrial sequence verification—The amplification of nuclear copies of mitochondrial genes (numts) is often a concern with mitochondrial data. We used two analytical approaches to screen for possible numt contamination. First, we checked every individual fragment to make sure it conformed to the expected characteristics of a true mitochondrial sequence, including base composition, absence of frameshift mutations and stop codons (for coding genes), and absence of mutations in conserved, pairing stem regions (for 12S rRNA and the three tRNAs). Second, we analyzed the four individual fragments separately. It is unlikely that all four of our sequence fragments are parts of a single numt insertion, as most numts are shorter than the span of the mitochondrial genome that would involve. However, it is also unlikely that four independent numt insertions would result in the same gene tree topology for the separate fragments. Our sequences have the characteristics of typical mammalian mtDNA, and the four individual data sets yield consistent phylogenetic results for the stool samples; together, these factors make it extremely unlikely that our results are due to accidental numt amplification. Alignment—We manually aligned all sequences. For COI, COII, ND4, tRNA-His, and tRNASer (AGY), alignment is unambiguous as there is no length variation. In ND5, there is a 3-bp deletion in three P. ursinus sequences (AY212057-AY212059), which we parsimoniously inferred as a single codon deletion. We aligned 12S and tRNA sequences manually to detailed secondary structure models (12S model based on Springer & Douzery 1996; tRNA structure 4 information from Mamit, http://mamit-trna.u-strasbg.fr/), which can provide a more accurate and biologically realistic alignment and analysis by explicitly including the stem-and-loop structure of ribosomal sequences (Kjer et al. 2009). We excluded 56 alignment positions from the 12S analysis for which we considered the alignment ambiguous. Our alignments, including details of excluded sites and secondary structure, are available from Dryad (http://www.datadryad.org). Phylogenetics—We used PAUP*4.0b10 (Swofford 2002) and MrBayes 3.1 (Ronquist and Huelsenbeck 2003) for phylogenetic analyses. We selected substitution models for likelihood bootstrapping and for individual-gene Bayesian analyses with the Akaike Information Criterion (AIC) by scoring possible models on a maximum parsimony tree. For COI and COII we applied an HKY model, with gamma rate variation (G) and invariant sites (I). For 12S, we applied the doublet model (Ronquist & Huelsenbeck 2003) to paired nucleotides in the stem regions, with separate HKY parameters for the paired and unpaired partitions and a single gamma shape parameter, and with relative rates varying between the two partitions. For ND4/5 we used a single GTR+I+G model. For the combined data, we used a partitioned model split into 12S pairing, 12S nonpairing, and 1st, 2nd, and 3rd codon positions. The 12S pairing partition was assigned a doublet model. All partitions were assigned an HKY model, with the rate ratio linked for the two 12S partitions and for the three coding-gene partitions. Base frequencies were estimated separately for the five partitions, and a single gamma shape parameter was estimated across all partitions. Among-partition rate variation was modeled by using a variable (Dirichlet) prior on relative rates. 5 In all MrBayes analyses, we used two separate runs from random starting parameters, each with four chains (1 cold, 3 heated; heating parameter 0.2). We ran each analysis for 20 million generations, sampling every 1000 generations, and excluded the first 1001 samples (1 million generations) as burnin. We used our own scripts and the package coda (Plummer et al. 2006) in R 2.8.1 (R Development Core Team 2004) to assess convergence between runs and the behavior of Markov chains within runs, including effective sample sizes, autocorrelation, and parameter variances. In addition to posterior probability for the combined data, we estimated bootstrap support with 1000 parsimony bootstrap replicates and 500 likelihood bootstrap replicates. For likelihood bootstrapping, we used a Tamura-Nei + I + G model. We tested whether an unconstrained topology was significantly better than one with a single monophyletic Rungwecebus clade using a Shimodaira-Hasegawa test in PAUP* 4.0b10. Under a GTR+I+G model, the best unconstrained topology (log-likelihood -9972.00572) was significantly better than the best constrained topology (log-likelihood -10012.20392; RELL bootstrap P = 0.003). In a Bayes factor test, the log-Bayes factor for the unconstrained topology compared to the constrained topology, based on 15,000,000 post-burnin generations in MrBayes with identical likelihood models, was 37.77, overwhelmingly supporting the unconstrained topology. 6 Supplementary references: Gannon, W. L., Sikes, R.S., & the Animal Care and Use Committee of the American Society of Mammalogists. 2007 Guidelines of the American Society of Mammalogists for the use of wild mammals in research. J.Mamm. 88, 809-823. Kjer, K.M., Roshan, U., & Gillespie, J.G. 2009 Structural and evolutionary considerations for multiple sequence alignment of RNA, and the challenges for algorithms that ignore them. In Sequence alignment: Methods, models, concepts, and strategies (ed. Rosenberg, M.S.), pp. 105-150. Berkeley: University of California Press. Kocher, T.D., Thomas, W.K., Meyer, A., Edwards, S.V., Pääbo, S., Villablanca, F.X., & Wilson, A.C. 1989 Dynamics of mitochondrial DNA evolution in animals: amplification and sequencing with conserved primers. Proc. Natl. Acad. Sci. USA 86, 6196-6200. Lorenz, J.G., Jackson, W.E., Beck, J.C., & Hanner, R. 2005 The problems and promise of DNA barcodes for species diagnosis of primate biomaterials. Phil. Trans. R. Soc. B 360, 18691878. Newman, T.K., Jolly, C.J., & Rogers, J. 2004 Mitochondrial phylogeny and systematics of baboons (Papio). Am. J. Phys. Anthro. 124, 17-27. Plummer, M., Best, N., Cowles, K., & Vines, K. 2009. coda: output analysis and diagnostics for MCMC. R package version 0.13-4. R Development Core Team 2008. R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org. Ronquist, F. & Huelsenbeck, J.P. 2003 MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19, 1572-1574. 7 Springer, M.S. & Douzery, E. Secondary structure and patterns of evolution among mammalian mitochondrial 12S rRNA molecules. J. Mol. Evol. 43, 357-373. Switzer, W.M., Salemi, M., Shanmugam, V., Gao, F., Cong, M., Kuiken, C., Bhullar, V., Beer, B.E., Vallet, D., Gautier-Hion, A., Tooze, Z., Villinger, F., Holmes, E.C., & Heneine, W. 2005 Ancient co-speciation of simian foamy viruses and primates. Nature 434, 376-380. Swofford, D.L. 2002. PAUP: Phylogenetic inference using parsimony (and other methods), version 4.0b10. Sunderland, MA: Sinauer and Associates. van der Kuyl, A.C., Kuiken, C.L., Dekker, J.T., & Goudsmit, J.1995 Phylogeny of African monkeys based upon mitochondrial 12S rRNA sequences. J. Mol. Evol. 40, 173-180. Wildman, D.E., Bergmann, T.J., al-Aghbari, A., Sterner, K.N., Newman, T.K., Phillips-Conroy, J.E., Jolly, C.J., & Disotell, T.R. 2004 Mitochondrial evidence for the origin of hamadryas baboons. Mol. Phylogenet. Evol. 32, 287-296. 8