THE EFFECT OF TRANSMISSION MODE ON GENETIC DIVERSITY

Transcription

THE EFFECT OF TRANSMISSION MODE ON GENETIC DIVERSITY
The Pennsylvania State University
The Graduate School
Biology Department
THE EFFECT OF TRANSMISSION MODE ON GENETIC DIVERSITY IN ZUCCHINI
YELLOW MOSAIC VIRUS
A Dissertation in
Biology
by
Heather Simmons
© 2011 Heather Simmons
Submitted in Partial Fulfillment
of the Requirements
for the Degree of
Doctor of Philosophy
December 2011
The Dissertation of Heather Simmons was reviewed and approved* by the following:
Andrew G. Stephenson
Distinguished Professor of Biology and Assistant Department Head for Research
Dissertation Co-Advisor
Edward C. Holmes
Professor of Biology and Eberly College of Science Distinguished Senior Scholar
Dissertation Co-Advisor
Andrew Read
Professor of Biology and Entomology
Eberly College of Science Distinguished Senior Scholar
Chair of Committee
Fred Gildow
Professor of Plant Pathology and Head of Plant Pathology Department
Michael Axtell
Associate Professor of Biology
Douglas Cavener
Professor and Head of Biology Department
*Signatures are on file in the Graduate School
iii
ABSTRACT
This dissertation consists of six chapters: an introduction, four data chapters and a
conclusion. In the introduction I provide general background and information on the study
system, zucchini yellow mosaic virus (ZYMV), and one of its host species, Cucurbita pepo ssp.
texana (a wild gourd). Also included in this section are background on the methods that I have
used, which are Bayesian coalescent and tree building methods.
The first study (chapter two) was motivated by the fact that plant RNA viruses were
considered more genetically stable than animal RNA viruses. Animal RNA viruses are assumed
to achieve extremely high levels of genetic diversity as a result of their high mutation rates, rapid
replication rates and large population sizes. However, it was believed that the same did not hold
true for plant RNA viruses due to a combination of lower mutation rates, weaker immune
selection, as well as the result of genetic bottlenecks during systemic movement through the plant
and during horizontal transmission by aphids. Therefore, we determined the mean rate of
nucleotide substitution for the coat protein (CP) of Pennsylvanian ZYMV samples using a
Bayesian coalescent approach to be 5.0 x10-4 subs/site/year (4x10-4 - 8x10-4), which is within the
range of those found for animal RNA viruses. As scant data were available on the timescale of the
evolution of this virus within the Cucurbitaceae (squash, melon, cucumber), using the same
approach we found the time to the most recent common ancestor for the lineages of ZYMV we
sampled to be approximately 400 years (HPD: 119-771 years) with a possible origin in Asia. In
addition, we found evidence in support of purifying selection (dN/dS = 0.108). We also
undertook an analysis of phylogeographical structure and found in situ evolution of ZYMV
within individual countries, suggesting intermittent movement of ZYMV across geographic
boundaries.
iv
Since we had established that the substitution rate estimate was in accord with those
previously observed in animal RNA viruses, we sought to determine if plant RNA viruses exhibit
quantifiable intra-host genetic diversity in the second study (chapter three). Most plant viral
genetic diversity studies had focused on genetic diversity at the inter-host level; however, there
was no consistency in the results of those studies that had considered intra-host genetic diversity.
In addition, it was believed that population bottlenecks associated with both aphid-vectored
transmission, as well as with systemic movement through the plant, drastically reduced the
effective population size. Although there had been some in vitro work on the effect of population
bottlenecks on viral genetic diversity in plant viruses, little work had been conducted in natural
systems. Therefore, to assess intra-host genetic diversity, as well as the effect of the aphid
induced population bottleneck on viral genetic diversity, we generated intra-host sequence data
for the CP gene of ZYMV from two horizontally transmitted populations: one aphid-vectored and
the other mechanically inoculated (to avoid aphid-related bottlenecks). We also sampled multiple
time points from individual plants to assess intra-host viral genetic diversity. We determined that
despite the relatively frequent generation of mutations, most of these occurred only transiently, as
they were deleterious and tended to be purged rapidly from the population. There appeared to be
more population structure in the aphid vectored clones as indicated by multiple clones bearing the
same mutations, the presence of a distinct sub-lineage, as well as several clones being more than
one mutational step away from the consensus sequence. We also observed possible evidence of
complementation occurring in trans. Unlike most comparable studies, we quantified the error rate
associated with the RT-PCR procedure used in this study. In doing so, we determined it was
high enough to cause a portion of the mutations detected, indicating future intra-host studies of
this nature should quantify the extent to which detected mutations are artificially induced.
v
Although the CP is the most frequently studied protein of ZYMV, it is a multifunctional
protein that is not the sole protein involved in aphid transmission. Therefore, we decided to
undertake full genome sequencing of these samples in the third study (chapter four). We had
sequenced a limited number of clones with conventional cloning and Sanger sequencing (we
averaged 35 clones per sample), but it was extremely difficult to detect minor variants with this
method. Thus, we sought to increase coverage by uncovering mutations present in the population
at low frequencies using deep sequencing. We used the same aphid vectored and mechanically
inoculated samples from the previous study with a few modifications: we increased the number of
time points in the field samples, and increased the number of serial passages in a mechanically
transmitted experiment. We found that mutations persist during inter-host transmission events in
both the aphid vectored and mechanically inoculated populations, suggesting that the vectorimposed bottleneck is not as extreme as previously supposed. Likewise, we found that mutations
persist intra-host over time, indicating that systemic bottlenecks may not constrain viral genetic
variation as severely as previously suggested. In addition, differential selective pressures as a
result of transmission mode was suggested by the presence of minor alleles that move to fixation
in the aphid vectored plants, but remain as low frequency alleles in the mechanically inoculated
plants. We determined that the high level of coverage obtained during deep sequencing makes it
the preferred method for detecting low frequency variants in the population.
The fourth study (chapter five) was prompted by the results I obtained while procuring
vertically transmitted samples of ZYMV for sequencing, which showed that the seed transmission
rate of ZYMV was three orders of magnitude greater than the most commonly cited rate
(0.047%). Whether or not seed transmission occurred in ZYMV was a controversial issue as the
rates in the literature ranged from 0-18.9%. Therefore, to definitively determine what the seed
transmission rate of ZYMV was in C. pepo, we measured the seed transmission rate of this virus
vi
by visual inspection, RT-PCR, and antibody tests. We found a seed transmission rate of 1.6%
using RT-PCR, and showed that vertically infected C. pepo plants are capable of initiating
horizontal ZYMV infections, both mechanically and via an aphid vector (Myzus persicae). Thus,
it appears that ZYMV infected seeds may act as viral reservoirs, thereby accounting for the
current geographic distribution of ZYMV. We also found that vertical ZYMV infection in C.
Pepo results in virtually symptomless infection and that antibody tests failed to detect vertical
ZYMV infection, suggesting that current methods used to detect seed-borne variants of this viral
pathogen need to be modified.
This dissertation explores the nucleotide substitution rate of ZYMV, the patterns and
extent of viral genetic diversity within individual hosts, the effect of transmission mode on this
diversity, as well as the vertical transmission rate of this virus. As a group, these studies reveal
the underlying mechanisms of an emerging RNA virus that will serve to aid in managing this
devastating crop pathogen. In addition, these studies highlight the need to consider how
methodological choices may impact viral population genetic results and, by extension, data
interpretation.
vii
TABLE OF CONTENTS LIST OF FIGURES.................................................................................................................. viii
LIST OF TABLES ................................................................................................................... ix
ACKNOWLEDGMENTS........................................................................................................ x
Chapter 1 Introduction ............................................................................................................. 1
The study systems ............................................................................................................ 3
Methods ............................................................................................................................ 11
Chapter 2 Rapid evolutionary dynamics of Zucchini yellow mosaic virus .............................. 15
Abstract ............................................................................................................................ 15
Introduction ...................................................................................................................... 15
Methods ............................................................................................................................ 18
Results and Discussion ..................................................................................................... 20
Chapter 3 Rapid turnover of intra-host genetic diversity in Zucchini yellow mosaic virus ..... 24
Abstract ............................................................................................................................ 24
Introduction ...................................................................................................................... 24
Methods ............................................................................................................................ 28
Results .............................................................................................................................. 32
Discussion ........................................................................................................................ 38
Chapter 4 Deep sequencing reveals persistence of intra- and inter-host genetic diversity in
natural and greenhouse populations of Zucchini yellow mosaic virus ............................. 41
Abstract ............................................................................................................................ 41
Introduction ...................................................................................................................... 41
Methods ............................................................................................................................ 45
Results .............................................................................................................................. 50
Discussion ........................................................................................................................ 59
Chapter 5 Experimental verification of seed transmission of Zucchini yellow mosaic virus .. 64
Abstract ............................................................................................................................ 64
Introduction ...................................................................................................................... 64
Methods ............................................................................................................................ 67
Results .............................................................................................................................. 70
Discussion ........................................................................................................................ 72
Chapter 6 Discussion............................................................................................................... 76
References ................................................................................................................................ 83
viii
LIST OF FIGURES
Figure 1-1: Diagram depicting different cell types that can be infected by the virus from
Principles of Plant Virology ............................................................................................. 10
Figure 2-1: Maximum likelihood tree of 55 ZYMV CP sequences. ........................................ 21
Figure 3-1: Experimental design of study ................................................................................ 30
Figure 3-2: Minimum spanning tree of the sequences ............................................................ 36
Figure 3-3: Spatial distribution of mutations in the CP gene from both the field and
greenhouse experiments ................................................................................................... 37
Figure 4-1: Schematic representation of the field experimental design showing the spatial
relationship between individual plants ............................................................................. 46
Figure 4-2 Representative simulation of the resampling of illumina reads to estimate the
effect of coverage on the detection threshold of minor alleles......................................... 52
Figure 4-3: Effect of coverage in the probability of detecting the ZYMV coat protein
alleles ................................................................................................................................ 53
Figure 4-4: Variation in allele frequency over time and space of ZYMV variants ................. 57
Figure 4-5: Distribution of mutations across the ZYMV genome under field and
greenhouse conditions. ..................................................................................................... 59
Figure 5-1: Minimum-spanning tree of the seed clones........................................................... 72
ix
LIST OF TABLES
Table 2-1: Bayesian estimates of population dynamic and evolutionary parameters of the
CP gene of ZYMV ........................................................................................................... 22
Table 3-1: Summary of the ZYMV CP sequences from each infected plant under aphidvectored (field) and mechanically-inoculated (greenhouse) transmission ....................... 33
Table 4-1: Summary of genome coverage statistics of Illumina sequence data ...................... 51
Table 4-2: Summary of the 27 variants found in more than one sample ................................. 55
x
ACKNOWLEDGMENTS
First and foremost, I would like to thank my advisor Dr. Andrew Stephenson for his
support over the past five years. In particular, I am extremely grateful that he had enough faith in
me to allow me to pursue my ideas, no matter how ridiculous, and without which this thesis
would not have been possible. I feel honored and privileged to have had the opportunity to work
with Dr. Edward Holmes from whom I have learned more than I could possibly begin to list in
the allotted space. It has been a real pleasure to have the opportunity to work with Dr. Fred
Gildow, to whom I am indebted both personally and professionally for his advice, as well as the
use of his lab, greenhouse and resources. I would also like to thank Drs Andrew Read and
Michael Axtell for their invaluable insights, comments and contributions to this thesis.
In addition to my committee I have been extremely fortunate to be surrounded by a
tremendous network of collaborators, colleagues and friends. I would like to thank Dr. Stephen
Schaeffer, who has always been happy to provide advice, equipment and freezer space. I am
extremely grateful for the expertise and help that I obtained from Tony Omesis and William
Sackett —Tony for maintaining my endless experiments in the greenhouse, and William for his
advice, maintenance of my plants and aphids, as well as for teaching me how to perform
transmission tests. I am deeply appreciative of Kari Peter for taking me under her wing and I am
indebted to Siobain Duffy and Ben Dickins for their advice and help throughout my PhD. I am
extremely grateful to my undergrads, Melinda Bothe and Sarah Scanlon, who slaved away
performing thousands of mini preps and RNA extractions. I would also like to thank the 314office crew (Miruna Sasu, Andre Wallace, Lindsey Swierk, Renee Rosier and honorary office
crew member Dominique Cowart) for their support and kindness. To my friend and colleague,
Joseph Dunham, I would not have made it this far without you.
xi
Last but not least I would like to thank my husband, Aaron Parker, for his incredible
support and patience throughout this process, and most of all for not divorcing me, and to my son
Bradley, who has had to sacrifice so much while I have been in school: thank you for not
disowning me.
Chapter 1
Introduction
As a result of rapid replication rates, large population sizes, and high mutation rates,
populations of RNA viruses are thought to exhibit extremely high levels of genetic diversity.
Understanding the patterns of intra-host viral diversity is key to understanding the underlying
evolutionary mechanisms in RNA viruses, as high levels of genetic diversity have been linked to
the capacity of these viruses to evade host resistance mechanisms (Feuer et al, 1999; Lech et al,
1996), switch hosts (Jerzak et al, 2008), and alter virulence (Acosta-Leal et al 2011).
Estimates of the rates of molecular evolution in RNA viruses range between 10–2 to 10–5
nucleotide substitutions per site, per year (subs/site/ year) (Duffy et al, 2008). When I began my
dissertation project, it was believed that plant RNA viruses evolved more slowly than their animal
counterparts (Blok et al., 1987; Fraile et al., 1997; Kim et al., 2005; Marco & Aranda, 2005;
Rodriguez-Cerezo et al., 1991). This was thought to be due to weaker immune mediated
selection, lower mutation rates and the effects of population bottlenecks on the viral population
(Garcia-Arenal et al., 2001). Hence, the first goal of this dissertation is to examine this
assumption that plant viruses evolve more slowly than their animal counterparts by
computing the mean substitution rate for the coat protein (CP) of Zucchini yellow mosaic
virus (ZYMV).
Although consensus sequences are valuable for inferring phylogenetic relationships
between populations, they are less informative of intra-host genetic diversity because the
consensus sequence represents average viral diversity within a population, typically the most
prevalent viral strains, masking the diversity of individual virions. In addition, most plant RNA
2
viral genetic diversity studies had been conducted at the inter-host level, and those that had
examined intra-host viral genetic diversity reported conflicting rates. For instance, limited
(<0.1%) intra-host genetic diversity was observed by Turturo et al. (2005) in Grapevine leafrollassociated virus 3, while higher levels of intra-host diversity were observed by Teycheney (2005)
using Banana mild mosaic virus, who observed divergence levels of more than 15% in a third of
the sequences obtained. Intermediate levels of nucleotide diversity (ranging from 0 to 2.4%) were
found by Jridi et al. (2006) using Plum pox virus measured over 13 years in a prunus tree.
Although the population sizes achieved by plant RNA viruses are expected to be
extremely high (e.g. 1011 – 1012 virions per infected leaf in Tobacco mosaic virus) (Garcia-Arenal
et al, 2003), it is believed that that the effective population sizes (Ne) are significantly lower
(García-Arenal et al. 2001), mostly as the result of population bottlenecks. Population bottlenecks
are thought to occur during several stages in the viral lifecycle: during vector transmission, during
systemic movement through the plant (that occur as the virus moves from cell-to-cell and tissueto-tissue), and as the virus enters the germ line. In fact, several studies report extremely low
numbers of virions being transmitted per transmission event. Moury et al (2007), using an in vitro
system, reported on average, only 0.5-3.2 Potato virus Y virions are transmitted per aphid; Ali et
al (2006) determined the number of virions transmitted from mechanically infected squash plants
to healthy plants via aphids (Aphis gossypii and Myzus persicae) was three virions on average for
both aphid species. Betancourt et al (2008) using Cucumber mosaic virus (CMV) estimated that
only one or two complete genomes of this multipartite virus are transmitted by aphids. Similar
drastic population bottlenecks have been reported during systemic movement. For instance,
Sacristan et al (2003), using Tobacco mosaic virus (TMV), estimated that the founding
population in a new leaf after systemic movement within tobacco to be between two and 20
virions, and French & Stenger (2003) determined that approximately four virions of Wheat streak
mosaic virus appeared to be involved in the invasion of new tillers of wheat. Likewise, Li &
3
Roossinck (2004) reported similar results from examining the movement of 12 experimental
mutants of CMV in tobacco, in which they found that an average of seven mutants were found in
the eighth leaf and an average of five in the 15th leaf (distance from inoculated leaf). Genetic
bottlenecks have also been observed in cell-to-cell movement of Soil-borne wheat mosaic virus,
where Miyashita & Kishino (2010) determined the cell-to-cell bottleneck to be ~6 virions for the
initial movement from the infected cell and ~5 virions in subsequent movements. Therefore,
severe bottlenecks appear to be common modifier of plant viral populations and are likely to have
a large impact on virus evolution.
Although several studies have explored the effect of artificially induced population
bottlenecks, very little work had been done to asses the effects of population bottlenecks as they
occur in nature (Li & Roossinck, 2004). In addition, a comparative study undertaken by
Schneider and Roossinck (2001) showed that mutation frequencies tended to be higher in plant
protoplasts than in intact plants, indicating that in vitro studies are not necessarily representative
of in planta conditions. Thus, the second goal of this dissertation is to assess the impact of
population bottlenecks on intra- and inter-host genetic diversity in plants growing under
greenhouse and field conditions.
The Study Systems
Zucchini yellow mosaic virus
Zucchini yellow mosaic virus (ZYMV), a member of the family Potyviridae, is a singlestranded, positive-sense RNA virus. Although ZYMV was initially discovered in Italy in 1973, it
was not formally described until 1981 (Lisa et al., 1981). Remarkably, within the next two
decades, this virus achieved a worldwide distribution and is thus considered to be an emerging
virus (Desbiez & Lecoq, 1997). Although the virus is present in temperate, subtropical and
tropical regions, few potential reservoirs have been identified. Natural infection appears to be
limited to members of the Cucurbitaceae, and the virus has been reported in wild cucurbits in the
4
United States, Jordan and Sudan; however, no natural reservoirs have been reported in temperate
regions (Debiez & Lecoq, 1997).
Symptom severity is dependent upon the time of infection — the younger the plant is
when infection occurs, the more severe the resulting symptoms. In addition, the strain of ZYMV
and the environmental conditions, particularly temperature, appear to affect symptom severity
(Desbiez and Lecoq, 1997). ZYMV infection often results in severe stunting of the entire plant, as
well as a distinctive yellow mottling of the leaves, and infected leaves often exhibit blistering and
lacination (Desbiez & Lecoq, 1997). The fruits of ZYMV infected plants are often mottled and
distorted and although they are edible, they tend to be unmarketable. Cucurbit (squash, melon and
cucumber) production in the United States alone is estimated to be worth 1.5 billion per annum
and cucurbits rank among the 15 most important agricultural crops in the United States (Cantliffe
et al., 2007). Given that ZYMV has the capacity to reduce agricultural yields up to 94%, it is an
extremely significant crop pathogen (Blua & Perring, 1989).
Virus transmission
ZYMV is transmitted by aphids in a non-persistent manner. Also known as noncirculative or stylet-borne transmission (Nault, 1997), the virions remain on the stylet of the
aphid, where the aphid is believed to act as a “flying syringe”. Acquisition and inoculation occur
during a brief (< 1min) epidermal puncture that is part of a gustatory based food selection process
(Nault & Styer, 1972, Powell & Hardie, 2000). The intracellular portion of the aphid probe has
been divided into three sub phases (II-1, II-2 and II-3) (Powell et. al., 1995). Aphids are thought
to acquire virions during a brief (5-10 seconds) intracellular probe of either epidermal or
mesophyll cells (Lopez-Abella & Bradley, 1969; Powell, 1991) in II-3 (Martin et al., 1997). Viral
inoculation is thought to occur while the aphids are ejecting watery salvia during the first
intracellular puncture (II-1) (Powell, 2005). Ejection of watery salvia continues until a mesophyll
or epidermal cell is punctured, at which point it is believed that the watery salvia may switch to
5
gelling salvia (Martin et al., 1997). The virions are thought to associate with the distal third of the
maxillary stylet (Wang et al., 1996).
The virus is transmitted in what is termed the helper strategy, which differs from the
capsid strategy, in that the coat protein (CP) does not interact directly with the aphid stylet but
rather the CP interacts with the aphid mouthpart through an intermediary called the Helper
Component protein (HC-Pro). Therefore, transmission occurs when the DAG motif on the CP
interacts with the PTK region of the HC-Pro and a secondary motif on the HC-Pro (KLSC)
interacts with the stylet. The key difference between the helper and capsid strategies is that in the
helper strategy, the HC-Pro and virion can be picked up separately with the effect that a given
HC-Pro can transmit a virion from a completely different plant or even from a different leaf of the
same host (Pirone & Blanc, 1996). This may have significant implications for the maintenance of
genetic diversity in the viruses that use this strategy.
To date, 26 aphid species have been shown to be capable of transmitting ZYMV (Katis et
al, 2006), although with differing efficiencies. The two most efficient transmitters of ZYMV in
laboratory and field tests have been shown to be Myzus persicae and Aphis gossypii, with 41%
and 35% efficiencies, respectively (Castle et al., 1992).
The aphid vector remains viruliferous for a very limited time period (~five hours at 21°C)
after acquisition of the virus (Fereres et al., 1992), which suggests that aphids may not be directly
involved in the long distance dissemination of ZYMV. This, in combination with the current
worldwide distribution of ZYMV, and that fact that there are no known reservoirs of ZYMV in
temperate regions, raises the possibility that vertical transmission of ZYMV may be instrumental
in the dissemination of this virus. Thus, the third goal of this dissertation is to assess the rate
of seed transmission in ZYMV and to determine if vertically infected plants are capable of
initiating horizontal infections.
6
Seed transmission within the Potyviridae is not uncommon, but how the virus enters the
germ line is currently unknown. However, there is some evidence in pea seed-borne mosaic virus
(PSbMV) that the virus uses the suspensor as a mode of entry into the embryonic tissues. Once
fertilization has occurred the zygote will undergo an asymmetrical cell division, resulting in a
small apical cell, which will become the embryo and a larger basal cell (commonly called the
suspensor) (Wang & Maule 1994). The function of the suspensor is to provide nutrients for the
growing embryo from the endosperm. The suspensor in pea during the early stages of seed
development appears to be anchored close to the micropyle (a tiny opening in the ovule through
which the pollen tube enters), as well as maintaining close contact with the endosperm wall
(Wang & Maule 1994) (Fig. 1-1). It is believed that the virus moves from the maternal cells in the
micropyle to the endospermic cytoplasm and embryonic suspensor from which it invades the
embryo (Roberts et al., 2003).
Genomic organization and protein function
The ZYMV genome is ~9,600 nt long with a viral encoded protein (VPg) covalently
linked to the 5′ end and a polyadenylated 3’end. The spatial arrangement is typical of the
Potyviridae, and protein functions are listed in genomic order. P1 encodes a proteinase and, along
with the third protein (P3), is the least conserved region in the viral genome. In addition, P1 has
been shown to enhance amplification and movement of the virus (Urcuqui-Inchima et al., 2001).
Due to the low conservation of sequence identity between potyviruses, it is believed P1 may be
involved in host-virus interactions (Shukla et al., 1991). The HC-Pro is required for aphid
transmission, and has proteinase activity that is responsible for cleaving the HC-P3 junction
(Shukla et al., 1991, Urcuqui-Inchima et al., 2001). The HC-Pro is believed to be involved in
viral amplification, synergism, symptom development, and is a suppressor of post-transcriptional
gene silencing (PTGS), or RNA interference (RNAi) (Gal-on, 2007). It has been proposed that
7
the HC-Pro may also function to aid the entry and exit of the virus into and out of the host
vascular system (Urcuqui-Inchima et al., 2001). The P3 protein, as a result of the lack of
sequence homology, is not well characterized (Shukla et al., 1991, Urcuqui-Inchima et al., 2001),
which may suggest a virus specific function (Shukla et al., 1991). However, it has been suggested
that this protein may play a role in both virus amplification, as well as plant pathogenicity
(Urcuqui-Inchima et al., 2001). It is believed that the P3-6K1 complex may encode a
pathogenicity determinant (Urcuqui-Inchima et al., 2001).
The Cylindrical inclusion protein (CI) protein acts as an RNA helicase as it unwinds the
RNA duplex, and may also be involved in cell-to-cell movement of the virus (Shukla et al., 1991,
Urcuqui-Inchima et al., 2001). The function of the 6K2 protein has not yet been established;
however, mutated 6K2 genes have been introduced into another potyvirus genome, tobacco etch
virus (TEV), and have shown to be either detrimental or lethal to the virus. It has also been
proposed that the 6K2 protein anchors the replication apparatus to ER-like membranes (UrcuquiInchima et al., 2001). The small nuclear inclusion protein (Nla) protein acts as a proteinase
(Shukla et al., 1991, Urcuqui-Inchima et al., 2001), and it has been suggested that it may also
posses a nuclear localization function (Urcuqui-Inchima et al., 2001). The VPg is believed to act
as a primer for viral synthesis, as well as protecting the mRNA from attack by exonucleases
(Shukla et al., 1991). The large nuclear inclusion protein (Nlb) is the RNA-dependent polymerase
for the virus. The coat (or capsid) protein (CP) is involved in encapsidation of the viral RNA,
vector transmission (Shukla et al., 1991, Urcuqui-Inchima et al., 2001), regulation of viral RNA
amplification, as well as cell-to cell and systemic movement (Urcuqui-Inchima et al., 2001). It is
believed that the CP may function in host specificity (Shukla et al., 1991).
Entry into the cell, translation and replication
In order for the virus to gain entry into the host cell, the cell wall needs to be physically
penetrated and for ZYMV this occurs via the aphid stylet. Once the virus has gained entry into the
8
cell, uncoating is bidirectional, occurring first and more rapidly from the 5’ end (with 70% of the
viral RNA being uncovered within 3 minutes) and more slowly from the 3’ end (Wu & Shaw,
1996). As ZYMV is a positive sense RNA virus, it is infectious once uncoated and can be directly
translated. Although the ZYMV RNA lacks a cap structure at the 5’end, this region is believed to
contain two regulatory regions, which are thought to direct cap-independent translation (Niepel &
Gallie, 1999) through interactions with the poly-A tail (Gallie, 2001). The VPg functions to
repress translation of capped messengers by proteolysis of eIF4G (a factor necessary for
translation of capped mRNAs) (Sachs et al., 1997). The eukaryotic translation machinery is
heavily biased to express only the 5' open reading frame. For the entire genome to be expressed,
the genome encodes a single open reading frame that codes for a large polyprotein precursor that
is processed into 10 putative proteins by three viral encoded proteases: the first protein (P1), the
helper component protein (HC-Pro) and the small nuclear inclusion protein (Nla) (Gal-on, 2007).
The proteases allow for two levels of regulation first through the rate of proteolysis and second
through regulating the efficiency of cleavage site recognition (Merits et al., 2002). The genome is
expressed as a single ORF, which results in equimolar amounts of each protein, but this is not
always desirable, especially in the case of the polymerase. Thus potyviruses are thought to
transport their excess replication proteins (Nla and Nlb) to the nucleus where they are
subsequently sequestered (Restrepo et al., 1990).
There are two stages of replication. First, the positive strand is copied into a negative
strand and, second, the negative strand is copied multiple times into positive strands. Once the
parental viral RNA is translated, the replicase proteins are available. At this point the parental
strand forms a replication complex with the newly synthesized viral proteins and replication
begins at the 3’ end of the parental virion. Replication is believed to be primed by the VPg in both
the negative and positive directions. Once formed, the negative strand serves as a template for
positive strand formation. The association of the negative strand with several growing positive
9
strands is called the intermediary complex, and free negative strands are not typically found in the
cell (Astier et. al., 2007). The 6K2 has been proposed to anchor the replication apparatus
(Urcuqui-Inchima et al., 2001) to the replication site, which for the genus potyvirus is believed to
be associated with the endoplasmic reticulum (ER) (Martin et al., 1995). The ER is thought to
form vesicles that protect the replication complex from host defense responses (Ahlquist et al.,
2003).
Cell-to-cell and systemic movement
For infection to occur, the virus must be capable of moving both cell-to-cell as well as
from organ-to-organ. Any infection that is halted in the first infected cells, termed subliminal
infection, will not result in systemic infection (Furusawa & Okuno, 1978). In potyviruses at least
four proteins are involved in virus movement: the CP, the HC-Pro, the CI and the VPg. It is
believed that the CP binds to the viral RNA and is involved in altering the exclusion size limit of
the plasmodesmata (which is a thin stream of cytoplasm that flows through the cell walls of
adjacent plant cells and allows communication between them), thus facilitating cell-to-cell
movement of the virus. This phenomenon is believed to be transient and follows the infection
front (Heinlein et al., 1995; Oparka et al., 1997). The HC-Pro is also thought to increase
plasmodesmal permeability (Rojas et al., 1997), and the CI is believed to guide the CP-RNA
complex to the plasmodesmata (Rodríguez-Cerezo et.al., 1997). It is currently unknown how the
VPg is involved in viral movement, but mutated VPgs in turnip Mosaic virus have been shown to
reduce both cell-to-cell and systemic movement (Dunoyer et al., 2004).
Although long distance movement of plant viruses has not been studied as extensively as
cell-to-cell movement, it is clear that for the Potyviridae, the CP is necessary for long distance
spread within a plant. However, it has proved to be extremely difficult to tease apart the
independent roles that this protein plays in long distance vs. localized spread of the virus. In order
for systemic infection to occur, the virus must enter the vascular tissue. The virus moves from the
10
mesophyll cells and through a series of cells, which are the perivascular parenchyma, the phloem
parenchyma, the companion cells, and finally into the sieve tube elements, which are series of
cells that are joined end-to-end and form a continuous tube through which carbon metabolites are
transported from the “source” leaves to the “sink” immature leaves (Fig. 1-1).
Figure 1-1: Diagram from Principles of Plant Virology - Genome, Pathogenicity, Virus Ecology.
© 2007, Science Publishers (English version)
Phylogeny
At least 25 strains of ZYMV have been identified (Desbiez and Lecoq, 1997).
Phylogenies of ZYMV (based on the Coat Protein) indicate three clusters of isolates exclusive of
11
the more distant Singapore and Reunion Island isolates (Zhao et al., 2003). The first cluster,
group I, includes the majority of the European isolates, as well as some Japanese and Chinese
isolates, and a Californian strain. Group II are all from Asia (South Korea, Taiwan, Hangzhou
and Japan), while Group III includes several Chinese isolates. Of particular interest is that the
members of Group III differ from the other two clusters in terms of the symptoms that they cause.
The group III viruses cause severe mosaic symptoms on the leaves, but not the fruits, whereas
groups I and II induce severe symptoms on both the leaves and fruits (Zhao et al., 2003).
Cucurbita pepo ssp. texana
Cucurbita pepo ssp. texana (the Texas gourd, or free-living squash) is a monoecious,
annual vine with indeterminate growth and reproduction. It is native to Northern Mexico, Texas,
and the states along the Mississippi River from Southern Illinois southward. It is thought that this
particular subspecies resulted either as an early escape from cultivation, or that it is the wild
progenitor of cultivated squashes, (Decker & Wilson, 1987; Decker-Walters, 1990; DeckerWalters et al., 2002; Lira et al. 1995,). It is cross compatible with all cultivated squash and
pumpkins, as well as annual Cucurbita taxa from Mexico (Arriaga et al., 2006). C. pepo is
considered to be the optimal host for the maintenance of ZYMV (Gal-on, 2007).
Methods
Maximum likelihood tree building
In Chapter two, we use 55 consensus sequences of the CP, six of which we generated
from samples obtained from our experimental fields in Pennsylvania, and the remaining 49 were
sequences from around the world that were deposited in GenBank. To determine the evolutionary
relationships amongst these samples, we generated a Maximum likelihood tree (ML) using the
PAUP package (Swofford, 2003). ML is a method in which a hypothesis about evolutionary
history is evaluated in terms of the probability that the proposed model of evolution and the
hypothesized tree would give rise to the observed set of sequences (Page & Holmes, 2007). ML
12
methods are thought to surpass other tree building methods since they are thought to be more
accurate. However, there are disadvantages to this method. It is very computationally intensive
and is highly dependent on the model of evolution (Huelsenbeck, 1995). Therefore, to determine
which model of evolution best fit our data to infer the tree, we used the program MODELTEST,
which is a program that selects from 56 models of nucleotide substitution and determines the best
model based on the data (Posada et al. 1998).
Minimum spanning tree building
Most traditional tree building methods require a fair amount of variance between the
sequences in order to accurately reconstruct relationships (Huelsenbeck & Hillis 1993), but the
clonal data generated in chapter three displayed very little variance. In fact, ~90% of the
sequences obtained were identical to the consensus and three of the twenty samples had no
mutations whatsoever. Therefore, I opted to use a minimum spanning tree approach to determine
the population structure of these sequences. The program I used, TCS, is based on a method
developed by Templeton et al. (1992) that uses statistical parsimony to infer population level
genealogies on samples with very low variance (Clement et al., 2000). After collapsing the
haplotypes, the program calculates their frequency. These frequencies are then used to estimate
haplotype outgroup probabilities. An absolute distance matrix is calculated for all pairwise
comparisons, and the probability of parsimony is calculated for these pairwise differences with a
95% probability cut-off. The number of mutations between pairs of sequences is the number of
mutational connections between pairs of sequences. These connections are then used to output the
resulting minimum spanning tree or network (Clement et al., 2000).
BEAST
We used the BEAST package (Drummond & Rambaut, 2007) to ascertain the rate of
nucleotide substitution per site, as well as the time to the most recent common ancestor
(TMRCA) of the ZYMV CP sequences in chapter two. Time structure is a requirement for this
13
analysis, and as I was only able to acquire a year of collection for a subset of the CP sequences
from GenBank. As a result, only 35 of the 55 sequences were used in this analysis. The BEAST
program models the rate of molecular evolution for each branch of the phylogenetic tree using the
Bayesian Markov chain Monte Carlo (MCMC) approach. This approach uses the MetropolisHastings algorithm to approximate the posterior distribution. It searches along a chain of
hypothetical trees and provides an estimate of the probability that a given tree is correct (Lakner
et al., 2008).
Sanger sequencing
I used Sanger sequencing to generate the clonal data in chapter three and five, as well as
the consensus sequences in chapter two. After PCR amplification and purification of the
sample(s) of interest, sequencing occurs when reverse strand synthesis is performed on these
copies starting from a known primer sequence located upstream of the desired sequence in a
mixture of deoxynucleotides (dNTP’s) and dideoxynucleotides (ddNTP’s). The dNTPs are the
standard A, C, G and T building blocks of DNA and the ddNTPs are modified nucleotides that
lack a hydroxyl group at the third carbon of the molecule, preventing ester bonds from forming
with the phosphate group of another dNTP or ddNTP. The polymerization reaction is terminated
when a ddNTP is incorporated instead of a dNTP; therefore, the mixture of both types of bases
randomly causes the extension to be terminated in a non-reversible fashion resulting in molecules
of different lengths. After denaturing and clean up, the molecules are sorted by molecular weight
using capillary electrophoresis, and the fluorescent label attached to the ddNTP is read out
sequentially in the order created by the sorting step (Kircher & Kelso, 2010).
Illumina Sequencing
As cloning free DNA amplification is possible through high throughput sequencing
technologies such as Illumina/Solexa, in chapter four I decided to undertake a deep sequencing
approach on the samples generated in chapter two for two reasons. The first reason is aphid
14
transmission of ZYMV involves more than one protein, the HC-Pro and the CP. The second
reason is the level of coverage obtained with cloning and Sanger sequencing was fairly low (we
averaged 35X per sample in chapter three). Illumina sequencing parallelizes the sequencing
process with the result that millions of reads can be produced at one time (Morozova & Marra,
2008). During library preparation, two different adaptors are added to the 3’ and 5’ end of each
molecule. On the surface of the flow cell (which is the solid surface of the sequencer), there are
two populations of immobilized oligonucleotides that are complementary to the two different
single-stranded adapter ends of the sequencing library. These hybridize to the single-stranded
DNA fragments, thus attaching them to the flow cell. The molecule is then bent over and
hybridized to a complementary adapter thus creating a “bridge” that serves as the template for
complementary strands. Bridge amplification is the process of bending and reverse synthesis,
whereby reverse strand synthesis starts from the hybridized portion, such that the new strand is
covalently bound to the flow cell. When the new strand bends over and attaches to another short
nucleic acid sequence complementary to the second adapter sequence attached to the free end of
the strand, it is then used to synthesize a second covalently bound reverse strand, and so on and so
forth. Once the amplification step is completed, the flow cell will contain ~ 40 million clusters,
each of which contains ~ 1000 clonal copies of a single template molecule. The process uses a
sequencing by synthesis concept that is similar to the Sanger sequencing process: the
incorporation reaction is halted after each base, then the label of the incorporated base is read, and
then the sequencing reaction continues with the incorporation of the next base. Illumina uses
reversible terminators with removable fluorescent molecules with DNA polymerases that
incorporate terminators into the chain. The terminators are labeled with fluorescence with a
different color for each base, so that the sequence is inferred as the color is read at each
nucleotide step (Kircher & Kelso, 2010).
Chapter 2
Rapid evolutionary dynamics of Zucchini yellow mosaic virus
Abstract
Zucchini yellow mosaic virus (ZYMV) is an economically important virus of cucurbit
crops. However, little is known about the rate at which this virus has evolved within members of
the family Cucurbitaceae, or the timescale of its epidemiological history. Herein, we present the
first analysis of the evolutionary dynamics of ZYMV. Using a Bayesian coalescent approach we
show that the coat protein of ZYMV has evolved at a mean rate of 5.0 x 10-4 nucleotide
substitutions per site, per year. Notably, this rate is equivalent to those observed in animal RNA
viruses. Using the same approach we show that the lineages of ZYMV sampled here have an
ancestry that dates back no more than 800 years, suggesting that human activities have played a
central role in the dispersal of ZYMV. Finally, an analysis of phylogeographical structure
provides strong evidence for the in situ evolution of ZYMV within individual countries.
Introduction
Zucchini yellow mosaic virus (ZYMV), first isolated in 1973 and described in 1981 (Lisa
et al., 1981), is the cause of one of the most economically important diseases of the family
Cucurbitaceae, naturally infecting plants in more than 50 countries (Desbiez & Lecoq, 1997).
Symptoms include yellowing, stunting, leaf deformations, and misshaped and discoloured fruits,
which often renders the fruits unmarketable, drastically reducing agricultural yields (Blua &
Perring, 1989; Desbiez & Lecoq, 1997; Gal-On, 2007). Although ZYMV is widespread, few viral
reservoirs have been identified, particularly in temperate regions (Desbiez & Lecoq, 1997).
ZYMV is a single-stranded, positive-sense RNA virus of the family Potyviridae. The
primary mode of transmission is via aphids in a non-persistent manner. Although 10 aphid
16
species have been reported as vectors (Katis et al., 2006; Lisa et al., 1981), a wider range of
potential aphid vectors has been identified under experimental conditions (Blackman & Eastop,
2000; Katis et al., 2006). While aphid transmission is undoubtedly the main route of spread for
ZYMV, infrequent seed transmission has also been proposed (Robinson et al., 1993;
Schrijnwerkers et al., 1991), the epidemiological importance of which is uncertain (Johansen et
al., 1994).
ZYMV has a genome of 9593 nt arranged as a single open reading frame encoding a
polyprotein precursor that is processed into 10 putative proteins (Gal-On, 2007). Of these, the
coat protein (CP) is involved in the encapsidation of viral RNA, vector transmission (Shukla et
al., 1991; Urcuqui-Inchima et al., 2001), the regulation of viral RNA amplification and cell-tocell and systemic movement (Urcuqui-Inchima et al., 2001). Transmission occurs as a result of
the interaction between the aphid stylet, CP and the HC-Pro protein (Pirone & Blanc, 1996), such
that some mutations in CP and HC-Pro disrupt viral transmission (Gal-On, 2007; Pirone & Blanc,
1996; Shukla et al., 1991; Urcuqui-Inchima et al., 2001). The CP is also extensively used as a
tool to infer the phylogenetic relationships among viral isolates (Rybicki & Shukla, 1992; Shukla
et al., 1991).
A variety of studies have explored the extent and structure of genetic diversity in ZYMV,
particularly within a biogeographical context. Analysis of a 250 nt fragment of 160 viral isolates
sampled from 23 geographical areas revealed two major groups of ZYMV, denoted A and B, with
the former divided into three clusters (Desbiez et al., 2002). A subsequent analysis of the CP
revealed three main groups of isolates with differing geographical distributions (Zhao et al.,
2003). Group I included the majority of European isolates, as well as some from China and Japan,
and a single Californian isolate. Group II was exclusively composed of viruses from Asia, while
group III included several Chinese isolates. Notably, while group I and II isolates resulted in
mosaic symptoms on leaves and fruit distortion, group III viruses did not cause symptoms on the
17
fruit, but induced severe mosaic symptoms on the leaves (Zhao et al., 2003). More
phylogenetically distant ZYMV isolates were observed in Singapore and Réunion (and other
islands in the Indian Ocean representing group B of Desbiez et al., 2002), which likely reflects
their biographical separation (Gal-On, 2007; Zhao et al., 2003). More localized
phylogeographical studies have revealed that viruses can diffuse within specific localities, such as
Central Europe (Glasa & Pittnerova, 2006; Glasa et al., 2007; Tobias & Palkovics, 2003), perhaps
mediated by the local spread of aphids. However, isolates sampled from adjoining locations are
not always related (Pfosser & Baumann, 2002), suggesting that biogeographical structure may, to
some extent, be determined by the international trading of infected seeds (Desbiez et al., 2002;
Tobias & Palkovics, 2003).
There has also been considerable interest in using sequence data from plant RNA viruses
to infer evolutionary dynamics. Although a combination of intrinsically high rates of mutation,
rapid replication and large population sizes are thought to provide RNA viruses with abundant
genetic variation, some plant RNA viruses appear more genetically stable than their animal
counterparts (Garcia-Arenal et al., 2001, 2003). This could be due to a combination of
intrinsically lower rates of mutation (Malpica et al., 2002) and a reduced fixation rate of
advantageous non-synonymous mutations because of weaker immune selection (Garcia-Arenal et
al., 2001). Similarly, genetic bottlenecks play a major role in structuring genetic diversity during
both systemic infection (French & Stenger, 2003; Li & Roossinck, 2004; Sacristan et al., 2003)
and horizontal transmission by aphids (Ali et al., 2006).
Despite the agricultural importance of ZYMV, there has been little work documenting
either the rate of molecular evolution of this virus or the age of the sampled genetic diversity,
reflected in the time to the most recent common ancestor (TMRCA). However, this information is
central to understanding the evolutionary dynamics of plant RNA viruses in general, and
particularly whether they exhibit reduced rates of evolutionary change, which in turn may have
18
major implications on their ability to emerge in new host species.
Cucurbita pepo ssp. texana is an annual monoecious vine that is native to northern
Mexico, Texas, and the lower Mississippi River drainage area. It is thought to be either the wild
progenitor of the cultivated squashes (C. pepo ssp. pepo) or an early escape from cultivation
(Decker & Wilson, 1987; Decker-Walters, 1990; Decker-Walters et al., 2002; Lira et al., 1995).
Methods
ZYMV infection of plants collected during the 2006 growing season was determined
immunologically (DAS-ELISA test kit; Agdia). Leaf tissue from infected plants was then
homogenized in liquid nitrogen and RNA extracted using a Qiagen RNeasy Plant Mini kit. Firststrand cDNA was synthesized from the extracted RNA using Superscript III First-Strand kit
(Invitrogen). The target cDNA was then amplified directly via PCR and sequenced. The CPspecific primers used for the cDNA, PCR and sequencing steps were: forward, 5’-AAGATTGGCACGCTA-3’; reverse, 5’-CGGTAAATATTAGAATTAGCTCG-3’. All sequences generated
here have been submitted to GenBank and assigned accession numbers EU371645–EU371650. A
total of six ZYMV CP, newly acquired here, were combined with 49 collected from GenBank
(accession numbers available from the authors on request), producing a total dataset of 55 CP
sequences, 815 nt in length. To determine the evolutionary relationships among all 55 sequences
we employed the maximum-likelihood (ML) method available within the PAUP* package
(Swofford, 2003). The best-fit model of nucleotide substitution was determined by MODELTEST
(Posada & Crandall, 1998) as TIM+I+I-4 and this was used as the basis for tree bisectionreconnection branch-swapping (parameter values available from the authors on request). A
bootstrap resampling approach (1000 replications), employing the ML substitution model, was
used to assess the support for individual nodes. To determine the strength of phylogenetic
clustering by country of virus isolation we employed a parsimony character mapping approach
(Carrington et al., 2005). Each ZYMV sequence was therefore assigned a character state
19
reflecting its country (or continent) of origin. Given the ML phylogeny for these sequences, the
minimum number of state changes needed to produce the observed distribution of country
character states was estimated using parsimony (excluding ambiguous changes). To determine the
expected number of changes under the null hypothesis of complete mixing among countries, the
states of all isolates were randomized 1000 times. The difference between the mean number of
observed and expected state changes indicates the level of geographical isolation, with statistical
significance assessed by comparing the total number of observed state changes to the number
expected under random mixing. All analyses were performed using PAUP* (Swofford, 2003).
The rate of nucleotide substitution per site, as well as the TMRCA of the ZYMV CP
sequences were estimated using the Bayesian Markov chain Monte Carlo approach implemented
in the BEAST package (Drummond & Rambaut, 2007). This approach analyses the distribution
of tip times on millions of plausible sampled phylogenies, so that estimates are set within a
rigorous statistical framework. As this analysis requires time-structured data, where the date of
sampling of each isolate is known, it was restricted to a subset of 35 CP sequences for which the
year of sampling was available, representing a 22 year period from 1984 to 2006. In the case of
eight Chinese viruses, sampling dates were only known to the nearest two possible years. To
account for this uncertainty, analyses were repeated using the different sampling times available.
We also compared the demographical models of a constant population size and exponential
population growth, employing both strict and relaxed (uncorrelated lognormal) molecular clocks.
Bayes factors were used to determine the best supported model. Because the TIM+I+I-4
substitution model is unavailable in the BEAST package, the closely related GTR+I+ I-4 model
was used in its place. The extent of statistical uncertainty in parameter estimates is reflected in the
95% highest probability density values. Finally, site-specific selection pressures in the 55 CP
dataset were estimated as the ratio of non-synonymous (dN) to synonymous substitutions (dS) per
site (ratio dN/dS) using both the single likelihood ancestor counting (SLAC) and fixed effects
20
likelihood (FEL) methods, available at the Datamonkey facility (Kosakovsky Pond & Frost,
2005).
Results and Discussion
In accord with other studies of the phylogeography of ZYMV, distinct clusters of viral
isolates are apparent in the ML tree of 55 CP sequences (Fig. 2-1). These clusters represent: (i) a
large group of isolates sampled from a variety of locations in Asia (China, Japan, Korea and
Taiwan), Europe and the Middle-East (Austria, Germany, Israel, Italy, Hungary and Slovenia),
and USA, and previously denoted as groups I and II; (ii) China (previously denoted group III);
and (iii) Singapore and the Réunion Island (previously unclassified). We found no compelling
evidence for the existence of group II isolates (from Asia), as these fell within the phylogenetic
diversity of group I viruses, and suggest that those isolates from Singapore and the Réunion
Island are so phylogenetically distinct that they be assigned to their own group.
A number of inferences can be made from this spatial pattern. First, the greatest level of
genetic diversity, including the deepest phylogenetic split, is seen in Asia (particularly China),
including the presence of one clade of viruses that has only been observed (to date) in China.
Although this is compatible with the lineages of ZYMV sampled here having an origin in Asia,
this will need to be confirmed with a larger sample of isolates. Second, other than a virus sampled
in Florida in 1984, all other USA isolates, sampled between 1992 and 2006 and including those
newly obtained from Pennsylvania, have a single common ancestor (Fig. 1). Although the sample
size is small, this suggests that there has been some in situ evolution of ZYMV in the USA since
this time, without the importation of new viral material. Our parsimony analysis of geographical
structure also revealed a strongly significant clustering by country of origin compared with that
expected by chance alone (P<0.001). A similarly strong clustering was observed by continent
(Americas, Asia, Europe and the Middle-East, Indian Ocean; P<0.001). Hence, although ZYMV
is able to cross geographical boundaries as indicated by the many countries represented within
21
groups I/II, such gene flow is not sufficiently frequent to eradicate geographical structure. More
generally, this strong spatial clustering suggests that there is little vertical transmission of ZYMV
through cultivated cucurbits, because commercial seeds of cultivated species are likely to be
frequently transported across national borders.
Figure 2-1: ML tree of 55 ZYMV CP sequences
For viruses where the year of sampling is available, these dates are given in parentheses.
Those viruses samples as part of this study are shaded grey. The group nomenclature depicted
represents that previously proposed for ZYMV (Zhao et al., 2003). The tree is drawn to scale of
0.05 nt substitutions per site and bootstrap values (.90%) are shown next to the relevant nodes.
The tree is mid-point rooted for clarity only. The best supported evolutionary model for the CP of ZYMV under our Bayesian
coalescent analysis was that of exponential population growth under a relaxed molecular clock
22
(Table 2-1). Under this model the mean rate of evolutionary change for ZYMV was 5.0 x 10-4
nucleotide substitutions per site, per year. Similar rates were obtained under different
demographical and molecular clock models, incorporating the different possible sampling times
for those viruses where the exact year of sampling was unknown, and using a range of prior
values for the substitution rate, indicating that they are robust (results available from the authors
on request). This high evolutionary rate falls within the normal range observed in RNA viruses,
most of which represent animal RNA viruses (Jenkins et al., 2002; Hanada et al., 2004). As such,
we find no evidence that ZYMV evolves any slower than animal RNA viruses that are subject to
the same, error-prone replication.
Table 2-1: Bayesian estimates of population dynamic and evolutionary parameters of the CP
gene of ZYMV.
HPD, Highest probability density (95 %).
Although repeated population bottlenecks undoubtedly influence the genetic structure of viral populations in the short-term (Li & Roossinck, 2004), they will have no affect on long-term
evolutionary rates if most substitutions are selectively neutral. Similarly, although a weaker
immune response against plant RNA viruses will reduce the rate at which some non-synonymous
mutations accumulate (Garcı́a-Arenal et al., 2001), the fact that these normally constitute a minor
fraction of the total number of nucleotide substitutions means that they are unlikely to have a
major impact on long-term evolutionary rates. In support of this we found no evidence for
23
positive selection acting on the CP of ZYMV using either the SLAC or FEL methods; the
predominant evolutionary pressure was that of negative (purifying) selection, with a mean dN/dS
of 0.108 and 106 of 271 codons negatively selected under the SLAC method. This agrees with
previous studies of the CPs of plant RNA viruses, which indicate that they are subject to
relatively strong purifying selection (Chare & Holmes, 2004). Further, the lack of positive
selection suggests that experimental passage has not had a major impact on our analyses.
Although the rapid evolutionary rates observed here for ZYMV will need to be verified for a
wider range of plant RNA viruses, the implication from this work is that mutational and
replicatory dynamics are similar across a broad range of RNA viruses. Such high rates of evolutionary change also lead to a recent TMRCA for the isolates of
ZYMV analysed here (Table 2-1). Although there is a relatively large date range because of the
inherent sampling error on this analysis (119–771 years), these dates clearly indicate that the
spread of this virus has been recent. Indeed, these dates broadly coincide with important
ecological changes that may have assisted the spread of ZYMV, including (i) an increase in the
number of hectares of worldwide cucurbit cultivation; (ii) the cultivation of cucurbits in novel
areas with few wild Cucurbitaceae, facilitating viral transfer from a non-cucurbitaceous plant to
the cultivated cucurbits (as observed in a contemporary setting; Perring et al., 1992), and (iii) the
cultivation, in close proximity, of cucurbit crops with diverse origins, which allowed the virus to
jump to new genera of the family Cucurbitaceae. Overall, our study highlights the utility of gene
sequence data to reveal key aspects of the epidemiological history of plant RNA viruses.
Chapter 3
Rapid turnover of intra-host genetic diversity in Zucchini yellow mosaic virus
Abstract
Genetic diversity in RNA viruses is shaped by a variety of evolutionary processes,
including the bottlenecks that may occur at inter-host transmission. However, how these
processes structure genetic variation at the scale of individual hosts is only partly understood. We
obtained intra-host sequence data for the coat protein (CP) gene of Zucchini yellow mosaic virus
(ZYMV) from two horizontally transmitted populations – one via aphid, the other without – and
with multiple samples from individual plants. We show that although mutations are generated
relatively frequently within infected plants, attaining similar levels of genetic diversity to that
seen in some animal RNA viruses (mean intra-sample diversity of 0.02%), most mutations are
likely to be transient, deleterious, and purged rapidly. We also observed more population
structure in the aphid transmitted viral population, including the same mutations in multiple
clones, the presence of a sub-lineage, and evidence for the short-term complementation of
defective genomes.
Introduction
Determining the extent and structure of genetic variation in RNA viruses is central to
understanding the mechanisms that shape their evolution. The high levels of genetic diversity that
characterize many RNA viruses have been linked to their ability to adapt rapidly to changing
environments including new host species (Holmes, 2009; Jerzak et al., 2008; Woolhouse et al.,
2001), and to evade mechanisms of host resistance (Feuer et al., 1999; Lech et al., 1996).
Most estimates of the rate of molecular evolution in animal RNA viruses fall within
approximately one order of magnitude of a mean rate of 1 × 10−3 nucleotide substitutions per site,
25
per year (subs/site/year; Duffy et al., 2008). In contrast, it has previously been suggested that
plant RNA viruses are characterized by lower rates of evolutionary change, in some cases by
several orders of magnitude (Blok et al., 1987; Fraile et al., 1997; Kim et al., 2005; Marco and
Aranda, 2005; Rodríguez Cerezo et al., 1991). This major difference in evolutionary dynamics
has been attributed to intrinsically lower mutation rates, weaker immune-mediated positive
selection, and the frequent occurrence of population bottlenecks (García-Arenal et al., 2001,
2003). However, more recent analyses using longitudinally sampled gene sequence data have
resulted in substitution rate estimates in accord with those previously observed in animal RNA
viruses, at least in the short term (Fargette et al., 2008; Gibbs et al., 2008, 2010; Pagán and
Holmes, 2010). As a case in point, we previously reported a mean evolutionary rate of 5 × 10−4
subs/site/year for the coat protein (CP) of Zucchini yellow mosaic virus (ZYMV) (Simmons et al.,
2008).
Most studies of genetic diversity in plant viruses have been conducted at the inter-host
level. However, if plant RNA viruses do evolve as rapidly as suggested by the analysis of
epidemiological scale sequence data then we would also expect them to exhibit measurable
genetic diversity at the intra-host scale. Those studies undertaken to date have found varying
levels of intra-host variation. Turturo et al. (2005) observed limited (<0.1%) intra-host genetic
diversity in Grapevine leafroll-associated virus, while Jridi et al. (2006) noted that the nucleotide
diversity of Plum pox virus measured over 13 years in a prunus tree ranged from 0 to 2.4%.
Rather higher levels of intra-host diversity were observed in Banana mild mosaic virus, with
divergence levels of more than 15% in a third of the sequences obtained (Teycheney et al., 2005).
Determining the extent and patterns of intra-host genetic diversity in plant RNA viruses
is central to revealing the fundamental processes of viral evolution. Large-scale population
bottlenecks are thought to result in effective population sizes for RNA viruses that are several
orders of magnitude lower than consensus population numbers (García-Arenal et al., 2001).
26
Indeed, population bottlenecks have been documented during aphid transmission in Cucumber
mosaic virus (Ali et al., 2006; Betancourt et al., 2008) and Potato virus Y (Moury et al., 2007).
Systemic bottlenecks (that occur as the virus moves from cell-to-cell and tissue-to-tissue) may
reduce effective population sizes even further (French and Stenger, 2003; Sacristán et al., 2003;
Li and Roossinck, 2004; Miyashita and Kishino, 2010). In these circumstances genetic drift is
predicted to play a major role in the substitution dynamics of mutant alleles. However, little is
known about the frequency and impact of population bottlenecks in natural virus populations (Li
and Roossinck, 2004). As an exception, the extent of genetic diversity in Citrus tristeza virus
transmitted via aphids was reduced by an order of magnitude compared to that found in the sweet
orange (Citrus sinensis) host (Nolasco et al., 2008).
ZYMV was first isolated in 1973 in Italy, and since this time the virus has been found in
more than 50 countries as a naturally occurring infection of the Cucurbitaceae (Debiez and
Lecoq, 1997; Desbiez et al., 2002). Viral symptoms include a distinctive yellow mottling in the
leaves, stunting of the plant, and severe deformities in the fruits and leaves (Debiez and Lecoq,
1997; Gal-On, 2007). Production of cucurbits in the United States is valued at approximately $1.5
billion per annum (Cantliffe et al., 2007), and as ZYMV infection can reduce agricultural yields
by up to 94% (Blua and Perring, 1989), it is one of the most economically significant agricultural
pathogens in cultivated cucurbits (squash, melon and cucumber). ZYMV is a member of the
Potyviridae family of positive-sense, single-stranded encapsidated RNA viruses. The ∼9.5 kb
viral genome encodes a single polyprotein precursor that is cleaved into ten putative proteins
(Gal-On, 2007). Transmission occurs primarily via aphids in a non- persistent manner (Lisa et al.,
1981) and, to date, 26 aphid species have been shown to transmit ZYMV (Katis et al., 2006). The
viral coat protein (CP) is multifunctional and involved in cell-to-cell and systemic movement, the
regulation of viral RNA amplification (Urcuqui-Inchima et al., 2001), encapsidation of the RNA,
vector transmission (Urcuqui-Inchima et al., 2001; Shukla et al., 1991), and perhaps host
27
specificity (Shukla et al., 1991). ZYMV transmission is the result of an interaction between the
stylet of the aphid, the helper component protein (HC-Pro), and the conserved DAG (Asp-AlaGly) region of the CP (Pirone and Blanc, 1996). The highly variable N-terminus region of the CP
is exposed on the surface of the coat protein and is thought to contain virus-specific epitopes. The
core region and C-terminus are more conserved, although the last ten amino acids of the Cterminus may be exposed on the viral surface (Gal-On, 2007).
To obtain a better understanding of the patterns and processes of plant virus evolution at
the scale of individual hosts, we analyzed the intra-host genetic diversity of ZYMV in Cucurbita
pepo ssp. texana (a wild gourd) under two distinct modes of transmission: aphid-vectored and
mechanically-inoculated (i.e. without aphids). The aphid-vectored experiment was conducted in
an experimental field and resulted in two types of data; a time series as the virus evolves within
the host over the course of the infection, and epidemiological-scale data following the spread of
the virus as it was transmitted by aphids between hosts during the growing season. Because the
number of transmission events is not controlled, these data recapitulate the natural spread of the
virus. Using data of the first type the extent of the bottleneck imposed by the aphid during
transmission can be estimated. The second type of data allowed us to determine if mutations are
transmitted between individuals or are generated anew within each individual.
In the mechanical inoculation experiment, carried out in a greenhouse, ZYMV was
serially passaged across four generations by mechanical inoculation. By comparing these data to
those from the field study we were able to compare viral genetic diversity with and without the
aphid-imposed bottleneck. To assess the effect of intra- host systemic bottlenecks, half of the fifth
and eighth leaves from each mechanically-inoculated individual were used separately to inoculate
another individual. This follows the design of two earlier studies which showed that the number
of mutant clones present in a leaf decreased as a function of distance from the original inoculum
source, presumably as a result of systemic bottlenecks (Li and Roossinck, 2004; Ali and
28
Roossnick, 2010).
Methods
Field experiment
The field experiment was conducted at The Pennsylvania State University Agriculture
Experiment Station at Rock Springs, Pennsylvania, USA, using Cucurbita pepo ssp. texana (a
wild gourd). One 0.4-hectare field was laid out as a grid labeled A-L and 1–15, with
approximately six meters between plants and 180 plants per field (Fig. 3-1a). In 2007 individual
F-8 (located in the middle of one of the fields) was mechanically inoculated with ZYMV, the
consensus sequence of which has been deposited in GenBank (accession number EU371649).
When the inoculated plant, CF8, exhibited viral symptoms a leaf was collected. Plant labels are as
follows: The first digit C designates that the sample was collected from the field, the next digit
and number in this case F8, designate the plant coordinates within the field grid, and the number
in parenthesis denotes the order in which samples where collected from an individual plant. As
neighboring plants became infected, leaf samples were collected so that a leaf sample was
gathered every two weeks from each individual that displayed disease symptoms from the onset
of visible symptoms until the host plant died (approximately 9 weeks in total). Presence of
ZYMV was detected immunologically using DAS-ELISA (Agdia, IN) and confirmed by
polymerase chain reaction (PCR) and sequencing of the viral CP. The DAS-ELISA results not
only confirmed the presence of ZYMV in the field plants but also revealed that only one of the
plants (CE7) was co-infected with another potyvirus. Leaf samples from confirmed ZYMVinfected plants were stored at −80 ◦ C. Although samples were collected from all of the infected
plants in the field, eleven of these, which represents six individual plants, were selected for
sequencing. One plant (CF7) was sampled at three time points (August 4th, August 28th,
September 13th); three plants were sampled at two time points (CE7 on September 13th and
September 20th, CE8 on August 8th and September 13th, and CG7 on August 30th and
29
September 20th); and clonal sequences were sampled only once from two plants (CF8 and CG6;
Table 1).
Greenhouse experiment
Two individual plants were mechanically inoculated in January of 2008 with a ZYMV
sample taken from the first diseased individual from the 2007 season (CF8). The mechanical
inoculations performed in the greenhouse using carborundum powder (500 gm). The infectious
tissue was prepared from infected plant tissue diluted in a phosphate buffer (0.1 M Na2 H/KH2
PO4 buffer) in a 1:3 ratio. The carborundum powder was dusted on the surface of the leaf, and the
inoculum was then applied with a pestle to the leaf surface. When the plants displayed disease
symptoms and exhibited at least an additional eight leaves of growth from the inoculation site
(typically 4–5 weeks), half of the fifth and eighth leaves (distance from the first inoculated leaf)
each were each used separately to inoculate another individual and so on through four generations
(Fig. 3-1b). The infection rate of the mechanical inoculations was 100%. The other half of each
leaf was stored at −80 ◦ C. We generated clonal sequence data from nine samples representing
one transmission chain. In summary, clones were generated from the fifth and eighth leaves of
individual A, the fifth and eighth leaves of individual C (which was infected from the fifth leaf of
A), the fifth leaf of individual G (which was infected from the fifth leaf of C), the fifth and eighth
leaves of individual H (which was infected from the eighth leaf of C), and the fifth leaf of
individual O (which was infected from the fifth leaf of G). In addition, we sequenced one sample
(fifth leaf of K) from the third generation from the eighth leaf of A. 30
31
Figure 3-1: Experimental design of the current study. (a) Field experiments. The schematic
shows the position of the field plants relative to each other. Plant labels are as follows: The first
digit C designates that the sample was collected from the field, the next two digits designate the
plant coordinates within the field grid, and the number in parenthesis denotes the number of
samples collected from an individual plant. The boxed images that occur between the sampled
field plants are of Aphis gossypii (cotton aphid), which serves to indicate that the spread of
infection in the field occurred naturally (i.e. was aphid vectored). (b) Greenhouse experiments.
The first field infected plant was used to infect plant A, the fifth leaf of which was used to infect
C. The fifth leaf of C was used to infect G and the eighth leaf to infect H. The fifth leaf of G was
used to infect O, and K was infected from the third generation from the eighth leaf of A.
RNA isolation, PCR analysis, cloning and sequencing
RNA was isolated from frozen leaf samples using the RNeasy® Plant Mini Kit (Qiagen,
CA). First-strand cDNA was synthesized from the extracted RNA following the protocol
provided by the supplier using the SuperscriptTM III First-Strand kit (Invitrogen, CA). The target
cDNA was then amplified directly via PCR using Phusion® High-Fidelity PCR Master Mix
(Finnzyme, MA). Although we used a high fidelity Taq polymerase to reduce the number of
‘mutations’ introduced during the experimental procedure, it is impossible to fully eliminate RTintroduced errors from occurring (see Results Section). Prior to cloning with the TOPO® TA
Cloning® Kit (Invitrogen, CA), each sample was purified using the QIAquick PCR Purification
Kit (Qiagen, CA) and an A overhang was added to each sample. Before submitting samples for
sequencing at The Pennsylvania State University Nucleic Acid Facility, each sample was purified
with the QIAprep Spin Miniprep Kit (Qiagen, CA). The CP-specific primers used for the cDNA,
PCR and steps were: forward: AAGTGAATTGGCACGCTA; reverse:
CGGTAAATATTAGAATTACGTCG. To ensure that mutations were valid each clone was
32
sequenced in forward and reverse and manually aligned. Any mutations occurring in one
direction only were discarded. T7 forward and M13 reverse primers were used for clone
sequencing. All sequences generated here have been submitted to GenBank and assigned
accession numbers HM768168–HM768204.
Sequence analysis
All ZYMV sequences were manually aligned using Se-Al (2.0a11; kindly provided by
Andrew Rambaut, University of Edinburgh) and trimmed to cover the coat protein region: from
the CP start codon until the stop codon, for a total of 849 nucleotides (nt). Counts of the number
of mutations in each sample were undertaken manually, while pairwise genetic distances were
estimated using MEGA (version 3) (Kumar et al., 2004). Because of the very small number of
mutations observed we utilized uncorrected genetic (p) distances. As the number of cloned
sequences varies across individual plants or time points we performed a chi-squared goodness of
fit test (Using R 2.10.1; 2008) to correct for the number of mutations compared to the number of
sequences. To estimate the number of nonsynonymous (dN) and synonymous substitutions (dS)
per site (ratio dN/dS), itself a measure of selection pressure, we used the Single Likelihood
Ancestor Counting (SLAC) algorithm employing the MG94 × HY85 3 × 4 substitution model in
HyPhy (Kosakovsky Pond et al., 2005). Finally, minimum spanning trees for the field and
greenhouse populations were estimated separately using the statistical parsimony approach
available in the TCS 1.21 program (Clement et al., 2000).
Results
To determine the extent and structure of intra-host viral genetic diversity in ZYMV we
sequenced clones from 20 viral samples representing both the greenhouse and field populations.
In total, we obtained 706 clonal sequences, with an average of 35 sequences per leaf sample.
Approximately 90% of the clones sequenced were identical to the consensus sequence. Pairwise
genetic distances ranged from 0 to 0.11%, with an overall mean of 0.02% for the field and
33
greenhouse populations combined (Table 3-1).
Table 3-1: Summary of the ZYMV CP sequences from each infected plant under aphid-vectored
(field) and mechanically-inoculated (greenhouse) transmission.
Mutational spectrum in the field plants
We generated a total of 378 clones from 11 field samples. Of these, 329 had no mutations
and therefore matched the consensus sequence generated from the first-infected field plant.
Clones from two of the individual plants, including the first inoculated plant – CF8 and CE7(2) –
exhibited no mutations. Overall, there were total of 47 mutated sequences and 23 different
mutations, 18 of which were singletons (occurred in one sequence only). This represents a
mutational frequency of 1.47 × 10−4 mutations per nucleotide site. Ten of the mutations were
synonymous; two sequences exhibited the same silent mutation, and 13 sequences from
individual CG7 at time point 1 showed a change from a TAG stop codon to a TAA stop codon.
There were 13 nonsynonymous mutations, three of which were found in multiple clones. Notably,
one of these non- synonymous mutations (TTG to TAG) resulted in a premature stop codon and
was found in seven (19.4%) of the clones from plant CG7(1). A minimum spanning tree showing
34
the structure of this genetic diversity is shown in Fig.3-2a. Although most mutations are only one
step away from the consensus, clear population structure was present in the form of three clones
being two mutational steps away from the consensus, a number of mutations present in multiple
clones, and in one case a mutant clone (at position 849) itself possessing a descendent mutation
(at position 786). The latter is indicative of a distinct sub-lineage, although one that is only found
at a single time-point in a single plant. Although we cannot exclude the possibility of a ZYMV
infection other than our primary inoculant, given the low level of genetic diversity and the fact
that ∼90% of the sequenced clones match the consensus this seems extremely unlikely. DASELISA tests undertaken by Agdia revealed that only one of the samples, CE7, was co-infected
with another virus, in this case Watermelon mosaic virus-2 (WMV-2). There appears to be no
significant difference in mean pairwise genetic divergence, or mean dN/dS, between this sample
and the other field samples (Table 3-1).
Previous work has suggested 36 of the 42 amino acids of the N-terminus of the CP can be
altered with no apparent effect on the viral life-cycle and hence are highly variable (Gal-On,
2007). In our study, only five of the total of 23 mutations occurred in this region, three of which
were nonsynonymous. However, when correcting for sequence length we observed no significant
difference in the number of mutations between the N-terminus and the rest of the CP (p =
0.2618). We also observed no mutations in the conserved DAG region known to be involved in
aphid transmission.
Finally, the number of unique mutations did not differ significantly over time within
individuals (CF7: p = 0.944; CE7: p = 0.0578; CG7: p = 0.345; CE8: p = 0.418). However, the
total number of mutated sequences within an individual over time was significantly different for
two individuals (CE7: p = 0.0339 and CG7: p = 0.0077 applying the same correction).
Mutational spectrum in the greenhouse plants
A total of 328 clones were generated from the nine greenhouse plants, 301 of which had
35
no mutations and so matched the consensus sequence of the first-infected field plant. Only one
individual plant, from the third generation, exhibited no mutations. There were a total of 24
mutated sequences and 18 different mutations, 17 of which were singletons, representing an error
frequency of 8.7 × 10−5 mutations/site. Seven of the mutations were synonymous, and 11 were
nonsynonymous, one of the latter being found in seven clones. One stop codon mutation was
found in one sequence. Notably, none of the mutations were the same between transmission
events. Three of the 18 mutations were found in the highly variable N-terminus region of the CP,
although we again observed no mutations in the conserved DAG region. Finally, comparing the
fifth and eighth leaves within a plant, we found that the number of mutations was the same
between them in plant A, increased from one to five in plant C, and increased from two to four in
plant H. Crucially, however, we identified no shared mutations between sequenced clones from
the fifth and the eighth leaves, indicative of a rapid population turnover. Indeed, the minimum
spanning tree of these data is striking in its marked lack of population structure, such that all the
mutations are only one step away from the consensus (although one is present in seven clones;
Fig. 3-2b).
36
Fig 3-2: Minimum spanning tree of the sequences collected here. (a) Field experiments. (b)
Greenhouse experiments. The numbers along the branches represent the nucleotide position at
which each mutation occurred. The number of clones with a particular mutation is one unless
otherwise noted within the oval. Plants labeled as in Fig 3-1.
37
As the aphid vector was removed in the greenhouse experiment we might expect the
extent of purifying selection to be stronger in the field than the greenhouse. However, we
observed no marked difference in mean dN/dS ratios among these populations; a value of 0.54
(CI 95%: 0.23–0.84) was observed in the field compared to 0.66 (CI 95%: 0.34–1.13) in the
greenhouse. The high dN/dS values (>1) observed in some individual samples likely reflect a
large sampling error on the small number of mutations observed. Finally, it is notable that we
observed no clear difference in the spatial distribution of mutations along the CP between the two
experimental conditions (Fig. 3-3).
Figure 3-3: Spatial distribution of mutations in the CP gene from both the field and greenhouse
experiments. The numbers below the horizontal line represents nucleotide positions.
Mutations introduced during the experimental procedure
The error rate for the reverse transcriptase (RT) enzyme used here is reported as 2.9 ×
10−5 mutations/site/replication (personal communication, Invitrogen). Given our sequenced target
region of 849 nt, the expected number of mutations per cDNA copy of the CP gene is therefore
0.0246 (2.9 × 10−5 mutations/site/replication × 849 sites × single round of replication). We cloned
706 of these cDNA copies, leading to an overall expectation of 17.37 mutations among our 706
clones. The overall error rate including both the Phusion taq error rate and RT error rate is 0.0377
(calculated using the Phusion Taq error rate provided by Finnzymes and the RT error rate given
above). Accounting for both the RT enzyme and Taq polymerase error rate we would expect the
38
total number of artefactual mutations to be ∼27. The actual number of mutations observed in our
data was 71. Although is it clear that our data contains a number of artefactual mutations, as is
likely to be true of any study of intra-host genetic variation in RNA viruses, many of the
mutations observed here will be bona fide, especially as the reported error rate for RNAdependent RNA polymerase is greater than of RT (Drake et al., 1998). In addition, we used great
caution when calling mutations and only counted those that were present in both the forward and
reverse alignments, and in some cases sequenced both directions twice. Hence, our reported
introduced RT error rate is likely to be conservative. As such, it is highly unlikely that mutations
at a frequency >1 are artefactual, including the stop codon mutation in plant CG7(1).
Discussion
Although the level of intra-host diversity we report for ZYMV (mean=0.02%) is on
average less than that recently observed in intra-host studies of animal influenza viruses using
similar methodologies, there was considerable overlap among estimates and fewer clones were
analyzed in this case (Hoelzer et al., 2010; Iqbal et al., 2009; Murcia et al., 2010). For example, a
study of 2366 sequences of equine influenza virus resulted in a mean intra-host diversity of
0.04% (range 0.01–0.12% among samples) (Murcia et al., 2010). Hence, ZYMV appears to
exhibit mutational dynamics broadly similar to those observed in some rapidly evolving animal
RNA viruses, and as expected given the intrinsically error-prone nature of replication with RNAdependent RNA polymerase. The possibility of artificially induced mutations should therefore be
explored for those plant RNA viruses in which far higher levels of intra-host genetic diversity are
observed.
It is also striking that most mutations in ZYMV are transient in nature, only being
observed at a single sampling point. Indeed, we observed no mutations that were shared between
time points from individual plants. Although a certain proportion of the mutations observed are
clearly artefactual and an inherent outcome of the experimental procedures employed, particularly
39
singleton mutations which should be treated with caution, our results are compatible with the
notion that the majority of intra-host mutations in ZYMV are deleterious and removed by
purifying selection between sampling times. The relatively high number of stop codon mutations
observed supports this hypothesis, as does the marked difference in mean dN/dS values within
(∼0.6; herein) and between (0.108; Simmons et al., 2008) hosts. A similar turnover of apparently
transient deleterious mutations has been observed in a number of animal RNA viruses (Holmes,
2003, 2009; Hoelzer et al., 2010; Murcia et al., 2010), is supported by experimental studies of
fitness distributions in RNA viruses (Sanjuán et al., 2004), and may therefore be a common
component of intra-host viral genetic diversity. Despite this, it is notable that some short-lived
population structure was present in the field samples – manifest as clones that differed in multiple
mutations from the consensus, the same mutations present in multiple clones, and at least one
distinct viral sub-lineage – yet not so in the greenhouse experiment. It is therefore possible that
transmission mode impacts the structure of viral genetic diversity, even at the scale of individual
plants, although this is evidently an issue that needs to be reassessed with a far larger number of
clones than generated here.
Importantly, the discontinuity of mutations within individuals over time extends to
transmission: no lineages were shared between individuals during aphid transmission. This
suggests that the bottleneck imposed by the aphid is substantial, although it is also possible that
our sample size is insufficient to sample minor lineages. As the aphid-imposed bottleneck is
absent from the greenhouse experiment we might have expected to see more lineages transferred
between hosts in this case. That this does not appear to the case from the data generated here
suggests that the intra- and inter-plant population bottlenecks are generally severe enough to
remove most genetic variation. In addition, that the number of unique mutations did not increase
during serial passaging in the greenhouse indicates that the aphid-imposed bottleneck is not the
only factor restricting genetic diversity, although this will clearly need to be explored further
40
using a larger number of serial passages. Irrespective of sample size, the existence of strong
population bottlenecks means that genetic drift will play a major role in substitution dynamics.
One of the most striking observations of our study was that seven clones sampled from
one leaf at one time point from one field plant contained the same stop codon mutation. Such a
high frequency of what is likely to be a deleterious mutation is suggestive of the action of
transient complementation, although this will require future experimental verification. Indeed,
that the stop codon mutation was not found at later time-points in this individual argues against
both recurrent mutation and polymerase read-through as both would be expected to have longerterm effects.
Complementation has previously been reported in experimental infections of plant
viruses (Fraile et al., 2008; Osbourn et al., 1990). For example, a mutant Tobacco mosaic virus
with a frameshift and premature stop codon mutation in the CP was fully complemented in
transgenic plants that expressed the wild-type CP gene (Holt and Beachy, 1991).
Complementation has also been documented during viral co-infections, including truncated CP
mutants of Pepper huasteco virus that were complemented by coinfection with Taino tomato
mottle virus (Guevara-González et al., 1999). Not only is viral co-infection a frequent occurrence
in nature, but the use of transgenic squash is now commonplace in agricultural settings.
Complementation in these circumstances could theoretically lead to the inhibition of gene
silencing (Qu et al., 2003; Thomas et al., 2003), the correction of defects in movement (Callaway
et al., 2004), and perhaps even the expansion of host range (Latham and Wilson, 2008; Spitsin et
al., 1999). Given the threat that RNA viruses such as ZYMV pose to staple crop production
worldwide, the frequency and consequences of complementation in natural populations of plant
viruses clearly needs to be investigated in greater detail.
Chapter 4
Deep sequencing reveals persistence of intra- and inter-host genetic diversity
in natural and greenhouse populations of Zucchini yellow mosaic virus
Abstract
The genetic diversity in populations of RNA viruses is likely to be strongly modulated by their
life-histories, including mode of transmission. However, how transmission mode shapes patterns
of intra- and inter-host genetic diversity, particularly when acting in combination with de novo
mutation, population bottlenecks, and the selection of advantageous mutations is still poorly
understood. To address these issues, we performed in-depth next generation sequencing of
Zucchini yellow mosaic virus (ZYMV) in a wild gourd, Cucurbita pepo ssp texana, under two
conditions: aphid-vectored and mechanically inoculated, achieving an average coverage of
~9000X. We show that mutations persist during inter-host transmission events in both the aphid
vectored and mechanically inoculated populations, suggesting that the vector-imposed
transmission bottleneck is not as extreme as previously supposed. Similarly, mutations were
found to persist within individual hosts, arguing against strong systemic bottlenecks. Strikingly,
mutations were seen to go to fixation in the aphid vectored plants, suggestive of a major fitness
advantage, but remained at low frequency in the mechanically inoculated plants. Overall, this
study highlights the utility of next generation sequencing in providing high resolution data
capable of revealing the nature of viral evolution, particularly as the full spectrum of genetic
diversity within a population may not be uncovered without sequence coverage of at least
2,500X.
Introduction
Understanding the factors that generate and maintain genetic diversity is the central goal
42
of evolutionary genetics. Plant pathogenic RNA viruses are ideally suited for the study of the
determinants of genetic variation because of their extremely high mutation rates, itself due to the
lack of error-correction associated with replication by an RNA-dependent RNA polymerase, and
their rapid replication (Duffy et al., 2008). This capacity to generate genetic diversity is central to
the capacity of RNA viruses to breakdown host resistance mechanisms (Acosta-Leal et al., 2010;
Feuer et al., 1999; Lech et al., 1996), to adapt to new niches (Roossinck, 1997), including new
hosts (Jerzak et al, 2008), and for changes in virulence (Acosta-Leal et al., 2011).
For any RNA virus, the extent and structure of the genetic variation that occurs within
individual hosts is due to a combination of de novo mutation, genetic diversity generated through
mixed infection, natural selection, and stochastic processes such as genetic drift and the
population bottlenecks that occur both within and among hosts. However, the roles played by
these differing processes in shaping intra-host genetic variation are uncertain. For example, given
the extremely large census population sizes that plant RNA viruses can achieve (e.g., in Tobacco
mosaic virus, TMV, this has been documented to reach 1011—1012 virions per infected leaf;
Garcia-Arenal et al., 2003), it might be expected that selection would act efficiently within hosts.
However, several studies indicate that the effective population size (Ne) of RNA viruses in nature
is several orders of magnitude lower than the census population number (García-Arenal et al.,
2001; Hughes 2009), and the duration of infection in a single host may be of insufficient length to
enable natural selection to fix beneficial mutations. As such, stochastic processes may be more
important determinants of genetic diversity at the intra-host level.
Population bottlenecks may be particularly important in plant RNA viruses. Such
bottlenecks are thought to occur during two processes: between-host vector transmission and
systemic movement within the plant. For example, the number of virions transmitted from
mechanically infected squash plants to healthy plants via aphids (Aphis gossypii and Myzus
persicae) has been estimated to be on average three virions for both aphid species (Ali et al.,
43
2006), and even lower numbers have been observed in Cucumber mosaic virus (CMV)
(Bentacourt et al., 208). Similar drastic population bottlenecks have been reported during
systemic movement. For instance, estimates of the founding population in a new leaf after
systemic movement during TMV infection ranged between two and 20 virions (Sacristan et al.,
2003), and only four virions of Wheat streak mosaic virus appear to be involved in the invasion of
new tillers of wheat (French & Stenger 2003). Population bottlenecks have also been observed on
a cellular level. For example, using Soil-borne wheat mosaic virus, Miyashita & Kishino (2010)
determined the cell-to-cell bottleneck to be ~6 virions for the initial movement from the infected
cell and ~5 virions in subsequent movements. Although these studies suggest that population
bottlenecks are likely to have major effects on plant virus evolution, to date there has been no
analysis of the impact of population bottlenecks using extremely high coverage data of viral
genomes, particularly as produced through next generation sequence data.
Due to its very high levels of coverage, next generation sequencing represents an
excellent tool for detecting allele frequencies present at low frequencies. Therefore, to gain a
deeper understanding of the extent of intra-host genetic diversity in plant RNA viruses and the
processes that have generated this variation, we used deep sequencing techniques to analyze the
extent of genetic variation, and particularly the effect of population bottlenecks, in Zucchini
yellow mosaic virus (ZYMV) infecting its natural host Cucurbita pepo ssp texana (a wild gourd).
ZYMV is one of the most studied viruses of the family Potyviridae. The virus infects wild and
agronomically important members of the plant family Cucurbitaceae (squash, melon and
cucumber), causing symptoms that include yellowing and stunting of the plant, as well as severe
leaf and fruit deformities (Desbiez & Lecoq, 1997). This emerging RNA virus attained worldwide
distribution within two decades of its description (Lisa et al., 1981), and the importance of
ZYMV as a crop pathogen is underscored by the fact that it has been shown to reduce agricultural
yields up to 94% (Blua & Perring, 1989). ZYMV has a single-stranded positive-sense RNA
44
genome of approximately 9,600 nt, with a polyadenylated 3’end and a viral encoded protein
(VPg) covalently linked to the 5’end. A single open reading frame codes for a large polyprotein
precursor that is processed into 10 putative proteins by three virally encoded proteases (P1, HCPro and Nla) (Gal-on, 2007). As is common given the compact genomes typical of RNA viruses,
these proteins are multi-functional and as such are expected to be under fairly strong selective
constraints (Holmes, 2003).
Transmission of ZYMV primarily occurs via aphids in a non-persistent manner (Pfosser
& Baumann, 2002; Urcuqui-Inchima et al., 2001), with 26 aphid species shown to be capable of
transmitting the virus (Katis et al., 2006). An interaction between two conserved regions of the
HC-Pro the KITC/KLSC (which interacts with the aphid stylet), and the PTK (which interacts
with the conserved DAG region in the CP) results in viral transmission (Urcuqui-Inchima et al.,
2001). This has been termed the ‘helper strategy’ as the HC-Pro acts as a bridge between the CP
and the aphid stylet, which differs from the ‘capsid strategy’ whereby the capsid protein interacts
directly with the aphid mouthparts (Pirone and Blanc, 1996). In addition, vertical transmission via
seed has been shown to occur in Cucurbita pepo at low rates (1.6%; Simmons et al, 2011).
To determine the extent and structure of genetic diversity in intra-host populations of
ZYMV, and particularly how this diversity is likely to be shaped by population bottlenecks, we
undertook deep sequencing of ZYMV populations infecting C. pepo ssp texana under two modes
of horizontal transmission: aphid-vectored and mechanically inoculated (i.e. without aphids).
From the aphid-vectored experiment, we produced both epidemiological-scale data from which
we can determine the extent of the bottleneck imposed by the aphid during inter-host
transmission, as well as intra-host genetic variation over the course of infection. As a new leaf
sample was collected at each time point we were not only able to determine the mutational
spectrum maintained within individual plants over time, but also how intra-host viral genetic
diversity is affected by bottlenecks during systemic movement. ZYMV was also mechanically
45
inoculated across eight generations in a serial passaging experiment carried out in a greenhouse.
Comparison of these data with those from the field study allowed us to analyze, uniquely, the
evolution of viral genetic diversity with and without the aphid-imposed bottleneck.
Methods
Field experiment
The field experiment was conducted using C. pepo ssp. texana at The Pennsylvania State
University Agriculture Experiment Station at Rock Springs, Pennsylvania, USA. One 0.4-hectare
field with 180 plants was laid out as a grid labeled A-L and 1-15, with approximately six meters
between plants. In 2007, the plant situated in the middle of the field, F-8, was mechanically
inoculated with ZYMV that was isolated by us during a previous field season (the consensus
sequence of the CP has been deposited in GenBank accession number EU371649) (Simmons et
al., 2008). Plants are labeled are as follows: The first letter and number, for example F8,
designates the plant coordinates within the field grid, and the number in parenthesis denotes the
order in which samples where collected from an individual plant. When the initially inoculated
plant, F8, exhibited viral symptoms a leaf was collected. As neighboring plants became infected,
a leaf sample was collected on a weekly basis from each plant from the onset of visible symptoms
until host death (~9 weeks). Presence of ZYMV in the leaf samples was detected
immunologically using DAS-ELISA (Agdia, IN) and confirmed by RT-PCR, and were
subsequently stored at -80oC. Although samples were collected from all of the infected plants in
the field, a subset of samples that were spatially related to F8 were selected so that a total of
sixteen samples representing six individual plants were used for next generation sequencing. This
subset included one plant that was sampled at four time points: F8 (July 24th, August 8th, August
13th and August 28th); two plants sampled at three time points: F7 (August 30th, September 13th
and September 20th) and G7 (August 30th, September 6th and September 20th); and three plants
46
sampled at two time points: E7 (September 13th and September 20th), E8 (September 13th and
September 20th), and G6 (September 20th and September 30th) (Fig 4-1).
Figure 4-1: Schematic representation of the field experimental design showing the spatial
relationship between individual plants. The first two digits designate the plant coordinates within
the field grid, and the number in parenthesis denotes the number of samples collected from an
individual plant. F8 (4) in the bottom right hand corner is the original inoculant. The arrows
represent transmission events by aphids.
Greenhouse experiment
47
Two individual plants were mechanically inoculated in a greenhouse at The Pennsylvania
State University in January of 2008 with a ZYMV sample taken from the first diseased individual
from the 2007 season (F8). Inoculum was prepared from infected plant tissue diluted in a
phosphate buffer (0.1 M Na2H/KH2PO4 buffer) in a 1:3 v/v ratio. Carborundum powder (500gm)
was then rubbed on the surface of the leaf, and the inoculum subsequently applied to the leaf
surface with a pestle. When the plants displayed disease symptoms and exhibited at least an
additional eight leaves of growth from the inoculation site (typically 4 to 5 weeks), half of the
fifth leaf each was used to inoculate another individual. This process was repeated up to the
eighth generation. The other half of each leaf was stored at -80oC, and subsequently used for
sequencing.
RNA isolation and RT-PCR
RNA was isolated from frozen leaf samples using the RNeasy® Plant Mini Kit (Qiagen,
CA). First-strand cDNA was synthesized from the extracted RNA using five genome-specific
primers, which were designed based on the reference strain, following the protocol provided by
the supplier using the SuperscriptTM III First-Strand Synthesis kit (Invitrogen, CA). The target
cDNA was then amplified directly via PCR using Phusion® High-Fidelity PCR Master Mix
(Finnzyme, MA). PCR was conducted following manufacturers protocols with HF PCR buffer
and 5 µl of first-strand product in a 50 µl total reaction volume. The following PCR conditions
were used: 98°C for 1 min, 98°C for 10 s, 58°C for 20 s, 72°C for 1 min 20 s, for a total of 20
cycles with a final 5 min 72°C extension and held at 4°C. The 5 primers were designed with 560,
19, 141, and 151 bp overlap between amplicons across the genome. Primers: ZYMC_F1: (nt 2750 of the reference strain NC_003224.1) AGAAATCAACGAACAAGCAGACGA, ZYMC_R1:
(nt 2199-2219) GCAACATCCATCAACGAAGGC, ZYMC_F2: (nt 1689-1708) GGGGG
AAAGAGGGTATCATT, ZYMC_R2: (nt 3956-3973) CCAAGGGGCGTGTAGGTT,
ZYMC_F3: (nt 3956-3974) TGAACCTACACGCCCCTTG, ZYMC_R3: (nt 6070-6088)
48
TGCCCTTGCCCATAAAATA, ZYMC_F4: (nt 5947-5970) GACGAAAGCACCC
ATACAGACATA, ZYMC_R4: (nt 7808-7826) TGACCGACCCACCAATCCT, ZYMV_F5-2:
(nt 5947-5970) GGTGGTTGGGATAGATTGATGAG, ZYMV_R5-2: (nt 9515-9534)
TCCGACAGGACTACGGCATT. These primers allowed for coverage of 99% of the viral
genome. Amplicon lengths were 2192, 2314, 2134, 1879, and 1859 bp in length. The five PCR
products per viral sample were pooled and gel extracted using Zymoclean Gel Recovery kit
(Zymo Research, CA) to remove background amplification product. After which the purified
samples were quantified using a Qubit fluorometer (Invitrogen, CA).
Illumina Library Construction
Once quantified, samples were sheared using NEB Next dsDNA Fragmentase (New
England Biolab, MA) following manufacturer’s recommendations. Approximately 300ng of
pooled product were used for shearing to a desired size range of 100-300 bp. The reaction was
terminated by adding 5 µl cold 0.5 M EDTA and cleaned with DNA Clean & Concentrator-5 kit
(Zymo Research, CA). The fragmented samples were used for library construction following
Mortazavi et al. protocol starting at blunt-end repair (2008). The following exceptions were
made: each cleaning step was conducted using DNA Clean & Concentrator kit and blunt-end
repair and ligation reactions were conducted using reagents from NEB. Samples were amplified
and indexes were incorporated following standard indexing protocols with a total of 18 PCR
cycles: 98°C for 1 min, 98°C for 10 s, 65°C for 30 s, 72°C for 30 s, and a final 5 min 72°C
extension and held at 4°C. Samples were then PCR purified, quantified, and diluted to 10 nM
concentration for Illumina sequencing. DNA sequencing was performed at the University of
Southern California on an Illumina GAIIx with multiplexing (12 samples per lane for the first two
lanes and eight on the last lane) for a total of three lanes on the same flow cell.
Read accuracy and the identification of variant sites
49
We used a standard workflow for identification of variant sites on Galaxy (Goecks et al.,
2010; Blankenberg et al., 2010) that can be accessed at http://usegalaxy.org/heteroplasmy. We
altered the workflow by increasing the maximum edit distance to seven, and the minimum
allowable coverage to a highly conservative value of 500X. The reads were mapped to the ZYMV
reference genome (NC_003224.1) using a burrows wheeler alignment mapper (Li & Durbin,
2009), and subsequently transformed and filtered using Galaxy tools. Strand bias was accounted
for such that any variance found at a site was validated in both strands in order to be considered a
true variant. To control for mapping quality we excluded any sites that had a quality score less
than 30 as compared with the illumina supplied control (PhiX 174). According to Illumina, with
this quality score the inferred base call accuracy is 99.9%. To control for methodological errors
introduced as a result of the experimental procedures we took an extremely conservative
approach, excluding (i) any mutations that were present at a frequency of less than 1% and (ii)
any sites where the coverage was less than 500X.
All nucleotide sequences generated here have been submitted to GenBank and assigned
accession numbers JN192405 to JN192428
Mutation analysis
The consensus ZYMV sequence for each sample was manually aligned to the ZYMV
reference strain using Se-Al (2.0a11; kindly provided by Andrew Rambaut, University of
Edinburgh). Counts of the number of mutations in each sample were undertaken manually. To
determine if there was an association between fluctuation in mutation frequency and time point,
we performed a chi-square test of independence using the statistical package SPSS 13.0 (SPSS
Inc., Chicago, USA). To test if the number of mutations per individual sample was significantly
different between samples, we used both a two-sample t-test and a Mann-Whitney Test in the R
software package (R 2.12.1; 2011), with which we also computed the spatial distribution of
mutations using a Mann-Whitney U test.
50
Given that the frequency of ‘minor’ alleles (i.e. those < 50% in the population) was
known, we used a binomial distribution (in the software package R) to determine the probability
of uncovering that minor allele at increasing levels of coverage (number of reads). In addition, we
resampled our Illumina data at progressively lower levels of coverage in order to determine how
lower coverage levels can bias the discovery of true minor alleles. We ran a simulation (in R) in
which we re-sampled our Illumina data at each base position in the genome. As we had excluded
any variants that occurred at less than 1% frequency, we calculated the minimum threshold as the
99th percentile of a binomial distribution. Not only did this analysis indicate the coverage level at
which all variants would be uncovered, but it also revealed how at low levels of coverage the
discovery of true minor alleles tends to be biased.
Results
Genome Coverage
24 samples were successfully sequenced; 16 aphid vectored and eight mechanically
inoculated. The proportion of the genome that was sequenced ranged from 76.5% to 95.4% with
an average of 83.7% (Table 4-1). After filtering, coverage ranged from 2,243 to 12,507 reads per
individual sample with the average coverage being 9,236. Given the high levels of coverage
attained for a relatively large number of samples, we used these data as a baseline to run
simulations in which we re-sampled the illumina reads using a 1% cutoff to determine the
coverage level at which all variants in the population would be revealed. This analysis suggested
that at very low levels of coverage (10X or less) variants tend to be oversampled leading to an
overestimate of the number of mutations. In contrast coverage levels from 25X to 1000X lead to
an underestimation of the mutational spectrum. For all 24 samples saturation, defined as the
ability to sample all variants in that population, was reached at ~2,500X coverage (Fig 4-2). Since
we averaged 9,236X coverage, we are confident that we have successfully uncovered the majority
of the variants in our populations.
51
Table 4-1: Summary of genome coverage statistics of Illumina sequence data.
For the field samples, the first two digits designate the plant coordinates within the field grid, and
the number in parenthesis denotes the number of samples collected from an individual plant.
1
Total number of reads obtained for each sample
2
Total number of reads that mapped to the ZYMV reference strain allowing for a mismatch of 7
3
Level of coverage obtained before filtering
4
Level of coverage obtained after filtering
5
Proportion of the genome that we obtained coverage of after filtering
52
Figure 4-2: Representative simulation of the resampling of illumina reads to estimate the effect
of coverage on the detection threshold of minor alleles. All samples used in the simulations
produced comparable results, and all variants were uncovered by ~2,500X coverage. The dashed
red line indicates the number of variants within the sample, so that points above the line indicate
oversampling and those below undersampling.
To further determine the power of our illumina coverage to detect low frequency alleles,
we performed a bootstrap resampling analysis using the minor alleles found in the coat protein
gene (CP). This region was chosen as we had previously cloned and Sanger sequenced the CP of
these samples (Simmons et al., 2011). Six CP mutations were uncovered in the current study.
None of which were detected in the previous study, and four of which were sampled only once,
ranging from 1.7-4.6% in allele frequency (nucleotide positions 8547(1.7%), 8631 (4.6%), 9009
53
(3.4%) and 9358 (4.3%)). The other two were found in more than one sample with allele
frequencies averaging 9.7% (8715) and 4.3% (9355). Accordingly, we found the level of
coverage needed to detect a least one read for each allele frequency to be: 1.7% ~250X; 2.1%
~200X; 3.4% ~150X; 4.3% and 4.6% ~100X and 9.7% ~50X (Fig 4-3). Hence, attaining
sufficient coverage is extremely important for detecting low frequency variants in a population,
and for obtaining an accurate characterization of genetic diversity in viral populations.
Figure 4-3: Effect of coverage in the probability of detecting the ZYMV coat protein alleles
uncovered in this study. Probabilities were estimated assuming a binomial distribution. Each
color represents a different mutation, labeled with their position in the genome and allele
frequency in parenthesis.
54
Frequency and pattern of nucleotide variants
A total of 93 variants (i.e. polymorphic mutations at a frequency >1%) were found across
the data set as a whole: 66 were found in a single sample, and 27 were found in at least two
samples. Two of the 27 were found only within the same individual, and 24 were found in more
than one individual, suggesting that these mutations were spread between hosts. Among the full
set of variants, 31/66 and 3/27 were nonsynonomous mutations (Table 4-2). In addition, 48/66
and 8/27 were unique to the field samples; 18/66 and 1/26 were unique to the greenhouse
samples; and 18/27 were shared between both experimental conditions. A chi-squared test in
which all 93 variants were considered indicated that the overall number of mutations generated in
the field was significantly higher than in the greenhouse (χ2=29.17; P<1x10-4). However, the
number of mutations per individual in the greenhouse and field was contrasted and no significant
difference was detected (p=0.494 by two-sample t test; p=0.346 by Mann-Whitney).
55
Table 4-2: Summary of the 27 variants found in more than one sample. The numbers at each
nucleotide position indicate how many samples within each group have a given mutation.
Strikingly, among the 93 variants detected, 11 were present in every time point in an
individual, or in all eight of the greenhouse samples. This indicates that these mutations are
maintained during the course of infection and hence through any intra-host bottlenecks that have
occurred. These comprised; two mutations in F8 (2205 and 7688); five mutations in F7 (1704,
7317, 7821, 7824 and 9463); six mutations in E7 (2205, 7317, 7688, 7821, 7824 and 9533); four
mutations in G7 (6294, 7688, 8508 and 8517); two mutations in E8 (2205 and 7688), and one
mutation in G6 (7688). For these conserved variants, we used a chi-square test of independence to
determine whether they experienced changes in allele frequency over time. Interestingly, we
observed an association between time point and allele frequency in all cases (p<1x10-4), such that
56
allele frequencies have increased rapidly through time as expected if they are selectively
advantageous. In addition, seven mutations were present in at least one time point in every single
field plant (nt positions: 1254, 2205, 4626, 7317, 7688, 7821 and 9463) indicating that these
variants are maintained during inter-host transmission and hence through any population
bottlenecks that have occurred at these times. All but one of these mutations (1254) were also
found in at least one greenhouse sample. In the greenhouse samples, three mutations were shared
across serial passages (1701, 1704 and 7688).
The average number of mutations between our samples and the reference strain
NC_003224.1 (a Taiwanese isolate) is 464 (5.78%), which is compatible with previous studies
using consensus sequences (Simmons et al., 2008). We also compared the variants found in this
study to the other 24 full-length ZYMV genomes published on GenBank. Of the 66 mutations
observed in a single sample found in this study, 25 were present in the GenBank sequences, as
were 16 of the 27 polymorphic variants, including all seven mutations that were found to be
present at least once in every individual, suggesting that these variants may exists as polymorphic
sites in natural populations.
Variation in Allele Frequency
Of the 27 variants present in more than one sample, we found two cases (positions 2205
and 7317) in which the originally ‘minor’ allele (defined as initially less than 50% frequency; in
these cases 35.2 % and 1.6%, respectively) approached fixation in later samples (both allele
frequencies reached 98%; Fig 4-4, a and b). In addition, these fixation events occurred rapidly,
taking only in 59 days in both cases. Interestingly, these same two nucleotide positions are
present as polymorphic sites in the 25 ZYMV full genome sequences on GenBank (2205 in 11/25
and 7317 in 6/25), suggesting that these sites may be polymorphic in nature and may confer a
selective advantage in some host genotypes (or host species) or under some environmental
conditions. The latter idea is supported by the fact that allele frequency changes appear to be
57
affected by environmental conditions. For instance, the minor allele at nucleotide position 7317
increased to fixation in both time and space in the field. However after an initial decrease, the
frequency remained constant through transmission events in the greenhouse, where
environmental conditions are relatively constant; the first greenhouse sample the allele frequency
was 19%, dropped to 2% in the subsequent host, and averaged 2.5% in the remaining hosts.
Although not as striking, a similar trend was observed at nucleotide position 7688 (data not
shown). An additional two cases where the minor variant increased as the virus spread in the field
from the original inoculant, although did not approach fixation, were also observed (nt 1254 and
9533) (Fig 4-4, c & d). At position 1254 the frequency in the original inoculant is 2.6% and
subsequently increases to 28.6%, while at position 9533 the frequency increases from 1.2% to
15%.
Figure 4-4: Variation in allele frequency over time and space of ZYMV variants. The 3D graphics
show changes in allele frequency (y-axis) during within-host infection. The x-axis shows
58
variation over time, or intra-host variation. The z-axis shows variation over space, or between- at
nucloetide positions 2205 (A), 7317 (B), 1254 (C) and 9533 (D). The data corresponds to the
field experiment.
Spatial Distribution of Mutations
We used a bootstrap method to infer whether mutations were spatially clustered across
the genome compared to a null model, which assumed random mutation placement. Bootstrap
distributions and null distributions were calculated for the index of dispersion statistic, and then
compared using the Mann-Whitney U test (using R 2.12.1; 2011). Interestingly, field mutations
showed evidence of significant spatial clustering (p<1x10-4). In contrast, there was no significant
spatial clustering of mutations in the greenhouse samples (p-value~1) (Fig 4-5). We looked at the
number of mutations per gene region, and using a chi-square fitness of fit test (in R) determined
that that the number of mutations per gene region was greater than would be expected by chance
in only two regions: Nlb in the field samples and HC-Pro in the greenhouse samples. We also
found one region in the greenhouse samples (CI) in which the number of mutations was less than
would be expected by chance alone, although these results are strongly dependent on the level of
coverage attained. Despite the relatively high number of mutations observed, those genomic
regions previously suggested to constitute conserved domains in ZYMV were also conserved in
our analysis, indicating that mutations in these regions are strongly deleterious and removed
rapidly within hosts. For instance, all of the regions known to be necessary for aphid transmission
– the PTK and KLSC regions in the HC-Pro, and the DAG region in the CP – were conserved in
our samples.
59
Figure 4-5: Distribution of mutations across the ZYMV genome under field and greenhouse
conditions. The length of the ticks indicates the relative number of samples with that mutation.
Discussion
Although population bottlenecks are expected to be strong both within and between
hosts, nearly 30% of the variants we detected within our viral populations were found in more
than one sample, either within the same or a different plant. As such, the population bottlenecks
that shape the evolution of plant RNA viruses may not be as large as previously suggested,
although this will clearly vary in a virus-specific manner. Of equal importance was the
observation that some of the initially ‘minor’ alleles rapidly went to fixation in the aphid vectored
plants, but remained at low frequency in the mechanically inoculated plants, suggesting that they
are strongly selectively advantageous in the former environment. The dramatic increase in allele
frequency for some of these alleles in the aphid-vectored plants (e.g. 1.6% to 98% at nucleotide
position 7317), was observed in more than one plant. This result argues strongly for natural
selection and against genetic drift as the main mechanism generating the differences between the
60
allele frequencies in the greenhouse and field populations, as the latter process is expected to
result in fixation events over much longer time-scales. The average time for fixation of a neutral
mutation in a haploid population is Ne x generation time, which will generally equate to timescales measured in years, whereas the change in allele frequency recorded here has occurred over
a time period of only two months.
Also of interest in this context was the observation that regions known to be involved in
aphid transmission were conserved in all of the samples analyzed in our study. Hence, the natural
selection we observed is unlikely to be directly linked to transmission events, although it may be
indirectly linked through host-virus, or host-vector rather than to vector-virus interactions.
Specifically, it is believed that compositional differences in salvia among aphid species may
result in differential viral transmission (Pirone & Perry, 2002). There is also evidence that the
virus may manipulate host factors to increase the plant’s attractiveness to potential vectors by
modulating color changes associated with infection (Ajayi & Dewar, 1983), olfactory cues in the
form of volatile compounds (Ngumbi et al., 2007; Medina-Ortega et al., 2009, Mauck et al.,
2010), as well as altering the mechanisms involved in virus acquisition. In addition, host factors
may be involved in optimizing vector transmission. For example, in Cauliflower mosaic virus
(CaMV) virus inclusion bodies have been shown to control aphid-mediated transmission
(Espinoza et al., 1991; Khelifa et al., 2007). Although little is known about the specific
mechanisms underlying these processes, it is possible that the differences in selection pressures
found in the present study may be due to the absence of the aphid vector in the greenhouse
experiment. This possibility notwithstanding, the effect of other environmental differences
between the field and the greenhouse experiments on allele frequencies should also be
investigated. For example, the greenhouse environment is relatively stress free, as the plants are
watered regularly, maintained within a narrow range of temperatures, have ample room and light,
and are sprayed regularly with insecticide to prevent herbivory. This is in direct contrast to our
61
field plants that are subjected to the vagaries of nature and experience a variety of biotic and
abiotic stresses, such as drought, herbivory and competition.
As the transmission events undertaken in the greenhouse represent a release from the
aphid vector, and hence a release from the large population bottleneck imposed by aphid
transmission, and the innoculum dose was large (half a leaf, which ensures inoculation at
saturation), we might expect that the amount of genetic diversity being transmitted between
greenhouse plants to be significantly greater than in the field. It is therefore surprising that our
results indicated that greater genetic diversity is transmitted in the field experiment. Indeed, an
average of only 0.5-3.2 Potato virus Y virions are transmitted per aphid in in vitro experimental
systems (Moury et al. 2007), with similar numbers reported in vivo (Betancourt et al., 2008).
However, these estimations do not consider the huge number of aphids that may be involved in
transmitting the virus, and which could potentially overwhelm the population bottlenecks induced
by single transmission events. In support of this, experiments using suction traps found that
although aphid population size tends to fluctuate both in terms of year and location, very high
population numbers can be achieved (Katis et al. 2006), with up to 40,000 aphids being counted
in one location in one year (range 2,179 - 41,851). Similarly, studies have revealed up to four
alatae and 400 apterous aphids (non winged) per leaf per time-point on C. pepo (zucchini) (Hooks
et al. 1998). As the incidence of ZYMV has been shown to be correlated with total aphid
numbers (Basky et al., 2001), the effect of aphid population size on the effective population size
of viral populations in individual plants clearly needs to be examined in more detail.
It is also possible that the lack of severe bottlenecks in this study may be due in part to
the fact that helper-dependent transmission, such as occurs with ZYMV, may be less prone to
severe bottlenecks than transmission where the virions interact directly with the aphid stylet.
Specifically, the HC-Pro and virion do not have to be acquired simultaneously. As long as the
helper protein is capable of interacting with the aphid stylet it can assist in the transmission of
62
virions acquired from other parts of the host or even from different hosts, thus ameliorating the
effect of the population bottleneck. This is in direct contrast with viruses that interact directly
with the vector (Pirone & Blanc 1996). Thus, it is possible that multiple aphids transmitting the
virus between hosts, as well as the fact that that ZYMV is vector transmitted via the HC-Pro,
maintained levels of genetic diversity in our study.
The genetic resolution we have achieved in this study is clearly a reflection of the deepamplicon sequencing used here. A previous study using some of the same samples, for which
cloning and Sanger sequencing of the CP region was undertaken, revealed that no mutations were
transmitted between individuals or within plants (Simmons et al., 2011), in marked contrast to the
results obtained here. Our simulations revealed that to reach saturation and detect all variants in
the population (assuming a 1% cutoff), a coverage level of ~2,500X is needed in order to sample
all of the variants present in our populations. We also determined that the probability of detecting
an allele that comprises ~10% of the population at least once requires approximately 50X
coverage, and to detect an allele present at 1.7% frequency at least once requires a minimum
coverage of 250X. Given that in our previous study we averaged 35 clones per sample, it is not
surprising that we were unable to uncover these mutations.
More than two thirds of the mutations observed in this study were observed in a single
sample only (66 out of 93). Thus, although there is some transmission of variants both inter- and
intra-host, the majority of the mutations generated were not transmitted either inter- or intra-host.
Whether this is the result of population bottlenecks restricting viral genetic diversity, purifying
selection acting on the viral population, or some combination of both still needs to be determined.
However, the majority of single nucleotide substitutions in RNA viruses are likely to be
deleterious (Sanjuan et al., 2004). Hence, given that approximately half of these mutations
(31/66) are nonsynomomous (compared to 3/27 mutations found in more than one sample), and
that we previously detected the mean dN/dS ratios among these populations to be ~0.6 (the coat
63
protein region only) (Simmons et al., 2011), it is probable that many of the mutations that
occurred in only one sample are also deleterious and will subsequently be purged from the
population.
Overall, our study reveals that, although the majority of the mutations generated within
viral populations may be deleterious, some mutations are clearly transmitted both within and
among hosts and despite the presence of population bottlenecks. Hence, although stochastic
processes must clearly play a role in structuring viral populations, these may be insufficient to
negate the action of natural selection. This latter point is dramatically highlighted by the fact that
we uncovered minor allele variants that approached fixation in time and space, strongly
suggesting that they are selectively advantageous. These findings therefore attest to a complex
pattern of changing genetic diversity in an emerging RNA virus, and will contribute to a more
complete understanding of the dynamics of evolutionary change with implications for the
management of emerging viral diseases.
Chapter 5
Experimental verification of seed transmission of Zucchini yellow mosaic virus
Abstract
Within two decades of its discovery, Zucchini yellow mosaic virus (ZYMV) achieved a
global distribution. However, whether or not seed transmission occurs in this economically
significant crop pathogen is controversial, and the relative impact of seed transmission on the
epidemiology of ZYMV remains unclear. Using reverse transcription polymerase chain reaction,
we observed a seed transmission rate of 1.6% in Cucurbita pepo subsp. texana and show that
seed-infected C. pepo plants are capable of initiating horizontal ZYMV infections, both
mechanically and via an aphid vector (Myzus persicae). We also provide evidence that ZYMV
infected seeds may act as effective viral reservoirs, partially accounting for the current geographic
distribution of ZYMV. Finally, the observation that ZYMV infection of C. pepo seeds results in
virtually symptomless infection, coupled with our finding that an antibody test failed to detect
vertically transmitted ZYMV in infected seed, highlights the urgent need to standardize current
detection methods for seed infection.
Introduction
Since the discovery of Zucchini yellow mosaic virus (ZYMV) in Italy in 1973, and its
subsequent description in 1981 (Lisa et al., 1981), this emerging RNA virus has spread rapidly
and achieved an effectively global distribution (Debiez & Lecoq 1997). Although a number of
explanations have been put forward to account for the widespread geographic distribution and
persistence of this virus, including the international trading of infected fruit, plants, or seeds, as
well as overwintering in alternative hosts and noncolonizer aphids, the mechanisms underlying
the rapid dissemination and persistence of ZYMV remain unclear (Lecoq et al., 2003). ZYMV is
65
a single-stranded positive-sense RNA virus of the family Potyviridae that can result in yellowing
and stunting of the plant, as well as severe leaf and fruit deformities that can reduce yields up to
94% (Blua & Perring, 1989). Given that cucurbit (squash, melon, and cucumber) production in
the United States alone is estimated to be worth approximately $1.5 billion per year (Cantliffe et
al., 2007), the economic significance of this crop pathogen is enormous. Understanding the
epidemiology and evolution of ZYMV is therefore central to controlling this devastating crop
disease.
Viral transmission generally occurs in one of two ways: horizontally, which is the
transmission of the virus between unrelated hosts, or vertically, which is the transmission of the
virus from parent to offspring. ZYMV is horizontally transmitted in a nonpersistent manner by at
least 26 aphid species (Katis et al., 2006). Transmission occurs as a result of an interaction
between the stylet of the aphid, the helper component protein (HC-Pro), and the conserved DAG
(Asp- Ala-Gly) region of the coat protein (CP) (Pirone & Blanc, 1996). However, the current
worldwide distribution of ZYMV is unlikely to have resulted from aphid transmission alone,
particularly as the aphid vector remains viruliferous for a very limited time period (~5 h at 21°C)
after acquisition of the virus (Fereres et al., 1992). Hence, it has been suggested that the longdistance spread of ZYMV may be the result of vertical transmission via infected seeds rather than
horizontal transmission by aphids (Davies & Mizuki, 1986; Debiez & Lecoq, 1997; Fletcher et
al., 2000; Lecoq et al., 2003; Schrijnwerkers et al., 1991; Tobias & Palkovics, 2003). Whether or
not seed transmission of ZYMV occurs remains controversial. This controversy is due in part to
the fact that the reported rates of seed transmission in cucurbits range from 0 to 18.9% (Davies &
Mizuki, 1986; Debiez & Lecoq, 1997; Fletcher et al., 2000; Gleason, 1990; Lecoq et al., 2003;
Muller et al., 2006; Riedle-Bauer et al., 2002; Robinson et al., 1993; Schrijnwerkers et al., 1991;
Tobias & Palkovics, 2003). Accurately determining the rate of seed transmission of ZYMV is of
fundamental importance for understanding the epidemiology of this major plant-pathogenic virus
66
and for developing and implementing strategies to control it.
Some of the reported variation in the estimates of seed transmission rates in ZYMV
undoubtedly results from differences in detection methods. For instance, using an enzyme-linked
immunosorbent assay (ELISA)-based method, Davis and Mizuki (1986) found 18% (246 of
1,299) of Cucurbita pepo (Black Beauty zucchini) seedlings to be infected with ZYMV.
Similarly, Fletcher et al. (2000), using DAS-ELISA, observed seed transmission rates of 3.5% for
ZYMV in C. maxima Duchesne (buttercup squash). However, their results should be interpreted
with caution because they also observed a 2% transmission rate of ZYMV in their controls
(possibly as a result of virus particles remaining on the seed coat). Muller et al. (2006) using
DAS-ELISA detected ZYMV in two of 1,000 asymptomatic Cucumis sativus L. (cucumber), C.
pepo L., and C. maxima Duchesne (pumpkin) that grew from seeds from infected plants, while
ZYMV was detected in 1.4% (15 of 1,031) seedlings of C. pepo var. styriaca (naked seed
pumpkin mutant) using a combination of both DAS-ELISA and reverse transcription–polymerase
chain reaction (RT-PCR) (Pirone & Blanc, 1996). More recently, Lecoq et al. (2003) mention
unpublished data in which no seed transmission of ZYMV was observed in 70,000 seedlings from
various Cucurbitaceae. Other studies suggest that there is only minimal, if any, seed transmission
of ZYMV in Cucumis melo L. (melon) (Gleason, 1990), and that ZYMV transmission through
seeds is probably of no epidemiological importance (Muller et al., 2006).
Finally, interpretations on rates of seed transmission for ZYMV also differ. For instance,
Robinson et al. (1993) found seed transmission rates of only 0.07% in various cucurbits and
concluded that seed transmission does not occur in ZYMV, while Schrijnwerkers et al. (1991)
found a seed transmission rate of 0.05% in C. pepo (zucchini) and concluded that seed
transmission does occur in ZYMV. Similarly, Tobias and Palkovics (2003) reported symptomatic
infections of <0.5% seeds of ZYMV-infected plants from C. pepo var. styriaca (hull-less seeded
oil pumpkin seeds) and concluded that seed transmission does occur at very low rates.
67
To determine what contribution seed transmission has on the epidemiology of ZYMV,
we used C. pepo subsp. texana (wild gourd) as a model system and measured the seed
transmission rate of ZYMV by visual inspection, RT-PCR, and antibody tests (ImmunoStrips;
Agdia, Elkhart, IN). Seed transmission of ZYMV is only epidemiologically significant if
vertically infected plants are capable of initiating additional infections via horizontal
transmission. To test for horizontal transmission, we assayed the ability of the vertically infected
plants to initiate infection via mechanical inoculation and tested for the ability of an aphid vector
(Myzus persicae (Sulzer)) to nonpersistently transmit ZYMV from vertically infected plants to
healthy seedlings.
Methods
Field experiment
We harvested approximately 6,000 seeds (count estimated by weight) at the end of the
2008 growing season from ZYMV-infected C. pepo subsp. texana plants growing in four
experimental fields at The Pennsylvania State University Agricultural Research Farm at Rock
Springs, PA. The 0.4-ha experimental fields were laid out with 180 plants per field with
approximately 6 m between plants. A healthy texana plant that was mechanically inoculated with
ZYMV was placed in the middle of each field to serve as a virus source, and the virus was
subsequently spread to neighboring plants via aphids. The seeds were extracted in 4%
hydrochloric acid and washed in a 10% bleach solution to ensure that any viral infection that
occurred was not simply the result of virus on the seed coat, but rather the result of embryonic
infection. The seeds were then germinated in flats in a greenhouse. At the third true leaf stage,
ZYMV infection was determined visually, and if no symptoms were present the seedling was
discarded. Based on visual symptoms showing only slight leaf deformations, two out of 3,195
plants had ZYMV, which was verified by RNA extraction, RT-PCR, sequencing, and cloning. In
fact, the symptoms were so mild that they could have been easily overlooked or considered to be
68
normal in appearance. We subsequently pooled an additional 281 symptomless seedlings in
groups of 10 for a total of 29 groups, 28 with 10 seedlings apiece, and as there was a final single
seedling, this was treated as an individual group. These 29 groups were tested for ZYMV via RTPCR. When a group consisting of 10 seedlings tested positive for infection, this result was taken
to mean that one of 10 plants was infected. As this interpretation could have underestimated the
number of infected seedlings, our estimate of the seed transmission rate is conservative. Because
we knew the proportion of samples that tested negative, we used both the binomial distribution
and the Poisson distribution to estimate the probability that more than one seedling would test
positive in the same sample. At the end of the 2009 season, we again collected fruits from field
plants that had been naturally infected with ZYMV via aphid transmission. Although all of the
plants displayed classic visible signs of ZYMV foliar infection such as deformed, stunted leaves
with yellow mottling, the majority of the fruits showed no symptoms and appeared healthy. The
seeds were extracted and cleaned as described above, and seeds from individual plants were
pooled. The seeds were planted in flats in the greenhouse. At the third true leaf stage, a leaf tissue
sample was collected and frozen at –80°C for analysis from each of 2,336 seedlings. Samples
were pooled into batches of 10 for extraction, cDNA synthesis, and PCR. Two plants that tested
positive by RT-PCR were also tested for ZYMV using ImmunoStrips as per the manufacturer’s
protocol. The ZYMV ImmunoStrip is polyclonal and able to detect a number of isolates,
including the CT, USDA, SJBCA, CA, IT, NY, FL, and Z18 strains.
RNA isolation, PCR analysis, cloning and sequencing
RNA was isolated from frozen leaf samples using the RNeasy Plant Mini Kit (Qiagen,
Valencia, CA). First-strand cDNA was synthesized from the extracted RNA using the Superscript
III First-Strand kit (Invitrogen, Carlsbad, CA) as per the manufacturer’s protocol, and the target
cDNA was then amplified directly via PCR using Phusion High-Fidelity PCR Master Mix
(Finnzymes, Espoo, Finland; distributed by New England Biolabs, Ipswich, MA). PCR
69
amplification was performed for 35 cycles (Step 1: 98°C for 1 min, Step 2: 98°C for 10 s, Step 3:
64°C for 20 s (minus 1°C every cycle), Step 4: 72°C for 40 s, Step 5: cycle to step 2 for 2 cycles,
Step 6: 98°C for 10 s, Step 7: 62°C for 20 s, Step 8: 72°C for 40 s, Step 9: cycle to step 6 for 31
cycles) followed by a final extension for 5 min at 72°C. The coat protein (CP) specific primers
used for the cDNA synthesis and PCR were: forward: AAGTGAATTGGCACGCTA; reverse:
CGGTAAATATTAGAATTACGTCG.
To verify that the PCR product was indeed ZYMV, four samples were submitted for
sequencing at the Penn State Genomics Core Facility (The Pennsylvania State University,
University Park, PA). Each sample was purified with the QIAprep Spin Miniprep Kit (Qiagen).
Two samples were cloned using the TOPO TA Cloning Kit (Invitrogen) prior to which each
sample was purified using QIAquick PCR Purification Kit (Qiagen) and an A overhang was
added to each sample. Approximately 40 clones were submitted from each sample for
sequencing. To ensure that mutations were valid, each clone was sequenced in forward and
reverse, and manually aligned in the Se-Al (2.0a11) package kindly provided by Andrew
Rambaut (University of Edinburgh, UK). Any mutations occurring in only one direction were
discarded. This resulted in 71 reliably sequenced clones. The sequences were then trimmed to
cover the majority of the CP region: from the CP start codon to nucleotide (nt) 773. T7 forward
and M13 reverse primers were used for clone sequencing. All sequences generated have been
submitted to GenBank and assigned accession numbers (HQ543133 to HQ543139).
Mechanical inoculation
To determine if mechanical transmission can occur from seed-infected plants, we grew
six healthy seedlings (noninfection determined via RT-PCR), which we mechanically inoculated
with ZYMV-infected tissue from three vertically infected plants. Each infected plant was used to
inoculate two healthy seedlings apiece. A ~3 cm2 piece of infected leaf tissue was ground in
liquid nitrogen prior to being diluted in a phosphate buffer (0.1 M Na2H/KH2PO4 buffer) in a 1:3
70
ratio. Carborundum powder was dusted on the surface of the leaf, and the inoculum was then
applied with a pestle to the leaf surface.
Horizontal transmission from vertically infected plants
We used Myzus persicae to determine if an aphid vector could transmit the virus from a
vertically infected plant to healthy seedlings. As a positive control, we assayed a mechanically
inoculated infected plant that displayed severe ZYMV symptoms. A leaf was cut into seven
portions, and ~25 aphids were allowed to feed on six of these in the dark for 30 min, the seventh
served as a negative control. The leaf portions were then placed on noninfected seedlings
(noninfection was checked by RT-PCR) at the first true leaf stage, and these plants were left
overnight. The following day, the plants were sprayed with Endeavor (pymetrozine) (Syngenta,
Guelph, ON) diluted per the manufacturer’s protocol (0.34 g/liter) and applied at a rate of 1 liter
per 46.5 m2 to kill the aphid populations, and the seedlings were left in the spray chamber
overnight before being returned to an aphid-free greenhouse. After approximately 3 weeks, a leaf
sample was collected from each seedling, and infection was determined by RT-PCR. The same
procedure as described above was then used to test if horizontal transmission could occur from
seven seed-infected plants using 10 to 60 aphids per leaf portion. Each plant was used to infect
six healthy seedlings (noninfection was checked by RT-PCR), with an additional healthy plant
serving as a control for a total of 42 seedlings.
Results
Immunostrips testing
The two seed-infected plants that tested positive for ZYMV via RT-PCR tested negative
using an antibody test (ImmunoStrips from Agdia). In contrast, the mechanically inoculated
positive control plant tested positive using the same test.
Seed transmission rate
In 2008, two individual seedlings and four of 281 (1.42%) samples were infected with
71
ZYMV (verified by RT-PCR). In 2009, 36 of 2,336 (1.54%) were infected. Hence, a total of 42 of
2,619 samples, or 1.6%, were infected by seed transmission. Using a binomial distribution, we
estimated the probability that an individual seedling would test positive to be 1.66%, while under
a Poisson distribution we estimated the same probability to be 1.67%. Thus, we believe that our
estimate of a seed transmission rate of 1.6% accurately reflects the data.
Horizontal transmission from vertically infected plants
We used three seed-infected plants to mechanically inoculate a total of six healthy
seedlings (two apiece). From these, we found four (66.67%) ZYMV-infected seedlings using RTPCR. When seven ZYMV-infected plants derived from infected seed were used as source plants
for the aphid transmission tests, the number of seedlings subsequently virus inoculated by the
aphids was three out of 42 (7.14%). This seedling infection rate of 7.14% was verified by RTRCR, and none of seven control plants fed on by nonviruliferous aphids became infected. In
contrast, when one mechanically infected ZYMV was used as a source plant for an aphid
transmission test, four of the eight (50%) became infected.
Genetic diversity of ZYMV
We generated a total of 71 coat protein (CP) clones from two vertically transmitted
plants. Within this sample we found a total of seven mutations, three from one plant (designated
seed-2 (S2)) and four from the other (S1). Five of the mutations were singletons (i.e., only
observed once in the alignment), and the other was observed in two clones from the same plant
(S1). Four mutations were synonymous and three were non-synonymous. A minimum spanning
tree, displaying the mutations observed, how they differed from the consensus, as well as a
marked absence of phylogenetic structure (i.e., all mutations are one step away from the
consensus), was estimated using the statistical parsimony approach available in the TCS 1.21
program (Clement et al., 2000) (Fig 5-1).
72
Figure 5-1: Minimum-spanning tree of the seed clones. Numbers along branches represent the
nucleotide position at which each mutation occurred. Number of clones with a particular mutation
is one unless otherwise noted within the oval. S1 and S2 designate seed samples one and two, and
the next two digits the clone number.
It is theoretically possible that the sequences we obtained could be the result of escaped
CP transgenes from deregulated transgenic squash rather than due to seed infections (H. Lecoq,
personal communication). However, after cloning and sequencing two of the ZYMV samples, we
found that all 71 clone sequences contained 22 amino acids from the protein immediately
preceding the CP (the nuclear inclusion b). As the transgene consists of the CP alone, sequencing
a portion of the nuclear inclusion b precludes the possibility that the obtained sequences were
derived from an escaped transgene.
Discussion
We observed a seed transmission rate of 1.6% for ZYMV in C. pepo subsp. texana, as
73
well as evidence that vertically infected plants can act as reservoirs for horizontal transmission.
This rate is theoretically high enough for infected seeds to constitute a viable route by which
ZYMV epidemics are initiated and hence may be partially responsible for the current geographic
distribution of this devastating crop pathogen. Indeed, trace seed infection (0.001) in lettuce
mosaic potyvirus has been shown to be sufficient to affect lettuce production due to the
subsequent spread of the virus by aphids (Johansen et al., 1994). We therefore believe it is
plausible that a seed transmission rate of 1.6% may be able to initiate yearly epidemics. Also of
note is that the DAG motif, which is known to be involved in aphid transmission (Gal-On, 2007),
was not mutated in any of the cloned sequences.
Notably, the infected plants were essentially symptomless. Other than the occasional leaf
curling on the first true leaves, which could also occur as the result of mechanical damage when
emerging from the seed coat, the plants looked healthy and displayed no mottling, yellowing, or
stunting as normally seen with ZYMV infection. This finding may account for the low
transmission rates reported by authors who used visual inspection as their primary ZYMV
detection method, thereby leading to an underestimation of the true transmission rate. For
example, Gleason (1990) determined infection based on visual symptoms and reported that only
three of 6,800 C. melo (melon) seedlings displayed the typical ZYMV symptoms of foliar
distortion, mosaic and stunting. The lack of obvious viral symptoms also implies that it may be
difficult to identify the source of a ZYMV epidemic. In addition, it is possible that healthyappearing, seed-infected seedlings might be involved in the global spread of the virus. Lecoq et
al. (2003) demonstrated that melon fruits displaying light disease symptoms of ZYMV infection
were capable of transmitting virus via aphids at a 5% rate. It is possible that apparently healthy
looking fruits may also be instrumental in disseminating this virus. However, we previously
determined a very strong spatial clustering of ZYMV by country of origin (Simmons et al., 2008).
This suggests that although there is international gene flow of ZYMV, it does not completely
74
disrupt biogeographic structure, which is itself more suggestive of intermittent gene flow via the
international seed trade than seed transmission via cultivated cucurbits.
Infected seed as a reservoir of ZYMV is further supported by the observations that
overwintering sources of ZYMV are scarce, especially in temperate regions (Lecoq et al., 2003),
and there are few if any alternative hosts of ZYMV (Pirone & Blanc, 1996). In addition,
Schrijnwerkers et al. (1991) found that seed transmission rates tend to vary depending upon the
age at which the plant becomes infected with ZYMV, with plants infected at an earlier growth
stage producing more infected seed. Thus, reservoirs of ZYMV may not remain constant over
time, which would explain the observation that ZYMV epidemics often skip years (GraftonCardwell et al., 1996; Lecoq et al., 2003; Luis-Atreaga et al., 1998; Rubies-Antonell et al., 1996).
That the ImmunoStrips tested negative while we were able to detect ZYMV via RT-PCR
may help to explain the conflicting vertical transmission rates found in the literature. Given that
the immunostrip is polyclonal, it is unlikely that the negative result is due to strain differences. As
we only detected a small number of mutations in the clones, it is possible that the inability of the
ImmunoStrips to detect ZYMV may be the result of lower virus titers in the seed-infected
samples. However, as we only sequenced 773 nucleotides (out of 849 from the CP start codon to
the CP stop codon) of the CP, it is possible that CP gene may have accumulated a sufficient
number of mutational differences that antibodies are no longer able to react with it.
The failure to detect vertically transmitted ZYMV using non-PCR techniques coupled
with our findings that vertically infected ZYMV can be horizontally transmitted has implications
for the international seed trade. Currently, three major organizations publish standardized testing
methods for seed health: International Seed Testing association (ISTA), International Seed Health
Initiative (ISHI), and in the U.S. the National Seed Health System (NSHS). Of the 14 approved
methods for virus detection in seeds, three use indicator plants while the remainder use ELISA
testing (Munkvold, 2009). We therefore suggest that one of the primary objectives for control
75
strategies for ZYMV should be the establishment of standardized testing protocol for the
detection of vertical infection in seeds.
76
Chapter 6
Discussion
This thesis explores the effect of transmission mode on the genetic diversity and
epidemiology of an emerging RNA virus, Zucchini yellow mosaic virus (ZYMV). Collectively
these studies provide information on the evolutionary rate of ZYMV, the manner in which viral
lineages are transmitted among hosts, the magnitude of transmission and systemic bottlenecks,
the amount and patterns of genetic diversity that are generated during infection, as well as the rate
of vertical transmission and its effect on the epidemiology of this virus.
Chapters two, three and four examine the genetic variation and underlying mechanisms
of evolution in populations of ZYMV at different scales ranging from the between population
level to the within individual level. The first study examined ZYMV at the between population
level and revealed that the nucleotide substitution rate of this plant RNA virus fell within the
range of those that have been observed for animal RNA viruses, which was contrary to the
prevailing thought. The scope and depth of chapters three and four was qualitatively different
from all previous studies on evolving populations, as the then current literature lacked
information on intra-host plant RNA viral diversity, the effects of population bottlenecks on viral
genetic diversity in vivo, and on how these two aspects of the viral lifecycle interact with one
another to shape evolution. These studies describe the patterns and amount of intra-host genetic
diversity in ZYMV and elucidate how this diversity is affected by the population bottleneck
imposed by the host plant during systemic movement. In addition, these studies examine how this
intra-host genetic variation is affected by the genetic bottleneck imposed by the aphid during
inter-host transmission. These studies reveal that most intra-host mutations are deleterious and
thus tend to be removed rapidly from the population. This is further supported by a comparison of
the dn/ds ratios calculated at the between population level (chapter two) with that calculated at
77
the within individual level (chapter three). That there appeared to be more purifying selection
acting at the between population level (~0.1) than within individual hosts (~0.6) strongly suggests
that a large proportion of the mutations that are generated within hosts are not maintained at the
population level. The third study revealed that both inter- and intra-host population bottlenecks
are not as extreme as had been previously hypothesized. The fourth and final study revealed that
the vertical transmission rate of ZYMV is 1.6% and that seed transmission of ZYMV may be
instrumental in the worldwide dissemination of this virus.
These studies not only consider genetic diversity at increasingly finer scales in terms of
moving from the between population level to the within individual level, but also at increasingly
deeper levels of coverage. The first study examined the evolutionary dynamics of ZYMV using
consensus sequences derived from populations from widely divergent geographic regions.
However as consensus sequence data masks the sequence variation among individual genomes in
order to gain a deeper understanding of inter- and intra-host genetic diversity in chapter three I
generated clones from ZYMV infected plants from our experimental fields as well as from serial
passaged greenhouse samples. From these samples I averaged ~35 clones per sample for 20
samples. Since this level of coverage would not allow for the detection of low frequency alleles
within these viral populations I undertook next generation sequencing of these same samples
(with some modifications) in chapter four and achieved an average coverage level of ~9,000X. In
chapter three I found that no mutations were transmitted between or within hosts, however in
chapter four with the deeper level of coverage obtained I found that mutations do in fact persist
both inter- and intra-hosts. A comparison of the methods used in chapters three and four, as well
as the analyses in chapter four, suggests that in order to uncover the full extent of genetic
variation within a population the level of coverage obtained is an extremely important parameter.
Conventional sequencing is limited by practical constraints such as time and finances, with the
78
effect that achieving this level of coverage is highly unlikely and only a relatively small number
of individuals from any one population are typically sampled at any one time.
Furthermore I show that the error rate inherent in the RT-PCR procedure may skew the
results obtained from mutation analyses of RNA viruses. In the second study, I estimated that
approximately 40% of the mutations uncovered could be due to procedural error associated with
the reverse transcriptase and the PCR step. There is considerable variation in the reported rates of
intra-host genetic variation in plant RNA viruses and it is possible that these discrepancies could
be the result of artefactual mutations. It would be extremely difficult, if not impossible, to
determine which mutations are real and which are the result of procedural error particularly in the
case of singletons; however, this problem can be compensated for by increasing the depth of
sequencing and obtaining higher levels of coverage. Thus, it appears that as a result of the high
levels of coverage achieved through deep sequencing approaches that they may be a superior
choice for elucidating genetic variation within viral populations.
These analyses reveal that results, and by extension the conclusions derived from these
results, can be strongly biased by the choice of methods used to generate the data. As a further
case in point, in chapter five we revealed that the vertical transmission rate of ZYMV, and its
impact on the epidemiology of this virus was a controversial issue mostly as a result of variation
in detection methods. A survey of the literature indicated that detection methods included visual
inspection, antibody testing as well as RT-PCR testing resulting in estimates of vertical
transmission that ranged from 0-18.9%. I determined that vertical infections were often
symptomless thus visual detection would naturally lead to many false negatives. Likewise I
discovered that antibody testing failed to detect vertically acquired viral infections that were
detectable via RT-PCR. This possibly accounts for the range in vertical transmission rates
recorded in the literature, and highlights the need to standardize detection methods for detecting
viral infections in crop seeds.
79
In chapter four I demonstrated that the population bottlenecks imposed as the viral
population moves through the plant both cell-to-cell as well as organ-to-organ were not as severe
as had been previously suggested. This was evidenced by the persistence of mutations within
individual plants over the course of infection. However, this phenomenon needs to be
investigated in greater detail, particularly as our results appear to be contrary to those published to
date. Thus, an assessment of the genetic diversity of the viral populations via Illumina sequencing
as the virus moves from leaf to leaf within a plant, would be instrumental in teasing apart the
population bottleneck imposed on the virus by the host plant during systemic movement. Given
the extremely high levels that aphid populations can achieve in agricultural fields it is highly
probable that population bottlenecks imposed by both the aphid vector during transmission and
by the host plant as the virus moves systemically may be overwhelmed by the sheer number of
transmission events both between individual plants as well as within the same plant. This is
hinted at by the fact that there was more population structure in the clones derived from the aphid
vectored samples compared to the mechanically inoculated samples in chapter three, and
underscores the need to study these systems in nature. These findings also highlight the problems
inherent in applying in vitro results to in vivo systems.
The phylogeographic analysis undertaken in the first study hinted that the manner in
which ZYMV was globally distributed required a mechanism that could both efficiently transport
the virus across geographic boundaries as well as effectively initiate horizontal infections. I found
significant clustering by country of origin, as well as by continent, which suggests that although
movement of ZYMV can and does occur it is not frequent enough to disrupt this geographical
structure. In chapter two, I assumed that vertical transmission was probably not the cause of this
movement, particularly as most of the literature at the time suggested that ZYMV infected seeds
were of little to no epidemiological significance. However, in chapter five, I demonstrated that
infected seeds not only result in infected plants but that this infection could be subsequently
80
transmitted by aphids, suggesting that infected seeds may acts as reservoirs for this viral
infection. Consequently, the sale and transport of seeds could be responsible for the current
geographic distribution of this crop pathogen. It had been previously suggested that the
international movement of infected fruits and or seedlings may also contribute to the global
dissemination of ZYMV (Lecoq et al., 2003). Given that ZYMV is such an economically
devastating crop pathogen, it is important for the global management of ZYMV to determine the
relative contribution of the transportation of infected fruits and seedlings versus infected seed to
the epidemiology of ZYMV.
In chapter four I determined that population bottlenecks as a result of vector transmission
and systemic movement through the plant were not as severe as previously hypothesized.
However, we did not address the population bottleneck imposed on the viral population as the
virus enters the germ line during vertical transmission. Given that infected seeds may act as
reservoirs of ZYMV and may contribute to the worldwide dissemination of this virus, an
assessment of how vertical transmission affects viral genetic diversity, as well as the effect of the
genetic bottleneck on this diversity while entering the germ line, would be an informative next
step. Illumina sequencing of the vertically transmitted samples could potentially reveal if there
are significant differences in genetic variation between the vertically and horizontally transmitted
populations. I determined in chapter five that vertically infected C. pepo plants are virtually
symptomless which could either be due to lower viral titers or genetic changes related to vertical
transmission. However, it is currently not clear if this is simply due to viral titer levels or if there
is an underlying genetic mechanism, or some combination of both mechanisms. Therefore, it is
important to determine if vertical transmission rates increase over time and if Illumina sequencing
of vertically transmitted samples reveal an underlying genetic cause of this decrease in symptoms.
81
As vertically transmitted pathogens are dependent on their host successfully producing
infected offspring, host fecundity is considered to be more important in vertical transmission than
horizontal transmission (Froissart, 2010). Thus, one might assume that the fecundity of vertically
infected C.pepo plants might be significantly higher than horizontally infected plants especially
over several generations. To date I have observed that the germination rate of seeds harvested
from horizontally infected fruits appears to be significantly lower than those from healthy fruits.
However, how this compares to vertically transmitted seeds is currently unknown. In addition, as
we know the transmission rate, an estimation of the germination rate of infected seeds in
comparison to healthy seeds would aid in managing the spread of this viral pathogen.
There is some evidence to suggest that viruses may manipulate host factors to increase
the plant’s attractiveness to potential vectors by modulating olfactory cues in the form of volatile
compounds (Ngumbi et al., 2007; Medina-Ortega et al., 2009, Mauck et al., 2010). Thus, given
that the virus in being transmitted vertically it may not influence volatiles in the same manner as
its horizontally transmitted counterparts. Therefore, an assessment of the volatiles emitted by
vertically infected plants and how they compare to those emitted by horizontally infected plants,
as well as healthy plants, may yield a deeper understanding of not only how the virus manipulates
host behavior but how this behavior influences the aphid vector.
This dissertation provides insight into the genetic variation of an RNA virus as it is
transmitted between and within host plants. As deep sequencing technologies become
increasingly more affordable, it will become possible to expand the number of viral populations
sequenced via both vertical and horizontal modes of transmission, thereby increasing our
understanding of how genetic diversity is modulated by population bottlenecks. This is of
paramount importance in terms of our capacity to predict and, perhaps limit, the spread of plant
RNA viruses. Moreover, knowledge gained by studying plant RNA viruses, which are amenable
82
to experimental manipulation at the field scale, may yield key insights into the tempo of evolution
and the evolution of virulence in emerging RNA viruses.
83
REFERENCES
Acosta-Leal, R., Bryan, B. K. and Rush, C. M. 2010. Host effect on the genetic diversification of
Beet necrotic yellow vein virus single-plant populations. Phytopathology 100:1204-1212.
Acosta-Leal, R., Duffy, S., Xiong, Z., Hammond, R. and Elena, S. 2011. Advances in Plant Virus
Evolution: Translating Evolutionary Insights into Better Disease Management.
Phytopathology. In Press. DOI: 10.1094/PHYTO-01-11-0017
Ahlquist, P., Noueiry, A., Lee, W., Kushner, D. and Dye, B. 2003. Host factors in positive-strand
RNA virus genome replication. J. Virol. 77: 8181-8186
Ajayi, O. and Dewar, A.M. 1983. The effect of barley yellow dwarf virus on field populations of
the cereal aphids, Sitobion avenae and Metopolophium dirhodum. Ann. Appl. Biol.103:1-11.
Ali, A., Li, H., Schneider, M. L., Sherman, D. J., Grey, S., Smith, D. and Roossinck, M. J. 2006.
Analysis of genetic bottlenecks during horizontal transmission of Cucumber Mosaic Virus. J.
Virol. 80:8345-8350.
Ali, A. and Roossinck, M.J. 2010. Genetic bottlenecks during systemic movement of Cucumber
Mosaic virus may vary in different host plants. Virology. 404:279-283.
Arriaga, L., Huerta, E., Lira-Saade, R., Moreno, E. and Alarcón, J. 2006. Assessing the risk of
releasing transgenic Cucurbita spp. in Mexico. Agric Ecosyst & Environ 112:291-299.
Astier, S., Albouy, J., Maury, Y., Robaglia, C. and Lecoq, H. 2007. Principles of Plant Virology –
Genome, Pathogenicity, Virus Ecology. Science Publishers
Ateya, C. D., Raccah, B. and Pirone, T. P. 1990. A point mutation in the coat protein abolishes
aphid transmissibility of a potyvirus. Virology. 178:161-165.
Basky, Z., Perring, T. and Tobias, I. 2001. Spread of zucchini yellow mosaic potyvirus in squash
in Hungary Journal of Applied Entomology. 125:1439-0418.
84
Betancourt, M., Fereres, A., Fraile, A., and Garcia-Arenal, F. 2008. Estimation of the effective
number of founders that initiate an infection after aphid transmission of a multipartite plant
virus. J. Virol. 82:12416-12421
Blackman, R. L. and Eastop, V. F. 2000. Aphids of the World’s Crops: an Identification and
Information Guide, 2nd edn. London, UK: John Wiley & Sons.
Blankenberg, D., Von Kuster, G., Coraor, N., Ananda, G., Lazarus, R., Mangan, M., Nekrutenko,
A. and Taylor, J. 2010. "Galaxy: a web-based genome analysis tool for
experimentalists". Current Protocols in Molecular Biology. 2010 Jan; Chapter 19:Unit
19.10.1-21.
Blok, J., Mackenzie, A., Guy, P. and Gibbs, A.J. 1987. Nucleotide sequence comparisons of
turnip yellow mosaic virus isolates from Australia and Europe. Arch. Virol. 97:283-295.
Blua, M. and Perring, T. 1989. Effect of Zucchini Yellow Mosaic Virus on development and yield
of cantaloupe cucumis melo. Plant. Dis. 73:317-320
Callaway, A.S., George, C.G. and Lommel, S.A. 2004. A Sobemovirus coat protein gene
complements long-distance movement of a coat protein-null Dianthovirus. Virology 330:186195.
Cantliffe, D.J., Shaw, N.L. and Stoffella, P.J. 2007. Current trends in cucurbit production in the
U.S. Acta. Hort. 731:473-478.
Carrington, C. V. F., Foster, J. E., Pybus, O. G., Bennett, S. N. and Holmes, E. C. 2005. Invasion
and maintenance of dengue virus type 2 and type 4 in the Americas. J Virol 79:14680–14687.
Castle, S.J., Perring, T.M. Farrar, C.A. and Kishaba, A.N. 1992. Field and laboratory
transmission of watermelon mosaic virus 2 and zucchini yellow mosaic virus by various aphid
species. Phytopathology 8: 235-240.
Chare, E. R. and Holmes, E. C. 2004. Selection pressures in the capsid genes of plant RNA
viruses reflect mode of transmission. J Gen Virol 85:3149–3157.
85
Clement, M., Posada, D. and Crandall, K.A. 2000. TCS: a computer program to estimate gene
genealogies. Mol. Ecol. 9, 1657-1660.
Davies, R.F. and Mizuki, M.K. 1986. Seed transmission of zucchini yellow mosaic virus
Phytopathology. 76:1073.
Decker, D. S. and Wilson, H. D. 1987. Allozyme variation in Cucurbita pepo complex: C. pep
ovar. overifera vs. C. texana. Syst Bot 12:263–273.
Decker-Walters, D. S. 1990. Evidence for multiple domestication of Cucurbita pepo. In Biology
and Utilization of the Cucurbitaceae, pp. 96–101. Edited by D. M. Bates, R. W. Robinson and
C. Jeffrey. Ithaca, NY: Cornell University Press.
Decker-Walters, D. S., Straub, J. E., Chung, S. M., Nakata, E. and Quemada, H. D. 2002.
Diversity in free-living populations of Cucurbita pepo Cucurbitaceae as assessed by random
amplified polymorphic DNA. Syst Bot 27, 19–28.
Debiez, C. and Lecoq, H. 1997. Zucchini yellow mosaic virus. Plant Path. 46, 809-829.
Desbiez, C., Wipf-Scheibel, C. and Lecoq, H. 2002. Biological and serological variability,
evolution and molecular epidemiology of Zucchini yellow mosaic virus Island of Martinique.
Plant Dis. 80:203-207.
Drake, J.W. and Holland, J.J. 1999. Mutation rates among RNA viruses. Proc. Natl. Acad. Sci.
USA 96:13910-13913.
Drummond, A. J. and Rambaut, A. 2007. BEAST: Bayesian evolutionary analysis by sampling
trees. BMC Evol Biol 7:214.
Duffy, S., Shackelton, L.A. and Holmes, E.C. 2008. Rates of evolutionary change in viruses:
patterns and determinants. Nat. Rev. Genet. 9:267-276.
Dunoyer, P., Thomas, C., Harrison, S., Revers, F. and Maule, A. 2004. A cysteine- rich plant
protein potentiates Potyvirus movement through an interaction with the virus genome-linked
protein VPg. J Virol 78:2301–2309
86
Espinoza, A. M., Medina, V., Hull, R. and Markham, P. G.1991.Cauliflower mosaic virus gene II
product forms distinct inclusion bodies in infected plant cells. Virology 185:337–344.
Fargette, D., Pinel, A., Rakotomalala, M., Sangu, E., Traoré, O., Sérémé, D., Sorho, F., Issaka, S.,
Hébrard, E., Séré, Y., Kanyeka, Z. and Konaté, G. 2008. Rice Yellow Mottle Virus, an RNA
plant virus, evolves as rapidly as most RNA animal viruses. J. Virol. 82, 3584–3589.
Fereres, A., Blua, M. J. and Perring, T. M. 1992. Retention and Transmission Characteristics of
Zucchini Yellow Mosaic Virus by Aphis gossypii and Myzus persicae Homoptera: Aphididae.
J. Econ. Entomol. 85:759-765.
Feuer, R., Boone, J.D. Netski, D., Morzunov, S.P. and St. Jeor, S.C. 1999. Temporal and spatial
analysis of Sin Nombre virus quasispecies in naturally infected rodents. J. Virol. 73:95449554.
Fraile, A., Sacristán, S. and García-Arenal, F. 2008. A quantitative analysis of complementation
of deleterious mutants in plant virus populations. Spanish J. Ag. Res. 6:195-200
Fraile, A., Escriu, F., Aranda, M.A., Malpica, J.M, Gibbs, A.J. and García-Arenal, F. 1997. A
century of tobamovirus evolution in an Australian population of Nicotiana glauca. J. Virol.
71:8316-8320
French, R. and Stenger, D. C. 2003. Evolution of Wheat streak mosaic virus: dynamics of
population growth within plants may explain limited variation. Annu. Rev. Phytopathol. 41,
199–214.
Froissart, R., Doumayrou, J., Vuillaume, F., Alizon, S. and Michalakis, Y. 2010. The virulence–
transmission trade-off in vector-borne plant viruses: a review of non-existing studies. Proc. R.
Soc. B. 365,1907-1918
Furusawa, I. and Okuno, T. 1978. Infection with BMV of mesophyll protoplasts isolated from
five plant species. J. Gen. Virol. 40:489-491.
Gaille, D. 2001. Translational control of cellular and viral mRNA. Plant Mol. Bio. 32:145-148
87
Gal-On, A. 2007. Zucchini yellow mosaic virus: insect transmission and pathogenicity – the tails
of two proteins. Mol. Plant Pathol. 8:139–150.
García-Arenal, F., Fraile, A. and Malpica, J.M. 2001. Variability and genetic structure of plant
virus populations. Annu. Rev. Phytopathol. 39:157-186.
García-Arenal, F., Fraile, A. and Malpica, J.M. 2003. Variation and evolution of plant virus
populations. Int. Microbiol. 6:225-232.
Gibbs, A.J., Fargette, D., García-Arenal, F. and Gibbs, M.J. 2010. Time – the emerging
dimension of plant virus studies. J. Gen. Virol. 91:13-22.
Gibbs, A.J., Ohshima, K., Phillips, M.J. and Gibbs, M.J. 2008. The prehistory of potyviruses:
their initial radiation was during the dawn of agriculture. PLoS ONE 3, e2523.
Glasa, M. and Pittnerova, S. 2006. Complete genome sequence of a Slovak isolate of Zucchini
yellow mosaic virus ZYMV provides further evidence of a close molecular relationship
among Central European ZYMV isolates. J Phytopathol 154:436–440.
Glasa, M., Svoboda, J. and Novakova ,́ S. 2007. Analysis of the molecular and biological
variability of Zucchini yellow mosaic virus isolates from Slovakia and Czech Republic. Virus
Genes 35:415–421.
Gleason, L. 1990. Absence of Transmission of Zucchini Yellow Mosaic Virus from Seeds of
Pumpkin. Plant Dis. 74:828.
Goecks, J, Nekrutenko, A, Taylor, J and The Galaxy Team. 2010. Galaxy: a comprehensive
approach for supporting accessible, reproducible, and transparent computational research in
the life sciences. Genome Biol. 11:R86.
Grafton-Cardwell, E.E., Perring, T.M., Smith, R.F., Valencia, J. and Ferrar, C. A. 1996.
Occurrence of mosaic virus in melons in the Central Valley of California. Plant Dis. 80:10921097.
88
Guevara-González, R. G., Ramos, P. L. and Rivera-Bustamante, R. F. 1999. Complementation of
coat protein mutants of pepper huasteco geminivirus in transgenic tobacco plants.
Phytopathology 89:540-545.
Hall, J.S., French, R., Morris, T.J. and Stenger, D.C. 2001. Structure and temporal dynamics of
populations within wheat streak mosaic virus isolates. J. Virol. 75:10231-10243
Hanada, K., Suzuki, Y. and Gojobori, T. 2004. A large variation in the rates of synonymous
substitution for RNA viruses and its relationship to a diversity of viral infection and
transmission modes. Mol Biol Evol 21:1074–1080.
Heinlein, M., Epel, B. L., Padgett, H. S. and Beachy, R. N. 1995. Interaction of tobamovirus
movement proteins with the plant cytoskeleton. Science 270:1983–1985.
Hoelzer, K., Murcia, P., Baillie, G.J., Wood, J.L.N., Metzger, S., Osterrieder, K., Dubovi, E.J.,
Holmes, E.C. and Parrish, C.R. 2010. Intra-host evolutionary dynamics of canine influenza
virus in naïve and partially immune dogs. J. Virol. 84:5329-5335.
Holmes, E.C. 2003. Patterns of intra- and inter-host nonsynonymous variation reveal strong
purifying selection in dengue virus. J. Virol. 77:11296-11298.
Holmes, E.C. 2009. The Evolution and Emergence of RNA Viruses. Oxford Series in Ecology and
Evolution. Series edited by PH Harvey & RM May. Oxford University Press, Oxford.
Holt, C. A. and Beachy, R. N. 1991. In vivo complementation of infectious transcripts from
mutant tobacco mosaic virus cDNAs in transgenic plants. Virology 181:109–117
Hooks, C.R.R., Valenzuela, H.R. and Defrank, J. 1998. Incidence of pests and arthropod natural
enemies in zucchini grown with living mulches. Agriculture, Ecosystems and Environment
69:217-231.
Huelsenbeck, J.P. 1995. Performance of phylogenetic methods in simulation. Syst Biol 44:17-48.
Huelsenbeck, J. I, and Hillis, D.W. 1993. Success of phylogenetic methods in the four-taxon case.
Syst. Biol. 42:247-264.
89
Huet, H., Gal-On, A., Meir, E., Lecoq, H. and Raccah, B. 1994. Mutations in the helper
component HC gene of zucchini yellow mosaic virus ZYMV affect aphid transmissibility. J.
Gen. Virol. 75:1407–1414.
Hughes, A. 2009. Small Effective Population Sizes and Rare Nonsynonymous Variants in
Potyviruses. Virology 10:127-134.
Iqbal, M., Xiao, H., Baillie, G., Warry, A., Essen, S.C., Londt, B., Brookes, S. M., Brown, I. H.
and McCauley, J. W. 2009. Within-host variation of avian influenza viruses. Phil. Trans. R.
Soc. Lond. B. 364:2739-2747.
Jridi, C, Martin, J-F., Marie-Jeanne, V., Labonne, G. and Blanc, S. 2006. Distinct viral
populations differentiate and evolve independently in a single perennial host plant. J. Virol.
80:2349-2357.
Jenkins, G. M., Rambaut, A., Pybus, O. G. and Holmes, E. C. 2002. Rates of molecular evolution
in RNA viruses: a quantitative phylogenetic analysis. J Mol Evol 54:156–165.
Jerzak, G.V.S., Brown, I., Shi, P., Kramer, L.D. and Ebel, G.D. 2008. Genetic diversity and
purifying selection in West Nile virus populations are maintained during host switching.
Virology 374:256-260.
Johansen, E., Edwards, M.C., and Hampton, R.O. 1994. Seed Transmission of Viruses: Current
Perspectives. Annu. Rev. Phytopathol. 32:363-86.
Katis, N.I., Tsitsipsi, J.A., Lykouressis, D.P., Papapanayotou, A., Kokinis, G.M.,Perdikis, D.C.
and Manoussopoulos, I.N. 2006. Transmission of Zucchini yellow mosaic virus by colonizing
and non-colonizing aphids in Greece and new aphid vectors of the virus. J. Phytopathol.
154:293-302.
Khelifa, M., Journou, S., Krishnan, K., Gargani, D., Espérandieu, P., Blanc, B. and Drucker, M.
2007. Electron-lucent inclusion bodies are structures specialized for aphid transmission of
cauliflower mosaic virus. J. Gen. Virol. 88:2872-2880.
90
Kim, T., Youn, M.Y., Min, B.E., Choi, S.H., Kim, M. and Ryu, K.H. 2005. Molecular analysis of
quasispecies of Kyuri green mottle mosaic virus. Virus Res. 110:161–167
Kircher, M. and Kelso, J. 2010. High-throughput DNA sequencing – concepts and limitations.
BioEssays. 32:524–536
Kosakovsky Pond, S.L., Frost, S.D.W. and Muse, S.V. 2005. HyPhy: hypothesis testing using
phylogenies. Bioinformatics 21:676-679.
Kumar, S., Koichiro, T. And Nei, M. 2004. MEGA3: Integrated software for molecular
evolutionary genetics analysis and sequence alignment. Brief Bioinform. 5:150-163.
Lakner, C., van der Mark, P., Huelsenbeck, J., Larget, B. and Ronquist, F. 2008. Efficiency of
Markov chain Monte Carlo tree proposals in Bayesian phylogenetics. Syst. Biol. 57:86-103
Latham, J.R. and Wilson, A.K. 2008. Transcomplementation and synergism in plants:
implications for viral transgenes? Mol. Plant. Path. 9:85 -103
Lech, W. J., Wang, G., Yang, Y. L., Chee, Y., Dorman, K., McCrae, D., Lazzeroni, L. C., J.,
Erickson, W., Sinsheimer, J. S. and Kaplan, A. H. 1996. In vivo sequence diversity of the
protease of human immunodeficiency virus type 1: presence of protease inhibitor-resistant
variants in untreated subjects. J. Virol. 70:2038-2043.
Lecoq, H., Desbiez, C., Wipf-Scheibel, C. and Girard, M. 2003. Potential Involvement of Melon
Fruit in the Long Distance Dissemination of Cucurbit Potyviruses. Plant Dis. 87:955-959.
Li, H. and Durbin, R. 2009. Fast and Accurate Short Read Alignment with Burrows-Wheeler
Transform. Bioinformatics. 25:1754-1760.
Li, H. and Roossinck, M.J. 2004. Genetic bottlenecks reduce population variation in an
experimental RNA virus population. J. Virol. 78, 10582-10587
Lira, R., Andres, T. C. and Nee, M. 1995. Cucurbita. Pages 1-115 in R. Lira, (ed). Systematic and
ecogeographic studies on crop genepools. Volume 9. International Plant Genetic Resources
Institute. Mexico City and Rome.
91
Lisa, V., Boccardo, G., D’Agostino, G., Dellavalle, G. and D’Aquilio, M. 1981. Characterization
of a potyvirus that causes Zucchini yellow mosaic. Phytopathology. 71:667–672.
Lopez-Abella, D., Bradley, R. H. E. and Harris, K. F. 1988. Correlation between stylet paths
made during superficial probing and the ability of aphids to transmit nonpersistent viruses.
Adv Dis Vector Res 5:251–285.
Luis-Atreaga, M., Alvarez, J.M., Alonso-Prados, J.L., Bernal, J.J., Garcia-Arenal, F., Lavina, A.,
Batlle, A. and Moriones, E. 1998. Occurrence, distribution and relative incidence of mosaic
viruses infecting field-grown melon in Spain. Plant Dis. 82:979-982.
Malpica, J. M., Fraile, A., Moreno, I., Obies, C. I., Drake, J. W. and Garcia-Arenal, F. 2002. The
rate and character of spontaneous mutations in an RNA virus. Genetics 162:1505–1511.
Marco, C. F. and Aranda, M. A. 2005. Genetic diversity of a natural population of Cucurbit
yellow stunting disorder virus. J. Gen. Virol. 86:815–822.
Martin, B., Collar, J. L., Tjallingii, W. F. and Fereres, A. 1997. Intracellular ingestion and
salivation by aphids may cause the acquisition and inoculation of non-persistently transmitted
plant viruses. J Gen Virol 78:2701–2705.
Mauck, K.E., De Moraes, C.M. and Mescher, M.C. 2010. Deceptive chemical signals induced by
a plant virus attract insect vectors to inferior hosts. Proc. Natl. Acad. Sci. USA 107:3600–
3605.
Medina-Ortega, K. J., Bosque-Perez, N. A., Ngumbi, E., Jimenez-Martinez, E. S. and
Eigenbrode, S. D. 2009. Rho- palosiphum padi Hemiptera: Aphididae responses to volatile
cues from Barley yellow dwarf virus-infected wheat. Environ. Entomol. 38:836 – 845.
Merits, A., Rajamaki, M., Lindholm, P., Runeberg-Roos, P., Kekarainen, T.M., Puustinen, P.,
Makelainen, K., Valkonen, J. and Saarma, M. 2002. Proteolytic processing of potyviral
proteins and polyprotein processing intermediate in insects and plant cells. J. Gen. Virol.
83:1211-1221
92
Miyashita, S. and Kishino, H. 2010. Estimation of the size of genetic bottlenecks in cell-to-cell
movement of Soil-borne wheat mosaic virus and the possible role of the bottlencks in
speeding up selection of variations in trans acting genes or elements. J Virol. 84:1828-1837.
Morozova, O. and Marra, M.A. 2008. Applications of next-generation sequencing technologies in
functional genomics. Genomics 92:255–64
Mortazavi, A., Williams, B.A., McCue, K. and Schaeffer, L., 2008. Wold B: Mapping and
quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 5:621-628
Moury, B., Fabre, F. and Senoussi, R. 2007. Estimation of the number of virus particles
transmitted by an insect vector. Proc. Natl. Acad. Sci. USA 104:17891-17896.
Muller, C., Brother, H., Von Bargen, S. and Buttner, C. 2006. Zucchini yellow mosaic virus –
incidence and sources of virus infection in field-grown cucumbers and pumpkins in the
Spreewald, Germany. J. Plant Dis and Prot. 113:252-258.
Munkvold, G.P. 2009. Seed pathology progress in academia and industry. Annu. Rev.
Phytopathol. 47:285-311.
Murcia, P., Baillie, G.J., Daley, J., Elton, D., Jervis, C., Mumford, J.A., Newton, R., Parrish,
C.R., Hoelzer, K., Dougan, G., Parkhill, J., Lennard, N., Ormond, D., Moule, S., Whitwham,
A., McKinley, T.J., McCauley, J.W., Holmes, E.C., Grenfell, B.T. and Wood, J.L.N. 2010.
The intra- and inter-host evolutionary dynamics of equine influenza virus. J. Virol. 84:69436954.
Nault, L. R. 1997. Arthropod transmission of plant viruses: a new synthesis. Ann Entomol Soc Am
90:521–541.
Nault, L. R. and Styer, W. E. 1972. Effects of sinigrin on host selection by aphids. Entomol Exp
Appl 15:423–437.
93
Niepel, M. and Gallie, D.R. 1999. Identification and characterization of the functional elements
within the tobacco etch virus 5’ leader required for cap-independent translation. J. Virol. 73:
9080-9088
Nolasco, G. Fonseca, F. and Silva, G. 2008. Occurrence of genetic bottlenecks during citrus
tristeza virus acquistion by Toxoptera citricida under field conditions. Arch. Virol. 153:259271.
Ngumbi, E., Eigenbrode, S. D., Bosque-Perez, N. A., Ding, H. and Rodriguez, A. 2007. Myzus
persicae is arrested more by blends than by individual compounds elevated in headspace of
PLRV-infected potato. J. Chem. Ecol. 33:1733–1747.
Oparka, K.J., Prior, D.A.M., Santa Cruz, S., Padgett, H.S. and Beachy, R.N. 1997. Gating of
epidermal plasmodesmata is restricted to the leading edge of expanding infection sites of
tobacco mosaic virus. Plant J. 12:781–789
Osbourn, J.K, Sarkar, S. and Wilson, M.A. 1990. Complementation of coat protein-defective
TMV mutants in transgenic tobacco plants expressing TMV coat protein. Virology 179:921925.
Pagán, I. and Holmes, E.C. 2010. Long-term evolution of the Luteoviridae: time-scale and mode
of virus speciation. J. Virol. 84:6177-6187.
Page, R.D.M. and Holmes, E.C. 2007. Molecular Evolution. A phylogenetic Approach. Blackwell
Publishing.
Perring, T. M., Farrar, C. A., Mayberry, K. and Blua, M. J. 1992. Research reveals pattern of
cucurbit virus spread. Calif. Agric. 46:35–39.
Pfosser, M. F. and Baumann, H. 2002. Phylogeny and geographical differentiation of Zucchini
yellow mosaic virus isolates Potyviridae based on molecular analysis of the coat protein and
part of the cytoplasmic inclusion protein genes. Arch. Virol. 147:1599–1609.
94
Pirone, T. P. and Blanc, S. 1996. Helper-dependent vector transmission of plant viruses. Annu.
Rev. Phytopathol. 34:227–247.
Posada, D. and Crandall, K. A. 1998. Modeltest: testing the model of DNA substitution.
Bioinformatics 14:817–818.
Powell, G. 1991. Cell membrane punctures during epidermal penetration by aphids: consequences
for the transmission of two potyviruses. Ann. Appl. Biol. 119:13–321.
Powell, G. 2005. Intracellular salivation is the aphid activity associated with inoculation of nonpersistently transmitted viruses. J. Gen. Virol. 86:469–472
Powell, G. and Hardie, J. 2000. Host-selection behavior by genetically identical aphids with
different plant preferences. Physiol. Entomol. 25:54–62.
Powell, G., Pirone, T. and Hardie, J. 1995. Aphid stylet activities during potyvirus acquisition
from plants and an in vitro system that correlate with subsequent transmission. Eur. .J Plant.
Pathol. 101:411–420.
Qu, F., Ren, T. and Morris, T.J. 2003. The coat protein of turnip crinkle virus suppresses
posttranscriptional gene silencing at an early initiation step. J. Virol. 77:511–522.
R Development Core Team 2008. R: A language and environment for statistical computing. R
Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0 http://www.Rproject.org.
R Development Core Team. 2011. R: A language and environment for statistical computing. R
Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0 http://www.Rproject.org.
Riedle-Bauer, M., Suarez, B. and Reinprecht, H.J. 2002. Seed transmission and natural reservoirs
of Zucchini yellow mosaic virus in Cucurbita pepo var. styriaca. J. Plant Dis and Prot.
1092:200-206.
95
Restrepo, M.A., Freed, D.D. and Carrington J.C. 1990. Nuclear Transport of Plant Potyviral
Proteins. The Plant Cell. 2:987-998
Roberts, I.M., Wang, D., Thomas, C.L. and Maule, A.J. 2003. Seed transmission of Pea seed
borne mosaic virus in pea exploits novel symplastic pathways and is, in part, dependent upon
chance. Protoplasma 222:31-43.
Robinson, R.W., Provvidenti, R. and Shail, J.W. 1993. Tests for Seedborne Transmission of
Zucchini Yellow Mosaic Virus. Hortscience. 287:694-696.
Rodríguez-Cerezo, E., Findlay, K., Shaw, J. G., Lomonossoff, G.P., Qiu, S.G., Linstead, P.,
Shanks, M. and Risco, C. 1997. The Coat and Cylindrical Inclusion Proteins of a Potyvirus
Are Associated with Connections between Plant Cells. Virology. 236:296-306.
Rodríguez-­‐Cerezo, E., Elena, S. F., Moya, A. and García-­‐Arenal, F. 1991. High genetic stability
in natural populations of the plant RNA virus tobacco mild green mosaic virus. J. Mol. Evol.
32:328–332.
Rojas, M. R., Zerbini, F. M., Allison, R. F., Gilbertson, R. L. and Lucas, W. J. 1997. Capsid
protein and helper component-proteinase function as potyvirus cell-to-cell movement
proteins. Virology 237:283–295.
Roossinck, M.J. 2007. Mechanisms of plant virus evolution. Annu. Rev. Phytopathol. 35:191-209
Rubies-Antonell, C. Ballante, M. and Turina, M. 1996. Virus infection in melon crops in CentralNorthern Italy. Inform. Fitopathol. 7-8:6-10.
Ruiz-Jarabo, C. M., Arias, A., Baranowski, E., Escarmís, E. and Domingo, E. 2000. Memory in
viral quasispecies. J. Virol. 74:3543-3547.
Rybicki, E. P. and Shukla, D. D. 1992. Coat protein phylogeny and systematics of potyviruses.
Arch. Virol. Suppl 5:139–170.
Sachs, A.B., Sarnow, P. and Hentze, M.X. 1997. Starting at the beginning, middle, and end
translation in Eucaryotes. Cell. 89:831-838
96
Sacristán, S., Malpica, J. M., Fraile, A. and García-Arenal, F. 2003. Estimation of population
bottlenecks during systemic movement of Tobacco mosaic virus in tobacco plants. J. Virol.
77:9906–9911.
Sanjuán, R., Moya, A. and Elena, S.F. 2004. The distribution of fitness effects caused by singlenucleotide substitutions in an RNA virus. Proc. Natl. Acad. Sci. USA 101:8396-8401.
Schnieder, B.S. and Higgs, S. 2008. The enhancement of arbovirus transmission and disease by
mosquito saliva is associated with modulation of the host immune response. Trans. Roy Soc.
Trop. Med. H. 102:400-408.
Schneider, W. L. and Roossinck, M. J. 2001. Genetic diversity in RNA virus quasispecies is
controlled by host-virus interactions. J. Virol. 75: 6566-6571
Schrijnwerkers, C. C. F. M. Huijberts, N. and Bos, L. 1991. Zucchini Yellow Mosaic virus; two
outbreaks in the Netherlands and seed transmissibility. Neth J Plant Path. 97:187-91.
Shukla, D.D., Frenkel, M.J. and Ward, C.W. 1991. Structure and function of the potyvirus
genome with special reference to the coat protein coding region. Canadian J. Plant Path.
13:178-191
Simmons, H.E., Holmes, E.C. and Stephenson, A.G. 2008. Rapid evolutionary dynamics of
zucchini yellow mosaic virus. J. Gen. Virol. 89:1081-1085.
Simmons H.E., Holmes E.C., Gildow, F.E., Bothe-Goralczyk, M.A. and Stephenson, A.G. 2011.
Experimental verification of seed transmission in Zucchini yellow mosaic virus. Plant Dis.
95:751-4.
Simmons H.E., Holmes E.C. and Stephenson, A.G. 2011. Rapid Turnover of Intra-Host Genetic
Diversity in Zucchini yellow mosaic virus. Virus Res. 155:389-96.
Spitsin, S., Steplewski, K., Fleysh, N., Belanger, H., Mikheeva, T., Shivprasad, S., Dawson, W.,
Koprowski, H. and Yusibov, V. 1999. Expression of alfalfa mosaic virus coat protein in
97
tobacco mosaic virus TMV deficient in the production of its native coat protein supports
long-distance movement of a chimeric TMV. Proc. Natl. Acad. Sci. USA 96:2549–2553.
Swofford, D. L. 2003. PAUP*. Phylogenetic Analysis Using Parsimony *and other methods,
version 4. Sunderland, MA: Sinauer Associates.
Teycheney, P-Y., Laboureau, N., Iskra-Caruana, M-L. and Candresse, T. 2005. High genetic
variability and evidence for plant-to-plant transfer of Banana mild mosaic virus. J. Gen.
Virol. 86:3179-3187.
Tobias, I. and Palkovics, L. 2003. Characterization of Hungarian isolates of zucchini yellow
mosaic virus ZYMV, potyvirus transmitted by seeds of Curcubita pepo var Styriaca. Pest
Manag Sci 59:493–497.
Thomas, C.L., Leh, V., Lederer, C. and Maule, A.J. 2003. Turnip crinkle virus coat protein
mediates suppression of RNA silencing in Nicotiana benthamiana. Virology 306:33–41.
Turturo, C., Saldarelli, P., Yafeng, D., Digiaro, M., Minafra, A., Savino, V. and Martelli, G.P.
2005. Genetic variability and population structure of Grapevine leafroll-associated virus 3
isolates. J. Gen. Virol. 86:217-224.
Urcuqui-Inchima, S., Haenni, A. and Bernardi, F. 2001. Potyvirus proteins: a wealth of functions.
Virus Res. 74:157-175
Wang, D. and Maule, A.J. 1994. A model for seed transmission of a plant virus: genetic and
structural analyses of pea embryo invasion by pea seed-borne mosaic virus. The Plant Cell
6:777-787
Wang, R. Y., Ammar, E. D., Thornbury, D. W., Lopez-Moya, J. J. and Pirone, T. P. 1996. Loss of
potyvirus transmissibility and helper- component activity correlate with non-retention of
virions in aphid stylets. J Gen Virol 77:861–867.
Woolhouse, M. E. J., Taylor, L. H. and Haydon, D. T. 2001. Population biology of multihost
pathogens. Science 292:1109–1112.
98
Wu, X. and Shaw, J. 1996. Bidirectional uncoating of the genomic RNA of a helical virus. Proc.
Natl. Acad. Sci. USA. 93: 2981-2984
Zhao, M. F., Chen, J., Zheng, H.-Y., Adams, M. J. and Chen, J.-P. 2003. Molecular analysis of
Zucchini yellow mosaic virus isolates from Hangzhou, China. J. Phytopathol. 151:307–311.
99
VITA: Heather Simmons
EDUCATION
Ph.D.
12/11
B.S.
2006
Department of Biology, The Pennsylvania State University
Advisor: Andrew Stephenson
Department of Biology, University of Oregon
Graduated cum laude with a 3.92 GPA
TEACHING EXPERIENCE
• Teaching Assistant, Biology 322 (Genetics) The Pennsylvania State University. Spring
2010
• Teaching Assistant, Bio220W (Populations and communities), The Pennsylvania State
University, Spring 2007 and 2008
• Teaching Assistant, Animal Behavior, University Of Oregon, Spring 2004
• Teaching Assistant, Freshman Biology and Anatomy and Physiology, New Mexico
Junior College, 05/1998 – 05/1999
SELECTED SCHOLARSHIPS AND AWARDS
• Jeanette Ritter Mohnkern Graduate Student Scholarship for Outstanding
Achievement in Doctoral Research (2011), Dept. of Biology, PSU
• Doctorial Dissertation Improvement Grant, NSF (2010 – 2012)
• Henry W. Popp Fellowship for Outstanding Graduate Student in Plant Sciences (2010),
Dept. of Biology, PSU
• PSU Biology Department Travel Grant (to attend 6th Annual Virus Evolution
Workshop), Biology Department, PSU (2010)
• Braddock Research Award, Eberly College of Science, PSU (2010)
• J. Ben and Helen D. Hill memorial Fund Award (2007, 2008, 2009, 2010)
• NSF Travel Grant to attend EEID (Ecology and Evolution of Infectious Diseases)
workshop and conference (2009)
• Braddock Graduate Recognition Fellowship for Outstanding New Graduate Students,
Eberly College of Science, PSU (2006)
• Para Talus Presidential Scholarship, University of Oregon (2006)
PEER-REVIEWED SCIENTIFIC PUBLICATIONS
• Simmons H.E., Holmes E.C., Gildow, F.E., Bothe-Goralczyk, M.A., & Stephenson, A.G.
(2011). Experimental verification of seed transmission in Zucchini yellow mosaic virus.
Plant Disease 95:751-4
• Simmons H.E., Holmes E.C., & Stephenson, A.G. (2011). Rapid Turnover of Intra-Host
Genetic Diversity in Zucchini yellow mosaic virus. Virus Research. 155:389-96
• Simmons H.E., Holmes E.C., & Stephenson, A.G. (2008). Rapid evolutionary dynamics
of zucchini yellow mosaic virus. J Gen Virol. 89:1081-5.
SCIENTIFIC MANUSCRIPTS IN PREPARATION
• Simmons H.E., Dunham, J.P., Stack, J.C., Dickins, B.J.A., Pagan, I.P., Holmes E.C., &
Stephenson, A.G. Deep sequencing reveals persistence of intra- and inter- host genetic
diversity in natural and greenhouse populations of Zucchini yellow mosaic virus. (To be
submitted to Journal of General Virology