Scoring human genomic SNPs and mutations: Multiplexed - E

Transcription

Scoring human genomic SNPs and mutations: Multiplexed - E
Scoring human genomic
SNPs and mutations:
Multiplexed primer extension
with manifolds and microarrays
as solid-support
by
Tomi Pastinen
Department of Human Molecular Genetics
National Public Health Institute and
Department of Medical Genetics
University of Helsinki
Helsinki, Finland
Academic dissertation
To be publicly discussed by the permission of the
Medical Faculty of the University of Helsinki,
in the Small Lecture Hall
of the Haartman Institute
on June 20th, at 12 noon
Helsinki 2000
1
Supervised by
Professor Leena Peltonen (Palotie)
Department of Human Molecular Genetics,
National Public Health Institute, and
Department of Medical Genetics,
University of Helsinki,
Helsinki, Finland
Professor Ann-Christine Syvänen
Department of Human Molecular Genetics,
National Public Health Institute,
Helsinki, Finland
Reviewed by
Professor Olli-Pekka Kallioniemi
Cancer Genetics Branch,
National Human Genome Research Institute,
National Institutes of Health,
Bethesda, MD, USA
Professor Ulf Landegren,
Department of Genetics and Pathology,
Rudbeck Laboratory,
University of Uppsala,
Uppsala, Sweden
Publications of the National Public Health Institute
NPHI A5/2000
Copyright National Public Health Institute
Julkaisija -
Utgivare -
Publisher
Kansanterveyslaitos
Mannerheimintie 166
00300 Helsinki
puh. vaihde 09-47441
telefax 09-47448408
Folkhälsoinstitutet
Mannerheimvägen 166
00300 Helsingfors
tel. växel 09-47441
telefax 09-47448408
National Public Health Institute
Mannerheimintie 166
FIN-00300 Helsinki, Finland
phone +358-9-47441
telefax +358-9-47448408
ISBN 951-740-171-X
ISSN 0359-3584
ethesis (PDF) ISBN
952-91-2256-X
2
To Nathalie
3
Contents
LIST OF ORIGINAL PUBLICATIONS ...................... 7
SUMMARY ......................................................... 8
INTRODUCTION .............................................. 10
REVIEW OF THE LITERATURE ........................... 11
Interindividual sequence variation .................................. 11
Frequency and distribution of human sequence variations .. 12
Dynamic mutations ....................................................... 13
Polymorphic markers in human genetics ......................... 13
Genotyping technology before the PCR era ....................... 15
Amplification of target DNA ........................................... 16
Mutation detection in amplified DNA .............................. 18
Unknown mutations ....................................................... 18
Screening for known sequence variation ........................... 19
PCR RFLP .......................................................................................... 19
Allele specific oligonucleotide (ASO) hybridisation ......................... 20
Ligation assay .................................................................................. 22
Allele specific PCR ............................................................................ 22
Minisequencing primer extension ................................................... 23
Homogenous assays ........................................................................ 24
Assays with signal amplification ....................................................... 26
DNA-array technology ................................................... 28
Origins of DNA microarray concept ................................................. 29
Array construction ............................................................................ 30
In situ synthesis ................................................................................ 30
Spotted arrays .................................................................................. 31
Array reading ................................................................................... 32
Comparative sequencing on DNA-microarrays .................. 33
Arrays in sequence scanning ........................................................... 34
Scoring SNPs or mutations on DNA-microarrays .............................. 36
ASO-hybridization based methods .................................................. 36
DNA-modifying enzymes in microarray genotyping ........................ 38
Summary .......................................................................................... 41
Practical alternatives to PCR? ........................................ 42
Alternatives to microarrays for multiplexing .................... 44
5
Use of sequence variations in modern human genetics ....... 44
“Routine” mutation/SNP-scoring ..................................................... 44
LD mapping of complex traits .......................................................... 45
AIMS OF THE PRESENT STUDY .......................... 47
MATERIALS AND METHODS .............................. 48
DNA samples and extraction of DNA ................................
Primer synthesis ..........................................................
PCR amplification ........................................................
Affinity capture and ssDNA preparation ...........................
Electrophoretic separation .............................................
Preparation of microarrays .............................................
Genotyping reactions ....................................................
Quantitation and interpretation of the results ....................
Reference methods .......................................................
Statistical methods .......................................................
48
48
48
49
49
50
50
51
51
52
RESULTS AND DISCUSSION................................ 53
Design of assays ........................................................... 53
“Length-labeled” multiplex fluorescent minisequencing ................ 53
Minisequencing primer extension arrays ........................................ 54
Allele specific extension arrays ........................................................ 55
Optimization of genotype discrimination ......................... 56
Length-labeled multiplex minisequencing assays .......................... 56
Array-based assays .......................................................................... 57
Assay procedures .......................................................... 58
Multiplex PCR ................................................................................... 58
Length labeled primers for multiplex minisequencing .................... 59
Array-based extension assays ......................................................... 61
Applications ................................................................. 63
HLA typing........................................................................................ 63
Screening for mutations and SNPs ................................................... 64
CONCLUDING REMARKS ................................. 71
ACKNOWLEDGEMENTS ................................... 72
REFERENCES .................................................. 75
6
LIST OF ORIGINAL PUBLICATIONS
(in addition unpublished data is presented)
I
Tomi Pastinen, Jukka Partanen and Ann-Christine Syvänen:
Multiplex, fluorescent solid-phase minisequencing for efficient
screening of DNA variation. (1996) Clinical Chemistry 42:1391-7.
II
Tomi Pastinen, Ants Kurg, Andres Metspalu, Leena Peltonen
and Ann-Christine Syvänen: Minisequencing: a specific tool for DNA
analysis and diagnostics on oligonucleotide arrays. (1997) Genome
Research 7:606-14.
III
Tomi Pastinen, Kirsi Liitsola, Paavo Niini, Mika Salminen
and Ann-Christine Syvänen: Contribution of the CCR5 and MBL genes
in susceptibility to HIV-1 infection in Finnish Population. (1998) AIDS
Research and Human Retroviruses 14:695-8.
IV
Tomi Pastinen*, Markus Perola*, Paavo Niini, Joe
Terwilliger, Veikko Salomaa, Erkki Vartiainen, Leena Peltonen and
Ann-Christine Syvänen: Array-based multiplex analysis of candidate
genes reveals two independent and additive genetic risk factors for
myocardial infarction in the Finnish population. (1998) Human Molecular Genetics 7:1453-62.
Tomi Pastinen, Mirja Raitio, Katarina Lindroos, Paivi
V
Tainola, Leena Peltonen and Ann-Christine Syvänen: A system for
specific, high-throughput genotyping by allele-specific primer extension on microarrays. (2000) Genome Research in press.
*equally contributed
(IV has previously appeared in Dr. Markus Perola’s PhD thesis
May-1999)
7
SUMMARY
Single nucleotide polymorphisms (SNPs) represent the most
common form of sequence variation among individuals: three million
common SNPs with a population frequency of over 5% have been
estimated to be present in the human genome. Furthermore, simple
substitution mutations account for the majority of disease alleles
identified for inherited disorders. The Human Genome Project’s
sequencing effort is enabling large scale, genomewide comparative
sequencing to identify common sequence variants. Thus, a genetic
map of unprecedented resolution is being constructed containing
several hundred thousand SNP markers. High-throughput methods for
scoring allelic variants of SNPs and point mutations are imperative
not only for efficient use of the new markers and for screening for
disease mutations, but also for characterizing functional
polymorphisms affecting drug metabolism.
In this thesis, methods based on enzymatic discrimination of
sequence variation in multiplexed formats are developed and applied.
Minisequencing is based on a detection primer annealing just 5’ to
the nucleotide of interest and extending this primer with labeled
nucleoside triphosphate analogues using a DNA-polymerase. The
fidelity of the DNA-polymerase ensures that only a nucleotide complementary to the site of interest is incorporated in the reaction,
specifically identifying the allele.
A multiplexed, fluorescent minisequencing method for scoring
SNPs at the human leukocyte antigen class II genes was developed
based on size addressing of detection primers specific for each site. A
manifold support was utilized to immobilize the amplified targets and
a multiplexed minisequencing reaction with fluorescein labeled
dideoxynucleotides was carried out. The reaction products were
analyzed by size separation by electrophoresis in an automated
sequencer. A 100% concordance of the typing results with samples of
known genotype was achieved. This convenient procedure allowed
rapid scoring of DQA1 and DRB1 alleles in a cohort of multiple sclerosis affected individuals and their parents.
DNA-microarrays are solid substrates with ordered set of immobilized oligo –or polynucleotide probes in a miniaturized format. A
DNA-microarray based minisequencing primer extension method for
simultaneous genotyping of disease mutations and SNPs was developed. The power of genotype discrimination using the enzyme as8
sisted minisequencing procedure was shown to be 10-fold better than
that of allele specific oligonucleotide hybridization (ASO) in a
pairwise comparison on the microarray format. The specificity of the
minisequencing approach allows the use of low complexity
microarrays for genotyping applications. Custom-built robotic spotters were used to construct arrays for scoring SNPs related with HIV-1
susceptibility as well as variants associated with increased risk of
myocardial infarction (MI). These assays were applied in two casecontrol association studies in over 600 Finns. The common chemokine
receptor gene deletion (CCR5 D32bp) and variant alleles of mannose
binding lectin gene were associated with protection against and
increased susceptibility to HIV-1 infection, respectively. Increased
risk for MI in the Finnish population was conferred by variant alleles
of the platelet glycoprotein IIIa and plasminogen activator inhibitor
type-1 genes; this predisposing effect was particularly prominent in
subjects who carried both predisposing variants.
A related enzyme assisted genotyping method, called allelespecific extension on DNA-microarrays, was then developed to simplify the reaction procedure. In this method, only a single post-PCR
liquid handling step is required for multiplexed genotyping. The
specificity of genotype discrimination remained high, and “in-house”
manufactured miniaturized reaction chambers enabled scoring of
over 2500 genotypes from a single glass microscope slide. One fluorescent dye is required, but the use of another dye as an internal
control allowed minority mutation detection at 5% level. A panel of 31
Finnish disease mutations and another panel for 11 SNPs were evaluated in 424 samples, with accurate assignment of known genotypes
and with an over 96% success rate. The assay for Finnish disease
mutations was recently applied to nearly 2500 population based
samples and blinded positive controls to determine carrier frequencies and the geographic variation of the carrier frequencies in Finland.
The work in this thesis demonstrates that significant improvement in single nucleotide variation scoring capacity can be achieved
with enzymatic discrimination based multiplexed methodology. Furthermore, the approaches described are suitable for any molecular
biology unit, as the reagents and instrumentation are now widely
available.
9
INTRODUCTION
Genomics can be understood as study of biology using specific
tools of cloned genes and, if possible, in a genomewide manner.
Cataloguing human genes and their functions has been compared to
the construction of a “periodic table of biology” (Lander 1996) analogous to the chemical periodic table of elements. The most straightforward way of acquiring a comprehensive list of human genes is to
determine the whole genomic sequence. The current goal of the
Human Genome Project is to have the finished human genomic sequence by the year 2003 (Collins et al. 1998), preliminary partial
alingment of the human genome sequence will be available within
the next few months. The sequence of one human chromosome has
already been released (Dunham et al. 1999), an accomplishment,
which was doubted by many just 10 years ago. Just as the development of technology to analyze DNA has made the advancement in
genome sequencing possible, it is expected to open the door for the
“post-genome” or “functional genomics” era.
Sequence variation between individuals consists of a continuum
from deleterious disease mutations to neutral polymorphisms. Characterization of this variation can be utilized in mapping disease
genes, diagnostics, pharmacogenetics - defining functional variation
in drug metabolizing enzymes or receptors, individual identification,
population genetics, and evaluation of physiological relevance of
individual genes. Thus, selective resequencing to determine genetic
variation can be considered as an integral part functional genomics.
In this thesis novel tools have been developed for analyzing single
nucleotide variation in a parallel, multiplex manner for various polymorphic and disease loci in the human genome. Our applications for
scoring variants to study Finnish genes, represent an early step towards large-scale screening of polymorphisms and mutations.
10
REVIEW OF THE LITERATURE
Interindividual sequence variation
Faithful replication of the 3.3*109 base pairs of the nuclear human
genome is a basic property necessary for each dividing cell. A key
group of enzymes in the replication process are the DNA-polymerases,
which synthesize new DNA-strands by incorporating complementary
nucleotides with high fidelity. The enzymes often possess a 3’-5’ proofreading activity to correct for misincorporated bases. For example
E.Coli DNA polymerase III with proofreading activity has an error rate
as low as 5x10-9 (Lewin 1997). Similarly, mammalian DNA polymerase
s, which synthesizes the daughter strand during replication possesses
3’-5’ exonuclease activity along with the extensive DNA repair machinery of an eucaryotic cell (see below) to ensure perfect copying of
the parent strand. However, errors in the replication occur in 1-100%
of the cell divisions as assayed in mammalian cell culture systems
(Thacker 1985).
Both endogenous and exogenous sources of replication errors
exist. Endogenous causes include spontaneous depurination of bases
(Loeb and Preston 1986) and deamination of cytosine residues (sometimes adenine) (Coulondre et al. 1978) yielding uracil (or hypoxanthine). Variation in microsatellite repeat size arises by intramolecular
slipped strand mispairing, whereas interstrand interactions such as
gene conversion or recombination give rise to minisatellite variability (Dijan 1998). Deletions and insertions are formed similarly, possibly being promoted by the surrounding direct or inverse repeats.
Exogenous mechanisms for mutations include thymidine dimerization
induced by UV light, various chemicals, such as alkylating agents
forming adducts with the DNA bases, reactive oxygen species damaging pyrimidine and purine rings, and ionizing radiation causing DNA
strand nicking and breakage. The importance of exogenous agents in
promoting mutagenesis has recently been highlighted by the increased mutation rate in mammals exposed to ionizing radiation by
the Chernobyl nuclear catastrophe (Dubrova et al. 1996, Ellegren et al.
1997).
The unavoidable damage to our DNA is counterbalanced by the
DNA repair systems present in our cells to retain viability. The crucial
nature of these repair mechanisms is evidenced by several inherited
diseases caused by defects in the system. Patients with xeroderma
11
pigmentosum are prone to skin cancer upon exposure to UV-light, due
to defects in the nucleotide excision repair system removing thymine
dimers and large chemical adducts (Lambert et al. 1998). Similarly
defects in post-replication repair of double-stranded breaks can
cause Bloom syndrome (Ellis and German 1996), Nijmegen breakage
syndrome (Matsuura et al. 1998) or hereditary ovarian and breast
cancer (Zhang et al. 1998). The essential nature of the mismatch repair
system proteins is highlighted by germ-line mutations in their genes
causing nonpolyposis colon cancer (Aaltonen and Peltomäki 1994).
Frequency and distribution of human sequence variations
Interindividual sequence variation is most frequently seen in
differences in lengths of repeated sequence elements such as
minisatellites and microsatellites, as small deletions or insertions,
and as substitutions of the individual bases. Hypervariable
minisatellites with repeated units of 9-64 bp in length are mostly
located between genes, and are dispersed unevenly in the genome
preferentially in telomeric locations in human chromosomes (Lathrop
et al. 1988).
Microsatellite repeats with repeat units of 1-4 bp, are also mostly
non-coding with the important exception of some trinucleotide repeat expansions causing inherited disorders. Mononucleotide repeats
of runs of A or Ts compose 0.3% of the nuclear genome, while dinucleotide repeats represent 0.7% of the genome, occurring approximately once in every 50kb (Weber & May 1989).
Substitutions of single nucleotides are the most common form of
sequence variation between individuals occurring every 300-1000bp
in the genome (Li & Sadler 1991, Wang et al. 1998, Cargill et al. 1999,
Halushka et al. 1999). Transition mutations (C to T) are more common
than transversion (A to T or A to C) mutations, probably partly due to
instability of CpG dinucleotides (Cooper et al. 1995). Polymorphisms
in the regulatory regions of genes and sequence variants that alter
amino acids in the coding regions of human genes are significantly
suppressed by selection. This is evidenced for by the similar frequency of fourfold degenerate site polymorphisms in the coding
compared to the noncoding DNA of pseudogenes; whereas
polymorphisms at twofold degenerate sites, 5’ flanking, 5’UTR and
3’UTR are less common. Finally, nucleotide substitutions at
nondegenerate sites have only 25-30% of the frequency of non-coding
12
polymorphisms (Li & Sadler 1991, Cargill et al. 1999, Halushka et al.
1999).
Small deletions/insertions (del/ins) usually cause frameshift
mutations and are even more significantly suppressed in the coding
regions of genes, and as assessed recently in the factor IX gene, their
frequency was only 1.5% of the mutation frequency of substitutions
(Anagnostopoulos et al. 1999). On average, common small del/ins
occur once in every 12kb of genomic DNA (Wang et al. 1998), and a
genetic map based on these polymorphisms is being developed
(http://www.marshmed.org/genetics/).
Dynamic mutations
After the characterization of the first trinucleotide repeat expansion mutation causing the Fragile X syndrome (Fu et al. 1991) it became clear that a number of inherited disorders are caused by these
instabile repeat mutations. The repeated segments range from short
trinucleotide repeats in coding regions, such as in Huntington Disease (The Huntington’s Disease Collaborative Consortium. 1993), to
promotor region expansions with >10bp repeat size, as in progressive
myoclonus epilepsy (Virtaneva et al. 1997). Some of the repeat sequences are too long or too GC-rich to be efficiently amplified and
pose a challenge to diagnostic laboratories (reviewed by McGlennen
1996). Consequently, in some cases the diagnosis is still based on
traditional Southern blotting and, the advances in techniques for
detection of sequence variation described in this thesis are not applicable for this category of mutations.
Polymorphic markers in human genetics
Initial identification of interindividual genetic variation was
made at the protein level. ABO bloodgroups (Landsteiner , 1901) were
the first genetic polymorphic markers described for humans. Coincidentally, very near their discovery in the early 20th century, Sir
Archibald Garrod described alkaptonuria, the first inborn error of
metabolism. Garrod’s work with alkaptonuria outlined many of the
cornerstones of a monogenic disease trait, such as familial distribution, high-incidence of consanguineous marriages in affected families, and a pattern of recessive inheritance as described by Mendel
earlier.
In the early days of human genetics no daily updates to marker
13
databases were made, and the next description of new polymorphic
markers - the Rh bloodgroups - was published several decades later
(Landsteiner & Wiener 1940, Levine & Stetson 1939). Gibson was the
first to characterize an enzyme defect as a cause of a human inherited
disease almost half a century after the clinical description of alkaptonuria (Gibson et al. 1948).
The next events in development of genetic markers and discovery of genetic diseases were merged by Linus Pauling’s work demonstrating that the sickle cell hemoglobin polypeptide had different
electrophoretic mobility than the wild type counterpart (Pauling et al.
1949). This introduced the molecular disease concept, and provided a
powerful tool for studying variation between individuals at the protein level. Many other abundant serum proteins were shown to be
polymorphic by protein electrophoresis. Ingram demonstrated some
years later, that Pauling’s observations were due to single aminoacid
substitution in the primary polypeptide sequence (Ingram 1957),
More sensitive enzymatic staining techniques enabled detection of
polymorphisms also in less abundant proteins (Harris 1966, Lewontin
& Hubby 1966). HLA proteins assayed by immunological methods
(Dausset 1958, Bach & Voynow 1966, Amos et al. 1969) in the 1960s
illustrated the peculiar feature of these molecules as being more
diverse than all the other polymorphic markers discovered.
Monitoring of genetic variation at the DNA-level became possible when enzymatic manipulation of DNA was discovered. Restriction
enzymes (Smith & Wilcox 1970, Kelly & Smith 1970) enabled targeted
cutting of human genomic DNA into fragments that could then be
cloned into vectors and propagated in suitable bacterial plasmids
(Cohen et al. 1973). Fragmented DNA could also be blotted and hybridized after agarose gel electrophoresis (Danna & Nathans 1971) to
nitrocellulose filters (Southern et al. 1975). Soon after the discovery of
first restriction fragment length polymorphisms (RFLPs) in the human
genome (Kan & Dozy 1978), the construction of a map of the human
genome based on these polymorphic DNA markers was suggested
(Botstein et al. 1980). The era of “reverse genetics” had begun. With
the polymorphic marker map and linkage analysis one could look for
disease causing loci in the human genome without a priori knowledge of the underlying biochemistry. The power of reverse genetics
was highlighted by the localisation of the Huntington disease gene
(Gusella et al. 1983). The highly polymorphic minisatellites (VNTRs)
(Jeffreys et al.1985) were more informative than RFLPs, but suffered
from nonuniform distribution across the human genome, and thus they
were mostly utilized in identification of individuals rather than in
14
genomic mapping. By the time next generation of polymorphic markers based on in vitro amplification of genomic DNA were suggested,
there were already over 2000 polymorphic RFLPs mapped into the
human genome (Weber 1990). The new microsatellite markers (Weber & May 1989) soon took over the genetic mapping and DNA-based
identification fields. Multiallelic microsatellites were easier to assay
after PCR amplification and polyacrylamide gel electrophoresis, and
they are also more informative than biallelic RFLPs. In 1996 there were
already >5000 mapped microsatellite markers included in the “2nd
generation genetic marker map” (Dib et al. 1996). Consequently,
finding disease genes for monogenic disorders has become considerably easier, evidenced for by over 1000 cloned genes with allelic
variants underlying inherited diseases and disease susceptibilities to
date (Antonorakis and McKusick, 2000, http://www.ncbi.nlm.nih.gov/
omim/).
Currently, interest has shifted towards the development of a 3rd
generation marker map (Wang et al. 1998) based on the most common form of interindividual sequence variation - the single nucleotide polymorphisms (SNPs). The enthusiasm regarding these
biallelic markers is not only due to their extreme density, but also the
promise of easier and more accurate scoring of them. Furthermore,
the relatively high mutation rate of microsatellites (Sajantila et al.
1999) theoretically favors the use biallelic markers. Joint efforts are
now in progress to develop up to 300.000 SNP markers within the next
2-3years (http://www.wellcome.ac.uk/en/1/awtprerel0499n123.html).
It is believed that with the SNP markers becoming available, the
genetic dissection of complex traits common in the population would
be possible (Risch & Merikangas 1996, Chakravarti 1998). However,
criticism against the simplified linkage disequilibrium assumptions
in association studies has been put forward (Terwilliger & Weiss
1998).
Genotyping technology before the PCR era
The recombinant DNA technology made it possibile to look into
the variation of the individual DNA bases in the genome. Particularly
Southern blot hybridization (Southern 1975) served as the “working
horse” of genotyping. The method involves restriction enzyme digestion of genomic DNA, in microgram amounts and usually a cloned
radiactively labelled probe. The discovery of an RFLP marker 3’ to the
b-globin locus associating with the sickle cell trait (Kan & Dozy 1978),
using the first cloned disease-related mammalian cDNA and gene
15
(Maniatis et al. 1976), opened the way for DNA diagnostics of human
inherited diseases. Next, improved disease allele detecting RFLPs for
the b-globin locus (Geever et al. 1981, Chang & Kan 1982), and also for
some other common recessive diseases, such as A1AT deficiency
(Cox et al. 1985), were identified. Development of the allele specific
hybridisation method using short synthetic oligonucleotide probes
(Wallace 1979) allowed, in principle, detection of any base substitution (Conner et al. 1983), irrespectively of changes in restriction sites.
Previously uncharacterized base substitutions from total genomic
DNA could be detected by the ribonuclease A cleavage method
(Myers et al. 1985). Subsequently, a more general method based on
differential electrophoretic mobility of DNA heteroduplex in a denaturation gradient gel (Myers et al. 1985) was introduced.
During the 1960s only sequences of small RNA molecules could
be determined by fragmenting. Development of the Sanger dideoxy
(Sanger et al. 1977) and the Maxam-Gilbert chemical cleavage methods (Maxam & Gilbert 1977) made sequence determination of cloned
DNA fragments possible. It was necessary to clone the fragments to
get enough DNA, and to reduce complexity of the sample. DNA synthesis on solid supports by automated synthesizers (Gait & Sheppard
1977a, Gait & Sheppard 1977b) made it possible to use of the novel
ASO-hybridisation and sequencing techniques.
Amplification of target DNA
Analysis of human genomic DNA is based on amplification of
fragments of interest from the genome to increase the copy number of
the target and to reduce the complexity of the analyzed DNA. Both of
these measures are directed to enable sensitive and specific detection of the target of interest.
The polymerase chain reaction (PCR, Figure 1) (Saiki 1986, Mullis
& Faloona 1987) has changed genome research. The possibility to
amplify specific segments of genomic DNA has enabled detection of
point mutations in large scale. With purified thermostable
polymerases (Lawyer et al. 1989) a wide range of genomic applications were quickly developed(reviewed by Erlich et al. 1991). Modern,
rapid thermal cyclers with standardized 96-well or 384-well plate
formats allow fast amplification and set-up of reactions. Combining
enzymes possessing pronounced 3’ to 5’ exonuclease activity (Pfu)
with “designer” enzymes having minimal 5’ to 3’ exonuclease activity
16
(AmpliTaq) allow high-fidelity amplification of over 10kb stretches of
DNA (Barnes 1994). Non-specific amplification is minimized by use of
“molecular switches” activating the reaction only at high temperatures (Sharkey et al. 1994, Dang & Jayasena 1996, Birch et al. 1996).
Active research and development in DNA-polymerases (for a review
see Abramson 1999) and instrumentation (Northrup et al. 1999) hold a
promise for extremely facile amplification procedures. For example,
micromachined devices with continuous flow systems allow amplification of genomic DNA in merely few minutes (Kopp et al. 1998).
The isothermal self sustained replication (3SR) reaction (Guatelli
et al. 1990) also known as nucleic acid sequence-based amplification
(NASBA) (Compton 1991) is another target amplifying method. In 3SR
successive action of three (or two) enzymes lead to exponential
amplification of target in an isothermal reaction. Another
“polyenzymatic-primer-guided” reaction procedure is strand displacement amplification (Walker et al. 1992a, 1992b). These alternative target amplification techniques have never gained popularity in
human genetic applications as they do not provide significant advantage over PCR, being more complicated to set-up and optimize.
),*85(6FKHPDWLFSUHVHQWDWLRQ
RIWKHSRO\PHUDVHFKDLQUHDFWLRQ
3&5
7DUJHW'1$
'1$SRO 'HQDWXUH
G173V
$QQHDO
V
W
&
\
F
O
H
7KHVSHFLILFLW\RI3&5
DPSOLILFDWLRQRULJLQDWHVIURPWKH
WZRLQGHSHQGHQWROLJRQXFOHRWLGH
([WHQG
SULPHUVUHTXLUHGWRDQQHDOWRWKH
WDUJHWVHTXHQFHQHDUE\HDFKRWKHU
FRSLHV
LQWKHFRUUHFWRULHQWDWLRQLQRUGHU
'HQDWXUH Q
WRDOORZWKHLUH[SRQHQWLDO
G
&
DPSOLILFDWLRQ'HQDWXUDWLRQ
\
$QQHDO
F
O
H
DQQHDOLQJDQGH[WHQVLRQVWHSVDUH
([WHQG
DFKLHYHGE\WKHUPDOF\FOLQJ
UHVXOWLQJLQWRGRXEOLQJWKHWDUJHW
FRSLHV QXPEHUDWHDFKF\FOH,QUHDOLW\
WKHHIILFLHQF\RIGRXEOLQJLVOHVV
Q
&
\
WKDQLQHDFKF\FOHDQG
F
O
H
DPSOLILFDWLRQLVQRWH[SRQHQWLDODW
V
ODWHUVWDJHVRIDPSOLILFDWLRQ
QFRSLHV
17
Mutation detection in amplified DNA
The mutation detection methods can be divided into those that
scan for unknown mutations in a target region and to those that
screen for previously described variation. Similarly, a nomenclature
for single nucleotide polymorphism detection methods has been
adopted with SNP discovery and SNP scoring methods, respectively.
Typically mutation scanning methodology is required in dominant
disorders with different mutations accounting for disease alleles in
each family and now in the large-scale characterization of common
SNPs in mammalian genomes. Screening or scoring of mutations and
SNPs is commonly employed in carrier screening and diagnosis of
disease mutations as well as in the many applications of SNP typing.
The rationale for division of the methods is that scanning methods are
usually labor intensive, difficult to interpret and expensive, whereas
the once the mutation or SNP has been discovered the scoring methods should provide efficient and straigthtforward techniques for
repetetive testing of the variant in large numbers of samples.
Unknown mutations
One set of methods is based on differential electrophoretic migration of DNA fragments with base substitutions in heteroduplex, as
in denaturing gradient gel electrophoresis, and denaturating high
perfomance liquid chromotography (DHLPC) assays, or in single
stranded DNA fragments, as in the single stranded conformational
polymorphism assay. Another set of methods is based on the cleavage
of heteroduplex molecules either by enzymes such as T4 endonuclease VII, RNase A and cleavases or chemically (reviewed by Cotton
1993, 1997, Grompe 1997). A screening method for nonsense mutations
based on in vitro transcription and translation of polypeptides has also
been developed (Powell et al. 1993), and recently combined with
mass spectrometry to enable detection of missense mutations as well
(Garvin et al. 2000). Common problems for the scanning methods
have been less than perfect sensitivity and labour intensive procedures. The DHLPC method (Underhill et al. 1997) has gained popularity because of its high sensitivity in single base variation detection
and the semi-automated procedure (O’Donovan et al. 1998), making it
a good choice for large scale variation screening projects (Cargill et
al. 1999).
The Sanger sequencing method benefitted from the introduction
18
of in vitro amplification and the new thermostable polymerases. Direct
sequencing of PCR amplicons was introduced (Wong et al. 1987) and
modified into a solid phase format (Hultman et al. 1989, Syvänen et
al.1989) facilitating diagnostic sequencing. Next the format of the
chain termination reaction itself was modified into a linearly amplifying thermal cycling procedure making the template preparation less
demanding (Murray 1989, Ruano & Kidd 1991). Currently, modified
thermostable sequencing polymerases (Tabor & Richardson 1995),
improved fluorescent dyes (Metzker et al. 1996) and sophisticated
capillary electrophoresis separation (Quesada 1997) have rendered
Sanger dideoxy sequencing robust and hence many of the techniques
for scanning mutations are becoming obsolete. Also the developments of MALDI-TOF mass-spectrometry are promising ever faster
separation of sequencing fragments, though currently limited to only
<100bp readouts (Roskey et al. 1996, Köster et al. 1996). Sequencing
by synthesis, or pyrosequencing, in which sequential release of inorganic pyrophosphate formed upon DNA polymerase catalyzed primer
extension is monitored by a luminometric assay (Ronaghi et al. 1996)
is also possible. Demonstrated read-lengths by pyrosequencing are
not currently sufficient for de novo sequencing (Ahmadian et al. 2000),
but the method is useful for EST tag sequencing and screening of
known mutations.
Screening for known sequence variation
The principles of common mutation screening or scoring methods are illustrated in figure 2. The next paragraphs describe these
basic methods, the different reaction formats they are employed in,
and their applicability for multiplex mutation detection. The systems
employing DNA-array format are discussed separately.
PCR RFLP
Restriction enzyme digested amplification products initially
required a radioactively labelled probe for detection (Saiki et al.
1985), but with improved specificity of the PCR method (Saiki et al.
1988) direct detection of restriction fragments using simple agarose
gel electrophoresis became possible. The drawback of the simple
PCR-RFLP method is that not all mutations change a restriction site
and artificial mismatches introduced by the amplification primers are
sometimes required to screen mutations (Cohen et al. 1988). Due to
its simplicity, PCR-RFLP has been very popular for detection of dis19
ease mutations, and some modifications to increase capacity of gel
electrophoresis have been suggested for large-scale mutation screening (Day & Humphries 1994, Bolla et al. 1995). Slab gel electrophoresis
is, however, difficult to automate and not a suitable separation method
for high-throughput genotyping. Optimally, an internal control to
verify efficient restriction of the PCR product should be included.
While long DNA molecules are still challenging for massspectrometric analysis, detection of short PCR-RFLP restriction products has been demonstrated with MALDI-TOF spectrometry (Liu et al.
1995).
5HVWULFWLRQGLJHVWLRQ
$OOHOHVSHFLILF3&5
7
7
$
$
7
3&5
5HVWULFWLRQHQ]
1WRILQWHUHVW
/LJDVH
&
&
7
$
7
7DUJHW'1$
&
7
G7
G&
'1$SRO
7
$
2OLJRQXFOHRWLGHOLJDWLRQ
DVVD\2/$
7
$
7
$
$OOHOHVSHFLILFROLJRQXFOHRWLGH
K\EULGL]DWLRQ$62
0LQLVHTXHQFLQJ
),*85(6FRULQJRIVLQJOHQXFOHRWLGHYDULDQWV
6FKHPDWLFSUHVHQWDWLRQRIWKHFRPPRQO\HPSOR\HGVWUDWHJLHVIRUVLQJOHQXFOHRWLGH
SRO\PRUSKLVPVFRULQJ$7WR&WUDQVLWLRQLVLQWHUURJDWHGLQWKHGHSLFWHGH[DPSOH
Allele specific oligonucleotide (ASO) hybridisation
Short oligonucleotide probes designed to hybridize with normal
or mutated target DNA can be used to screen for mutations as mismatches between probe and target destabilizes the hybrid. PCR
20
amplification provided sufficient enrichment of the target DNA of
interest to have amplified DNA samples immobilized on nitrocellulose filters (Saiki et al. 1986). These “dot-blots”, first used for detection of human point mutations in b-globin and for HLA-DQA1 typing,
and were shown to be suitable for clinical diagnostics as well in nonradioactive detection schemes using biotin labeled targets and
colorimetric reaction to demonstrate positive signals (Saiki et al.
1988).
Multiplexing ASO hybridization by immobilization of amplified
fragments, simultaneous hybridisation with several probes followed
by elution of hybridized amplicons and finally sequencing by chemical cleavage to identify underlying mutation carriers has been used
(Shuber et al. 1997). The complex procedure of multiplexing in the
“dot blot” approach can be avoided if a “reverse dot-blot” method is
used.
In reverse dot-blot hybridisation ASO probes are immobilized
and the amplified samples are hybridized to these immobilized
probes (Saiki et al. 1989). Applications of the reverse dot-blot method
for multiplex detection of mutations have indicated that very careful
optimisation of the immobilized probes is required to achieve discrimination of several mutations in the same reaction conditions
(Wall et al. 1995). Another attempt to technically simplify ASO-hybridization is sandwich hybridization in microtiter plates (Cros et
al.1992). Even commercial filters for ASO mutation detection have
proven to be sensitive for minor changes in the reaction conditions
(Thonnard et al. 1995). The relatively high background from mismatched hybridisation obviates the use of ASO for detection of minority mutations, as the limit of detection is 10% of mutant sequences in
mixed samples (Farr et al. 1992).
One recent suggestion to improve the power and versatility of
ASO hybridsation has been to use peptide nucleic acid analogue
probes (PNA) with mass-spectrometric detection. Human tyrosinase
gene mutations (Griffin et al 1997) and HLA-DQA1 polymorphisms
(Ross et al. 1997) have been typed using PNA probes. PNA probes
with their neutral backbone hybridize at low ionic strength conditions allowing in principle better discrimination against mismatches.
Also, the backbone does not fragment in the MALDI-TOF conditions,
but multiplexing is limited due to the widely differing thermal
stabilities of PNA probes (Griffin & Smith 2000).
21
Ligation assay
In favourable reaction conditions, T4-ligase was shown to be
highly discriminative against mismatches occurring near the ligation
junction (Landegren et al. 1988, Alves & Carr 1988, Wu & Wallace
1989). Solid-phase systems for detection of mutations or SNPs based
on hapten labeling with indirect detection or the use of lanthanide
dyes with time-resolved fluorometry have been applied in oligonucleotide ligation assays (OLA) (Nickerson et al. 1990, Samiotaki et al.
1994, Tobe et al. 1996). Ligation products can be detected by massspectrometry as well (Jurinke et al. 1996). Thermostable ligases
(Barany & Gelfand 1991) increased the specificity and efficiency of
OLA-assays to a high level (Luo et al. 1996). The ligation assay has
been multiplexed by using fluorescently labelled ligation probes
with differential electrophoretic mobility to distinct each mutation
and detection in an automated sequencer (Grossman et al. 1994, Day
et al. 1995, Baron et al. 1996). Recently, optimization of the ligation
assay conditions has enabled allele distinction at detection of mono –
or microsatellite repeats (Zirvi et al. 1999a, Zirvi et al. 1999b) and
detection of minority mutations down to 0.2% level (Khanna et al.
1999).
Allele specific PCR
Mismatches at the 3’end of a PCR primer hinder extension of the
primer during PCR. A pair of allele specific PCR primers with 3’ends
complementary to either allele at a variable nucleotide site, and a
common non-discriminatory primer used in parallel PCR reactions
provide a convenient way for mutation detection (Wu et al. 1989,
Newton et al. 1989, Sommer et al. 1989). Also primers with allele
specific mismatches in non-terminal positions can be used for
competetive allele-specific amplification (Gibbs et al. 1989). The
advantage of the assay is that genotype assignment only required the
detection of a positive amplification signal, for example after separation using simple EtBr-stained agarose gels (Wu et al. 1989).
Early experiments indicated that mismatch discrimination by
allele specific PCR was highly dependent on reaction conditions and
was particularly poor for purine-pyrimidine mismatches (Kwok et al.
1990). The self-propagating nature of the mismatched extension in the
PCR has hindered development of robust high-throughput assays, and
multiplexing of the reactions has been achieved only after extensive
optimisation of the reaction conditions (Ferrie et al. 1992).
22
Minisequencing primer extension
Incorporation of a single nucleotide by a DNA-polymerase to the
3’end of a detection primer, which anneals just 5’ to the site of interest, in a sequence specific manner was first presented 10 years ago
(Sokolov 1990, Syvanen et al. 1990, Kuppuswamy et al. 1991).
Multiplexed versions of this method are the subject of this thesis.
Several other modifications have been presented (recently reviewed
by Syvanen 1999), some of which will be discussed in detail below.
Subsequent chapters describe homogenous-, “tagged”-and
multiplexed gel or array-based minisequencing systems.
The early applications of the method already illustrated the
excellent genotype discrimination provided by the fidelity of DNApolymerases in a single set of reaction conditions for all mutations.
This allowed not only the unambigous assignment of heterozygotes,
but also the discrimination of minority mutations down to 0.25% level
as well as quantitation of alleles in mixed samples (Syvanen et al.
1992, 1993; Krook et al. 1992).
Separated reactions for each allele to be detected can be performed using radioactively labeled nucleoside triphosphate analogues in solid-phase (Syvanen et al. 1990, 1992, 1993) with detection
by scintillation counters. If the reaction is carried out in solution
(Sokolov et al. 1990, Kuppuswamy et al. 1991, Krook et al. 1992) separation of primers and excess label must be done by electrophoresis and
detection by autoradiography, respectively.
Chemiluminescent detection of nucleotides labelled with
haptens such as FITC, DNP and biotin with alkaline phosphatase or
horseradish peroxidase conjugated antibodies avoids the use of
radioisotopes (Syvänen et al. 1990, Harju et al. 1993, Livak et al. 1994,
Pecheniuk et al. 1997, Sitbon et al. 1997, Tuuminen et al. 1997,
Nikiforov et al. 1994). In the pyrosequencing detection system the
release of pyrophosphate upon extension is monitored
luminometrically (Nyren et al. 1993). Mass-spectrometry of the extended primers has also proved to be feasible (Haff & Smirnov 1997).
Size separation of multiplexed minisequencing products in automated sequencers is discussed in the results and discussion section.
Multiplexed minisequencing MALDI-TOF detection was shown (Ross
et al. 1998) for 12 sites using detection primers differing by 2bp or
23
less utilizing “mass tuning” based on different composition (=mass)
of different detection primers. Multiplexing was claimed to be extendable to 20 loci simultaneously by current mass-spectrometric
detection technology. In another multiplexed approach Li and colleagues (Li et al. 1999) used detection primers with cleavable bases,
resulting in lower mass of the detection primers, which could allow
higher degree of multiplexing. A recent strategy for massspectrometric detection of primer extension products uses a
3’thiolated detection primer and a subtracted set of a-S-dNTPs (Sauer
et al. 2000). The non-phoshporothioate substrates are degraded and
diluted prior to detection, and despite multiple steps this procedure
is robust as no purification of the extension products prior to measurement is required.
Homogenous assays
Homogenous assays refer to procedures in which the separation
of the genotyping reaction product from unreacted reaction components is not required. This “closed-tube walk-away” assay format is
attractive for genotyping as carry-over contamination can be avoided
in some assays and the number of steps in the procedure are minimized. Assays utilizing intercalating dyes monitor accumulation of
double stranded amplification products during the PCR reaction
(Higuchi et al 1992, 1993). These methods do not discriminate nonspecific amplification by-products such as primer-dimers from the target
of interest limiting their usefulness.
Most other assays are based on the fluorescence resonance
energy phenomenon (FRET) (Foster 1965), in which two fluorescent
dyes in close proximity to each other result into quenching of the
emission of one dye (donor, shorter absorption wavelength) and
increased emission of the other (acceptor) when the donor dye is
excited. The problem of non-specific amplification products also
hinders the use of “sunrise” primers with a FRET dye-pair incorporated into the loop-forming primers during synthesis, which quenches
the monitored emission wavelength if no amplification is taking place
(Nazarenko et al. 1997). A “self-probing” primer design was targeted
to avoid problems encountered with the sunrise primers (Whitcombe
et al. 1999).
In the most widespread homogenous genotyping methods the
PCR product spanning the mutated or polymorphic nucleotide is
probed with an internally hybridizing allele specific oligonucleotide
24
forming a FRET pair in non-hybridized state. The 5’nuclease assay
(Holland et al. 1991) is based on the Taq-polymerase 5’-3’ exonuclease activity, which cleaves the amplification product bound doubly
labelled probe causing an increase in the donor and decrease in the
acceptor dye fluorescence in an allele specific manner (Livak et al.
1995). Molecular beacons refer to stem-loop structured probes with a
quencher-dye pair in the opposite ends of the oligonucleotide which
are in very close proximity in the intact beacon probe (Tyagi et al.
1996). Upon binding to the amplified target this stem-loop organization of the probes is disrupted leading to an increase in the fluorescent dye emission. Both the 5’nuclease and the molecular beacon
assay require careful design of the probes, as the detection of variant
nucleotides is based on allele specific hybridisation. The stem-loop
structure, which has a strong tendency to self-anneal, has been found
to enhance mismatch destabilization compared to target specific
linear probes (Bonnet et al. 1999). Both the molecular beacon and
5’nuclease assay approaches allow limited multiplexing by using
spectrally resolvable common quencher-probe specific dye strategy
(Tyagi et al. 1998, Lee et al. 1999). Recently, a pairwise comparison of
these techniques suggested that the molecular beacon approach is
slightly more discriminative against single base substitutions (Tapp
et al. 2000). A ligation based homogenous assay with one common
dye-labelled primer and allele specific primers having different dyes
both forming resolvable FRET signals upon ligation has also been
devised (Chen et al. 1997a), and has the advantage of utilizing the
clear genotype discrimination by the thermostable ligase. Another
category of homogenous assay formats involves the addition of single
nucleotide extension primers to the amplified target. In homogenous
minisequencing, the inactivation of the PCR polymerase and degradation of PCR nucleotides prior to genotyping reaction are necessary.
The extension primer is labelled with one dye and the incorporated
ddNTPs are differentially labeled again creating an allele-specific
FRET signal (Chen et al. 1997b, 1998).
Fluorescence polarization (FP) detection to achieve homogenous
genotyping assays has been developed based on oligonucleotide
hybridisation detection of allele specific amplification products
(Gibson et al. 1997) or on the minisequencing principle (Chen et al.
1999). FP minisequencing has the advantage of avoiding costly probe
labelling.
A compact miniaturized, homogenous assay format (GenecardsTM,
Livak et al. 1999, 2nd Int’l SNP Meeting, Hohenkammer, Germany)
25
should allow extremely simple genotyping, but the throughput is
limited by non-multiplexed reactions, and customized sets of SNPs are
not easily created.
Assays with signal amplification
In order to detect variation at the basepair level in the human
genome with signal amplification the selectivity of the amplification
process must match that of the PCR reaction, which utilizes a pair of
oligonucleotides to unequivocally define the target region in the
genome.
The Qb-replicase assay was originally described for RNA-hybridisation probes containing the MDV-1 RNA sequence (Chu et al.
1986). After these probes have annealed to the target the hybrids are
isolated and the probe amplified with Qb-replicase up to 109-fold
(Lizardi & Kramer 1990). Better S/N was achieved by using binary
probes and ligation reaction followed by affinity capture and separation (Tyagi et al. 1996). The Qb-replicase assay has not been applied
for detection of polymorphisms or mutation in human genomic DNA,
though in principle this should be possible as ligase discriminates
allelic variants well.
Combining thermostable ligase and ligation primers with the
ligation junction near the nucleotide of interest (for both
orientations) is referred to as the ligation chain reaction (LCR), and
was first introduced as a method for sensitive detection of single
nucleotide substitutions in whole genomic DNA (Barany et al. 1991).
In subsequent studies this assay did not perform as well, and detection of sequence variation required either “preamplification” using
PCR (Ferro et al. 1993) or extensive optimization with additional
mismatches in the ligation probes (Fang et al. 1995).
Padlock probes refer to linear oligonucleotides with target complementary sequences at the ends and a non-complementary linking
segment in between. Upon binding to the target the probes are circularized by a ligase in a template specific manner (Nilsson et al. 1994).
The padlock probes catenate with the target sequence and specific,
localized detection of human metaphase chromosome centromere
repeats could be demonstrated (Nilsson et al. 1997). Combination of
the padlock probes with signal amplification using rolling circle
replication (Fire and Xu 1995) with a strand displacing DNA polymerase could, in principle, allow mutation detection in just single target
26
molecules (Baner et al. 1998). Modifications of the padlock probe
detection and F29 DNA polymerase mediated rolling circle amplification were shown to be sensitive and specific for single molecule
counting (Lizardi et al. 1998).
The invader assay is based on flap endonuclease (FEN) enzymes,
which recognize and cleave structures formed by two overlapping
oligonucleotides hybridized to a target DNA strand (Lyamichev, et al.
1999). PCR based SNP scoring by the Invader assay provides a good
discrimination for all types of nucleotide substitutions (Mein et al.
2000). The principle of the invader and the exponential amplification
providing modified “Invader Squared Assay” are illustrated in figure 3.
Elegant genotyping of 12 SNPs separately with the invader squared
assay was achieved using genomic DNA as a target and MALDI-TOF
detection in a 5h procedure (Griffin, et al. 1999). Branched DNA
probes have also been shown to be able to detect non-amplified DNA
very sensitively (Collins et al. 1997), but lack specificity required for
single nucleotide polymorphism detection.
,QYDGHUUHDFWLRQ
),*85(,QYDGHUDVVD\V
&OHDYDJHVLWH
3ULPDU\ SUREH
3ULPDU\ LQYDGHU
SUREH
7DUJHW'1$
´SUREHµROLJRQXFOHRWLGHGRZQVWUHDPRIWKHVLWH
7
7
$
RILQWHUHVWDUHDQQHDOHGWRWKHWDUJHW7KHWZR
ROLJRQXFOHRWLGHVRYHUODSDWWKHVLWHRILQWHUHVW
7
7
1RQFOHDYHGSUREHERXQG
WR© DUUHVWRUªSUREH
6HFRQGDU\,QYDGHU
DWDUJHWVWUXFWXUHIRUDWKHUPRVWDEOHIODS
HQGRQXFOHDVH)(1)(1FOHDYHVWKHSUREH·WR
WKHRYHUODSSLQJQXFOHRWLGHUHOHDVLQJDWKH·WDLO
ZKLFKFDQEHPRQLWRUHG:KHQUHDFWLRQLV
&OHDYDJHVLWH
SURGXFW
DQGWKHSUREHKDV·QRQVSHFLILFWDLO7KLVFUHDWHV
SDUWRIWKHSULPDU\SUREHWKHDFFXPXODWLRQRI
©6TXDUHGªUHDFWLRQ
3ULPDU\ FOHDYDJH
$Q´LQYDGHUµROLJRQXFOHRWLGHXSVWUHDPDQGD
6HFRQGDU\ SUREH
1
7
1
FDUULHGRXWQHDUWKH7PRIWKHVLJQDOROLJRWKHUH
LVDFRQVWDQWWXUQRYHURIWKHROLJRQXFOHRWLGH
ELQGLQJWRWKHWDUJHW'1$WKXVDPSOLI\LQJWKH
6HFRQGDU\ WDUJHW
VLJQDO7KLVDVVD\FDQEHXVHGHIILFLHQWO\IRU
VFRULQJ613VLQDPSOLILHG'1$0HLQHWDO
1
6HFRQGDU\ FOHDYDJHSURGXFW
$GGLQJDVHFRQGDU\WDUJHW´DUUHVWRUµ
ROLJRQXFOHRWLGHDVHFRQGDU\SUREHDQGXVLQJ
WKHSULPDU\FOHDYDJHSURGXFWDVDQLQYDGHU
ROLJRQXFOHRWLGHDSSUR[LPDWHO\VTXDUHVWKH
DPRXQWRIDPSOLILFDWLRQZKHQWKHVHFRQGDU\
FOHDYDJHSURGXFWLVPRQLWRUHG7KH,QYDGHU
VTXDUHGDVVD\UHVXOWHGLQWRPLOOLRQIROG
DPSOLILFDWLRQRIWKHVLJQDODQGWKXVGHWHFWLRQRID
PXWDWLRQLQQRQDPSOLILHGJHQRPLFWDUJHWZDV
VKRZQWREHIHDVLEOH*ULIILQHWDO
27
DNA-array technology
DNA-microarrays were originally introduced for sequencing and
genotyping, which are discussed in detail in the following chapters,
but this assay format has since found a growing number of applications listed in Table 1. The attractiveness of the array-technology is
based on miniaturized size, parallel nature and solid-phase format,
serving to minimize reagent consumption, increase the number of
assays carried out in parallel and enable automation of the reaction
and read-out.
Table 1. Applications of microarray technology
$33/,&$7,21
7<3(2)$55$<
5()(5(1&(6
&RPSDUDWLYHVHTXHQFLQJ
6\QWKHVL]HGRUVSRWWHG
6HHEHORZ
ROLJRQXFOHRWLGHV
0RQLWRULQJP51$
6SRWWHGDPSOLILHGF'1$IUDJPHQWV 6FKHQDHWDO
%URZQDQG%RWVWHLQ
H[SUHVVLRQOHYHOV
&RPSDUDWLYH*HQRPLF
2OLJRQXFOHRWLGHV$II\PHWUL[
/RFNKDUWHWDO
$PSOLILHGF'1$IUDJPHQWV
%HKUHWDO3ROODFNHWDO
+\EULGL]DWLRQ
,QVLWXGHWHFWLRQRI'1$
&ORQHGJHQRPLF'1$
6ROLQDV7ROGRHWDO
(PEHGGHGWLVVXHELRSVLHV
.RQRQHQHWDO
51$RUSURWHLQV
*HQRPLFPLVPDWFK
3&5DPSOLILHG<$&DQG3$&FORQH &KHXQJHWDO
VFDQQLQJ
IUDJPHQWV
'1$SURWHLQLQWHUDFWLRQ
2OLJRQXFOHRWLGHV$II\PHWUL[
%XO\NHWDO
2OLJRQXFOHRWLGHV
0LOQHUHWDO
2OLJRQXFOHRWLGHV$II\PHWUL[
6KRHPDNHUHWDO
2OLJRQXFOHRWLGH$II\PHWUL[
6DSROVN\/LSVKXW]
DVVD\V
6FUHHQLQJRIDQWLVHQVH
WKHUDSHXWLFV
3KHQRW\SLFDQDO\VLVRI
\HDVWGHOHWLRQPXWDQWV
0DSSLQJRUGHULQJ
JHQRPLFFORQHV
28
Origins of DNA microarray concept
A series of theoretical papers and patent applications published
only a decade ago by several independent groups introduced the
sequencing by hybridization (SBH) approach (Drmanac &
Crkvenjakov 1987, Southern 1988, Bains & Smith 1988, Lysov et al. 1988,
Khrapko et al. 1989, Bains 1991). The original SBH technology was
intended for de novo sequencing by hybridization, which was believed to have higher throughput and be easily automatable. This was
prompted by the launch of Human Genome Project and a common
view that the Sanger dideoxy method could not be scaled up for
increased sequencing speed required.
The simple idea of reading a sequence based on the hybridisation reaction onto its constituent DNA-oligomers was presented in
two formats. Format I had target DNA immobilized on solid support
followed by sequential queries using labeled hybridisation probes
(Strezoska et al. 1991, Drmanac et al. 1993). Format II had a large
number of oligonucleotide probes immobilized either on polyacrylamide gel pads (Khrapko et al. 1989, 1991) or directly synthesized onto
derivatized glass surface (Southern et al. 1992) followed by hybridisation of a labeled target as previously described for “reverse dot-blot
hybridization”.
De novo sequence analysis was complicated by the short oligonucleotide probes (usually octamers) which required low stringency
hybrisation conditions and poor predictability of the result due to
secondary structures within the target. Construction of complete nmer arrays with sufficient probe length, for example a complete set of
15-mers would have 109 probes, to alleviate the problem was not
technically feasible due to the unrealistic number of different
oligonucleotides required (Southern 1996). The problem of low yield
hybridisation of AT-rich probes was recognized at an early stage of
SBH trials (Southern et al. 1992). Suggestions to improve the SBH
strategy include “stacking” hybridisation, with short 5-mer oligos
used with the octamer arrays to increase duplex stability and mismatch discrimination (Broude et al. 1994, Yershov et al. 1996,
Stomakhin et al. 2000). Due to these difficulties in de novo sequencing
the interest shifted to multiplex genotyping and comparative
sequencing. The following chapters describe different strategies for
comparative sequence analysis on DNA-arrays.
29
Array construction
DNA arrays are constructed by deposition and immobilization of
different polynucleotides in spatially addressable sites on a 2-D surface with high-density. If a certain continuous DNA fragment is to be
scanned for sequence variation a tiled array design is employed,
which contains overlapping sets of oligonucleotides designed to
interrogate successive basepairs in the target sequence. Monitoring
recurrent variation at several different targets or sites within the same
target is usually carried out with probe sets interrogating only these
sites of interest. A third approach is to manufacture all possible sequences of a given length onto a single array, in a generic array design representing the original SBH concept that can be used for selective resequencing as well.
In situ synthesis
The array can be manufactured either by combinatorial in situ
synthesis or premade probe deposition on a derivatized surface,
sometimes referred as “off-chip” or “linear” manufacture. In situ synthesis by standard phosphoramidite chemistry on derivatized glass
surfaces was pioneered by Southern and colleagues (Maskos & Southern 1992, Southern et al. 1992, Maskos & Southern 1993). The current in
situ-synthesizer uses teflon-lined synthesis cells to apply reagents on
the glass surface. The synthesis cell is moved along the glass surface
and different parts of the surface are exposed to different
phosphoramidites. The arrays produced this way will be scanning
arrays with all possible probe lengths along the synthesis cell path
(Southern et al. 1994). Another in situ approach utilized standard
phosphoramidites on a derivatized polypropylene surface (Matson et
al. 1994), which were delivered by a multichannel fluidic system with
direct contact to the surface (Matson et al. 1995). Neither of the these
approaches have been applied in large-scale genotyping. In principle
the synthesis method by Southern is limited to produce tiled sets of
oligonucleotides, which are not suitable for SNP scoring applications.
A sophisticated method for synthesis of biopolymers was introduced by Fodor and colleagues (Fodor et al. 1991). The method combined semiconductor-based photolithography and solid-phase
chemical synthesis to achieve highly parallel in situ synthesis of
biopolymers on small glass surfaces. Using phosphoramidites with
photolabile 5’protective groups Affymetrix demonstrated the synthesis of 256 different octanucleotides on 1,28cm2 surface in just 16
chemical coupling steps (Pease et al. 1994) - DNA-arrays were now
30
called “DNA-chips”. The drawback of producing high-density arrays
by the Affymetrix method has been the low step-wise yield of synthesis, varying from 92-94%, effectively limiting the probe lengths to 2025bp (McGall et al. 1997). A novel approach to utilize photolithographic technology for manufacture of DNA-arrays is based on
photoresists that allows the use regular phosphoramidites to make
DNA-oligos with up to 106/ cm2 densities and provides high yields
(McGall et al. 1996,Wallraff et al. 1997).
Spotted arrays
The “off-chip” synthesis of oligonucleotides and deposition of
these with various methods provides a simple method for DNA-array
manufacture accessible also to less specialized chemistry laboratories. Photopolymerized gel-pads can be produced in relatively simple
steps and oligonucleotides can be covalently immobilized to these
pads (Khrapko et al. 1991, Vasiliskov et al. 1999, Proudnikov et al. 1998)
either manually (Guschin et al. 1997) or with the aid of a robotic pindevice (Yershov et al. 1996). The gel-pads can be as small as 25 by
25µm in size, in principle allowing arrays with very high densities,
which is somewhat limited by the requirement of alignment with the
deposition device. Solid glass surfaces have been derivatized with
epoxysilane (Lamture et al. 1994), phenylisothiocyanate (Guo et al.
1994), or mercaptosilane (Rogers et al. 1999) to allow covalent attachment of oligonucleotides via amino- or disulfide groups, respectively.
Recently, chemistries creating dendrimeric structures on glass surfaces increasing the surface area binding the oligonucleotides (Beier
& Hoheisel 1999) and immobilisation of acrylamide modified
oligonucleotides via co-polymerization (Rehman et al. 1999) have
been presented. In cDNA arrays with longer polynucleotide probes
immobilization on polylysine coated glass slides takes place through
electrostatic interactions between the surface and the negatively
charged DNA backbone (Schena et al. 1995).
Robotic devices based on contact printing pins (Schena et al.
1995, Shalon et al. 1996), inkjet dispensing heads (Lemmo et al. 1998,
Stimpson et al. 1998, Okamoto et al. 2000), nanoliter dispensing needles (Graves et al. 1998) and electrospray deposition (Morozov &
Morozova 1999) can be applied to deliver minute droplets of DNAprobes on surfaces, achieving densities of up to 105 probes/ cm2.
Special array designs utilize electronic addressing of charged DNAprobes to affinity capture sites (Sosnowski et al. 1997), selective polymerization of acrylamide on optical fiber tips (Healey et al. 1997) or
31
randomly ordered, labeled microspheres binding DNA-probes on
optic-fibers (Steemers et al. 2000).
Array reading
Early applications of the DNA-arrays invariably used 32P-labeled
probes and phosphorimaging detection systems. The radioisotopic
labeling with 33P coupled with phosphorimaging detection is still
practical in array reading (i.e. Southern et al. 1994, Drmanac et al.
1998, II-IV, Mir et al. 1999). Development of fluorescence labeling
schemes and detection systems based on an epifluorescence
confocal scanner with photo-multiplier-tube (PMT) detector (Fodor et
al. 1991, Pease et al. 1994, Lipshutz et al. 1995) or a fluorescence microscope coupled to a cooled CCD camera (Mirzabekov 1994, Yershov
1996) led to improved resolution. Later it became apparent that in
order to utilize hybridisation based mutation scanning on high-density arrays an internal control labeled with one fluorophore and test
sample with another fluorophore were needed to achieve sufficient
discrimination (Chee et al. 1996, Hacia et al. 1996). Similar dual colour
imaging is also used in expression array methods to normalize the
results (Schena et al. 1995). There are now several commercial providers of confocal array scanners, usually with two or four excitation
sources suitable for fluorophores absorbing at 488-650nm and emitting at 515-690nm.
Several other approaches for array read-out have been suggested, but not yet widely applied. Direct integration of the DNA-array
with a charge-coupled device (CCD) (Eggers et al. 1994) or
phototransistor sensing element (Vo-Dinh et al. 1999) offer a promise
of sensitive detection of fluorescent signals in a highly compact
format. Arrays formed by bundles of optic fibers with probes attached
to their distal ends directly (Healey et al. 1997) or on beads attached
to the fibers (Steemers et al. 2000) have been described. Evanescent
wave excitation for detection of particulates near the surface is created by conducting light into the edge of a wave guide, in which light
propagates by total internal reflection. This approach was demonstrated for DNA-array applications by Stimpson and colleagues
(Stimpson et al. 1995). Recently, a system with four excitation lasers
utilizing the evanescent wave principle for detection of primer extension with labeled dideoxynucleotides was presented (Kurg et al.
2000). Developments in mass-spectrometric techniques are now being
realized also in array based genotyping, offering potential for detection without labeling (Little et al. 1997, Tang et al. 1999). Entirely
different detection system on solid gold surfaces based on differential
32
charge transduction through matched vs. mismatched DNA duplexes
by cyclic voltammometry was recently suggested by Kelley et al.
(Kelley et al. 1999), it remains to be seen whether this elegant approach is applicable in true biological assays.
TABLE 2. Comparison of commonly employed DNAmicroarray labelling and detection systems
/ $ % ( / / ,1 * $ 1 '
$ ' 9$ 1 7$ * (6
' ( 7 ( & 7 ,2 1
6 ( 1 6 ,7 ,
5( 62 /8 9 ,7 <
7 ,2 1
"
"
6<67(0
6 3 D Q G 3 & K H P LF D OV LP LOD U LW\ WR
D Q D OR J X H V Q D WX U D OE D V H V
3 K R V S K R U LP D J H U
/ R Z F R V WLQ V WU X P H Q WD WLR Q
) OX R U H V F H Q WOD E H OV
9 H U V D WLOLW\ Z LW K G LIIH U H Q W
IOX R U R S K R U H V LH LQ WH U Q D O
F R Q WU R OV & R Q IR F D OV F D Q Q H U
,Q V WU X P H Q WD WL R Q Z H OO
D Y D LOD E OH
) OX R U H V F H Q WOD E H OV
$ V D E R Y H
& & ' H OH P H Q W
) D V WH Q D E OH U H D O WLP H
E D V H G U H D G H U V
P H D V X U LQ J
Comparative sequencing on DNA-microarrays
It is surprising how much excitement and hope has been invested into DNA-array technology (Barinaga 1991, Nature Genet.
[Editorial] 1996;14:367, Nature Genet [Editorial] 1998;18:195, Marshall
& Hodgson 1998, Lander 1999) when in practice the development has
been rather slow particularly for genotyping-SNPs. There has been
only a handful of studies extending the analysis beyond the “proof-ofprinciple” level. In the next two chapters the arrays for scanning
known sequence for unknown sequence variants and arrays for assaying known polymorphisms or mutations are discussed separately.
33
Arrays in sequence scanning
The high-density light-directed synthesized arrays by Affymetrix
(Santa Clara, CA) have dominated as the primary platforms for
resequencing arrays. Table 3 summarizes details of the published
studies of resequencing on the Affymetrix chips.
Table 3. Resequencing on high-density oligonucleotide arrays
122)
122)
$33/,&$7,21
6$03
352%(6
%36&$1 /(6
3(5
1('
%$6(3$,5
678
5()(5(1&(
',('
([RQRI&)75JHQH
&URQLQHWDO
+,9SUJHQH
.R]DOHWDO
0LWRFKRQGULDOJHQRPH
&KHHHWDO
+DFLDHWDO
:DQJHWDO
([RQRI%5&$JHQH
+XPDQ613GLVFRYHU\
&RGLQJH[RQVRI$70JHQH
+XPDQFRGLQJ613
+DFLDHWDOD
&DUJLOOHWDO
+DOXVKNDHWDO
GLVFRYHU\
+XPDQFRGLQJ613
GLVFRYHU\
0RXVH613GLVFRYHU\
/LQGEODG7RKHWDO
The earliest versions of such arrays were designed to interrogate
CFTR exon 11 sequence with a minimal set of tiled 15-mer probes
each synthesized on 365 by 365µm synthesis sites (Cronin et al. 1996).
At this point it was already stated that the simplest form of scanning
34
array would not be sufficiently sensitive to detect heterozygous mutations unambigously.
Regions of the HIV-1 genome were assayed on the Affymetrix
chips, as resistance to antiviral therapy is known to be mediated by
mutations in the genes coding for the drug targets (Kozal et al. 1996).
The array resequencing with a considerably more complex design
was shown to be equal in accuracy as compared to Sanger
dideoxysequencing in this study. The evaluation of the commercial
Affymetrix HIV-1 chip demonstrated that mutations present 50% of the
studied viral population could not be detected reliably (Gunthard et
al. 1998).
Chee and colleagues applied the Affymetrix chips to a considerably larger target - the entire mitochondrial genome - using over
130.000 different probes to interrogate the sequence. A two-colour
labeling strategy to include an internal control was employed. A good
genotyping result with 98-99% accuracy in base calling was achieved
in this haploid genome.
The first large human genomic application of the high-density
arrays was resequencing of an exon of the BRCA1 gene. Interpretation
of the results was based on two different scoring methods, “gain of
signal”, in which an increase of the mutant probe signal as compared
to wild-type control signal is seen. “Loss-of-signal”, which assayed the
normalized test and control signals over a larger region as a true
mutation is expected to form a footprint of decreased signal at all the
mutant site overlapping probes. Due to the strong mismatched hybridization at certain sites the more complicated “loss-of-signal”
analysis was found to give better results and 14/15 tested samples
were scored correctly.
In a gigantic SNP-survey with over 109 oligos synthesized on 149
chips, covering a 2Mb stretch of genomic DNA, the sensitivity and
specificity of were both reported to be <90% (Wang et al. 1998).
Similarly with sequence survey of the ATM gene (Hacia et al. 1998a)
and evaluation of the commercial p53 gene array (Ahrendt et al. 1999)
resulted into sensitivities of 88-91%. Resequencing for SNP discovery
consequently used DHLPC in parallel with high-density arrays to
achieve higher sensitivity and specificity (Cargill et al. 1999), alternatively SNPs detected on the array were treated as “candidate SNPs”
(Halushka et al. 1999). Mouse SNP discovery by high-density arrays
was reported to be more successful (Lindblad-Toh et al. 2000), likely
35
due to the use of inbred homozygous mouse strains.
Improved perfomance of poorly hybridizing probes on highdensity arrays was achieved by using 5-methyluridine triphoshates in
the target (Hacia et al. 1998b). Affymetrix has also sought to extend
utility of their sequence scanning arrays by constructing generic 89mer arrays and using ligation reaction to detect sequence differences in a test versus a control sequence (Gunderson et al. 1998). The
performance of the 9-mer array was excellent with targets up to 1.2kb
in size. Similarly Head and colleagues (Head et al. 1997) utilized the
fidelity of a DNA-polymerase in model experiments scanning a 33-bp
stretch of the p53 gene using single nucleotide primer extension,
showing detection of variants present as little as 5% of the target
sequence. An intriguing approach would be the use of polymerase
extension on high-density arrays, which might be possible with the
alternative array synthesis procedures (McGall et al. 1996, Wallraff et
al. 1997) or a novel inversion strategy for primers attached in their
3’end (Kwiatkowski et al. 1999).
The less popular SBH format involving immobilisation of the
targets on filters and successive interrogation with short oligonucleotide probes has been shown to be effective in determining sequence variation in stretches of cloned DNA (Drmanac et al. 1998).
The use of this strategy is complicated by the need for several thousand hybridisation reactions to deduce the sequence, and is thus
limited to large groups automating the successive hybridisation steps
(Drmanac & Drmanac 1999).
Scoring SNPs or mutations on DNA-microarrays
Genotyping of previously characterized SNPs at several different
loci by DNA-microarrays has different key requirements than comparative sequencing. The template preparation to enrich the several genomic fragments spanning the variants of interest is more complex,
and the need for virtually 100% specificity in allele scoring is highly
demanding.
ASO-hybridization based methods
In two short reports Southern and Maskos first described optimization of ASO probes for detection of three beta-globin alleles
(Maskos and Southern 1993), followed by synthesis of the optimal
probes on a solid surface and genotyping four samples (Maskos and
36
Southern 1993). A convenient reaction chamber for analysis of up to
50 samples for 100 mutations in parallel and the reuse of the probe
arrays were suggested, though to date these features have not been
used in practice. Similarly, Mirzabekov and colleagues have suggested hybridization for mutation screening on oligonucleotides
immobilized to gel pads using also the beta-globin gene as a model.
“Stacking” hybridization probes, two-colour hybridisation and very
short targets (32-bp) were used to obtain five genotypes at three
variable sites (Yershov et al. 1996). Alternatively, melting curves of the
hybrids were measured in real-time to improve allelic discrimination,
which was assessed using five amplified targets and several synthetic
targets on arrays for five different nucleotide positions (Drobyshev et
al. 1997). Various modifications of ASO-hybridisation based microarray
genotyping have been presented. Guo et al. used glass supports and
determined that spacer length, surface density and use of single
stranded target were important for good hybridisation yields. Five
tyrosinase gene mutations could be determined from amplified,
fluorescently labeled and single-stranded rendered genomic samples
simultaneously with optimized hybridization probes (Guo et al. 1994).
Another study did not find spacer arms separating the ASO-probes
from the glass surface critical, but only three samples were tested for
two mutations (Beattie et al. 1995). In situ synthesized PNA probes
were evaluated in model systems indicating some difficulties in
predicting behavior of DNA-PNA hybrids and thus limiting their usefulness in genotyping for the time being (Weiler et al. 1997). Electronically addressable electrode array hybridization was first studied
using model systems for DNA and PNA probes (Sosnowski et al. 1997,
Edman et al. 1997). The assay was applied for genotyping three
mannose binding protein (MBP) gene SNPs and one IL-1b gene SNP
with validation using 35 blinded samples regarding the MBP SNPs
(Gilles et al. 1999). Despite the claimed advantages of the electronically addressable arrays and the use of “electronic stringency” it
remains questionable whether these arrays will be useful for routine
genotyping as they are complex to manufacture and assay procedures
require specialized equipment. Similar restraints apply to the use of
the randomly ordered fiber-optic arrays with probes immobilized to
coded beads (Steemers et al. 2000).
All the above variants of ASO-hybridization on DNA microarrays
provided proof-of-principle for the approach, but have not been implemented in practice. High-density arrays generated by light-directed synthesis have been applied in larger scale studies. CFTR
alleles were interrogated on DNA-arrays after amplification of two
regions of the CFTR gene, followed by asymmetric labeling PCR
37
reaction, fragmentation and dilution prior to hybridization. (Cronin et
al. 1996) The assay was evaluated by typing 32 known and 10 blinded
samples and yielded 3-5-fold discrimination against mismatches in
these low complexity targets, though a threshold of 1.4-fold difference
was used. SNP-scoring was carried out by Wang and colleagues, in
which the SNPs were amplified in 46-plex PCR reactions. Only <400
out of the >500 sites performed well enough to allow genotype assignment. One fourth of the well performing SNPs were validated in
three individuals and two CEPH families with a good success rate
(98%) and high confidence assignment of genotypes (99.9%). The
same arrays were applied by Hacia et al. to determine allele frequencies at 214 markers using pooled samples from different populations
(Hacia et al. 1999), however details of the procedure to generate the
allele frequencies were not provided. A similar approach was used tor
genotype Arabidopsis thaliana SNPs (Cho et al. 1999). Almost half of the
markers had to be discarded in this study due to imperfect discrimination of genotypes on the arrays. These studies indicate that there
will be a considerable number of polymorphisms not amenable to
high-density array scoring: at present it is unclear whether it will be
possible to predict which sites will be difficult to genotype. Furthermore, high-level PCR multiplexing is “costly” as 10-20% of markers
are lost at this stage. Commercial array (HuSNP™, Affymetrix, Santa
Clara, CA) with 1494 different SNP specific probe sets is claimed to
yield 1200-1300 usable genotypes per sample (Figure 4A), translating
into success rate of 80-87%(Genechip® HuSNP™ Mapping Assay,
Technical Note No.1, Part No 700318, Affymetrix). It is clear that higher
success rates are required for assaying coding SNPs and mutation
panels.
DNA-modifying enzymes in microarray genotyping
In“traditional” reaction formats DNA-polymerases and ligases
have improved genotype discrimination under uniform reaction conditions, making multiplex genotyping more feasible. The use of DNApolymerases in improving microarray genotyping was the main target
of this thesis: work by others is discussed below, and a more detailed
comparison of approaches is provided in Results and Discussion
section. In minisequencing on DNA-microarray detection primers are
immobilized at their 5’end and designed to anneal just 5’ to the nucleotide on the target; DNA-polymerase incorporates labeled ddNTPs
complementary to the site of interest with high specificity (II). The
same method has been denoted arrayed-primer-extension (APEX,
Shumaker et al. 1996, Kurg et al. 2000), nested GBA (Head et al. 1997)
38
and “multibase single stranded primer extension” (Dubiley et al.
1999). In the model experiment 5 successive basepairs in the HPRT
gene were scanned using three ssDNA samples and extension was
carried with T7 DNA-polymerase incorporating 32P-dNTPs in four
parallel reactions (Shumaker et al. 1996). Recently, single nucleotide
primer extension was applied for analysis of 10 allelic variants of the
b-globin gene (Figure 4B and Kurg et al. 2000). The primer extension
was performed in both orientations of the template with a set of four
ddNTPs each labeled with a different fluorophore using a modified
thermostable DNA polymerase. Results of the primer extension reactions were read with a four-colour evanescent wave laser excitation
imaging system. The average genotype discrimination was nearly 40fold in the nine tested samples.
$
%
),*85(,PDJHVRI613VFRULQJE\$$62K\EULGL]DWLRQRQKLJK
GHQVLW\'1$DUUD\VDQG%VLQJOHQXFOHRWLGHSULPHUH[WHQVLRQRQ
VSRWWHGSULPHUDUUD\V
$$VPDOOSRUWLRQRIWKH+X613¹PDSSLQJDVVD\$II\PHWUL[6DQWD
&ODUD&$LVVKRZQ5HGXQGDQWVHWVRI$62SUREHVIRU613VDUH
V\QWKHVL]HGRQWKH'1$FKLSJHQRW\SHVFDQEHVFRUHGIURP
HDFKVDPSOH,PDJHFRXUWHV\RI$II\PHWUL[,QF
%$QLPDJHRIDQDVVD\IRUVFRULQJVL[613VE\VLQJOHQXFOHRWLGH
SULPHUH[WHQVLRQZLWKIRXUGLIIHUHQWLDOO\ODEHOHGGG173VZLWK7,5)
GHWHFWLRQ'XSOLFDWHVSRWVRIGHWHFWLRQSULPHUVIRUHDFKQXFOHRWLGH
LQWHUURJDWHGIURPERWKRULHQWDWLRQVDUHSULQWHGRQDQDFWLYDWHGJODVV
VOLGH7KHIRXUHPLVVLRQZDYHOHQJWKVRIWKHIOXRURSKRUHODEHOOHG
GG173VDUHFROOHFWHGVHSDUDWHO\*HQRW\SHVFRULQJIRUWKHLOOXVWUDWHG
H[DPSOHLVSURYLGHGLQWKHWDEOHEHORZ,PDJHFRXUWHV\RI'U$QWV
39
.XUJ7DUWX(VWRQLD
Dubiley et al. used similar procedure on gel-pad arrays with a
single fluorophore divided into four separate reactions to detect
seven b-globin alleles in eight patients (Dubiley et al. 1999). Reported
genotype discrimination was lower, with up to 20% misincorporation
rates. Detection of primer extension products by mass spectrometry
for three polymorphisms has been demonstrated on silicon wells
(Tang et al. 1999), in which the PCR product was immobilized
covalently. Despite the speed of the mass-spectrometric measurement
itself, the study by Tang and colleagues suffered from a cumbersome
reaction procedure and only limited miniaturization (4cm2 chip with
36 wells): further studies are required to demonstrate high-throughput
potential of the array-based MALDI-TOF spectroscopy in genotyping
in practice.
Another polymerase-assisted primer extension assay is based on
two immobilized detection primers with 3’end complementary to one
or the other allele, denoted as “multiprimer extension
assay”(Dubiley et al. 1999) or “allele specific extension assay”(V). A
total of 56 genotypes were produced at seven sites using the
minisequencing or the multiprimer extension assay with DNApolymerase and fragmented dsDNA templates on gel-pad arrays
yielding similar base calling accuracy (Dubiley et al. 1999).
Ligation has thus far been applied for screening known mutations
on DNA-microarrays in one published study (Gerry et al. 1999), in
which a universal “zip-code” array was used. Zip-coded or “tagged”
arrays (Figure 5) are adopted from the molecular bar-coding strategy
of yeast deletion strains (Shoemaker et al. 1996). An array with
nine“zip-code” oligonucleotides was produced and allele specific
ligation probes with 5’sequence complementary to one of the array
immobilized oligonucleotides were used in ligation reaction. Following ligation reaction in solution the ligated products were hybridized
on the zip-code arrays, and nine alleles at three K-ras sites could be
demonstrated.
Groups at Affymetrix (Robert Lipshutz, The Microarray Meeting,
Scottsdale, AR Sept. 1999) and Whitehead Institute (Hirschhorn et al.
ASHG 1999 Annual Meeting A1418) are developing the tagged single
basepair extension method (TAG-SBE) (Figure 5). The method is
based on a generic array of “tag”-sequences and multiplex
minisequencing reaction performed in solution using differentially
labeled F-ddNTPs with detection primers each carrying a unique 5’
sequence complementary to one of the tag-sequence immobilized on
the array. Following the minisequencing reaction, the primers are
40
hybridized on the tagged arrays and the incorporated nucleotides are
identified by a multicolour array-reader. The advantage of the TAGSBE or zip-code ligation approach is the generic array design suitable
for any set of SNPs. Furthermore, high-density arrays can be applied
with DNA-polymerase based allele discrimination. The cost of the
genotyping will not be lower than with primer extension directly on
the array (Kurg et al. 2000), since same amount of primer synthesis for
each new set of SNPs is required and multiplexing capacity in solution reaction is most likely not higher than on arrays as there are
more interacting DNA sequences in the reaction mixture.
0XOWLSOH[HGVROXWLRQUHDFWLRQZLWKGLIIHUHQWGHWHFWLRQROLJRQXFOHRWLGHV
HDFKFDUU\LQJDQXQLTXHQRQVSHFLILF©7$*ªVHTXHQFH
2/$
0LQLVHTXHQFLQJ
*
*
&
7
7
$
&
$
$JHQHULFDUUD\RI©7$*ª
FRPSOHPHQWDU\SUREHV
*
7
*
7
7KHVLWHRIK\EULGL]DWLRQLVVSHFLILFIRUDJLYHQYDULDQWORFXVDQGWKH
OLJDWHGSUREHLQFRUSRUDWHGQXFOHRWLGHLVLGHQWLILHVWKHDOOHOHDWHDFKVLWH
),*85(3ULQFLSOHVRIWKH´=LSFRGHOLJDWLRQµDUUD\DQG´7$*6%(µ²DUUD\VWUDWHJLHV
Summary
Hybridization based SNP scoring systems have been applied in
sufficiently large array designs to demonstrate their inherent weaknesses for genotype discrimination. The seemingly unavoidable loss
41
of 20-40% of SNP markers/mutations due to an unoptimal assay principle is serious, and the final estimation of throughput of the systems
by users is lacking. While enzyme based systems do improve genotype discrimination they have not been utilized in simultaneous
assays for hundreds of variants, and a fair comparison of reaction
principles is thus not yet possible.
Practical alternatives to PCR?
Amplification of the target DNA and reduction of its complexity
are achieved by PCR. Parallel analysis of SNPs or mutations necessitates the use of multiplex PCR, for which several procedures have
been presented (for example, see Chamberlain and Chamberlain
1994, Shuber et al. 1995, Henegariu et al. 1997, Zangenberg et al.
1999). In practice, each application requires separate optimization of
multiplex PCR reaction (II, IV, V, Hacia et al. 1998, Cheng et al. 1999).
Generalized rules to avoid optimization of individual reactions and to
allow higher multiplexing levels have been applied with tolerance to
losses of amplifiable genomic fragments (Wang et al. 1998, Cho et al.
1999). Whole genome amplification procedures (ie. Zhang et al. 1992)
do not produce equal levels of copies and, more importantly, provide
no decrease of complexity. An isolated report of detection of single
nucleotide variation in total human genomic DNA using ASO dotblots with competitive hybridization (Wu et al. 1989) is in sharp contrast to high backround hybridization of ASO-probes in Southern blots
(Conner et al. 1983). Furthermore, despite the successful detection of
point mutations in 250-fold less complex yeast genome by highdensity arrays (Winzeler et al. 1998), reduction of complexity of the
human genomic sample is likely to be a prequisite for SNP scoring
with oligonucleotide probes.
An issue in SNP typing is whether PCR could be replaced with
methods providing higher throughput. A few nanograms of genomic
DNA is sufficient for routine PCR reactions and 100-1000µg of genomic
DNA can be extracted routinely from 10ml whole blood sample (personal comm. Dr. M. Perola, National Public Health Institute, Helsinki,
Finland). A single blood sample would thus be sufficient for 105-106
separate PCR reactions, translating into 106-107 SNP genotypes with
modest multiplexing.
Of the proposed signal amplification procedures, the rolling
circle amplification of padlock probes (Lizardi et al. 1998, Baner et al.
1998) has been demonstrated to have the ability to detect single copy
42
sequence in total genomic DNA (Lizardi et al. 1998). Demanding
synthesis of full-length, pure 70-90bp padlock probes (Kwiatkowski et
al. 1996) and topological factors hindering efficient rolling circle
amplification of the target bound probe on solid surfaces (Baner et al.
1998) are challenging practical issues that must be resolved. The
“Invader Squared Assay” has been applied to “simplex” detection of
SNPs (Griffin et al. 1999), and multiplexing should be equally problematic as in PCR, making it uncompetitive with high-throughput
PCR-based assays in its current form. The same limitations apply to
the other target amplification or signal amplification techniques
presented earlier. Furthermore, the Invader assay requires significantly more starting material than PCR. Some experimental techniques with extreme sensitivity, such as fluorescence correlation
spectroscopy (Rigler 1995) suffer from high background signals and
are not applicable for “real-life” diagnostic applications.
Simplification of the array-based methods could be achieved by
coupling the amplification and detection on the solid-surface and
several groups are exploring ways to do this. Electronically
addressable arrays were used for the anchored SDA reaction, which
showed enhanced amplification with electronic addressing of one of
the SDA amplification primers to create “microamplification zones”.
Specific amplification of the factor V gene was achieved using 1µg of
placental DNA as the starting material (Westin et al. 2000). Localized
amplification by PCR on a polyacrylamide film has been done (Mitra
& Church 1999). Another on-chip amplification-in situ PCR is termed
“bridge-amplification”, in which forward and reverse primers for each
amplicon are co-immobilized at specific sites on the chip [US patent
#5,641,658] resulting in double stranded amplicon “bridges” on the
array sites. These in situ amplification techniques remain curiosities,
until their utility in true genotyping applications are shown.
In conclusion, at least for the first years of large-scale SNP-scoring we will have to rely on the “classical” multiplex PCR followed by
more extensively multiplexed high-throughput detection reactions.
Automation and scale-up of existing methods has made the genome
sequencing facile (Meldrum et al. 2000). Similar streamlined procedures for PCR and genotyping using submicroliter reaction volumes
are likely to be the answer for at least large SNP scoring centers. For
example, a single automated 384-well format “SNP production line”
with two microliter PCR reaction volumes and mass-spectrometric
detection of primer extension products (Sauer et al. 2000), yields >104
genotypes per day (personal communication - Dr. Ivo Gut, Centre
43
National de Genotypage, Evry, France). “Point-of-care” testing with
targeted SNP or mutation panels might be practical to carry out using
integrated miniaturized devices (Jacobson & Ramsey 1995, Cheng et
al. 1996).
Alternatives to microarrays for multiplexing
Color-coded microspheres (Luminex, Austin, TX) with similar
degenerate sets of tagged-probes immobilized on their surface as
described for “zip-code” ligation or TAG-SBE can be used to detect
solution based multiplex minisequencing products. A flow-cytometric
measurement with two exciting lasers, one for the identification of the
microsphere color-coding (SNP) and another for identification of the
incorporated label (allele) easily resolves up to 50 different mutations in the same mixture (Dr. Allen Roses, Glaxo Wellcome, UK and
Dr. Scott White, Los Alamos National Laboratory). Hakala et al. relied
on sandwich hybridisation with fluorescein/EDANS labeled beads
and lanthanide labeled ASO probes for detection of six different
mutations, with detection on a microfluorometer (Hakala et al. 1998).
An advantage of these color-coded sphere multiplexing methods is
potentially very fast read-out requiring no image analysis. A related,
futuristic approach pioneered by PharmaSeq corp. is based on coding
the beads by integrating a digital transponder into each bead, which
could result in almost infinite sets different reactions in the same
mixture [US Patent #5,736,332].
Use of sequence variations in modern human genetics
“Routine” mutation/SNP-scoring
Many of the techniques with high SNP-typing capacity could be
used to simplify and improve current routine determination of sequence variation. For example HLA typing would be ideally suited for
array-based genotyping, as relatively few amplicons span a large
number of SNPs in the HLA-region. Unequivocal identification of
individuals with tens of SNPs should be feasible (Syvanen et al. 1993,
Delahunty et al. 1996). The pharmacogenetic field is an interesting
and rapidly evolving one (Evans & Relling 1999), and array-based
products are already available to study cytochrome p450
polymorphisms (Affymetrix). Despite the shift of interest towards
studying common diseases in the genetic research community, clinical geneticists have now efficient tools for diagnosis, and in some
44
cases, prevention of inherited disorders. Most recessive traits even
with high allelic heterogeneity could be diagnosed and screened for
using mutation panels on arrays or other highly multiplexed formats.
Dominant disorders would generally require gene specific
resequencing arrays, which do not yet have sufficient sensitivity for
clinical diagnostics. An example of applying novel array-based detection methods for mutation screening in large scale was a populationbased survey of disease gene frequencies in Finland (Pastinen et al.
manuscript in preparation).
LD mapping of complex traits
Linkage disequilibrium mapping of complex diseases is based to
an approach where a large number of markers throughout the genome, or selectively flanking candidate loci are typed to identify
markers, which are close to the disease causing mutation leading to
cosegregation of the markers with the disease trait in families. Alternatively, sporadic disease cases can be analyzed with matched controls in association analysis based studies (Risch and Merikangas
1996), which ignited suggestions for large scale screening for SNPs
(Collins et al. 1997). A complication of such analysis with complex
traits is the old age of the predisposing mutations, which decreases
the extent of linkage disequilibrium between the marker and disease
allele (Laan & Paabo 1997).
Suggestions for the number of SNP markers required for genomewide LD-mapping of complex traits has varied from 30,000 (Lonjou et
al. 1999) to 500,000 (Kruglyak 1999). Recent studies of genomic diversity distribution in different populations using systematic genomic
sequencing (Nickerson et al. 1998, Clark et al. 1998, Rieder et al. 1999)
or SNP-typing (Goddard et al. 2000, Moffatt et al. 2000) indicate difficulties for designing and interpreting LD-scans or SNP-based association studies. The results show highly variable distribution of LD between different genes, between different parts of the same genes and
also between different populations. Cataloguing SNPs in the coding
regions of candidate genes has been considered to be rational to
approach complex genetic traits, as one could expect that these are
more likely to be functionally significant (Cargill et al. 1999, Halushka
et al. 1999). Carefully designed studies with a good definition of the
phenotype, a large sample size and sufficient marker density are
required to settle the dispute of the usefulness of SNPs in dissection
of complex traits.
45
Currently, there are nearly 30,000 human genomic biallelic
polymorpisms (www.ncbi.nlm.nih/dbSNP/) in the public domain. This
number will surge to hundreds of thousands of SNPs in the near future
by large scale “non-targeted” (Masood 1999) and gene targeted SNP
discovery projects (Cargill et al. 1999, Halushka et al. 1999, Buetow et
al. 1999). While it is unlikely that many (or any) of the groups studying
complex diseases will initiate a whole genome association study,
requiring >>104 SNPs typed from each sample, it is clear that the
number of SNPs required for refined mapping or candidate gene
association studies will be in the order of hundreds to thousands. This
increases the number of genotypes per study by two to three orders of
magnitude as compared to most of the association studies to date.
Some of the techniques described above have the potential to accomplish this with reasonable cost. In the next few years we will be
analyzing the genomic architecture of individual genes and whole
genome with unparalleled precision. Consequently, we are likely to
witness many exciting and important discoveries on genetic predisposition to human disease traits similar to identification of the common genetic risk factors for venous thromboemboli (Bertina et al.
1994) or protection against HIV-1 infection conferred by a single
deletion allele (Liu et al. 1996).
46
AIMS OF THE PRESENT STUDY
1) To develop techniques for multiplex analysis of human DNA
sequence variation based on the primer extension reaction principle.
2) To apply this technology to studying disease causing mutations and genetic variation in the Finnish population.
47
MATERIALS AND METHODS
DNA samples and extraction of DNA
Lymphoblastoid cell lines characterized by the 10th International
HLA Workshop and anonymous Finnish individuals were sampled for
development of the HLA-DQA1 and DRB1 genotyping system (I). Several clinicians and researchers provided samples of known carriers
and patients for development of mutation screening panels (II,V).
Participants in the FINRISK 1992 study (Vartiainen et al. 1994) were
analyzed for characterization of polymorphisms related to myocardial
infarction (see III for inclusion criteria). DNA from HIV-1 infected
Finns had been collected between 1990 and 1996 and anonymous
healthy controls from different parts of Finland were used to study the
MBL and CCR5 polymorphisms (for details see IV). DNA from laboratory personnel and anonymous blood donors along with samples from
the Finnish twin registry were used as the unknown samples for
evaluation of the mutation panel (V). DNA was prepared by a standard
phenol-chloroform extraction method (Bell et al. 1981) in all studies
except IV, in which a rapid lysis method was used (Higuchi 1989). The
concentration of DNA was determined spectrophotometrically
(Sambrook et al. 1989).
Primer synthesis
An Applied Biosystems 392 DNA synthesizer (Foster City, CA)
with standard phosphoramidite chemistry was used for synthesis of
primers in I-II, and oligonucleotides were used without purification.
Primers were purchased from Interactiva Biotechnologie GmbH (Ulm,
Germany) in all the subsequent work (III-V) and had been HPLC
purified with the exception of the amino-modified primers.
PCR amplification
Dynazyme II DNA polymerase (Helsinki, Finland) with a manual
hot-start procedure was used in all PCR reactions in I-II using an MJ
Research PTC-100 thermal cycler (Watertown, MA). In the subsequent
PCR amplifications (III-V) chemically modified Amplitaq Gold DNA
polymerase (Perkin Elmer, Branchburg, NJ) was used, and the thermal
cycling carried out in thin walled 96-well plates using MJ Research
PTC-225 thermal cycler. Each amplicon had one non-biotinylated and
another biotinylated primer to enable affinity capture for preparation
of ssDNA (I-IV). T7-RNA polymerase promotor sequences were included 5’ to the gene specific sequence in one primer of each pair to
48
enable in vitro RNA transcription (V). Multiplex PCR reactions (II-V)
were optimized according to signal intensities obtained in the
genotyping reactions on the arrays. Modification of the primer concentrations in the individual reaction mixes, ordering of the primer
pairs in different groups and replacement of individual primer pairs
were done to optimize multiplex amplification while the other amplification reaction parameters were kept constant. To facilitate multiplex amplification reactions, common 5’tail sequences were added to
the primers, DNA-polymerase concentrations were increased, and
longer extension times (II-IV) or a “touch-down” PCR cycling procedure (V) were used.
Affinity capture and ssDNA preparation
For HLA typing (I) the combined PCR products were captured on
streptavidin-coated manifolds (Amersham-Pharmacia Biotech,
Uppsala, Sweden) (I, Lagerkvist et al. 1994). For genotyping on arrays
by minisequencing the biotinylated PCR products were captured on
streptavidin-coated polystyrene beads (Idexx Research Products,
Westbrook, ME) (II-IV) and for reference genotyping by solid-phase
minisequencing they were captured (Syvänen et al. 1990) in
streptavidin coated microtiter wells (Labsystems, Helsinki, Finland)
(III, V). Following alkaline denaturation the captured strand was subjected to primer extension in the gel-based multiplex (I) and standard minisequencing method, while the eluted strand served as the
template in minisequencing on DNA-arrays with immobilized primers
(II-IV). In V dsDNA served as template.
Electrophoretic separation
Labeled minisequencing primers and group specific PCR products were separated in an ALF automated sequencer (AmershamPharmacia Biotech). The streptavidin-coated manifolds were directly
inserted into the wells of a 10% Hydrolink polyacrylamide gel (Long
Ranger, AT Biochem, Malvern, PA), in which the minisequencing
detection primers were released by denaturation. The gels were run
for 55-65 min and reloaded up to six times. The results were interpreted using the ALF Fragment Manager version 1.1 software
(Pharmacia Biotech).
49
Preparation of microarrays
Microscopic glass slides with teflon lined wells (Erie Scientific,
Portsmouth, NH) were treated essentially as previously described
(Lamture et al. 1994) to yield an epoxysilanized surface (II-IV). A
modification of an isothiocyanate activation method (Guo et al. 1994)
was applied on standard microscope glass slides (V). Diluted solutions (20 µM of each primer) of the NH2-modified oligonucleotides
were prepared in 0.1 M NaOH or 0.4 M NaHCO3 (pH 9.0) and immobilized on the epoxysilane of thiocyanite surface, respectively. The
oligonucleotide solutions were spotted in an array format either
manually (II) or with a contact printing robot (III-V). Initially (III-IV)
custom-made tweezer like printing pins (Shalon et al. 1996) were
used on a modified Isel EP 1090/4 XYZ robot (Eiterfeld, Germany),
which was replaced (V) by the faster Isel Automation Flachbettanlage
2 robot equipped with two TeleChem CPH-2 (Sunnyvale, CA) printing
pins. One detection primer for each mutation or SNP to be detected
was immobilized on the minisequencing arrays (II-IV), while two
allele specific primers for each site were required for allele specific
extension arrays (V). The spot diameter was 300µm with the custommade pin or 125-150µm with the CPH-2 pin; spotting density was 200
(III-IV) or 2000 spots/cm2(V), respectively. The slides were stored up
to 2 months in -20 - -70oC.
Genotyping reactions
Multiplex fluorescent minisequencing was carried out on the
ssDNA templates immobilized to the manifold supports using either
T7 DNA polymerase at +37 oC or ThermoSequenase DNA polymerase
at +50 oC (both from Amersham-Pharmacia Biotech) with fluorescein
labeled ddNTPs (NEN Dupont, Herts, UK) in four parallel slots for each
sample. Following non-stringent annealing of ssDNA to the arrays the
optimized (II) reaction mixture included DyNASeq DNA-polymerase
(Finnzymes) and the four 33P-labelled ddNTPs (Amersham-Pharmacia
Biotech)in parallel wells of the slide and the reaction was allowed to
proceed at +60-65 oC for 1-15min (II-IV). The optimized (V) allelespecific extension reaction coupled the template preparation and
detection reaction. The reaction mixture contained minute amounts of
the combined, T7-tailed multiplex PCR product, T7 RNA polymerase,
MMLV reverse transcriptase, all ribonucleotides, unlabelled dATP and
dGTP, and CY5/CY3-labelled dUTP and dCTP (Amersham-Pharmacia
Biotech). Trehalose was included in the reaction buffer to stabilize
50
and activate the enzymes (Carninci et al. 1998) at +52oC and the
reaction time was 45-90min.
Quantitation and interpretation of the results
Fluorograms showed well resolved patterns of peaks corresponding to particular DQA1 genotype or DQA1 – DRB1 subgroup genotypes in the multiplexed minisequencing procedure based on electrophoretic separation of the extended primers (I). The minisequencing
arrays were exposed to an imaging plate for 30min-2h following the
reactions (II-IV), scanned with 100µm resolution in a Fuji BAS-1500
phosphorimager (Kanagawa, Japan) and signal intensities at each spot
were quantified using Tina 2.10 software (Raytest, Straubenhardt,
Germany). The fluorescently labeled allele-specific extension reaction results were scanned using a ScanArray 4000 system (GSI
Lumonics, Watertown, MA) with 5µm resolution, using excitation at
630nm, and emission at 670nm for CY5, and at 540nm and 570nm,
respectively, for CY3 (V). The results were quantified using the
Scanalyze 2.44 software (Michael Eisen, Stanford University, CA). The
genotypes were determined for each site by calculating the ratio
between signal intensity at the nucleotide (or probe) corresponding
to one allele by that of the intensity at the nucleotide (or probe)
corresponding to the other allele. The ratios fell into three distinct
clusters, high for homozygotes at one of the alleles, approximately
one for heterozygotes, and low for homozygotes for the second allele
(II-V). For detection of minor mutations, a dual colour approach was
used. A known sample was typed on the same array as the test sample
using different fluorophores for extending each sample. The results of
test sample were then normalized according to the signal intensities
of the known control sample.
Reference methods
In most cases requiring genotype validation or verification (III, V)
a standard solid-phase minisequencing procedure (Syvänen et al.
1990, 1992) was used. In some (the NKH and LPI mutations, V) PCRRFLP digestion was done as described (Kure et al. 1992, Torrents et al.
1999). For Salla disease mutation (Verheijen et al. 1999) an allele
specific PCR reaction reaction (Wu and Wallace 1989) with an internal
control amplicon was used.
51
Statistical methods
The statistical significance of differences in marker allele frequency distribution between cases and controls were calculated by
the c2 test or Fisher’s exact test (III, IV). In study III, the combined
effect of GpIIIa and PAI-1 were analyzed by comparing individuals
carrying 3 or 4 alleles of PlA2 or 4G with those having only 1 or 0 such
alleles. A logistic regression model including age, sex, total cholesterol, HDL-cholesterol and triglyceride levels, body mass index (BMI)
and smoking was utilized to control for environmental effects. HardyWeinberg equilibrium was analyzed by Genepop 1.0 software (III-V).
52
RESULTS AND DISCUSSION
In the following chapters the results of this thesis are presented
and discussed within the framework of closely related work by other
groups.
Design of assays
“Length-labeled” multiplex fluorescent minisequencing
A primer pair for amplifying the second exon of the DQA1 gene
was designed. Due to the extensive allelic heterogeneity of the DRB1
gene a two-step strategy for subtyping DR2 alleles was used. Allele
specific PCR reactions divided the DRB1 alleles into seven subgroups
identified by their size (Westman et al. 1993).
Initially six sites identifying 10 alleles were chosen disregarding
overlap of some detection primers, and in each case only one of the
overlapping primers performed acceptably. The selection of sites was
thereafter modified to allow discrimination of the alleles with nonoverlapping primers (I), and the same design was applied for the
DRB1 subtyping primers. The minisequencing primers had 18-21 bp of
gene specific sequence and random 5’ tails to distinguish each
primer by its length. The DR2 subgroup typing primers were designed
similarly for the three sites identifying the five alleles. The sizes of
the detection primers were 18-42 bp with 3bp difference between
primers.
The target bound to an avidin-coated manifold was rendered
single-stranded and multiplex minisequencing was carried out in four
parallel reaction wells using fluorescein labeled ddNTPs, followed by
size separation and detection of the extended primers on the A.L.F.
sequencer.
Size separation of multiplex minisequencing products was first
proposed by Krook et al. (Krook et al. 1992), who used three
polymorphisms in the glucose transporter and insulin receptor gene.
In this study 32P-dATP or dCTP were incorporated in the
minisequencing reaction. Four color detection on PE-ABI sequencers
have been utilized to detect HPRT mutations (Shumaker et al. 1996),
mitochondrial SNPs (Tully et al. 1996) and mouse SNPs (Lindblad-Toh
et al. 2000). The use of capillary electrophoresis for detection of
multiplexed minisequencing products has also been suggested
(Piggee et al. 1997). Detection of short dinucleotide repeats with two
minisequencing primers is also possible (Tully et al. 1996), but the
53
approach presented is not generally applicable as the repeat size has
to be shorter than the oligonucleotide, and furthermore, multiplexing
of dinucleotide repeat minisequencing is not feasible. We have also
designed a multiplex minisequencing assay for genotyping the common pharmacogenetic polymorphisms in CYP2D6 and CYP2C19
genes, essentially as described for the HLA-typing (Pastinen et al.
1999). This assay has proven flexible, and has recently been extended
for typing several additional polymorphisms in the CYP2D6 and NAT2
genes (Sitbon and Syvanen, in press). Generally, all the multiplexed
primer extension assays have similar designs avoiding overlapping
detection primers, and having a size difference of two or more bases
generated by synthesizing a non-specific 5’tail sequence. Also, all the
assays are based on affinity capture of the PCR products followed by
solid-phase minisequencing on the ssDNA targets. Exiting possibilities of multiplexing minisequencing with size addressing is now
offered by the exquisite resolution and accuracy of MALDI-TOF mass
detection (Ross et al. 1998).
Minisequencing primer extension arrays
Minisequencing detection primers designed to interrogate different mutations or SNPs can be addressed to discrete sites on a solid
surface. After target annealing and extension with labeled ddNTPs
the genotype at each site can be read based on the identity of the
incorporated label. Initially we immobilized detection primers for
mutations occurring in Finland, which ranged from single substitutions to small and large deletions (I). The immobilization was
achieved via a 5’amino group (Lamture et al. 1994) and non-specific
spacer tails separated the primers from the surface to enhance annealing (Guo et al. 1994). Standard minisequencing primer design
with 19 to 22 bp of gene specific sequence was used. Multiplex PCR
was performed with a biotinylated primer and ssDNA was eluted for
analysis after affinity capture. The target was annealed on 4 parallel
arrays, and the extension reactions using thermostable DNA polymerase with 33P-labelled ddNTPs were carried out followed by detection
on a phosphor-imager. A similar procedure was applied to common
sequence variants of the MBL and CCR-5 genes (III), and SNPs related
with cardiovascular disease (IV).
Analogous procedure for inisequencing primer extension on a
glass surface with 32P-dNTPs was shown to be possible in a nonmultiplexed format in model experiments for detection of 5bp sequence in of the HPRT gene (Shumaker et al. 1996). A tiled array
54
design to scan a 33-bp stretch of the p53 gene also employed four
reactions in parallel using FITC-ddNTPs as terminators, followed by
alkaline phosphase mediated generation of a fluorogenic substrate
(Head et al. 1997). The tiled design demonstrates the significant
advantage of having primers rather than templates immobilized to a
solid-phase (Syvanen et al. 1990, I), as closely occuring SNPs or mutations can be interrogated in a multiplexed format (III). This limitation
by competitive binding of primers in solution applies to the “TAGSBE” approaches in which the reaction takes place in solution
(Hirschhorn et al., ASHG 1999 Meeting). Direct detection
minisequencing extension on an array using fluorescently labelled
ddNTPs in four parallel reactions has been achieved using fluorescein
(Dubiley et al. 1999) or TAMRA labels (Raitio et al., manuscript). Recently, the use of four different dye terminators in array
minisequencing in a single reaction chamber was demonstrated
(Kurg et al. 2000). The fidelity of a DNA-polymerase in single base
extension is the basis for allele discrimination in array
minisequencing assays, and remarkedly similar designs of the assays
by different groups is notable. On the contrary ASO hybridization
based array assays have highly dissimilar designs. Various approaches to generalize design of ASO-probe array for mutation detection have been suggested. Highly redundant probe sets (Cronin et al.
1996, Wang et al. 1998, Cho et al. 1999), individual probe optimization
(Guo et al. 1994), monitoring of melting curves of the hybrids
(Drobyshev et al. 1997), stacking hybridization (Yershov et al. 1996),
use of chaotropic salts in the hybridization buffer (Nguyen et al. 1999)
and utilization of an electronic charge changer (Gilles et al. 1999)
have been described to overcome the inherent limitations of multiplex discrimination of genotypic variation by ASO-hybridization.
Allele specific extension arrays
DNA-polymerase extension is hindered by 3’ mismatches in the
primers. This property has commonly been applied for allele specific
PCR (Wu and Wallace 1989, Sommer et al. 1989, Newton et al. 1989),
but due to the exponential nature of the PCR reaction, even limited
mismatched extension generally requires careful optimization of
reaction conditions and multiplexing is difficult (Ferrie et al. 1992).
We designed arrays with one primer having 3’-end complementary to
one allele and another primer with 3’-end complementary with the
other allele for 31 mutations and 9 SNPs (V). Multiplex PCRs with
each amplicon having one primer with 5’ T7 RNA-polymerase pro55
moter sequence were carried out. RNA targets were generated from
the PCR amplicons along with reverse transcriptase extension of the
allele specific primers directly on the arrays. Incorporation of CY5 or
CY3 labeled dNTPs were detected by a confocal epifluorescence
reader. This approach utilizes the fidelity of reverse transcriptase in
discrimination against terminal mismatches. Allele specific extension
rather than minisequencing is used because reverse transcriptases
do not incorporate ddNTPs with sufficient sequence specificity (II). In
a pairwise comparison of minisequencing and allele specific extension, a notably higher (100-fold) amount PCR products was required to
achieve sufficient signal-to-noise for genotype calling in
minisequencing (Pastinen et al., unpublished). Allele specific extension of primers immobilized on arrays has also been shown to be
feasible for seven b-globin mutations by using fragmented DNA
targets, thermostable DNA-polymerase and fluorescently labelled
ddNTPs (Dubiley et al. 1999).
Optimization of genotype discrimination
Throughout the work presented in this thesis the goal was to
create assays which discriminate homozygote and heterozygote
genotypes simultaneously for several loci in parallel with procedures
that would be practical to apply to genomic DNA. No artificial templates such as synthetic oligos and cloned DNA fragments were used
in the work to establish the method, as assays will in most cases
perform differently with amplified genomic targets containing possible non-specific fragments and other reaction components.
Length-labeled multiplex minisequencing assays
T7-DNA polymerase incorporates fluorescein labelled ddNTPs
with different efficiencies, a feature which was only partly alleviated
by the use of Mn2+ in the reaction buffer, requiring that different concentrations of the nucleotides had to be used in the reaction mixture.
Our preliminary results indicated that carrying out the extension
reaction with a modified thermostable polymerase,
ThermoSequenaseTM, would require less optimization of individual
terminator concentrations (I), but further experience with another
application (Pastinen et al. 1999) did not support this. Also detection
primer concentrations were adjusted for each individual site. Similar
optimization steps have been used in two other multiplex
minisequencing applications (Shumaker et al. 1996, Tully et al. 1997).
Less optimization is apparently required in procedures utilizing linear
56
amplification of minisequencing extension by thermal cycling
(Lindblad-Toh et al. 2000, Ross et al. 1998).
Array-based assays
Studies on the hybridization behavior on solid glass supports
had indicated that sufficient spacing from the surface is required to
achieve efficient annealing of the template on the immobilized
probes (Guo et al. 1994). We tested detection primers containing a 15mer dT-spacer, and 15-25-mer detection primers without spacer tails.
Significantly improved minisequencing extension efficiency was
observed with the tailed probes (T.P. unpublished observations), and
thus all the subsequent array assays utilized detection primers with a
15- or 9-mer dT-tail 5’ to the gene specific sequence. An
epoxysilanization procedure (Lamture et al. 1994) to immobilize the
aminated detection primers was initially applied (II-IV). In a comparison of different immobilization chemistries (Lindroos K., M.Sc. thesis,
Helsinki University of Technology, 1998) an isothiocyanate activation
procedure (Guo et al. 1994) to immobilize detection primers was
found to be superior for the minisequencing extension efficiency, and
a modified immobilization procedure on thiocyanate surfaces was
applied thereafter (V). Primer extension on arrays (Head et al. 1997)
using disulfide modified detection primers immobilized to
mercaptosilanized glass slides via a disulfide bond exchange reaction
(Rogers et al. 1999) has also been successful. This chemistry limits the
use of reducing agents in the reaction mixture, and is thus not suitable for use in the allele-specific extension reaction (V) because the
RT enzyme requires dithiotreitol. Both in minisequencing and allele
specific extension has been demonstrated on detection primers
immobilized to acrylamide gel-pads (Dubiley et al. 1999). The 3dimensional gel-pad enables higher loading of the detection primer
compared to a plain glass surface, but the accessiblity of the target
molecules and enzymes into the probes embedded in the gel matrix
may be hampered, and would require pairwise comparison of the
approaches to be evaluated.
A non-thermostable DNA-polymerase was applied in the first
description of minisequencing on glass surface (Shumaker et al.
1996). We compared several different enzymes for their extension
specificity in multiplexed minisequencing on DNA-arrays (II), and
specific extension using 33P-ddNTPs was only achieved with modified
thermostable DNA-polymerases. The significant decrease in mis57
matched extension at high reaction temperatures is likely to be
based on a lower degree of secondary structure formation (Mir et al.
1999) of the probe immobilized on the array accompanied by
destabilization of non-specifically annealed targets to the detection
primers. Importantly, the use of RNA targets in minisequencing assays
is hindered by the high misincorporation rates of ddNTPs by reverse
transcriptases (II), but RNA targets were shown to be suitable in the
allele specific extension assay. MMLV enzyme with low RNAse H
activity and processivity in the allele specific extension procedure
was found to yield better genotype discrimination and minimal template independent extension compared to other reverse
transcriptases (V). Excellent genotyping results were obtained at
reaction conditions employing high temperature and trehalose in the
reaction buffer to stabilize and activate the enzyme at the elevated
reaction temperature (Carninci et al. 1998).
Assay procedures
The key considerations in development of multiplexed
genotyping assays are that the procedure should consist a minimal
number of steps and that it should allow complete automation. Several detection technologies possess a very high capacity, including
fluorescent read-out from high-density arrays (Lipshutz et al. 1999)
and MALDI-TOF spectrometry (Griffin and Smith 2000). Template
preparation requiring PCR amplification and possibly concentration,
purification and inactivation of certain reaction components is often
overlooked. A second consideration is the general accessiblity of the
method, since proprietary technologies will by necessity increase the
cost of the assays and limit their use to large centers. It is now clear
that SNP typing on a large scale will be of central importance in
human genetics in the years to come and methods should thus preferably be applicable in many molecular biology laboratories. An illustration of the importance of accessibility is the technology for cDNA
array expression analysis (Brown and Botstein 1999), which has spread
in the scientific community due to active support of its dissemination
by the original developers (Futcher 1999).
Multiplex PCR
By far the most labor intensive aspect of multiplexing the
genotyping by primer extension assays is the optimization of concurrent amplification of several genomic fragments in a single reaction
vessel. The applications presented here were targeted at distinct
58
polymorphisms or disease mutations, and it was not acceptable to
discard loci that were diffult to amplify, unlike the situation in random
genome-wide mapping by SNPs (Wang et al. 1998, Cho et al. 1999). In
multiplex PCR the goal is to unify the character of different amplicons
to allow their concurrent amplification (for a review, see Zangenberg
et al. 1999). Multiplex amplification requires high concentrations of
numerous synthetic oligonucleotides in a single reaction mixture,
which promotes the generation of non-target derived amplification
products or “primer-dimers” (Chou et al. 1992). The formation of
primer-dimers can be reduced by avoiding exposing the reaction to
low temperatures (for example when setting up the reaction), which
is in practice achieved by adding enzyme or nucleotides only after
the reaction reaches high temperatures. A general improvement of
multiplex PCR perfomance was introduced by chemically modified
DNA-polymerase (AmpliTaq GoldTM), which is inactive prior to thermal activation at +95 oC. Having amplicons of similar and preferably
small size has been reported to yield well working multiplex PCRs
(Wang et al. 1998), in practice design of very short amplicons is often
prevented by the local nucleotide sequence. We attempted to limit
the amplicon size to 80-200bp, with some exceptions - for example, in
Batten disease, where ALU-repeat sequences flank the common mutation. A strategy to limit primer-dimer formation is to design primers
with identical dinucleotides at their 3’ends (Zangenberg et al. 1999),
which is a criterion difficult to combine with small fragment size. In
our hands there was no significant increase in multiplex amplification
success by grouping amplicons according their 3’end primer sequences (T.P. unpublished observations). The use of common nonspecific 5’tails in PCR primers (Shuber et al. 1996), and increased
concentrations of DNA-polymerase (Chamberlain and Chamberlain
1994) improved our multiplex amplification efficiency, while modification of MgCl2 or dNTP concentrations did not. Two to nineplex PCRs
were set-up, applying common 5’tails, increased enzyme concentrations and longer extension times along with modification of the
primer concentrations of individual amplicons according to their
signal intensities in assays on arrays. Similar success has been reported by others analyzing distinct sets of mutations or
polymorphisms (Chamberlain et al. 1988, Hacia et al. 1998).
Length labeled primers for multiplex minisequencing
Solid-phase minisequencing (Syvanen et al. 1990, 1992) is based
on immobilisation of amplified templates in microtiter plate wells,
followed by alkaline denaturation and minisequencing on the ssDNA
59
template, and finally detection of extended primers in a scintillation
counter. A limitation of the method is the requirement of a separate
reaction well for each allele to be scored, and thus only 48 genotypes
can be obtained per microtiter plate. Doubling the throughput of the
method can be achieved by carrying out the extension with two
differentially labeled ddNTPs. The multiplex fluorescent
minisequencing method for genotyping six or nine SNPs in HLADQA1 and DRB1 genes (I) was based on a convenient streptavidin
coated manifold support (Lagerkvist et al. 1994) for capturing amplified products. The manifold support minimized the number of
pipetting steps in the procedure, and simplified loading of the gel on
an automated sequencer. The extension was carried out with four
fluorescein labelled ddNTPs in parallel, which were then detected on
four parallel lanes of an automated sequencer. The disadvantage of
separating the four nucleotides into parallel reactions, rather than
carrying out extension with four different fluorophores in the same
reaction followed by electrophoretic analysis in a single lane
(Shumaker et al. 1996, Tully et al. 1996, Lindblad-Toh et al. 2000), was
alleviated by the easy handling of the samples and less than 60 min
separation time in reloadable gels. A single operator could perform
genotyping of 100 samples a day, translating into 900 genotypes with
the described HLA-typing system, which was a significant increase in
through-put as compared to standard solid-phase minisequencing.
Interpretation of the results was carried out using a software for fragment analysis, in which a threshold level of peak height was set followed by recording of the simple “sequence” patterns unequivocally
determining alleles. As the electrophoregrams produced in the process are relatively simple to interpret, the automation of genotype
scoring should be feasible by modifying the base-calling software
used in standard sequencing. The multiplexing capacity of the system
is limited by synthesis of oligonucleotides of different length extended in the reaction. Allele-scoring should be robust with 2-bp
primer spacing, which would provide a 4-fold increase in throughput
when combined with different incorporating nucleotides under same
detection primer length (eg. A to C, G to T transversions could be
analyzed with a single oligomer length).
Three reports on using multiplex solid-phase minisequencing
and four different fluorophores (Tully et al. 1996, Shumaker et al. 1996,
Lindblad-Toh et al. 2000) apply magnetic streptavidin-coated beads
for immobilisation of templates and procedures include multiple
washes, centrifugation and concentration steps all limiting the
throughput without extensive automation. Also the electrophoresis
60
time of similar detection primer pools as used in our HLA-typing
procedure is reported to be twice as long in the four-color sequence
analyzer (Morley et al. 1999, Lindblad-Toh et al. 2000). An attractive
alternative for high-throughput genotyping would be combining a
manifold-format solid support with the four-color multiplex
minisequencing reaction separated on a standard four colour
sequencer with 96-lanes, a single operator could produce up to 104
SNP genotypes per day – a significant number by any measure at
present.
Array-based extension assays
We next developed primer extension on DNA-arrays to achieve
higher multiplexing potential and to avoid gel electrophoretic separation. Initial experiments were carried out on arrays prepared by
manual spotting of previously synthesized detection oligonucleotides
on the activated glass surfaces (II). Better reproducibility, miniaturization and larger production scale of arrays became possible following
construction of custom-built printing robots (P. Niini M.Sc.thesis,
Helsinki University of Technology 2000). The first printing robot was
an industrial robot with XYZ range of movement and had a single
tweezer-like pin (Shalon et al. 1996) custom made from stainless steel
according to a model pin (kind gift from Dr. Mark Schena, Stanford
University, CA). The first pins produced spots of 300 to 500µm in diameter and the performance varied between different pins. Nevertheless, the robotic spotting enabled evaluation of the array-based
minisequencing in large number of samples (III,IV). Commercial
printing pins (Telechem, Sunnyvale, CA), a faster printing robot with
two printheads, and improved activation chemistry for immobilization
of aminated primers further improved the throughput of the array
production (V). Only one or two primers are immobilized for each
mutation to be detected, thus the simple spotting robotics provides a
large number of arrays at relatively low cost and sufficient speed.
Use of dsDNA targets in minisequencing on arrays resulted in low
extension yields (II), and ssDNA targets prepared by affinity capture
and elution of the other target strand were thus used (II-IV). An alternative mean to produce sufficient signal intensities in array-based
minisequencing is through fragmentation of the PCR products along
with alkaline phosphatase treatment to inactivate dNTPs followed by
concentration of the targets (Kurg et al. 2000, Raitio et al. manuscript).
The ssDNA was anneled to the primer arrays, followed by a separate
extension reaction in four parallel teflon-lined glass surfaces contain61
ing identical primer arrays (II-IV). To facilitate analysis of a large
number of samples the well spacing was compatible with
multichannel pipettors. The incorporation of 33P-ddNTPs was quantified by a phosphorimager after a short exposure. Signal ratios generally fell into distinct categories with 5 to 100-fold differences in ratios
between homozygous and heterozygous genotypes. Obvious limitations of the system are that the use of 33P-ddNTP labels requires parallel wells for the four nucleotides; and spatial resolution is lower than
with epifluorescence detection preventing the use of medium density
arrays (³1000 spots per cm2). The advantages offered by 33P-labeling
are high sensitivity (Raitio et al., unpublished) and speed of analyses
as a 5min scan with 100µm resolution is sufficient to analyze up to 48
slides. Radiolabeling also avoids the problems encountered with
incorporation by DNA-polymerases of nucleotide analogues with
bulky fluorescent reporter groups (Plaschke et al. 1998).
A relatively large number of liquid handling steps and significant
amounts of multiplex PCR products are needed to achieve a sufficient
detection sensitivity after single-nucleotide primer extension on the
arrays. This necessitates precipitation of the template DNA. To avoid
post-PCR sample preparation, and to increase the sensitivity of the
system, an allele-specific extension procedure was devised. In this
system, the template preparation is performed concurrently with the
extension reaction with the aid of T7-RNA polymerase to generate
single stranded RNA targets, while also increasing the copy number
of templates (V). The reaction procedure compares favourably in its
speed and simplicity to any described array genotyping procedures
as illustrated in figure 6.
Detection is based on fluorescence labelling with a single
fluorophore. If two fluorophores are used, one of them may serve as an
internal control enabling detection of mutations representing 5% of
the target sequences (V). The developed reaction format with silicon
rubber grids forming 80 separate reaction wells on one microscopic
glass slide allows detection of up to 24,000 genotypes from a single
slide at current spotting densities. Currently, the rate limiting step for
the genotyping is the detection of extension signals using a confocal
epifluorescence reader followed by signal quantitation with software
designed for expression array analysis. Algorithms to automate signal
quantitation and genotype scoring are now needed to fully exploit the
capacity of the genotyping procedure.
62
$62K\EULGL]DWLRQ
0XOWLSOH[3&5ZLWK
77SURPRWRUWDLOHG
SULPHUV
6HFRQGODEHOOLQJ3&5
UHDFWLRQZLWK77
SULPHUVDQGELRWLQG873
0LQLVHTXHQFLQJ
0XOWLSOH[3&5G773
SDUWO\UHSODFHGE\G873
$OOHOHVSHFLILFH[WHQVLRQ
0XOWLSOH[3&5ZLWK
77SURPRWRUWDLOHG
SULPHUV
)UDFWLRQDWLRQE\'QDVH,
RU81*JO\FRVLGDVHDQG
LQDFWLYDWLRQRIG173VE\
DONDOLQHSKRVSKDWDVH
&RPELQHGWHPSODWHSUHSDUDWLRQ
DQGJHQRW\SLQJUHDFWLRQ
RQWKHDUUD\VXVLQJ51$SRO
DQGUHYHUVHWUDQVFULSWDVH
3XULILFDWLRQDQG
3UHFLSLWDWLRQRIWKH
FRQFHQWUDWLRQRISURGXFW
IUDJPHQWHGWDUJHW
+\EULGL]DWLRQ
6FDQQLQJZLWKRQHZDYHOHQJWK
0LQLVHTXHQFLQJUHDFWLRQLQ
IRXUSDUDOOHODUUD\VIRXU
GLIIHUHQWLDOO\ODEHOOHGGG173V
6WDLQLQJ
6FDQQLQJZLWKRQHIRXU
6FDQQLQJZLWKRQH
ZDYHOHQJWKV
ZDYHOHQJWK
),*85(&RPSDULVRQRISURFHGXUHVIRUPXOWLSOH[HG613VFRULQJRQ'1$DUUD\V
7KH$62SURFHGXUHLVSUHVHQWHGDVGHVFULEHGE\:DQJHWDO:DQJHWDOPLQLVHTXHQFLQJSURFHGXUHLVSUHVHQWHGDV
FDUULHGRXWE\5DLWLRHWDOPDQXVFULSWLQSDUHQWKHVHVWKHSURFHGXUHRI.XUJHWDO.XUJHWDOLVVKRZQ7KHDOOHOH
VSHFLILFH[WHQVLRQSURFHGXUHLVDVGHVFULEHGLQ91RWDEO\DVLQJOHOLTXLGKDQGOLQJVWHSSRVW3&5LVVXIILFLHQWIRUDOOHOH
VSHFLILFH[WHQVLRQRQ'1$DUUD\VHQDEOLQJKLJKWKURXJKSXWJHQRW\SLQJZLWKRXWDXWRPDWLRQ)XUWKHUPRUHWKH51$SRO
DPSOLILHVWKHWHPSODWHIXUWKHUDOORZLQJPXOWLSOH[3&5UHDFWLRQZLWKPLQLPDOUHDFWLRQYROXPHVDVOHVVWKDQ—ORIWKHSRROHG
PXOWLSOH[3&5UHDFWLRQLVUHTXLUHGIRUJHQRW\SLQJ
Applications
HLA typing
The method for HLA-DQA1 genotyping and DRB1 group-specific
typing, followed by DR2-group subtyping was initially evaluated in 42
lymphoblastoid cell lines of the 10th International HLA workshop
(Kimura et al. 1992) and in 42 anonymous Finnish samples of known
HLA genotypes. A complete agreement of the typing results in these
controls along with high success rate allowed us to proceed to use
the developed method for genotyping affected offspring and both
parents from 110 families with multiple sclerosis (MS) (Pastinen et al.,
unpublished). The DQB1 genotyping had previously been carried out
by a standard PCR-ASO method (Kimura et al. 1991). The strong LD in
the HLA-region is evident by occurrence of only certain haplotype
combinations of DQA1-DQB1-DRB1 and in our genotyping no exception to the previously characterized haplotype combinations was
63
seen. The HLA-association of multiple sclerosis was first described
nearly 30 years ago with cellular typing techniques and later refined
by DNA typing to the single haplotype DQA1*0102-DBQ1*0602DRB1*1502 (reviewed by Hillert 1994). In our MS cohort the frequency
of this haplotype was 31% in MS affected individuals and 18% in
parental non-transmitted chromosomes, confirming the significant
HLA association (p=0.003, two-sided Fisher’s exact test, Pastinen et al.
unpublished). Further evaluation of other MS-susceptibility loci in the
extended patient-parent trio material is on-going, and HLA-typing will
be used to stratify patient material in studying this phenotypically
diverse disorder (Tienari et al., unpublished). Minisequencing with
length labelling is in commercial use (PGL Laboratories, Uppsala,
Sweden), for pharmacogenetic genotyping of SNPs (Pastinen et al.
1999, Syvanen and Sitbon in press) affecting the metabolism of certain
drugs (Linder et al. 1997).
Screening for mutations and SNPs
We initally demonstrated proof-of-principle of the array-based
minisequencing method using a system of nine different mutations in
the Finnish population by applying it to 14 genomic samples (II). Most
methods for screening known SNPs on microarrays have, in fact, been
presented only at such a limited scale or even validated with synthetic templates (Yershov et al. 1996, Shumaker et al. 1996, Drobyshev
et al. 1997, Dubiley et al. 1999, Tang et al. 1999, Gerry et al. 1999, Kurg
et al. 2000). Importantly, we compared perfomance of the
minisequencing primer extension to ASO hybridization under six
different hybridization conditions.
The results of this comparison showed clearly better discrimination of genotypes by DNA-polymerase assisted minisequencing compared to the ASO-hybridization method (Table 4).
64
Table 4. Power of genotype discrimination by minisequencing
vs. ASO hybridization on DNA-arrays.
0(7+2'
352%(6
$1'
7$5*(76
6,*1$/5$7,26)5201250$/$1'
32:(52)
087$17$//(/(6
*(127<3(
+RPR]SRVLWLRQV
+HWHUR]SRVLWLRQV
',6&5,0,1$7,
21
33777
)9**
3377$
)9*$
337
)9
0LQLVHT
PHU
'1$WDUJHW
$62
PHU
51$WDUJHW
$62
PHU
51$WDUJHW
2QO\SDUWLDOUHVXOWVRIWKHFRPSDULVRQDUHVKRZQLOOXVWUDWLQJWKHVWDQGDUG
PLQLVHTXHQFLQJFRQGLWLRQVDQGWKRVH$62K\EULGL]DWLRQFRQGLWLRQVLQZKLFKWKH
EHVWGLVFULPLQDWLRQUDWLRVZHUHREWDLQHG7KUHHZDVKLQJVWULQJHQFLHVHLWKHU'1$
RU51$WDUJHWVDQGWZRSUREHOHQJWKVZHUHHPSOR\HGIRU$62K\EULGL]DWLRQ
'HVSLWHWKHVHYHUDOGLIIHUHQWFRQGLWLRQVPLQLVHTXHQFLQJSURYHGWRKDYHQLQHWR
IROGKLJKHUJHQRW\SHGLVFULPLQDWLRQWKDQ$62K\EULGL]DWLRQRQD'1$DUUD\
The practical utility of minisequencing on DNA-microarrays has
been established in several applications involving genotyping SNPs
and mutations in genomic samples. The following four polymorphisms
were evaluated in a cohort of 111 HIV-1 infected Finns and 194 healthy
controls on minisequencing microarrays. A D32-bp allele of CCR5
gene coding for a chemokine receptor had been shown to be protective against HIV-1 infection in its homozygous form (Liu et al. 1996,
Samson et al. 1996). No homozygotes for the CCR5 deletion were seen
among HIV-1 infected individuals, consistent with protective effect of
65
CCR5 deletion. The presence of heterozygotes among patients and
controls at similar frequencies is also consistent with the lack of
protection by a single allele. Despite the characteristic population
history of Finns (Peltonen et al. 1999) the CCR5 allele frequency was
not significantly different from other North European populations
(Martinson et al. 1997). Mannose binding lectin (MBL) is a circulating
serum protein with multiple functions in innate immunity (Turner et
al. 1996): three nonsynonymous substitutions in the gene lead to
decreased concentrations of the circulating MBL and had been associated with increased risk for HIV-1 infection in the Danish population
(Garred et al. 1997). The MBL variant alleles occurred at decreased
frequency as compared to the Danish population, but homozygotes for
the variant alleles were significantly enriched among HIV-1 infected
individuals, supporting the role of the normal MBL allele on the first
line defence against the pathogen. A study with a large cohort of
multiply exposed healthy controls and HIV-1 infected patients of the
same ethnicity would be required to confirm this suggestive predisposing effect seen in Scandinavian populations.
The second association study for genetic predisposition to myocardial infarction (MI) included four LDLR mutations accounting for
majority of FH alleles in Finland (Koivisto et al. 1995) and nine common polymorphisms previously associated with an increased risk for
MI. The patients and controls were derived from the large epidemiologic study FINRISK, and finally 152 MI patients with 152 healthy
matched controls were included in the study. Primary association was
seen for GPIIIa and PAI-1 variant alleles, apparently increasing the
risk for MI, while the other polymorphisms were not significantly
associated with MI risk in the study subjects. Only one individual was
carrying an FH causing LDLR mutation. If the GPIIIa and PAI-1 variant
alleles were analyzed jointly a high predisposing effect was seen in
individuals carrying 3-4 variant alleles compared to those carrying
only 0 or one variant alleles (p=0.001 in total –and p=0.0005 in male
subgroup of study subjects). Another study in a Finnish autopsy material subsequently associated the GPIIIa PlA2 allele with increased
risk for myocardial infarction as well (Mikkelsson et al. 1999). The
study illustrated the candidate gene strategy analyzed on DNA-arrays,
which will be increasingly popular with the progress in coding sequence- targeted SNP discovery projects (Cargill et al. 1999, Halushka
et al. 1999). Relatively limited study sizes and the analysis of only a
single polymorphism in the genes calls for larger and more detailed
studies of the associated genes to determine the potentially causative role.
66
The minisequencing assay in the form described above, which
has also been set up for analysis of 18 Finnish BRCA1 and BRCA2
mutations (Syrjakoski, Nevanlinna et al. unpublished). Recently, a
simplified target prepartion strategy and flurorescent labels were
applied for 26 Y-chromosomal SNPs in Finnish and related populations
(Raitio et al. manuscript).
A panel of 31 Finnish mutations with major mutations of most of
the characterized Finnish disease heritage mutations and other recurrent recessive mutations along with two disease predisposing
polymorphisms was set-up based on the fluorescent, allele-specific
extension on DNA-microarrays (Figure 7) (V).
$OOHOH
$OOHOH
P
P
P P
6DPSOH$
6DPSOH%
6DPSOH&
),* 8 5 ( ( [DP SOHVR IDOOHOHVSHFLILFH[WHQVLRQJHQRW\ SLQJ RQ' 1 $ P LFURDUUD\ V
7 KUHH GLII HUHQWVDP SOHVD QDO\ ]HGRQDOOHOH VSH FLI LFH [WHQVLR QP LFUR DUUD\ VIR UJHQRW\ SLQJ61 3 VP XWDWLR QV
) RUHDF K61 3 WREHJH QRW\ SHGWZ R GHWHFWLRQSULP HUVDUHLP P R ELOL]HGRQDVR OLGVXUIDFHULJ KWEHVLGHHDFKR WKH
RQHSD LURI GHWHFWLRQROLJ RVKLJKOLJ KWHGR QWKHOHIWP RVWDUUD\ ( DFKSDLUR IGHWHFWLRQSULP HUVGLI IHUDWWKHLU
¶HQGZ KLFKLVFR P SOHP HQWDU\ WRH LWKHUDOOHOHRU,QWKHGHWHFWLRQUHDFWLRQUHYHUVHWUDQVFULSWDVH DQG& < ODEHOOHGG1 7 3VD UHXVHGLQWHP SODWHGHSHQGHQWSULP HUH[WHQVLRQ2 QO\ GHWHFWLRQSULP HUVZ LWKFRP SOHP HQWDU\
¶HQGVD UHHIILFLHQWO\ H[WHQGHGE\ WKHHQ]\ P H WKXVGHWHUP LQLQJ WKHJ HQRW\ SHDWHDFKVLWH7 KH JLYHQH[DP SOH
UHSUHVHQWVDVP DOOSRUWLR QR IP LFURVFRSLFJODVVVOLGHFRQWDLQLQJVHSDUDWHUHD FWLR QFKDP EH UVLQZ HOO
I RUP DWRQZ KLFKVDP SOHVD UHDQDO\ ]HGLQ— OUH DFWLRQYR OXP HV$ I WHUKUUHD FWLR QWLP HWKHVOLGH Z DVVF DQQHG
XVLQJ 6 FDQ$ UUD\ V\ VWHP 6DP SOH$ LVKHWHUR ]\ J RXVI R UD &WR 7 WUDQVLWLRQLQWKH$ ,5 ( J HQH ER[HG
VDP SOH% LVKHWHUR]\ JR XVIRUWZ RWUD QVLWLRQV*WR$ LQWKH) 9 JHQHDQG& WR7 LQWKH) 6+ 5 J HQHVDP SOH&LV
KHWHUR]\ JR XVI RUD&WR 7 WUDQVLWLRQLQWKH 1 ( ) 5 ,1 J HQH
The method was evaluated in 192 samples containing known
carriers for each of the mutations and in unknown samples from which
the putative carriers as determined by the microarray assay were
confirmed by a reference method. A high primary success rate
(96.5%) with excellent specificity was achieved with the simple
genotyping procedure (0.1% ambiguous genotypes, no miscalls),
yielding nearly 2500 genotypes from a single microscopic glass slide.
67
Also genotyping of the common FV and HFE mutations along with 9
SNPs in the genes in a total of 233 samples (Figure 8) with clear genotype discrimination and almost 100% success rate demonstrates the
general applicability of the novel system.
ES ES
ES
ES
NE
6
9
,
(
)
<
&
+
(
)
+
+
+
++
++
&<
)
$
,
:
&<
)
$
,
:
,96
)9
7&
*$
/HLGHQ
*$
R L
W
D
5
O D
Q
J
L
6
)
$
,
:
)
$
,
:
)
$
,
:
)
$
,
:
)
$
,
:
/HLGHQ
:,$)
:,$)
$&
*$
:,$)
:,$)
*$
$*
R L
W
D
5
O D
Q
J
L
6
$
)9
/HLGHQ
*$
NE ESES
Q
H
G
L
H
/
9
)
)9
++
&<
NE
6LJQDOLQWHQVLW\
%
6LJQDOLQWHQVLW\
),*85('HVLJQDQGUHVXOWVRIDQ613VFRULQJDVVD\
3UHYLRXVO\FKDUDFWHUL]HGVHTXHQFHYDULDQWVZHUHVHOHFWHGIURP$+)(JHQHDQG%)9JHQH7KHUHODWLYHJHQRPLF
ORFDWLRQRIWKHVH613VDUHLOOXVWUDWHGLQWKHXSSHUSDUWRIWKHILJXUHLQGLYLGXDOVZHUHW\SHGRQWKHDUUD\V
UHYHDOLQJWKDWWKUHHRIWKH)9JHQH613VZHUHQRWSRO\PRUSKLFLQWKHVDPSOHVWXGLHG7KHJHQRW\SLQJUHVXOWVRIWKH
SRO\PRUSKLF613VDUHLOOXVWUDWHGEHORZ7KHJHQRW\SHDWWKH)9/HLGHQDQG++&<PXWDWLRQVZHUHNQRZQ
SUHYLRXVO\DQGWKHVDPSOHVZHUHDVVLJQHGFRUUHFWO\IRUWKHVHLQDOOFDVHV*HQHUDOO\WKHVLJQDOUDWLRVIDOOLQWRWKUHH
GLVWLQFWFOXVWHUVXQHTXLYRFDOO\FKDUDFWHUL]LQJWKHJHQRW\SHDWHDFKORFXV
A large population-based screening of the 31 mutations was
carried out to determine their geographic distribution in Finland.
Putative carriers found through the array-based analysis were further
confirmed with reference methods. Approximately 2600 samples were
analyzed along with blinded controls, yielding over 70.000 genotypes
at a 96% success rate (Pastinen et al. in preparation). Significant
variation of disease mutation frequencies was seen across Finland
(Figure 9). The development of high-throughput tools for genotyping
are making population-wide genetic tests possible. It will be necessary to accurately determine disease allele frequencies in the population. This information forms the basis for the evaluation of the impact and cost-effectiveness of population-wide screenings in disease
prevention.
68
2XOX
1RUWK.DUHOLD
6RXWKHUQ%RWQLD
+HOVLQNL
),*85(6XPPHGFDUULHUUDWHVIRUPXWDWLRQVLQIRXUUHJLRQVRI)LQODQG
)RXUKXQGUHGVDPSOHVZHUHW\SHGIURPLQGLYLGXDOVRULJLQDWLQJIURPHDFKRIWKHIROORZLQJ
UHJLRQV1RUWK.DUHOLD2XOXDQG6RXWKHUQ%RWQLDVDPSOHVZHUHW\SHGIURPLQGLYLGXDOV
RULJLQDWLQJIURP+HOVLQNL,QDGGLWLRQRYHUFDUULHULQGLYLGXDOVVHUYHGDVSRVLWLYHFRQWUROV
$OOWKHLGHQWLILHGFDUULHUVZHUHFRQILUPHGE\WKHUHIHUHQFHPHWKRG7KHDJJUHJDWHFDUULHUUDWHV
EDVHGRQXQDPELJXRXVJHQRW\SLQJFDOOVDUHVKRZQGHPRQVWUDWLQJZLGHO\
YDU\LQJUDWHVRIFDUULHUVKLSLQWKHIRXUUHJLRQVVWXGLHG)XUWKHUPRUHSDUWLFXODUO\SURPLQHQW
YDULDWLRQLQFDUULHUIUHTXHQFLHVRIWKHSUHVXPDEO\\RXQJHUPXWDWLRQVZDVVHHQDFURVV)LQODQG
3DVWLQHQHWDOPDQXVFULSW
Table 5 describes the characteristics of different published array
genotyping applications, in which more than 500 genotypes were
produced and the additional applications presented in this thesis.
Despite the existence of high-density array production technology so
far, the highest information density (genotypes per area) has been
achieved on our simple spotted arrays utilizing enzymatic extension
to discriminate genotypes.
69
TABLE 5. Genotype scoring on DNA-arrays.
$33/,&$7,21
1 2 6 1 3 6
1 2 2 )
1 2 2 )
632 76
25
6$ 0 3
* (12
3 ( 5 & 0
0 87$
/( 6
7<3(6
7 ,2 1 6
352 %( 6
* ( 1 2 7 <
3(5
3 ( 6 3 ( 5
* (12 &0
7<3(
& ) P X WD WLR Q V
& UR Q LQ H WD O
& & 5 0 % / 6 1 3 V
,,,
0 ,D V V R F LD WH G
6 1 3 V ,9
UG
J H Q H U D WLR Q
P D U N H U P D S
: D Q J H WD O $ Q F H V WH U D OD OOH OH
G H WH U P LQ D WLR Q
+ D F LD H WD O $ UD E LG R S V LV 6 1 3
P DS
!
& K R H WD O $ OOH OH V S H F LILF
H [ WH Q V LR Q R Q
' 1 $ D U UD \ V 9
´ ) LQ Q & K LS µ
J H Q R W\ S LQ J
3 D V WLQ H Q H WD O
P D Q X V F U LS W
2 X WR I D V V D \ V LWH V V F R U H G J H Q R W\ S H V U H OLD E O\ LQ G LY LG X D OV Z H UH V F R U H G R Q WK H Z K R OH D UU D \ D G G LWLR Q D OV D P S OH V Z H UH W\ S H G IR U D V X E V H W
R I 6 1 3 V 7 K H V D P H D V V D \ D V G H V F U LE H G E \ : D Q J H WD O Q R Z P D UN H U V Z H U H UH S R U WH G WR
V F R U H K H WH U R ] \ J R WH V UH OLD E O\ 7 H Q S R R OH G ' 1 $ V D P S OH V 2 Q O\ D E R X WK D OIR IWK H J H Q R W\ S LQ J F D S D F LW\ R IWK H D U UD \ Z D V LQ ID F WX V H G D V D V X E V H WR I P D U N H U V Z H U H W\ S H G LQ WK H S R R OH G V D P S OH V 2 X WR IWK H S U R E H V H WV V \ Q WK H V L]H G R Q WK H D U U D \ V R Q O\ P D U N H UV S H U IR U P H G
D F F H S WD E O\ LQ K H WH U R ] \ J R WH G LV F ULP LQ D WLR Q 70
CONCLUDING REMARKS
Deciphering the human genome sequence will finally open the
door to true functional genomics studies of our own species. The few
years of experience with genomics on organisms with sequenced
genomes have revealed that in order assign function to all genes a
wide variety of tools are required. In addition to many of the highthoughput analytical methods described in this thesis, efficient computational tools are essential in determining molecular physiology
(Marcotte et al. 1999a, 1999b). Deposition of the analytical data in the
public domain increases the power of these biocomputing-based
methods greatly, and hopefully such sharing will be seen for the large
number of on-going gene-mapping, association and expression studies in mammals as well. Databases on genotypes related with all the
possible phenotypic data available might tackle with the large sample sizes and marker numbers required to characterize genes for
complex traits. Finally, independent validation studies of the new
methods should allow the application of these high-throughput tools
in clinical applications in the near future.
71
ACKNOWLEDGEMENTS
I wish to thank professor Jussi Huttunen, head of the National
Public Health Institute, for providing excellent research facilities and
infrastructure to carry out this work.
Right from that first visit to the Department of Human Molecular
genetics when I met professor Leena Peltonen (Palotie) in October
1993 her openess and “action-style” were striking for the young medical student entering her room. Consequently, I initiated my PhD studies the following week in her group. I have been truly privileged to
be a part of Leena’s group and to have her positive attitude guide me
through the first steps of my research career.
For the last five years I have had the pleasure to be co-supervised by professor Ann-Christine Syvanen. Her down-to-earth approach, expertise in molecular methodology and familiarity with “the
benchtop world” made the work in this thesis happen. Among the
many things Chrisse taught me was criticism and self-confidence,
essential for carrying out genomics on the edge of Europe. Both
Leena and Chrisse are also thanked for literally never saying no - the
trust and responsibility they provided was truly flattering.
Docent Jukka Partanen, an HLA-wizard, is greatly acknowledged
for his always willing attitude to discuss about genetics and HLA, for
being an excellent collaborator and for his continuing interest towards my work. Drs Andres Metspalu and Ants Kurg are acknowledged
for their guidance on oligonucleotide immobilization and many
stimulating discussions on minisequencing in several meetings
during the past few years. Paavo Niini is thanked for building our
arrayers and providing engineering point-of-view for the project
along with his colleague Pekka Katila. The collaboration with the
Microelectronics lab at VTT led by professor Matti Leppihalme was
extremely useful not only for the providing possibility of testing
various silicon chips, fluorescence detection system and array-spotters, but also for educating me on communication and work in an
interdisciplinary project.
The “chip-group”, which was unfortunately formed only during
the later part of my stay in Leena’s lab was a lot of fun to work and
share laughs with. Mirja’s and Katarina’s great work on “park-chemistry” immobilizing oligonucleotides and fantastic stories immobilizing
me were an essential part of the work. Paivi’s efficiency, humour and
ability to interpret instructions not even interpretable by their composer was always outstanding (and seeing the world’s fastest
pipetting hands in work reveals the secrets of high-throughput
72
genotyping). Minna’s ability to pool various excel spreadsheets of
repeated genotypings, carrier numbers, and so on just before that
important presentation or meeting in record time was “life saving”
along with those thousands of minisequencing and PCR reactions she
carried out.
Satu and Pena introduced me into the world of complex disease
genetics and multiple sclerosis, even if I bailed out early. Satu’s unequivocal determinism and Pena’s enthousiasm exemplified the two
ways of surviving in the competitive field. Reintroduction to multifactorial diseases and coworking with Markus were truly pleasurable,
though coffee breaks and shared congress trips were even more so (as
long as we didn’t discuss about Star Wars). Also, with Markus we went
through two horrifying “near death experiences” - one in the Rocky
Mountains caught in a snowstorm on a small mountain road and another on the streets of LA caught as passengers in a car driven by
Lasse.
During those long evening and weekend hours in the lab I got to
know Jyrki. He could be found loading repeat gels virtually at any
time of the day, the waiting periods between loadings were often
spent talking dirty and living in a fantasy world of “young eagles”.
Sharing some special congress trips, and also my last days as a bachelor in New York City and Boston with one minute beer challenges,
along with numerous other events extended our friendship beyond
the lab environment. Lasse’s inexhaustible interest in everyday drama
of life made him the perfect listener of your worries, and the regular
“healing sessions” with Jyrki and Lasse in William K. were relaxing
(to some more than others).
Kaisu’s partying mode is seemingly always on, and her tolerance
to infantile humour was amazing during the HGM 96 meeting where
she shared most of the free time with me, Jyrki and Tuomas listening
to on-line commentary on Fitness World competitors... Kaitsu never
got tired of teaching me why Canadian hockey sucks, and Tuomas is
acknowledged for his kind advice on current trends of men’s wear...
Johanna, Petra, Maria, Miina, Naula, Jesper, Tero, Juha and Teemu were
all excellent company in the after-office-hour activities and also
helping me out during office hours.
The senior scientists Iski, Anu and Irma are all thanked for their
interest towards my work. Sari E. and Sari K. were always helpful in
sorting out reagent billing and conference trips, but particularly the
submission and publication of this thesis would have not been possible without Sari Kivikko’s tremendous help! Dr. Robert Sladek is
kindly acknowledged for revising my English.
The world of immunodiagnostics and homogenous assays kept
73
me busy during the past year, which was made possible by Professor
Hans Soderlund. Hasse’s treasure trove of ideas for diagnostic assays
never ceased to amaze me and his interest towards DNA-arrays was
also important for the work in this thesis. I want to thank Adela for her
patience regarding my less than organized way of working and my
reluctance for true protein work.
The combination of the medical school and the laboratory was
only possible with the aid of Mikko, Ilkka, Antti and Samuli - poker
nights and pinball tournaments without any medical jargon were true
quality leisure time. Mikko and Ilkka kept me from sinking totally into
an eppendorf tube, without their friendship an early burn-out would
have been evident. The “Quebecois” party house in Kapyla provided
fun Halloweens and BBQ’s with the good company of Frank, Martin,
Markus and Michelle along with many others. Regular lunches with
Tuukka are truly missed for his hilarious company brought me always
in good mood.
The support of my family has in all aspects of my life has been
tremendous. Their silent encouragement in my career choices was
particularly valuable, never generating additional pressure to succeed.
Despite being “brutally dragged” from happy Montreal to dark
Scandinavia my wife Nathalie has always provided her support and
love even through the worst of times. This work is dedicated to her as
during the past five and a half years she has shared all the sorrows and
joys related to this thesis, beared my constant absence, and yet always encouraged me to do my best. Our dog Spugi is lastly thanked
for taking care of Nathalie when I was away!
This work has been supported by grants from the Technology
Development Centre of Finland, the Instrumentarium Foundation, EC
Biomed2 Contract no. BMH4-972013, The Hjelt Fond of the Pediatric
Research Foundation, the Emil Aaltonen Foundation, Stiftelsen Oscar
Oflund, the Finnish Medical Society Duodecim, the Rinnekoti Research Foundation and the Maud Kuistila Foundation.
Montreal, May 25, 2000
74
REFERENCES
Aaltonen LA. Peltomaki P. Genes involved in hereditary nonpolyposis colorectal carcinoma.
Anticancer Research.14:1657-60, 1994
Abramson RD Thermostable DNA polymerases: an update. In:PCR Applications: protocols
for functional genomics. (ed. Innis MA, Gelfand DH and Sninsky JJ) pp. 33-48. Academic
Press, San Diego, 1999
Ahmadian A. Lundeberg J. Nyren P. Uhlen M. Ronaghi M. Analysis of the p53 tumor suppressor gene by pyrosequencing. Biotechniques. 28:140, 2000
Ahrendt SA, Halachmi S, Chow JT,Wu L, Halachmi N,Yang SC,Wehage S, Jen J, Sidransky D.
Rapid p53 sequence analysis in primary lung cancer using an oligonucleotide probe array.
Proc Natl Acad Sci U S A. 96:7382-7, 1999
Alves AM. Carr FJ. Dot blot detection of point mutations with adjacently hybridising synthetic oligonucleotide probes. Nucleic Acids Research. 16:8723, 1988
Amos DB. Bashir H. Boyle W. MacQueen M. Tiilikainen A. A simple micro cytotoxicity test.
Transplantation. 7:220-3, 1969
Anagnostopoulos T. Green PM. Rowley G. Lewis CM. Giannelli F. DNA variation in a 5-Mb
region of the X chromosome and estimates of sex-specific/type-specific mutation rates.
American Journal of Human Genetics. 64:508-17, 1999
Antonarakis SE. McKusick VA. OMIM passes the 1,000-disease-gene mark. Nature Genetics. 25:11, 2000
Bach FH. Voynow NK. One-way stimulation in mixed leukocyte cultures. Science. 153:545,
1966
Bains W, Smith GC. A novel method for nucleic acid sequence determination. J Theor Biol.
135:303-7, 1988
Bains W. Hybridization methods for DNA sequencing. Genomics. 11:294-301, 1991
Baner J. Nilsson M. Mendel-Hartvig M. Landegren U. Signal amplification of padlock probes
by rolling circle replication. Nucleic Acids Research. 26:5073-8, 1998
Barany F. Gelfand DH. Cloning, overexpression and nucleotide sequence of a thermostable
75
DNA ligase-encoding gene. Gene. 109:1-11, 1991
Barany F. Genetic disease detection and DNA amplification using cloned thermostable
ligase. Proceedings of the National Academy of Sciences of the United States of America.
88:189-93, 1991
Barinaga M: Will “DNA Chip” speed genome initiative? Science 251:1489, 1991
Barnes WM. PCR amplification of up to 35-kb DNA with high fidelity and high yield from
lambda bacteriophage templates. Proceedings of the National Academy of Sciences of the
United States of America. 91:2216-20, 1994
Baron H. Fung S. Aydin A. Bahring S. Luft FC. Schuster H. Oligonucleotide ligation assay
(OLA) for the diagnosis of familial hypercholesterolemia. Nature Biotechnology. 14:127982, 1996
Beattie WG, Meng L, Turner SL, Varma RS, Dao DD, Beattie KL. Hybridization of DNA targets
to glass-tethered oligonucleotide probes. Mol Biotechnol. 4:213-25, 1995
Behr MA,Wilson MA, Gill WP, Salamon H, Schoolnik GK, Rane S, Small PM. Comparative
genomics of BCG vaccines by whole-genome DNA microarray. Science. 284:1520-3, 1999
Beier M, Hoheisel JD. Versatile derivatisation of solid support media for covalent bonding
on DNA-microchips.Nucleic Acids Res. 27:1970-7, 1999
Bell GI, Karam JH, Rutter WJ. Polymorphic DNA region adjacent to the 5' end of the human
insulin gene. Proc. Natl. Acad. Sci. USA 78;5759-63, 1981
Bertina RM, Koeleman BP, Koster T, Rosendaal FR, Dirven RJ, de Ronde H, van der Velden PA,
Reitsma PH. Mutation in blood coagulation factor V associated with resistance to activated
protein C. Nature. 369:64-7, 1994
Birch DE. Simplified hot start PCR. Nature. 381:445-6, 1996
Bolla MK. Haddad L. Humphries SE. Winder AF. Day IN. High-throughput method for
determination of apolipoprotein E genotypes with use of restriction digestion analysis by
microplate array diagonal gel electrophoresis. Clinical Chemistry. 41:1599-604, 1995
Bonnet G. Tyagi S. Libchaber A. Kramer FR. Thermodynamic basis of the enhanced specificity of structured DNA probes. Proceedings of the National Academy of Sciences of the
United States of America. 96:6171-6, 1999
Botstein D,White RL, Skolnick M, and Davis RW. Construction of a genetic linkage map in
76
man using restriction fragment length polymorphisms. Am J Hum Genet 32 314-31, 1980
Broude NE, Sano T, Smith CL, Cantor CR. Enhanced DNA sequencing by hybridization. Proc
Natl Acad Sci U S A. 91:3072-6, 1994
Brown PO, and Botstein D Exploring the new world of the genome with DNA microarrays.
Nat. Genet. 21: S33-S37, 1999
Buetow KH, Edmonson MN, Cassidy AB. Reliable identification of large numbers of candidate SNPs from public EST data. Nat Genet. 21:323-5, 1999
Bulyk ML, Gentalen E, Lockhart DJ, Church GM. Quantifying DNA-protein interactions by
double-stranded DNA arrays. Nat Biotechnol. 17:573-7, 1999
Cargill, M. et al. Characterization ofsingle-nucleotide polymorphisms in coding regions of
human genes. Nature Genet. 22, 231, 1999
Carninci P, Nishiyama Y, Westover A, Itoh M, Nagaoka S, Sasaki N, Okazaki Y, Muramatsu M,
Hayashizaki Y. Thermostabilization and thermoactivation of thermolabile enzymes by
trehalose and its application for the synthesis of full length cDNA. Proc Natl Acad Sci U S A.
95:520-4, 1998
Chakravarti A. It’s raining SNPs, hallelujah? Nature Genetics. 19:216-7, 1998
Chamberlain JS, Gibbs RA, Ranier JE, Nguyen PN, Caskey CT. Deletion screening of the
Duchenne muscular dystrophy locus via multiplex DNA amplification. Nucleic Acids Res.
16:11141-56, 1988
Chamberlain, JS, and Chamberlain, J.R. Optimization of multiplex PCRs. In: Mullis, K.B., Ferre,
F and Gibbs, R.A. (ed). Polymerase chain reaction. Boston: Birkhäuser pp. 38-46, 1994.
Chang JC. Kan YW. A sensitive new prenatal test for sickle-cell anemia. New England Journal
of Medicine. 307: 30-2, 1982
Charles A. Mein, Bryan J. Barratt, Michael G. Dunn, Thorsten Siegmund, Annabel N. Smith,
Laura Esposito, Sarah Nutland, Helen E. Stevens, Amanda J.Wilson, Michael S. Phillips,
Nancy Jarvis, Scott Law, Monika de Arruda, and John A. Todd Evaluation of Single Nucleotide
Polymorphism Typing with Invader on PCR Amplicons and Its Automation Genome Res. 10:
330-343, 2000
Chee M,Yang R, Hubbell E, Berno A, Huang XC, Stern D, Winkler J, Lockhart DJ, Morris MS,
Fodor SP. Accessing genetic information with high-density DNA arrays. Science. 274:610-4,
1996
77
Chen X, Kwok PY. Template-directed dye-terminator incorporation (TDI) assay: a homogeneous DNA diagnostic method based on fluorescence resonance energy transfer. Nucleic
Acids Res. 25:347-53, 1997
Chen X, Levine L, Kwok PY. Fluorescence polarization in homogeneous nucleic acid analysis
Genome Res.9:492-8, 1999
Chen X, Zehnbauer B, Gnirke A, Kwok PY. Fluorescence energy transfer detection as a
homogeneous DNA diagnostic method. Proc Natl Acad Sci U S A. 94:10756-61, 1997
Chen X. Livak KJ. Kwok PY. A homogeneous, ligase-mediated DNA diagnostic test. Genome
Research. 8:549-56, 1998
Cheng J, Fortina P, Surrey S, Kricka LJ,Wilding P. Microchip-based Devices for Molecular
Diagnosis of Genetic Diseases. Mol Diagn. 1:183-200, 1996
Cheng S, Grow MA, Pallaud C, Klitz W, Erlich HA,Visvikis S, Chen JJ, Pullinger CR, Malloy MJ,
Siest G, Kane JP. A multilocus genotyping assay for candidate markers of cardiovascular
disease risk. Genome Res. 9:936-49, 1999
Cheng S. Fockler C. Barnes WM. Higuchi R. Effective amplification of long targets from
cloned inserts and human genomic DNA. Proceedings of the National Academy of Sciences
of the United States of America. 91:5695-9, 1994
Cheung VG, Gregg JP, Gogolin-Ewens KJ, Bandong J, Stanley CA, Baker L, Higgins MJ, Nowak
NJ, Shows TB, Ewens,WJ, Nelson SF, Spielman RS. Linkage-disequilibrium mapping without
genotyping. Nat Genet. 18:225-30, 1998
Cho RJ, Mindrinos M, Richards DR, Sapolsky RJ, Anderson M, Drenkard E, Dewdney J,
Reuber TL, Stammers M, Federspiel N, Theologis A, Yang WH, Hubbell E, Au M, Chung EY,
Lashkari D, Lemieux B, Dean C, Lipshutz RJ, Ausubel FM, Davis RW, Oefner PJ. Genome-wide
mapping with biallelic markers in Arabidopsis thaliana. Nat Genet. 23:203-7, 1999
Chou Q. Russell M. Birch DE. Raymond J. Bloch W. Prevention of pre-PCR mis-priming and
primer dimerization improves low-copy-number amplifications. Nucleic Acids Research.
20:1717-23, 1992
Chu BC, Kramer FR, Orgel LE. Synthesis of an amplifiable reporter RNA for bioassays.
Nucleic Acids Res. 14:5591-603, 1986
Clark, A.G. et al. Haplotype structure and population genetic inferences from nucleotide-
78
sequence variation in humanlipoprotein lipase. Am. J. Hum. Genet. 63: 595,1998
Cohen et al. Nature 334:119, 1988
Cohen SN, Chang AC, Boyer HW, Helling RB. Construction of biologically functional bacterial
plasmids in vitro. Proc Natl Acad Sci U S A. 70:3240-4, 1973
Collins FS, Guyer MS, Charkravarti A. Variations on a theme: cataloging human DNA sequence variation. Science. 278:1580-1, 1997
Collins FS. Patrinos A. Jordan E. Chakravarti A. Gesteland R. Walters L. New goals for the U.S.
Human Genome Project: 1998-2003. Science. 282: 682-9, 1998
Collins ML, Irvine B, Tyner D, Fine E, Zayati C, et al. A branched DNA signal amplification
assay for quantification of nucleic acid targets below 100 molecules/ml Nucleic Acids Res
25:2979, 1997
Compton J. Nucleic acid sequence-based amplification. Nature 350: 91-2, 1991
Conner, B.J., A.A. Reyes, C. Morin, K. Itakura, R.L. Teplitz, and R.B. Wallace. Detection of
sickle cell beta S-globin allele by hybridization with synthetic oligonucleotides. Proc. Natl.
Acad. Sci 80: 278-282, 1983
Cooper DN, Krawczak M, Antonorakis SE. The nature and mechanisms of human gene
mutation. In:Metabolic and Molecular Bases of Inherited Disease, 7th edn. (ed. Scriver C,
Beaudet AL, Sly WS, Valle D) pp. 259-291. McGraw-Hill, New York, 1995
Cotton RG. Current methods of mutation detection. Mutation Research. 285:125-44, 1993
Cotton RG. Slowly but surely towards better scanning for mutations Trends in Genetics.
13:43-6, 1997
Coulondre C. Miller JH. Farabaugh PJ. Gilbert W. Molecular basis of base substitution
hotspots in Escherichia coli. Nature. 274:775-80, 1978
Cox DW, Woo SL, Mansfield T. DNA restriction fragments associated with alpha 1-antitrypsin
indicate a single origin for deficiency allele PI Z. Nature. 316:79-81, 1985
Cronin MT, Fucini RV, Kim SM, Masino RS, Wespi RM, Miyada CG. Cystic fibrosis mutation
detection by hybridization to light-generated DNA probe arrays. Hum Mutat. 7:244-55,
1996
Cros P, Allibert P, Mandrand B, Tiercy JM, Mach B. Oligonucleotide genotyping of HLA
79
polymorphism on microtitre plates. Lancet. 340:870-3, 1992
Dang C. Jayasena SD. Oligonucleotide inhibitors of Taq DNA polymerase facilitate detection
of low copy number targets by PCR. Journal ofMolecular Biology. 264:268-78, 1996
Danna K, and Nathans D. Specific cleavage of simian virus 40 DNA by restriction endonuclease of Hemophilus influenzae. Proc Natl Acad Sci U S A 68 2913, 1971.
Dausset J. Acta Haematol. 20:156-166, 1958
Day DJ, Speiser PW, White PC, Barany F. Detection of Steroid 21-Hydroxylase Alleles Using
Gene-Specific PCR and a Multiplexed Ligation Detection Reaction Genomics 29: 152-162,
1995
Day IN. Humphries SE. Electrophoresis for genotyping: microtiter array diagonal gel
electrophoresis on horizontal polyacrylamide gels, hydrolink,or agarose. Analytical
Biochemistry. 222:389-95, 1994
Delahunty C, Ankener W, Deng Q, Eng J, Nickerson DA. Testing the feasibility of DNA typing
for human identification by PCR and an oligonucleotide ligation assay. Am J Hum Genet.
58:1239-46, 1996
Dib C. Faure S. Fizames C. Samson D. Drouot N.Vignal A. Millasseau P. Marc S. Hazan J.
Seboun E. Lathrop M. Gyapay G. Morissette J. Weissenbach J. A comprehensive genetic
map of the human genome based on 5,264 microsatellites Nature. 380:152-4, 1996
Dijan P. Cell 94:155-160, 1998
Drmanac and Crkvenjakov 1987 [Yugoslav Patent Application 570]
Drmanac R, Drmanac S. cDNA screening by array hybridization. Methods Enzymol.
303:165-78, 1999
Drmanac S. Kita D. Labat I. Hauser B. Schmidt C. Burczak JD. Drmanac R. Accurate sequencing
by hybridization for DNA diagnostics and individual genomics. Nature Biotechnology.
16:54-8, 1998
Drobyshev A, Mologina N, Shik V, Pobedimskaya D, Yershov G, Mirzabekov A. Sequence
analysis by hybridization with oligonucleotide microchip: identification of beta-thalassemia
mutations. Gene. 188:45-52, 1997
Dubiley S, Kirillov E, Mirzabekov A. Polymorphism analysis and gene detection by
minisequencing on an array of gel-immobilized primers. Nucleic Acids Res. 27:e19, 1999
80
Dubrova YE. Nesterov VN. Krouchinsky NG. Ostapenko VA. Neumann R. Neil DL. Jeffreys AJ.
Human minisatellite mutation rate after the Chernobyl accident Nature. 380:683-6, 1996
Dunham I. Shimizu N. Roe BA. Chissoe S. Hunt AR. Collins JE. Bruskiewich R. Beare DM. Clamp
M. Smink LJ. Ainscough R. Almeida JP. Babbage A. Bagguley C. Bailey J. Barlow K. Bates KN.
Beasley O. Bird CP. Blakey S.Bridgeman AM. Buck D. Burgess J. Burrill WD. O’Brien KP. et al.
The DNA sequence of human chromosome 22 Nature. 402: 489-95, 1999
Edman CF, Raymond DE, Wu DJ, Tu E, Sosnowski RG, Butler WF, Nerenberg M, Heller MJ.
Electric field directed nucleic acid hybridization on microchips. Nucleic Acids Res. 25:490714, 1997
Eggers M, Hogan M, Reich RK, Lamture J, Ehrlich D, Hollis M, Kosicki B, Powdrill T, Beattie K,
Smith S, et al. A microchip for quantitative detection of molecules utilizing luminescent and
radioisotope reporter groups. Biotechniques. 17:516-25, 1994
Ellegren H. Lindgren G. Primmer CR. Moller AP. Fitness loss and germline mutations in barn
swallows breeding in Chernobyl. Nature. 389:593-6, 1997
Ellis NA. German J. Molecular genetics of Bloom’s syndrome. Human Molecular Genetics. 5
Spec No:1457-63, 1996
Erlich HA. Gelfand D. Sninsky JJ. Recent advances in the polymerase chain reaction. Science.
252:1643-51, 1991
Evans WE, and Relling RV. Pharmacogenomics: Translating Functional Genomics into
rational therapeutics. Science 286:487, 1999
Fang P, Bouma S, Jou C, Gordon J, Beaudet AL. Simultaneous analysis of mutant and normal
alleles for multiple cystic fibrosis mutations by the ligase chain reaction. Hum Mutat. 6:14451, 1995
Farr C.J., R.K. Saiki, H.A. Erlich, F. McCormick, C.J. Marshall. Analysis of RAS gene mutations
in acute myeloid leukemia by polymerase chain reaction and oligonucleotide probes. Proc.
Natl. Acad. Sci. 85: 1629-1633, 1988
Ferrie RM. Schwarz MJ. Robertson NH. Vaudin S. Super M. Malone G. Little S. Development,
multiplexing, and application of ARMS tests for common mutations in the CFTR gene.
American Journal of Human Genetics. 51:251-62, 1992
Fire A, Xu SQ Rolling replication of short DNA circles. Proc Natl Acad Sci U S A 92:4641-5,
81
1995
Fodor SP, Read JL, Pirrung MC, Stryer L, Lu AT, Solas D. Light-directed, spatially addressable
parallel chemical synthesis. Science. 251:767-73, 1991
Foster, T. Modern quantum chemistry, Istanbul lectures, part III, pp. 93-137. Academic Press,
New York, NY, 1965
Fotin AV, Drobyshev AL, Proudnikov DY, Perov AN, Mirzabekov AD. Parallel thermodynamic
analysis of duplexes on oligodeoxyribonucleotide microchips. Nucleic Acids Res 26:151521, 1998
Fu YH. Kuhl DP. Pizzuti A. Pieretti M. Sutcliffe JS. Richards S.Verkerk AJ. Holden JJ. Fenwick RG
Jr.Warren ST. et al.Variation of the CGG repeat at the fragile X site results in genetic instability: resolution of the Sherman paradox. Cell. 67:1047-58, 1991
Futcher B. Blast ahead. Nat. Genet. 23:377-378, 1999
Gait MJ. Sheppard RC. Rapid synthesis of oligodeoxyribonucleotides. II. Machine-aided
solid-phase syntheses of two nonanucleotides and an octanucleotide. Nucleic Acids
Research. 4:4391-410, 1977
Gait MJ. Sheppard RC. Rapid synthesis of oligodeoxyribonucleotides: a new solid-phase
method. Nucleic Acids Research. 4:1135-58, 1977
Garred P, Madsen HO, Balslev U, Hofmann B, Pedersen C, Gerstoft J, Svejgaard A. Susceptibility to HIV infection and progression of AIDS in relation to variant alleles of mannosebinding lectin. Lancet 349:236-40, 1997
Geever RF.Wilson LB. Nallaseth FS. Milner PF. Bittner M.Wilson JT. Direct identification of
sickle cell anemia by blot hybridization. Proceedings of the National Academy of Sciences
of the United States of America. 78:5081-5, 1981
Gerry NP.Witowski NE. Day J. Hammer RP. Barany G. Barany F. Universal DNA microarray
method for multiplex detection of low abundance point mutations. Journal of Molecular
Biology. 292:251-62, 1999
Gibbs RA, Ngyen P-N, Caskey TC. Detection of single DNA base differences by
competetive oligonucleotide priming. Nucleic Acids Res. 17;2437-48, 1989
Gibson NJ , Gillard HL, Whitcombe D, Ferrie RM, Newton CR, Little S. A homogeneous
method for genotyping with fluorescence polarization Clin Chem 43:1336-1341, 1997
82
Gibson QH The reduction of methaemoglobin in red blood cells and studies on the cause of
idiopathic methaemoglobinemia. Biochem. J. 42:13, 1948
Gilles PN,Wu DJ, Foster CB, Dillon PJ, Chanock SJ. Single nucleotide polymorphic discrimination by an electronic dot blot assay on semiconductor microchips. Nat Biotechnol. 17:36570, 1999
Goddard KA, Hopkins PJ, Hall JM,Witte JS. Linkage disequilibrium and allele-frequency
distributions for 114 single-nucleotide polymorphisms in five populations. Am J Hum
Genet. 66:216-34, 2000
Graves DJ, Su HJ, McKenzie SE, Surrey S, Fortina P. System for preparing microhybridization
arrays on glass slides. Anal Chem. 70:5085-92, 1998
Griffin TJ, Hall JG, Prudent JR, Smith LM. Direct genetic analysis by matrix-assisted laser
desorption/ionization mass spectrometry. Proc Natl Acad Sci U S A. 96:6301-6, 1999
Griffin TJ. Smith LM. Single-nucleotide polymorphism analysis by MALDI-TOF mass spectrometry Trends in Biotechnology. 18:77-84, 2000
Griffin TJ. Tang W. Smith LM. Genetic analysis by peptide nucleic acid affinity MALDI-TOF
mass spectrometry Nature Biotechnology. 15:1368-72, 1997
Grompe M. The rapid detection of unknown mutations in nucleic acids Nature Genetics.
5:111-7, 1993
Grossman PD, Bloch W, Brinson E, Chang CC, Eggerding FA, Fung S, et al. High-density
multiplex detection of nucleic acid sequences: oligonucleotide ligation assay and sequence-coded separation. Nucleic Acids Res. 22;4527-34, 1994
Guatelli JC. Whitfield KM. Kwoh DY. Barringer KJ. Richman DD. Gingeras TR. Isothermal, in
vitro amplification of nucleic acids by a multienzyme reaction modeled after retroviral
replication. Proceedings of the National Academy of Sciences of the United States of
America,1990
Gunderson KL, Huang XC, Morris MS, Lipshutz RJ, Lockhart DJ, Chee MS. Mutation detection
by ligation to complete n-mer DNA arrays. Genome Res. 8:1142-53, 1998
Gunthard HF, Wong JK, Ignacio CC, Havlir DV, Richman DD. Comparative performance of
high-density oligonucleotide sequencing and dideoxynucleotide sequencing of HIV type
1 pol from clinical samples. AIDS Res Hum Retroviruses. 14:869-76, 1998
83
Guo Z, Guilfoyle RA, Thiel AJ,Wang R, Smith LM. Direct fluorescence analysis of genetic
polymorphisms by hybridization with oligonucleotide arrays on glass supports. Nucleic
Acids Res. 22:5456-65, 1994
Guschin D, Yershov G, Zaslavsky A, Gemmell A, Shick V, Proudnikov D, Arenkov P,
Mirzabekov A. Manual manufacturing of oligonucleotide, DNA, and protein microchips.
Anal Biochem. 250:203-11, 1997
Gusella JF. Wexler NS. Conneally PM. Naylor SL. Anderson MA. Tanzi RE. Watkins PC. Ottina
K.Wallace MR. Sakaguchi AY. et al. A polymorphic DNA marker genetically linked to
Huntington’s disease. Nature. 306:234-8, 1983
Hacia JG, Fan JB, Ryder O, Jin L, Edgemon K, Ghandour G, Mayer RA, Sun B, Hsie L, Robbins
CM, Brody LC, Wang D, Lander ES, Lipshutz R, Fodor SP, Collins FS. Determination of ancestral alleles for human single-nucleotide polymorphisms using high-density oligonucleotide
arrays. Nat Genet. 22:164-7, 1999
Hacia JG. Sun B. Hunt N. Edgemon K. Mosbrook D. Robbins C. Fodor SP. Tagle DA. Collins FS.
Strategies for mutational analysis of the large multiexon ATM gene using high-density
oligonucleotide arrays. Genome Research. 8:1245-58, 1998a
Hacia JG. Woski SA. Fidanza J. Edgemon K. Hunt N. McGall G. Fodor SP. Collins FS. Enhanced
high density oligonucleotide array-based sequence analysis using modified nucleoside
triphosphates. Nucleic Acids Research. 26:4975-82, 1998b
Haff LA. Smirnov IP. Single-nucleotide polymorphism identification assays using a thermostable DNA polymerase and delayed extraction MALDI-TOF mass spectrometry. Genome
Research. 7:378-88, 1997
Hakala H, Virta P, Salo H, Lonnberg H. Simultaneous detection of several oligonucleotides
by time-resolved fluorometry: the use of a mixture of categorized microparticles in a
sandwich type mixed-phase hybridization assay. Nucleic Acids Res. 26:5581-8, 1998
Halushka, M.K. et al. Patterns ofsingle-nucleotide polymorphisms in candidate genes
regulating blood-pressure homeostasis. Nature Genet. 22, 239, 1999
Harju L.Weber T. Alexandrova L. Lukin M. Ranki M. Jalanko A. Colorimetric solid-phase
minisequencing assay illustrated by detection of alpha 1-antitrypsin Z mutation. Clinical
Chemistry. 39:2282-7, 1993
84
Harris H Enzyme polymorphisms in man. Proc. R. Soc. Lond. (Biol) 174:1, 1966
Head SR, Rogers YH, Parikh K, Lan G, Anderson S, Goelet P, Boyce-Jacino MT. Nested genetic
bit analysis (N-GBA) for mutation detection in the p53 tumor suppressor gene. Nucleic
Acids Res. 25:5065-71, 1997
Healey BG, Matson RS, Walt DR. Fiberoptic DNA sensor array capable of detecting point
mutations. Anal Biochem. 251:270-9, 1997
Henegariu O, Heerema NA, Dlouhy SR, Vance GH, Vogt PH. Multiplex PCR: critical parameters and step-by-step protocol. Biotechniques. 23:504-11, 1997
Higuchi R. Dollinger G.Walsh PS. Griffith R. Simultaneous amplification and detection of
specific DNA sequences. Bio/Technology 10:413-7, 1992
Higuchi R. Fockler C. Dollinger G.Watson R. Kinetic PCR analysis: real-time monitoring of
DNA amplification reactions. Bio/Technology 11:1026-30, 1993
Higuchi R. Simple and rapid preparation of samples for PCR. In Ed. HA Erlich. PCR Technology: principles and applications for DNA amplification. Stockton Press, New York. 31-38,
1989
Hillert J. Human leukocyte antigen studies in multiple sclerosis. Ann Neurol. 36 Suppl:S15-7,
1994
Holland PM. Abramson RD. Watson R. Gelfand DH. Detection of specific polymerase chain
reaction product by utilizing the 5' 3' exonuclease activity of Thermus aquaticus DNA
polymerase. Proceedings of the National Academy of Sciences of the United States of
America 88:7276-80, 1991
Hultman T. Stahl S. Hornes E. Uhlen M. Direct solid phase sequencing of genomic and
plasmid DNA using magnetic beads as solid support. Nucleic Acids Research. 17:4937-46,
1989
Ingram VM A specific chemical difference between the globins of normal human and sickle
cell anaemia haemoglobin. Nature 178:792, 1956
Jamer R. Differentiating genomics companies. Nat Biotech 18:153, 2000
Jeffreys AJ,Wilson V, and Thein SL. Hypervariable ‘minisatellite’ regions in human DNA.
Nature 314 67-73, 1985
Jurinke C. van den Boom D. Jacob A. Tang K. Worl R. Koster H. Analysis of ligase chain
85
reaction products via matrix-assisted laser desorption/ionization time-of-flight-mass
spectrometry. Analytical Biochemistry. 237:174-81, 1996
Kan YW, and Dozy AM. Polymorphism of DNA sequence adjacent to human beta-globin
structural gene: relationship to sickle mutation. Proc Natl Acad Sci U S A 75 5631-5, 1978
Kelley SO, Boon EM, Barton JK, Jackson NM, Hill MG. Single-base mismatch detection based
on charge transduction through DNA. Nucleic Acids Res. 27:4830-7, 1999
Kelly TJ, Jr., and Smith HO. A restriction enzyme from Hemophilus influenzae. II. J Mol Biol 51
393-409, 1970
Khanna M. Park P. Zirvi M. Cao W. Picon A. Day J. Paty P. Barany F. Multiplex PCR/LDR for
detection of K-ras mutations in primary colon tumors. Oncogene. 18:27-38, 1999
Khrapko KR, Lysov YuP, Khorlin AA, Ivanov IB, Yershov GM, Vasilenko SK, Florentiev VL,
Mirzabekov AD. A method for DNA sequencing by hybridization with oligonucleotide
matrix. DNA Seq. 1:375-88, 1991
Khrapko, K.R., Lysov, P. Yu, A.A. Khorlyn, V.V. Shick, V.L. Florentiev, and A.D. Mirzabekov. An
oligonucleotide hybridization approach to DNA sequencing. FEBS Lett. 256:118-122, 1989
Kimura A, Dong Rui-P, Harada H, Sasazuki T. DNA typing of HLA Class II genes in Blymphoblastoid cell lines homozygous for HLA. Tissue Antigens 40;5-12, 1992
Kimura A, Takehiko S. Eleventh International Histocompatibility Workshop reference
protocol for the HLA DNA-typing technique. HLA 1991. Oxford: Oxford University Press,
397-419, 1991
Koivisto UM,Viikari JS, Kontula K. Molecular characterization of minor gene rearrangements
in Finnish patients with heterozygous familial hypercholesterolemia: identification of two
common missense mutations (Gly823—>Asp and Leu380—>His) and eight rare mutations
of the LDL receptor gene. Am J Hum Genet. 57:789-97, 1995
Kononen J, Bubendorf L, Kallioniemi A, Barlund M, Schraml P, Leighton S, Torhorst J, Mihatsch
MJ, Sauter G, Kallioniemi OP. Tissue microarrays for high-throughput molecular profiling of
tumor specimens. Nat Med. 4:844-7, 1998
Kopp MU. Mello AJ. Manz A. Chemical amplification: continuous-flow PCR on a chip. Science.
280:1046-8, 1998
Koster H. Tang K. Fu DJ. Braun A. van den Boom D. Smith CL. Cotter RJ. Cantor CR. A strategy
86
for rapid and efficient DNA sequencing by mass spectrometry. Nature Biotechnology.
14:1123-8, 1996
Kozal MJ, Shah N, Shen N,Yang R, Fucini R, Merigan TC, Richman DD, Morris D, Hubbell E,
Chee M, Gingeras TR. Extensive polymorphisms observed in HIV-1 clade B protease gene
using high-density oligonucleotide arrays. Nat Med. 2:753-9, 1996
Kramer FR, Lizardi PM. Replicatable RNA reporters. Nature. 339:401-2, 1989
Krook A, Stratton IM, O’Rahilly S. Rapid and simultaneous detection of multiple mutations by
pooled and multiplex single nucleotide primer extension: application to the study of
insulin-responsive glucose transporter and insulin receptor mutations in non-insulindependent diabetes. Hum. Mol. Gen. 1;391-5, 1992
Kruglyak L. Prospects for whole-genome linkage disequilibrium mapping of common
disease genes. Nature Genetics. 22:139-44, 1999
Kuppuswamy MN, Hoffmann JW, Kasper CK, Spitzer SG, Groce SL, Bajaj PS. Single nucleotide
primer extension to detect genetic diseases: Experimental application to hemophilia B
(factor IX) and cystic fibrosis genes. Proc. Natl. Acad. Sci. USA 88;1143-7, 1991
Kure S, Takayanagi M, Narisawa K, Tada K, Leisti J. Identification of a common mutation in
Finnish patients with nonketotic hyperglycinemia. J Clin Invest. 90:160-4, 1992
Kurg A, Tõnisson N, Georgiou I, Shumaker J, Tollett J, Metspalu A. Arrayed Primer Extension:
Solid phase four-color DNA resequencing and mutation detection technology. Genetic
Testing 2000 (in press)
Kwiatkowski M, Fredriksson S, Isaksson A, Nilsson M, Landegren U. Inversion of in situ
synthesized oligonucleotides: improved reagents for hybridization and primer extension
in DNA microarrays. Nucleic Acids Res. 27:4710-4, 1999
Kwiatkowski M. Nilsson M. Landegren U. Synthesis of full-length oligonucleotides: cleavage
of apurinic molecules on a novel support. Nucleic Acids Research. 24:4632-8, 1996
Kwok S. Kellogg DE. McKinney N. Spasic D. Goda L. Levenson C. Sninsky JJ. Effects of
primer-template mismatches on the polymerase chain reaction: human immunodeficiency
virus type 1 model studies. Nucleic Acids Research. 18:999-1005, 1990
Laan M. Paabo S. Demographic history and linkage disequilibrium in human populations
Nature Genetics. 17:435-8, 1997
87
Lagerkvist A. Stewart J. Lagerstrom-Fermer M. Landegren U. Manifold sequencing: efficient
processing of large sets of sequencing reactions. Proceedings of the National Academy of
Sciences of the United States of America. 91:2245-9, 1994
Lambert WC, Kuo H-R, Lambert MW. Xeroderma pigmentosum and related disorders. In
Jameson JL (ed.) Principles of Molecular Medicine. Humana Press, NJ.
Lamture JB, Beattie KL, Burke BE, Eggers MD, Ehrlich DJ, Fowler R, Hollis MA, Kosicki BB,
Reich RK, Smith SR, et al. Direct detection of nucleic acid hybridization on the surface of a
charge coupled device. Nucleic Acids Res. 22:2121-5, 1994
Landegren U, Kaiser R, Sanders J, Hood L. A ligase-mediated gene detection technique.
Science 241;1077-80, 1988
Lander ES. Array of hope. Nat Genet. 21(1 Suppl):3-4, 1999
Lander ES. The new genomics: global views of biology. Science. 274:536-9, 1996
Landsteiner & Wiener 1940
Landsteiner , 1901
Lathrop M. Nakamura Y. O’Connell P. Leppert M. Woodward S. Lalouel JM. White R. A
mapped set of genetic markers for human chromosome 9. Genomics. 3:361-6, 1988
Lawyer FC. Stoffel S. Saiki RK. Myambo K. Drummond R. Gelfand DH. Isolation, characterization, and expression in Escherichia coli of the DNA polymerase gene from Thermus
aquaticus. Journal of Biological Chemistry. 264:6427-37, 1989
Lee LG. Livak KJ. Mullah B. Graham RJ. Vinayak RS. Woudenberg TM. Seven-color, homogeneous detection of six PCR products. Biotechniques. 27:342-9, 1999
Lemmo AV, Rose DJ, Tisone TC. Inkjet dispensing technology: applications in drug discovery. Curr Opin Biotechnol. 9:615-7, 1998
Levine & Stetson 1939
Lewin. GENES VI Chapter 15 “DNA replication” pp471-504, Oxford University Press, New
York, 1997
Lewontin RC. Hubby JL. A molecular approach to the study of genic heterozygosity in
natural populations. II. Amount of variation and degree of heterozygosity in natural populations of Drosophila pseudoobscura. Genetics. 542:595-609, 1966
88
Li J, Butler JM, Tan Y, Lin H, Royer S, et al. Single nucleotide polymorphism determination
using primer extension and time-of-flight mass spectrometry. Electrophoresis 20 1258-65,
1999
Li, W.H. & Sadler, L.A. Low nucleotide diversity in man. Genetics 129, 513–523, 1991
Lindblad-Toh K, E. Winchester, M.J. Daly, D.G. Wang, J.N. Hirschhorn, J.-P. Laviolette,
K.Ardlie, D.E. Reich, E. Robinson, P. Sklar, N. Shah, D. Thomas, J.-B. Fan, T.Gingeras,
J.Warrington, N. Patil, T.J. Hudson & E.S. Lander. Large-scale discovery and genotyping of
single nucleotide polymorphisms in the mouse. Nat. Genet., 2000
Linder MW, Prough RA, Valdes R Jr. Pharmacogenetics: a laboratory tool for optimizing
therapeutic efficiency. Clin Chem. 43:254-66, 1997
Lipshutz RJ, Fodor SP, Gingeras TR, Lockhart DJ. High density synthetic oligonucleotide
arrays. Nat Genet. 21(1 Suppl):20-4, 1999
Lipshutz RJ, Morris D, Chee M, Hubbell E, Kozal MJ, Shah N, Shen N,Yang R, Fodor SP. Using
oligonucleotide probe arrays to access genetic diversity. Biotechniques. 19:442-7, 1995
Little DP, Braun A, O’Donnell MJ, Koster H. Mass spectrometry from miniaturized arrays for
full comparative DNA analysis. Nat Med. 3:1413-6, 1997
Liu R, Paxton WA, Choe S, Ceradini D, Martin SR, Horuk R, MacDonald ME, Stuhlmann H,
Koup RA, Landau, NR. Homozygous defect in HIV-1 coreceptor accounts for resistance of
some multiply-exposed individuals to HIV-1 infection. Cell 86:367-77, 1996
Liu YH. Bai J. Zhu Y. Liang X. Siemieniak D.Venta PJ. Lubman DM. Rapid screening of genetic
polymorphisms using buccal cell DNA with detection by matrix-assisted laser desorption/
ionization mass spectrometry. Rapid Communications in Mass Spectrometry. 9:735-43,
1995
Livak KJ. Flood SJ. Marmaro J. Giusti W. Deetz K. Oligonucleotides with fluorescent dyes at
opposite ends provide a quenched probe system useful for detecting PCR product and
nucleic acid hybridization. Genome Research. 4:357-62, 1995
Livak KJ. Hainer JW. A microtiter plate assay for determining apolipoprotein E genotype and
discovery of a rare allele. Human Mutation. 3:379-85, 1994
Lizardi PM, Huang X, Zhu Z, Bray-Ward P, Thomas DC, Ward DC. Mutation detection and
single-molecule counting using isothermal rolling-circle amplification. Nat Genet. 19:225-
89
32, 1998
Lockhart DJ, Dong H, Byrne MC, Follettie MT, Gallo MV, Chee MS, Mittmann M, Wang C,
Kobayashi M, Horton H, Brown EL. Expression monitoring by hybridization to high-density
oligonucleotide arrays. Nat Biotechnol. 14:1675-80, 1996
Loeb L.A., Preston B.D. Mutagenesis by apurinic/apyrimidic sites. Ann. Rev. Genet. 20:201230, 1986
Lonjou C. Collins A. Morton NE. Allelic association between marker loci. Proceedings of the
National Academy of Sciences of the United States of America. 96:1621-6, 1999
Luo J. Bergstrom DE. Barany F. Improving the fidelity of Thermus thermophilus DNA ligase.
Nucleic Acids Research. 24:3071-8, 1996
Lyamichev V, Mast AL, Hall JG, Prudent JR, Kaiser MW, Takova T, Kwiatkowski RW, Sander TJ,
de Arruda M, ArcoDA, Neri BP, Brow MA. Polymorphism identification and quantitative
detection of genomic DNA by invasive cleavage of oligonucleotide probes. Nat Biotechnol.
17:292-6, 1999
Lysov, P. Yu, V.L. Florentiev, A.A. Khorlyn, K.R. Khrapko, V.V. Shick, and A.D. Mirzabekov.
Dokl. Akad. Nauk. SSSR. 303:1508-1511, 1989
Maniatis T. Kee SG. Efstratiadis A. Kafatos FC. Amplification and characterization of a betaglobin gene synthesized in vitro. Cell. 8:163-82,1976
Marcotte EM, Pellegrini M, Ng HL, Rice DW, Yeates TO, Eisenberg D. Detecting protein
function and protein-protein interactions from genome sequences. Science. 285:751-3,
1999
Marcotte EM, Pellegrini M, Thompson MJ, Yeates TO, Eisenberg D. A combined algorithm
for genome-wide prediction of protein function. Nature. 402:83-6, 1999
Marshall A, Hodgson J. DNA chips: an array of possibilities. Nat Biotechnol. 16:27-31, 1998
Martinson JJ, Chapman NH, Rees DC, Liu Y-T, Clegg JB. Global distribution of the CCR5 gene
32-basepair deletion. Nature Genet 16:100-3, 1997
Maskos U, Southern EM. A novel method for the analysis of multiple sequence variants by
hybridisation to oligonucleotides. Nucleic Acids Res. 21:2267-8, 1993
Maskos U, Southern EM. A novel method for the parallel analysis of multiple mutations in
multiple samples. Nucleic Acids Res. 21:2269-70, 1993
90
Maskos U, Southern EM. Oligonucleotide hybridizations on glass supports: a novel linker
for oligonucleotide synthesis and hybridization properties of oligonucleotides
synthesised in situ. Nucleic Acids Res. 20:1679-84, 1992
Masood E. As consortium plans free SNP map of human genome. Nature. 398:545-6, 1999
Matson RS, Rampal J, Pentoney SL Jr, Anderson PD, Coassin P. Biopolymer synthesis on
polypropylene supports: oligonucleotide arrays. Anal Biochem. 224:110-6, 1995
Matson RS, Rampal JB, Coassin PJ. Biopolymer synthesis on polypropylene supports. I.
Oligonucleotides. Anal Biochem. 217:306-10, 1994
Matsuura et al. Nature Genet. 19:179, 1998
Maxam AM. Gilbert W. A new method for sequencing DNA. Proceedings of the National
Academy of Sciences of the United States of America. 74:560-4, 1977
McGall G et al. J Am Chem Soc 119:5081, 1997
McGall G, Labadie J, Brock P, Wallraff G, Nguyen T, Hinsberg W. Light-directed synthesis of
high-density oligonucleotide arrays using semiconductor photoresists. Proc Natl Acad Sci
U S A 93:13555-60, 1996
McGlennen RC. Dynamic mutations pose unique challenges for the molecular diagnostics
laboratory [comment]. Clinical Chemistry. 42:1582-8, 1996
Meldrum DR, Evensen HT, Pence WH, Moody SE, Cunningham DL, Wiktor PJ. ACAPELLA1K, A capillary-based submicroliter automated fluid handling system for genome analysis.
Genome Res. 10:95-104, 2000
Metzker ML. Lu J. Gibbs RA. Electrophoretically uniform fluorescent dyes for automated
DNA sequencing. Science. 271:1420-2, 1996
Mikkelsson J, Perola M, Kauppila LI, Laippala P, Savolainen V, Pajarinen J, Penttila A, Karhunen
PJ. The GPIIIa Pl(A) polymorphism in the progression of abdominal aortic atherosclerosis.
Atherosclerosis. 147:55-60, 1999
Milner N, Mir KU, Southern EM. Selecting effective antisense reagents on combinatorial
oligonucleotide arrays. Nat Biotechnol. 15:537-41, 1997
Mir KU, Southern EM. Determining the influence of structure on hybridization using oligonucleotide arrays. Nat Biotechnol. 17:788-92, 1999
91
Mirzabekov AD. DNA sequencing by hybridization—a megasequencing method and a
diagnostic tool? Trends Biotechnol. 12:27-32, 1994
Mitra RD, Church GM. In situ localized amplification and contact replication of many individual DNA molecules. Nucleic Acids Res. 27(24):e34, 1999
Moffatt MF. Traherne JA. Abecasis GR. Cookson WOCM. Single nucleotide polymorphism
and linkage disequilibrium within the TCR a/d locus. Hum. Mol. Gen. 9:1011-9, 2000
Morley JM, Bark JE, Evans CE, Perry JG, Hewitt CA, Tully G. Validation of mitochondrial DNA
minisequencing for forensic casework. Int J Legal Med. 112:241-8, 1999
Morozov VN, Morozova TYa. Electrospray deposition as a method for mass fabrication of
mono- and multicomponent microarrays of biological and biologically active substances.
Anal Chem. 71:3110-7, 1999
Mullis KB. Faloona FA. Specific synthesis of DNA in vitro via a polymerase-catalyzed chain
reaction. Methods in Enzymology. 155:335-50, 1987
Murray V. Improved double-stranded DNA sequencing using the linear polymerase chain
reaction. Nucleic Acids Res. 17:8889, 1989
Myers RM. Larin Z. Maniatis T. Detection of single base substitutions by ribonuclease
cleavage at mismatches in RNA:DNA duplexes. Science.230:1242-6, 1985
Myers RM. Lumelsky N. Lerman LS. Maniatis T. Detection of single base substitutions in total
genomic DNA. Nature. 313:495-8, 1985
Nazarenko IA. Bhatnagar SK. Hohman RJ. A closed tube format for amplification and detection of DNA based on energy transfer. Nucleic Acids Research. 25:2516-21, 1997
Newton CR, Graham A, Heptinstall LE, Powell SJ, Summers C, Kalsheker N, et al. Analysis of
any point mutation in DNA. The amplification refractory mutation system ( ARMS ). Nucleic
Acids Res. 17;2503-16, 1989
Nguyen HK, Fournier O, Asseline U, Dupret D, Thuong NT. Smoothing of the thermal stability
of DNA duplexes by using modified nucleosides and chaotropic agents. Nucleic Acids Res.
27:1492-8, 1999
Nickerson DA. Kaiser R. Lappin S. Stewart J. Hood L. Landegren U. Automated DNA diagnostics using an ELISA-based oligonucleotide ligation assay. Proceedings of the National
Academy of Sciences of the United States of America. 87:8923-7, 1990
92
Nickerson, D.A. et al. DNA sequence diversity in a 9.7-kb region of the human lipoprotein
lipase gene. Nature Genet. 19, 233, 1998
Nilsson M. Krejci K. Koch J. Kwiatkowski M. Gustavsson P. Landegren U. Padlock probes
reveal single-nucleotide differences, parent of origin and in situ distribution of centromeric
sequences in human chromosomes 13 and 21. Nature Genetics. 16:252-5, 1997
Nilsson M. Malmgren H. Samiotaki M. Kwiatkowski M. Chowdhary BP. Landegren U. Padlock
probes: circularizing oligonucleotides for localized DNA detection. Science. 265:2085-8,
1994
Northrup MA, Christel LA, McMillan WA, Petersen K, Pourahmadi F, Western L, and Young S.
A new generation of PCR instruments and nucleic acid systems. In:PCR Applications:
protocols for functional genomics. (ed. Innis MA, Gelfand DH and Sninsky JJ) pp. 105-126.
Academic Press, San Diego, 1999
Nyren P. Karamohamed S. Ronaghi M. Detection of single-base changes using a
bioluminometric primer extension assay. Analytical Biochemistry. 244:367-73, 1997
O’Donovan MC. Oefner PJ. Roberts SC. Austin J. Hoogendoorn B. Guy C. Speight G.
Upadhyaya M. Sommer SS. McGuffin P. Blind analysis of denaturing high-performance
liquid chromatography as a tool for mutation detection. Genomics. 52:44-9, 1998
Okamoto T. Suzuki T.Yamamoto N. Microarray fabrication with covalent attachment of DNA
using Bubble Jet technology. Nature Biotech. 18:438, 2000
Parker KC; Haff L.; Garvin AM; MALDI-TOF based mutation detection using tagged in vitro
synthesized peptides Nature Biotechnology 18:95, 2000
Pastinen T, Syvänen AC, Sitbon G Lönngren J: Fluorescent, solid-phase minisequencing
method for genotyping cytochrome P450 genes. In: PCR applications: Protocols for
functional genomics. Ed. Michael Innis, David Gelfand ja John Snitsky. Academic Press. Pp.
521-536, 1999
Pauling L, Itano HA, Singer SJ,Wells IC. Sickle cell anemia: A molecular disease. Science
110:543, 1949
Pease AC, Solas D, Sullivan EJ, Cronin MT, Holmes CP, Fodor SP..Light-generated oligonucleotide arrays for rapid DNA sequence analysis. Proc Natl Acad Sci U S A. 91:5022-6, 1994
Pecheniuk NM. Marsh NA. Walsh TP. Dale JL. Use of first nucleotide change technology to
93
determine the frequency of factor V Leiden in a population of Australian blood donors.
Blood Coagulation & Fibrinolysis. 8:491-5, 1997
Peltonen L, Jalanko A, Varilo T. Molecular genetics of the finnish disease heritage. Hum Mol
Genet. 8:1913-23, 1999
Piggee CA, Muth J, Carrilho E, Karger BL. Capillary electrophoresis for the detection of
known point mutations by single-nucleotide primer extension and laser-induced fluorescence detection.J Chromatogr A. 781:367-75, 1997
Plaschke J, Voss H, Hahn M, Ansorge W, Schackert HK. Doublex sequencing in molecular
diagnosis of hereditary diseases. Biotechniques. 24:838-41, 1998
Powell et al. New England Journal of Medicine 329:1982, 1993
Proudnikov D, Timofeev E, Mirzabekov A. Immobilization of DNA in polyacrylamide gel for
the manufacture of DNA and DNA-oligonucleotide microchips. Anal Biochem. 259:34-41,
1998
Quesada MA. Replaceable polymers in DNA sequencing by capillary electrophoresis.
[Review] [65 refs] Current Opinion in Biotechnology.8:82-93, 1997
Ramsey JM, Jacobson SC, Knapp MR. Microfabricated chemical measurement systems. Nat
Med. 1:1093-6, 1995
Rehman FN, Audeh M, Abrams ES, Hammond PW, Kenney M, Boles TC. Immobilization of
acrylamide-modified oligonucleotides by co-polymerization. Nucleic Acids Res. 27:649-55,
1999
Rieder MJ. Taylor SL. Clark AG. Nickerson DA. Sequence variation in the human angiotensin
converting enzyme. Nature Genetics.22:59-62, 1999
Rigler R. Fluorescence correlations, single molecule detection and large number screening.
Applications in biotechnology. J Biotechnol. 41:177-86, 1995
Risch N, and Merikangas K. The future of genetic studies of complex human diseases.
Science 273 1516-7, 1996
Rogers YH, Jiang-Baucom P, Huang ZJ, Bogdanov V, Anderson S, Boyce-Jacino MT. Immobilization of oligonucleotides onto a glass support via disulfide bonds: A method for preparation of DNA microarrays. Anal Biochem. 266:23-30, 1999
Ronaghi M. Karamohamed S. Pettersson B. Uhlen M. Nyren P. Real-time DNA sequencing
94
using detection of pyrophosphate release. Analytical Biochemistry. 242:84-9, 1996
Roskey MT. Juhasz P. Smirnov IP. Takach EJ. Martin SA. Haff LA. DNA sequencing by delayed
extraction-matrix-assisted laser desorption/ionization time of flight mass spectrometry.
Proceedings of the National Academy of Sciences of the United States of America. 93:47249, 1996
Ross P, Hall L, Smirnov I, and Haff L. High level multiplex genotyping by MALDI-TOF mass
spectrometry Nat Biotechnol 16 1347-51, 1998
Ross PL. Lee K. Belgrader P. Discrimination of single-nucleotide polymorphisms in human
DNA using peptide nucleic acid probes detected by MALDI-TOF mass spectrometry.
Analytical Chemistry. 69:4197-202, 1997
Ruano G. Kidd KK. Coupled amplification and sequencing of genomic DNA. Proceedings of
the National Academy of Sciences of the United States of America. 88:2815-9, 1991
Saiki RK. Bugawan TL. Horn GT. Mullis KB. Erlich HA. Analysis of enzymatically amplified
beta-globin and HLA-DQ alpha DNA with allele-specific oligonucleotide probes. Nature.
324:163-6, 1986
Saiki RK. Chang CA. Levenson CH.Warren TC. Boehm CD. Kazazian HH Jr. Erlich HA. Diagnosis of sickle cell anemia and beta-thalassemia with enzymatically amplified DNA and nonradioactive allele-specific oligonucleotide probes. New England Journal of Medicine.
319:537-41, 1988
Saiki RK. Gelfand DH. Stoffel S. Scharf SJ. Higuchi R. Horn GT. Mullis KB. Erlich HA. Primerdirected enzymatic amplification of DNA with a thermostable DNA polymerase. Science.
239:487-91, 1988
Saiki RK.Walsh PS. Levenson CH. Erlich HA. Genetic analysis of amplified DNA with immobilized sequence-specific oligonucleotide probes. Proceedings of the National Academy of
Sciences of the United States of America. 86:6230-4, 1989
Sajantila A. Lukka M. Syvanen AC. Experimentally observed germline mutations at human
micro- and minisatellite loci. European Journal of Human Genetics. 7:263-6, 1999
Sambrook, J., Fritsch, E.F., and Maniatis, T. Molecular Cloning: A Laboratory Manual. 2nd ed.
pp.E.5 Cold Spring Harbor Laboratory Press. Cold Spring Harbor, NY, 1989
Samiotaki M. Kwiatkowski M. Parik J. Landegren U. Dual-color detection of DNA sequence
95
variants by ligase-mediated analysis. Genomics. 20:238-42, 1994
Samson M, Libert F, Doranz BJ, Rucker J, Liesnard C, Farber C-M, Saragosti S, Lapoumeroulie
C, Cognaux J, Forceille C, Muyldermans G, Verhofstede C, Burtonboy G, Georges M, Imai T,
Rana S,Yi Y, Smyth RJ, Collman RG, Doms RW, Vassart G, Parmentier M. Resistance to HIV-1
infection in caucasian individuals bearing mutant alleles of the CCR-5 chemokine receptor
gene. Nature 382:722-5, 1996
Sanger F. Nicklen S. Coulson AR. DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences of the United States of America. 74:5463-7,
1977
Sapolsky RJ, Lipshutz RJ. Mapping genomic library clones using oligonucleotide arrays.
Genomics 33:445-56, 1996
Sauer S, Lechner D, Berlin K, Lehrach H, Escary JL, Fox N, Gut IG A novel procedure for
efficient genotyping of single nucleotide polymorphisms. Nucleic Acids Res 28:e13, 2000
Schena M, Shalon D, Davis RW, Brown PO. Quantitative monitoring of gene expression
patterns with a complementary DNA microarray. Science. 270:467-70, 1995
Shalon, D., S.J. Smith. P.O. Brown. A DNA microarray system for analyzing complex DNA
samples using two-color fluorescent probe hybridization. Genome Res. 6: 639-45, 1996
Sharkey DJ. Scalice ER. Christy KG Jr. Atwood SM. Daiss JL. Antibodies as thermolabile
switches: high temperature triggering for the polymerase chain reaction. Bio/Technology.
12:506-9, 1994
Shoemaker DD, Lashkari DA, Morris D, Mittmann M, Davis RW. Quantitative phenotypic
analysis of yeast deletion mutants using a highly parallel molecular bar-coding strategy. Nat
Genet. 14:450-6, 1996
Shuber AP. Grondin VJ. Klinger KW. A simplified procedure for developing multiplex PCRs.
Genome Research. 5:488-93, 1995
Shuber AP. Michalowsky LA. Nass GS. Skoletsky J. Hire LM. Kotsopoulos SK. Phipps MF.
Barberio DM. Klinger KW. High throughput parallel analysis of hundreds of patient samples
for more than 100 mutations in multiple disease genes. Human Molecular Genetics. 6:33747, 1997
Shumaker JM, Metspalu A, Caskey CT. Mutation detection by solid phase primer extension.
96
Hum Mutat. 7:346-54, 1996
Sitbon G. Hurtig M. Palotie A. Lonngren J. Syvanen AC. A colorimetric minisequencing assay
for the mutation in codon 506 of the coagulation factor V gene. Thrombosis & Haemostasis.
77:701-3, 1997
Smith HO, and Wilcox KW. A restriction enzyme from Hemophilus influenzae. I. Purification
and general properties. J Mol Biol 51 379, 1970..
Sokolov BP. Primer extension technique for the detection of single nucleotide in genomic
DNA. Nucleic Acids Res. 18;3671, 1990
Solinas-Toldo S, Lampel S, Stilgenbauer S, Nickolenko J, Benner A, Dohner H, Cremer T,
Lichter P. Matrix-based comparative genomic hybridization: biochips to screen for genomic
imbalances. Genes Chromosomes Cancer 20:399-407, 1997
Sommer SS, Cassady JD, Sobell JL, Bottema CDK. A novel method for detecting point mutations or polymorphisms and its application to population screening for carriers of phenylketonuria. Mayo Clin. Proc.64;1361-72, 1989
Sosnowski RG, Tu E, Butler WF, O’Connell JP, Heller MJ. Rapid determination of single base
mismatch mutations in DNA hybrids by direct electric field control. Proc Natl Acad Sci U S A.
94:1119-23, 1997
Southern EM 1988 [Analyzing polynucleotide sequences. International Patent Application
PCT GB 89/00460]
Southern EM, Case-Green SC, Elder JK, Johnson M, Mir KU, Wang L, Williams JC. Arrays of
complementary oligonucleotides for analysing the hybridisation behaviour of nucleic
acids. Nucleic Acids Res. 22:1368-73, 1994
Southern EM, Maskos U, Elder JK. Analyzing and comparing nucleic acid sequences by
hybridization to arrays of oligonucleotides: evaluation using experimental models.
Genomics. 13:1008-17, 1992
Southern EM. Detection of specific sequences among DNA fragments separated by gel
electrophoresis. J Mol Biol 98 503-17, 1975
Southern EM. DNA chips: analysing sequence by hybridization to oligonucleotides on a
large scale. Trends Genet. 12:110-5, 1996
Steemers FJ, Ferguson JA, Walt DR. Screening unlabeled DNA targets with randomly or-
97
dered fiber-optic gene arrays. Nat Biotechnol. 18:91-4, 2000
Stimpson DI, Cooley PW, Knepper SM, Wallace DB. Parallel production of oligonucleotide
arrays using membranes and reagent jet printing. Biotechniques. 25:886-90, 1998
Stimpson DI, Hoijer JV, Hsieh WT, Jou C, Gordon J, Theriault T, Gamble R, Baldeschwieler JD.
Real-time detection of DNA hybridization and melting on oligonucleotide arrays by using
optical wave guides. Proc Natl Acad Sci U S A. 92:6379-83, 1995
Stomakhin AA, Vasiliskov VA, Timofeev E, Schulga D, Cotter RJ, Mirzabekov AD. DNA
sequence analysis by hybridization with oligonucleotide microchips: MALDI mass spectrometry identification of 5mers contiguously stacked to microchip oligonucleotides.
Nucleic Acids Res. 28:1193-1198, 2000
Strezoska Z, Paunesku T, Radosavljevic D, Labat I, Drmanac R, Crkvenjakov R. DNA sequencing by hybridization: 100 bases read by a non-gel-based method. Proc Natl Acad Sci U S A.
88:10089-93, 1991
Syvanen AC. Aalto-Setala K. Harju L. Kontula K. Soderlund H. A primer-guided nucleotide
incorporation assay in the genotyping of apolipoprotein E. Genomics. 8:684-92, 1990
Syvanen AC. Aalto-Setala K. Kontula K. Soderlund H. Direct sequencing of affinity-captured
amplified human DNA application to the detection of apolipoprotein E polymorphism. FEBS
Letters. 258:71-4, 1989
Syvanen AC. Ikonen E. Manninen T. Bengtstrom M. Soderlund H. Aula P. Peltonen L. Convenient and quantitative determination of the frequency of a mutant allele using solid-phase
minisequencing: application to aspartylglucosaminuria in Finland. Genomics. 12:590-5,
1992
Syvanen AC. Sajantila A. Lukka M. Identification of individuals by analysis of biallelic DNA
markers, using PCR and solid-phase minisequencing. American Journal of Human Genetics.
52:46-59, 1993
Syvanen, A.C. From gels to chips: “minisequencing” primer extension for analysis of point
mutations and single nucleotide polymoprhisms. Human Mutation 13:1-10,1999
Tabor S. Richardson CC. A single residue in DNA polymerases of the Escherichia coli DNA
polymerase I family is critical for distinguishing between deoxy- and
dideoxyribonucleotides. Proceedings of the National Academy of Sciences of the United
98
States of America. 92:6339-43, 1995
Tang, K., D.-J. Fu, D. Julien, A. Braun, C.R. Cantor, and H. Koster. Chip-based genotyping by
mass spectrometry. Proc. Natl. Acad. Sci. 96: 10016-10020, 1999
Tapp I, Malmberg L., Rennel E, Wik M, Syvanen AC Homogenous scoring of singlenuclleotide polymorphisms: Comparision of the 5’-nuclease TaqMan assay and molecular
beacon probes. Biotechniques 28:0-0, 2000
Terwilliger JD.Weiss KM. Linkage disequilibrium mapping of complex disease: fantasy or
reality?. Current Opinion in Biotechnology. 9:578-94, 1998
Thacker J.The molecular nature of mutation in cultured mammalian cells: a review. Mutat.
Res. 150:431-442, 1985
The Huntington’s Disease Collaborative Research Group.A novel gene containing a
trinucleotide repeat that is expanded and unstable on Huntington’s disease chromosomes.
Cell. 72:971-83, 1993
Thonnard J. Deldime F. Heusterspreute M. Delepaut B. Hanon F. De Bruyere M. Philippe M.
HLA class II genotyping: two assay systems compared. Clinical Chemistry. 41:553-6, 1995
Tobe VO. Taylor SL. Nickerson DA. Single-well genotyping of diallelic sequence variations
by a two-color ELISA-based oligonucleotide ligation assay. Nucleic Acids Research.
24:3728-32, 1996
Torrents D, Mykkanen J, Pineda M, Feliubadalo L, Estevez R, de Cid R, Sanjurjo P, Zorzano A,
Nunes V, Huoponen K, Reinikainen A, Simell O, Savontaus ML, Aula P, Palacin M. Identification
of SLC7A7, encoding y+LAT-1, as the lysinuric protein intolerance gene. Nat Genet. 21:2936, 1999
Tully G, Sullivan KM, Nixon P, Stones RE, Gill P. Rapid detection of mitochondrial sequence
polymorphisms using multiplex solid-phase fluorescent minisequencing. Genomics.
34:107-13, 1996
Turner MW. Mannose-binding lectin: the pluripotent molecule of the innate immune
system. Imm Today 17:532-40, 1996
Tuuminen T. Ingman H. Therrell BL Jr. Kallio A. Multivariant confirmation of sickle cell disease
using a non-radioactive minisequencing reaction. Hemoglobin. 21:71-89, 1997
Tyagi S. Bratu DP. Kramer FR. Multicolor molecular beacons for allele discrimination. Nature
99
Biotechnology. 16:49-53, 1998
Tyagi S. Kramer FR. Molecular beacons: probes that fluoresce upon hybridization. Nature
Biotechnology. 14:303-8, 1996
Tyagi S. Landegren U. Tazi M. Lizardi PM. Kramer FR. Extremely sensitive, background-free
gene detection using binary probes and beta replicase. Proceedings of the National
Academy of Sciences of the United States of America. 93:5395-400, 1996
Tyagi, S., D.P. Bratu, and F.R. Kramer. Multicolor molecular beacons for allele discrimination.
Nature Biotech. 16: 49-53, 1998
Underhill PA. Jin L. Lin AA. Mehdi SQ. Jenkins T.Vollrath D. Davis RW. Cavalli-Sforza LL. Oefner
PJ. Detection of numerous Y chromosomebiallelic polymorphisms by denaturing highperformance liquid chromatography Genome Research. 7:996, 1997
Vartiainen E, Puska P, Pekkanen J, Tuomilehto J, Jousilahti P. Changes in risk factors explain
changes in mortality from ischaemic heart disease in Finland. BMJ. 309:23-7, 1994
Vasiliskov AV, Timofeev EN, Surzhikov SA, Drobyshev AL, Shick VV, Mirzabekov AD.
Fabrication of microarray of gel-immobilized compounds on a chip by copolymerization.
Biotechniques. 27:592, 1999
Verheijen FW, Verbeek E, Aula N, Beerens CE, Havelaar AC, Joosse M, Peltonen L, Aula P,
Galjaard H, van der Spek PJ, Mancini GM. A new gene, encoding an anion transporter, is
mutated in sialic acid storage diseases. Nat Genet. 23:462-5, 1999
Virtaneva K. D’Amato E. Miao J. Koskiniemi M. Norio R. Avanzini G. Franceschetti S.
Michelucci R. Tassinari CA. Omer S. Pennacchio LA Myers RM. Dieguez-Lucena JL. Krahe R.
de la Chapelle A. Lehesjoki AE. Unstable minisatellite expansion causing recessively
inherited myoclonus epilepsy, EPM1. Nature Genetics. 15: 393-6, 1997
Vo-Dinh T, Alarie JP, Isola N, Landis D,Wintenberg AL, Ericson MN. DNA biochip using a
phototransistor integrated circuit. Anal Chem. 71:358-63, 1999
Walker GT. Fraiser MS. Schram JL. Little MC. Nadeau JG. Malinowski DP. Strand displacement
amplification—an isothermal, in vitro DNA amplification technique. Nucleic Acids Research.
20:1691-6, 1992
Walker GT. Little MC. Nadeau JG. Shank DD. Isothermal in vitro amplification of DNA by a
restriction enzyme/DNA polymerase system. Proceedings of the National Academy of
100
Sciences of the United States of America. 89:392-6, 1992
Wall J. Cai S. Chehab FF. A 31-mutation assay for cystic fibrosis testing in the clinical molecular diagnostics laboratory. Human Mutation. 5:333-8, 1995
Wallace RB. Shaffer J. Murphy RF. Bonner J. Hirose T. Itakura K. Hybridization of synthetic
oligodeoxyribonucleotides to phi chi 174 DNA: the effect of single base pair mismatch.
Nucleic Acids Research. 6:3543-57, 1979
Wallraff et al. Chemtech 22-32, 1997
Wang D.G. et al. Large-scale identification,mapping, and genotyping of single-nucleotide
polymorphisms in the human genome. Science 280, 1077–1082, 1998
Weber JL. Human DNA polymorphisms and methods of analysis. Current Opinion in
Biotechnology. 1:166-71, 1990
Weber JL. May PE. Abundant class of human DNA polymorphisms which can be typed using
the polymerase chain reaction. American Journalof Human Genetics. 44:388-96, 1989
Weiler J, Gausepohl H, Hauser N, Jensen ON, Hoheisel JD. Hybridisation based DNA
screening on peptide nucleic acid (PNA) oligomer arrays. Nucleic Acids Res. 25:2792-9,
1997
Westin L, Xu X, Miller C,Wang L, Edman CF, Nerenberg M. Anchored multiplex amplification
on a microelectronic chip array. Nat Biotechnol. 18:199-204, 2000
Westman P, Kuismin T, Partanen J, Koskimies S. An HLA-DR typing protocol using groupspecific PCR-amplification followed by restriction enzyme digests. Eur J Immunogen
20;103-9, 1993
Winzeler EA, Richards DR, Conway AR, Goldstein AL, Kalman S, McCullough MJ, McCusker
JH, Stevens DA,WodickaL, Lockhart DJ, Davis RW. Direct allelic variation scanning of the
yeast genome. Science. 281:1194-7, 1998
Wong C. Dowling CE. Saiki RK. Higuchi RG. Erlich HA. Kazazian HH Jr. Characterization of
beta-thalassaemia mutations using direct genomic sequencing of amplified single copy
DNA. Nature. 330(6146):384-6, 1987
Wu DY, Nozari G, Schold M, Conner BJ,Wallace RB. Direct analysis of single nucleotide
variation in human DNA and RNA using in situ dot hybridization. DNA. 8:135-42, 1989
101
Wu DY, Ugozzoli L, Pal BK,Wallace BR. Allele-specific enzymatic amplification of beta-globin
genomic DNA for diagnosis of sickle cell anemia. Proc. Natl. Acad. Sci. USA 86;2757-60, 1989
Wu DY,Wallace RB.The ligation amplification reaction ( LAR ) - amplification of specific DNA
sequences using sequential rounds of template-dependent ligation. Genomics 4;560-9,
1989
Yershov G, Barsky V, Belgovskiy A, Kirillov E, Kreindlin E, Ivanov I, Parinov S, Guschin D,
Drobishev A, Dubiley S, Mirzabekov A. DNA analysis and diagnostics on oligonucleotide
microchips. Proc Natl Acad Sci U S A. 93:4913-8, 1996
Zangenberg G, Saiki RK, Reynolds R. Multiplex PCR: Optimization guidelines. In: PCR
applications: Protocols for functional genomics. Ed. Michael Innis, David Gelfand ja John
SnitskyAcademic Press. Pp. 73-94, 1999
Zhang H, Tombline G, Weber BL. BRCA1, BRCA2, and DNA damage response: collision or
collusion? Cell. 92:433-6, 1998
Zhang L, Cui X, Schmitt K, Hubert R, Navidi W, Arnheim N.Whole genome amplification from
a single cell: implications for genetic analysis. Proc Natl Acad Sci U S A. 89:5847-51, 1992
Zirvi M, Bergstrom DE, Saurage AS, Hammer RP, Barany F. Improved fidelity of thermostable
ligases for detection of microsatellite repeat sequences using nucleoside analogs. Nucleic
Acids Research Methods 27:e41, 1999a
Zirvi M, Nakayama T, Newman G, McCaffrey T, Paty1 P, Barany F Ligase-based detection of
mononucleotide repeat sequences. Nucleic Acids Research Methods. 27:e42: 1999b
102