Scoring human genomic SNPs and mutations: Multiplexed - E
Transcription
Scoring human genomic SNPs and mutations: Multiplexed - E
Scoring human genomic SNPs and mutations: Multiplexed primer extension with manifolds and microarrays as solid-support by Tomi Pastinen Department of Human Molecular Genetics National Public Health Institute and Department of Medical Genetics University of Helsinki Helsinki, Finland Academic dissertation To be publicly discussed by the permission of the Medical Faculty of the University of Helsinki, in the Small Lecture Hall of the Haartman Institute on June 20th, at 12 noon Helsinki 2000 1 Supervised by Professor Leena Peltonen (Palotie) Department of Human Molecular Genetics, National Public Health Institute, and Department of Medical Genetics, University of Helsinki, Helsinki, Finland Professor Ann-Christine Syvänen Department of Human Molecular Genetics, National Public Health Institute, Helsinki, Finland Reviewed by Professor Olli-Pekka Kallioniemi Cancer Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA Professor Ulf Landegren, Department of Genetics and Pathology, Rudbeck Laboratory, University of Uppsala, Uppsala, Sweden Publications of the National Public Health Institute NPHI A5/2000 Copyright National Public Health Institute Julkaisija - Utgivare - Publisher Kansanterveyslaitos Mannerheimintie 166 00300 Helsinki puh. vaihde 09-47441 telefax 09-47448408 Folkhälsoinstitutet Mannerheimvägen 166 00300 Helsingfors tel. växel 09-47441 telefax 09-47448408 National Public Health Institute Mannerheimintie 166 FIN-00300 Helsinki, Finland phone +358-9-47441 telefax +358-9-47448408 ISBN 951-740-171-X ISSN 0359-3584 ethesis (PDF) ISBN 952-91-2256-X 2 To Nathalie 3 Contents LIST OF ORIGINAL PUBLICATIONS ...................... 7 SUMMARY ......................................................... 8 INTRODUCTION .............................................. 10 REVIEW OF THE LITERATURE ........................... 11 Interindividual sequence variation .................................. 11 Frequency and distribution of human sequence variations .. 12 Dynamic mutations ....................................................... 13 Polymorphic markers in human genetics ......................... 13 Genotyping technology before the PCR era ....................... 15 Amplification of target DNA ........................................... 16 Mutation detection in amplified DNA .............................. 18 Unknown mutations ....................................................... 18 Screening for known sequence variation ........................... 19 PCR RFLP .......................................................................................... 19 Allele specific oligonucleotide (ASO) hybridisation ......................... 20 Ligation assay .................................................................................. 22 Allele specific PCR ............................................................................ 22 Minisequencing primer extension ................................................... 23 Homogenous assays ........................................................................ 24 Assays with signal amplification ....................................................... 26 DNA-array technology ................................................... 28 Origins of DNA microarray concept ................................................. 29 Array construction ............................................................................ 30 In situ synthesis ................................................................................ 30 Spotted arrays .................................................................................. 31 Array reading ................................................................................... 32 Comparative sequencing on DNA-microarrays .................. 33 Arrays in sequence scanning ........................................................... 34 Scoring SNPs or mutations on DNA-microarrays .............................. 36 ASO-hybridization based methods .................................................. 36 DNA-modifying enzymes in microarray genotyping ........................ 38 Summary .......................................................................................... 41 Practical alternatives to PCR? ........................................ 42 Alternatives to microarrays for multiplexing .................... 44 5 Use of sequence variations in modern human genetics ....... 44 Routine mutation/SNP-scoring ..................................................... 44 LD mapping of complex traits .......................................................... 45 AIMS OF THE PRESENT STUDY .......................... 47 MATERIALS AND METHODS .............................. 48 DNA samples and extraction of DNA ................................ Primer synthesis .......................................................... PCR amplification ........................................................ Affinity capture and ssDNA preparation ........................... Electrophoretic separation ............................................. Preparation of microarrays ............................................. Genotyping reactions .................................................... Quantitation and interpretation of the results .................... Reference methods ....................................................... Statistical methods ....................................................... 48 48 48 49 49 50 50 51 51 52 RESULTS AND DISCUSSION................................ 53 Design of assays ........................................................... 53 Length-labeled multiplex fluorescent minisequencing ................ 53 Minisequencing primer extension arrays ........................................ 54 Allele specific extension arrays ........................................................ 55 Optimization of genotype discrimination ......................... 56 Length-labeled multiplex minisequencing assays .......................... 56 Array-based assays .......................................................................... 57 Assay procedures .......................................................... 58 Multiplex PCR ................................................................................... 58 Length labeled primers for multiplex minisequencing .................... 59 Array-based extension assays ......................................................... 61 Applications ................................................................. 63 HLA typing........................................................................................ 63 Screening for mutations and SNPs ................................................... 64 CONCLUDING REMARKS ................................. 71 ACKNOWLEDGEMENTS ................................... 72 REFERENCES .................................................. 75 6 LIST OF ORIGINAL PUBLICATIONS (in addition unpublished data is presented) I Tomi Pastinen, Jukka Partanen and Ann-Christine Syvänen: Multiplex, fluorescent solid-phase minisequencing for efficient screening of DNA variation. (1996) Clinical Chemistry 42:1391-7. II Tomi Pastinen, Ants Kurg, Andres Metspalu, Leena Peltonen and Ann-Christine Syvänen: Minisequencing: a specific tool for DNA analysis and diagnostics on oligonucleotide arrays. (1997) Genome Research 7:606-14. III Tomi Pastinen, Kirsi Liitsola, Paavo Niini, Mika Salminen and Ann-Christine Syvänen: Contribution of the CCR5 and MBL genes in susceptibility to HIV-1 infection in Finnish Population. (1998) AIDS Research and Human Retroviruses 14:695-8. IV Tomi Pastinen*, Markus Perola*, Paavo Niini, Joe Terwilliger, Veikko Salomaa, Erkki Vartiainen, Leena Peltonen and Ann-Christine Syvänen: Array-based multiplex analysis of candidate genes reveals two independent and additive genetic risk factors for myocardial infarction in the Finnish population. (1998) Human Molecular Genetics 7:1453-62. Tomi Pastinen, Mirja Raitio, Katarina Lindroos, Paivi V Tainola, Leena Peltonen and Ann-Christine Syvänen: A system for specific, high-throughput genotyping by allele-specific primer extension on microarrays. (2000) Genome Research in press. *equally contributed (IV has previously appeared in Dr. Markus Perolas PhD thesis May-1999) 7 SUMMARY Single nucleotide polymorphisms (SNPs) represent the most common form of sequence variation among individuals: three million common SNPs with a population frequency of over 5% have been estimated to be present in the human genome. Furthermore, simple substitution mutations account for the majority of disease alleles identified for inherited disorders. The Human Genome Projects sequencing effort is enabling large scale, genomewide comparative sequencing to identify common sequence variants. Thus, a genetic map of unprecedented resolution is being constructed containing several hundred thousand SNP markers. High-throughput methods for scoring allelic variants of SNPs and point mutations are imperative not only for efficient use of the new markers and for screening for disease mutations, but also for characterizing functional polymorphisms affecting drug metabolism. In this thesis, methods based on enzymatic discrimination of sequence variation in multiplexed formats are developed and applied. Minisequencing is based on a detection primer annealing just 5 to the nucleotide of interest and extending this primer with labeled nucleoside triphosphate analogues using a DNA-polymerase. The fidelity of the DNA-polymerase ensures that only a nucleotide complementary to the site of interest is incorporated in the reaction, specifically identifying the allele. A multiplexed, fluorescent minisequencing method for scoring SNPs at the human leukocyte antigen class II genes was developed based on size addressing of detection primers specific for each site. A manifold support was utilized to immobilize the amplified targets and a multiplexed minisequencing reaction with fluorescein labeled dideoxynucleotides was carried out. The reaction products were analyzed by size separation by electrophoresis in an automated sequencer. A 100% concordance of the typing results with samples of known genotype was achieved. This convenient procedure allowed rapid scoring of DQA1 and DRB1 alleles in a cohort of multiple sclerosis affected individuals and their parents. DNA-microarrays are solid substrates with ordered set of immobilized oligo or polynucleotide probes in a miniaturized format. A DNA-microarray based minisequencing primer extension method for simultaneous genotyping of disease mutations and SNPs was developed. The power of genotype discrimination using the enzyme as8 sisted minisequencing procedure was shown to be 10-fold better than that of allele specific oligonucleotide hybridization (ASO) in a pairwise comparison on the microarray format. The specificity of the minisequencing approach allows the use of low complexity microarrays for genotyping applications. Custom-built robotic spotters were used to construct arrays for scoring SNPs related with HIV-1 susceptibility as well as variants associated with increased risk of myocardial infarction (MI). These assays were applied in two casecontrol association studies in over 600 Finns. The common chemokine receptor gene deletion (CCR5 D32bp) and variant alleles of mannose binding lectin gene were associated with protection against and increased susceptibility to HIV-1 infection, respectively. Increased risk for MI in the Finnish population was conferred by variant alleles of the platelet glycoprotein IIIa and plasminogen activator inhibitor type-1 genes; this predisposing effect was particularly prominent in subjects who carried both predisposing variants. A related enzyme assisted genotyping method, called allelespecific extension on DNA-microarrays, was then developed to simplify the reaction procedure. In this method, only a single post-PCR liquid handling step is required for multiplexed genotyping. The specificity of genotype discrimination remained high, and in-house manufactured miniaturized reaction chambers enabled scoring of over 2500 genotypes from a single glass microscope slide. One fluorescent dye is required, but the use of another dye as an internal control allowed minority mutation detection at 5% level. A panel of 31 Finnish disease mutations and another panel for 11 SNPs were evaluated in 424 samples, with accurate assignment of known genotypes and with an over 96% success rate. The assay for Finnish disease mutations was recently applied to nearly 2500 population based samples and blinded positive controls to determine carrier frequencies and the geographic variation of the carrier frequencies in Finland. The work in this thesis demonstrates that significant improvement in single nucleotide variation scoring capacity can be achieved with enzymatic discrimination based multiplexed methodology. Furthermore, the approaches described are suitable for any molecular biology unit, as the reagents and instrumentation are now widely available. 9 INTRODUCTION Genomics can be understood as study of biology using specific tools of cloned genes and, if possible, in a genomewide manner. Cataloguing human genes and their functions has been compared to the construction of a periodic table of biology (Lander 1996) analogous to the chemical periodic table of elements. The most straightforward way of acquiring a comprehensive list of human genes is to determine the whole genomic sequence. The current goal of the Human Genome Project is to have the finished human genomic sequence by the year 2003 (Collins et al. 1998), preliminary partial alingment of the human genome sequence will be available within the next few months. The sequence of one human chromosome has already been released (Dunham et al. 1999), an accomplishment, which was doubted by many just 10 years ago. Just as the development of technology to analyze DNA has made the advancement in genome sequencing possible, it is expected to open the door for the post-genome or functional genomics era. Sequence variation between individuals consists of a continuum from deleterious disease mutations to neutral polymorphisms. Characterization of this variation can be utilized in mapping disease genes, diagnostics, pharmacogenetics - defining functional variation in drug metabolizing enzymes or receptors, individual identification, population genetics, and evaluation of physiological relevance of individual genes. Thus, selective resequencing to determine genetic variation can be considered as an integral part functional genomics. In this thesis novel tools have been developed for analyzing single nucleotide variation in a parallel, multiplex manner for various polymorphic and disease loci in the human genome. Our applications for scoring variants to study Finnish genes, represent an early step towards large-scale screening of polymorphisms and mutations. 10 REVIEW OF THE LITERATURE Interindividual sequence variation Faithful replication of the 3.3*109 base pairs of the nuclear human genome is a basic property necessary for each dividing cell. A key group of enzymes in the replication process are the DNA-polymerases, which synthesize new DNA-strands by incorporating complementary nucleotides with high fidelity. The enzymes often possess a 3-5 proofreading activity to correct for misincorporated bases. For example E.Coli DNA polymerase III with proofreading activity has an error rate as low as 5x10-9 (Lewin 1997). Similarly, mammalian DNA polymerase s, which synthesizes the daughter strand during replication possesses 3-5 exonuclease activity along with the extensive DNA repair machinery of an eucaryotic cell (see below) to ensure perfect copying of the parent strand. However, errors in the replication occur in 1-100% of the cell divisions as assayed in mammalian cell culture systems (Thacker 1985). Both endogenous and exogenous sources of replication errors exist. Endogenous causes include spontaneous depurination of bases (Loeb and Preston 1986) and deamination of cytosine residues (sometimes adenine) (Coulondre et al. 1978) yielding uracil (or hypoxanthine). Variation in microsatellite repeat size arises by intramolecular slipped strand mispairing, whereas interstrand interactions such as gene conversion or recombination give rise to minisatellite variability (Dijan 1998). Deletions and insertions are formed similarly, possibly being promoted by the surrounding direct or inverse repeats. Exogenous mechanisms for mutations include thymidine dimerization induced by UV light, various chemicals, such as alkylating agents forming adducts with the DNA bases, reactive oxygen species damaging pyrimidine and purine rings, and ionizing radiation causing DNA strand nicking and breakage. The importance of exogenous agents in promoting mutagenesis has recently been highlighted by the increased mutation rate in mammals exposed to ionizing radiation by the Chernobyl nuclear catastrophe (Dubrova et al. 1996, Ellegren et al. 1997). The unavoidable damage to our DNA is counterbalanced by the DNA repair systems present in our cells to retain viability. The crucial nature of these repair mechanisms is evidenced by several inherited diseases caused by defects in the system. Patients with xeroderma 11 pigmentosum are prone to skin cancer upon exposure to UV-light, due to defects in the nucleotide excision repair system removing thymine dimers and large chemical adducts (Lambert et al. 1998). Similarly defects in post-replication repair of double-stranded breaks can cause Bloom syndrome (Ellis and German 1996), Nijmegen breakage syndrome (Matsuura et al. 1998) or hereditary ovarian and breast cancer (Zhang et al. 1998). The essential nature of the mismatch repair system proteins is highlighted by germ-line mutations in their genes causing nonpolyposis colon cancer (Aaltonen and Peltomäki 1994). Frequency and distribution of human sequence variations Interindividual sequence variation is most frequently seen in differences in lengths of repeated sequence elements such as minisatellites and microsatellites, as small deletions or insertions, and as substitutions of the individual bases. Hypervariable minisatellites with repeated units of 9-64 bp in length are mostly located between genes, and are dispersed unevenly in the genome preferentially in telomeric locations in human chromosomes (Lathrop et al. 1988). Microsatellite repeats with repeat units of 1-4 bp, are also mostly non-coding with the important exception of some trinucleotide repeat expansions causing inherited disorders. Mononucleotide repeats of runs of A or Ts compose 0.3% of the nuclear genome, while dinucleotide repeats represent 0.7% of the genome, occurring approximately once in every 50kb (Weber & May 1989). Substitutions of single nucleotides are the most common form of sequence variation between individuals occurring every 300-1000bp in the genome (Li & Sadler 1991, Wang et al. 1998, Cargill et al. 1999, Halushka et al. 1999). Transition mutations (C to T) are more common than transversion (A to T or A to C) mutations, probably partly due to instability of CpG dinucleotides (Cooper et al. 1995). Polymorphisms in the regulatory regions of genes and sequence variants that alter amino acids in the coding regions of human genes are significantly suppressed by selection. This is evidenced for by the similar frequency of fourfold degenerate site polymorphisms in the coding compared to the noncoding DNA of pseudogenes; whereas polymorphisms at twofold degenerate sites, 5 flanking, 5UTR and 3UTR are less common. Finally, nucleotide substitutions at nondegenerate sites have only 25-30% of the frequency of non-coding 12 polymorphisms (Li & Sadler 1991, Cargill et al. 1999, Halushka et al. 1999). Small deletions/insertions (del/ins) usually cause frameshift mutations and are even more significantly suppressed in the coding regions of genes, and as assessed recently in the factor IX gene, their frequency was only 1.5% of the mutation frequency of substitutions (Anagnostopoulos et al. 1999). On average, common small del/ins occur once in every 12kb of genomic DNA (Wang et al. 1998), and a genetic map based on these polymorphisms is being developed (http://www.marshmed.org/genetics/). Dynamic mutations After the characterization of the first trinucleotide repeat expansion mutation causing the Fragile X syndrome (Fu et al. 1991) it became clear that a number of inherited disorders are caused by these instabile repeat mutations. The repeated segments range from short trinucleotide repeats in coding regions, such as in Huntington Disease (The Huntingtons Disease Collaborative Consortium. 1993), to promotor region expansions with >10bp repeat size, as in progressive myoclonus epilepsy (Virtaneva et al. 1997). Some of the repeat sequences are too long or too GC-rich to be efficiently amplified and pose a challenge to diagnostic laboratories (reviewed by McGlennen 1996). Consequently, in some cases the diagnosis is still based on traditional Southern blotting and, the advances in techniques for detection of sequence variation described in this thesis are not applicable for this category of mutations. Polymorphic markers in human genetics Initial identification of interindividual genetic variation was made at the protein level. ABO bloodgroups (Landsteiner , 1901) were the first genetic polymorphic markers described for humans. Coincidentally, very near their discovery in the early 20th century, Sir Archibald Garrod described alkaptonuria, the first inborn error of metabolism. Garrods work with alkaptonuria outlined many of the cornerstones of a monogenic disease trait, such as familial distribution, high-incidence of consanguineous marriages in affected families, and a pattern of recessive inheritance as described by Mendel earlier. In the early days of human genetics no daily updates to marker 13 databases were made, and the next description of new polymorphic markers - the Rh bloodgroups - was published several decades later (Landsteiner & Wiener 1940, Levine & Stetson 1939). Gibson was the first to characterize an enzyme defect as a cause of a human inherited disease almost half a century after the clinical description of alkaptonuria (Gibson et al. 1948). The next events in development of genetic markers and discovery of genetic diseases were merged by Linus Paulings work demonstrating that the sickle cell hemoglobin polypeptide had different electrophoretic mobility than the wild type counterpart (Pauling et al. 1949). This introduced the molecular disease concept, and provided a powerful tool for studying variation between individuals at the protein level. Many other abundant serum proteins were shown to be polymorphic by protein electrophoresis. Ingram demonstrated some years later, that Paulings observations were due to single aminoacid substitution in the primary polypeptide sequence (Ingram 1957), More sensitive enzymatic staining techniques enabled detection of polymorphisms also in less abundant proteins (Harris 1966, Lewontin & Hubby 1966). HLA proteins assayed by immunological methods (Dausset 1958, Bach & Voynow 1966, Amos et al. 1969) in the 1960s illustrated the peculiar feature of these molecules as being more diverse than all the other polymorphic markers discovered. Monitoring of genetic variation at the DNA-level became possible when enzymatic manipulation of DNA was discovered. Restriction enzymes (Smith & Wilcox 1970, Kelly & Smith 1970) enabled targeted cutting of human genomic DNA into fragments that could then be cloned into vectors and propagated in suitable bacterial plasmids (Cohen et al. 1973). Fragmented DNA could also be blotted and hybridized after agarose gel electrophoresis (Danna & Nathans 1971) to nitrocellulose filters (Southern et al. 1975). Soon after the discovery of first restriction fragment length polymorphisms (RFLPs) in the human genome (Kan & Dozy 1978), the construction of a map of the human genome based on these polymorphic DNA markers was suggested (Botstein et al. 1980). The era of reverse genetics had begun. With the polymorphic marker map and linkage analysis one could look for disease causing loci in the human genome without a priori knowledge of the underlying biochemistry. The power of reverse genetics was highlighted by the localisation of the Huntington disease gene (Gusella et al. 1983). The highly polymorphic minisatellites (VNTRs) (Jeffreys et al.1985) were more informative than RFLPs, but suffered from nonuniform distribution across the human genome, and thus they were mostly utilized in identification of individuals rather than in 14 genomic mapping. By the time next generation of polymorphic markers based on in vitro amplification of genomic DNA were suggested, there were already over 2000 polymorphic RFLPs mapped into the human genome (Weber 1990). The new microsatellite markers (Weber & May 1989) soon took over the genetic mapping and DNA-based identification fields. Multiallelic microsatellites were easier to assay after PCR amplification and polyacrylamide gel electrophoresis, and they are also more informative than biallelic RFLPs. In 1996 there were already >5000 mapped microsatellite markers included in the 2nd generation genetic marker map (Dib et al. 1996). Consequently, finding disease genes for monogenic disorders has become considerably easier, evidenced for by over 1000 cloned genes with allelic variants underlying inherited diseases and disease susceptibilities to date (Antonorakis and McKusick, 2000, http://www.ncbi.nlm.nih.gov/ omim/). Currently, interest has shifted towards the development of a 3rd generation marker map (Wang et al. 1998) based on the most common form of interindividual sequence variation - the single nucleotide polymorphisms (SNPs). The enthusiasm regarding these biallelic markers is not only due to their extreme density, but also the promise of easier and more accurate scoring of them. Furthermore, the relatively high mutation rate of microsatellites (Sajantila et al. 1999) theoretically favors the use biallelic markers. Joint efforts are now in progress to develop up to 300.000 SNP markers within the next 2-3years (http://www.wellcome.ac.uk/en/1/awtprerel0499n123.html). It is believed that with the SNP markers becoming available, the genetic dissection of complex traits common in the population would be possible (Risch & Merikangas 1996, Chakravarti 1998). However, criticism against the simplified linkage disequilibrium assumptions in association studies has been put forward (Terwilliger & Weiss 1998). Genotyping technology before the PCR era The recombinant DNA technology made it possibile to look into the variation of the individual DNA bases in the genome. Particularly Southern blot hybridization (Southern 1975) served as the working horse of genotyping. The method involves restriction enzyme digestion of genomic DNA, in microgram amounts and usually a cloned radiactively labelled probe. The discovery of an RFLP marker 3 to the b-globin locus associating with the sickle cell trait (Kan & Dozy 1978), using the first cloned disease-related mammalian cDNA and gene 15 (Maniatis et al. 1976), opened the way for DNA diagnostics of human inherited diseases. Next, improved disease allele detecting RFLPs for the b-globin locus (Geever et al. 1981, Chang & Kan 1982), and also for some other common recessive diseases, such as A1AT deficiency (Cox et al. 1985), were identified. Development of the allele specific hybridisation method using short synthetic oligonucleotide probes (Wallace 1979) allowed, in principle, detection of any base substitution (Conner et al. 1983), irrespectively of changes in restriction sites. Previously uncharacterized base substitutions from total genomic DNA could be detected by the ribonuclease A cleavage method (Myers et al. 1985). Subsequently, a more general method based on differential electrophoretic mobility of DNA heteroduplex in a denaturation gradient gel (Myers et al. 1985) was introduced. During the 1960s only sequences of small RNA molecules could be determined by fragmenting. Development of the Sanger dideoxy (Sanger et al. 1977) and the Maxam-Gilbert chemical cleavage methods (Maxam & Gilbert 1977) made sequence determination of cloned DNA fragments possible. It was necessary to clone the fragments to get enough DNA, and to reduce complexity of the sample. DNA synthesis on solid supports by automated synthesizers (Gait & Sheppard 1977a, Gait & Sheppard 1977b) made it possible to use of the novel ASO-hybridisation and sequencing techniques. Amplification of target DNA Analysis of human genomic DNA is based on amplification of fragments of interest from the genome to increase the copy number of the target and to reduce the complexity of the analyzed DNA. Both of these measures are directed to enable sensitive and specific detection of the target of interest. The polymerase chain reaction (PCR, Figure 1) (Saiki 1986, Mullis & Faloona 1987) has changed genome research. The possibility to amplify specific segments of genomic DNA has enabled detection of point mutations in large scale. With purified thermostable polymerases (Lawyer et al. 1989) a wide range of genomic applications were quickly developed(reviewed by Erlich et al. 1991). Modern, rapid thermal cyclers with standardized 96-well or 384-well plate formats allow fast amplification and set-up of reactions. Combining enzymes possessing pronounced 3 to 5 exonuclease activity (Pfu) with designer enzymes having minimal 5 to 3 exonuclease activity 16 (AmpliTaq) allow high-fidelity amplification of over 10kb stretches of DNA (Barnes 1994). Non-specific amplification is minimized by use of molecular switches activating the reaction only at high temperatures (Sharkey et al. 1994, Dang & Jayasena 1996, Birch et al. 1996). Active research and development in DNA-polymerases (for a review see Abramson 1999) and instrumentation (Northrup et al. 1999) hold a promise for extremely facile amplification procedures. For example, micromachined devices with continuous flow systems allow amplification of genomic DNA in merely few minutes (Kopp et al. 1998). The isothermal self sustained replication (3SR) reaction (Guatelli et al. 1990) also known as nucleic acid sequence-based amplification (NASBA) (Compton 1991) is another target amplifying method. In 3SR successive action of three (or two) enzymes lead to exponential amplification of target in an isothermal reaction. Another polyenzymatic-primer-guided reaction procedure is strand displacement amplification (Walker et al. 1992a, 1992b). These alternative target amplification techniques have never gained popularity in human genetic applications as they do not provide significant advantage over PCR, being more complicated to set-up and optimize. ),*85(6FKHPDWLFSUHVHQWDWLRQ RIWKHSRO\PHUDVHFKDLQUHDFWLRQ 3&5 7DUJHW'1$ '1$SRO 'HQDWXUH G173V $QQHDO V W & \ F O H 7KHVSHFLILFLW\RI3&5 DPSOLILFDWLRQRULJLQDWHVIURPWKH WZRLQGHSHQGHQWROLJRQXFOHRWLGH ([WHQG SULPHUVUHTXLUHGWRDQQHDOWRWKH WDUJHWVHTXHQFHQHDUE\HDFKRWKHU FRSLHV LQWKHFRUUHFWRULHQWDWLRQLQRUGHU 'HQDWXUH Q WRDOORZWKHLUH[SRQHQWLDO G & DPSOLILFDWLRQ'HQDWXUDWLRQ \ $QQHDO F O H DQQHDOLQJDQGH[WHQVLRQVWHSVDUH ([WHQG DFKLHYHGE\WKHUPDOF\FOLQJ UHVXOWLQJLQWRGRXEOLQJWKHWDUJHW FRSLHV QXPEHUDWHDFKF\FOH,QUHDOLW\ WKHHIILFLHQF\RIGRXEOLQJLVOHVV Q & \ WKDQLQHDFKF\FOHDQG F O H DPSOLILFDWLRQLVQRWH[SRQHQWLDODW V ODWHUVWDJHVRIDPSOLILFDWLRQ QFRSLHV 17 Mutation detection in amplified DNA The mutation detection methods can be divided into those that scan for unknown mutations in a target region and to those that screen for previously described variation. Similarly, a nomenclature for single nucleotide polymorphism detection methods has been adopted with SNP discovery and SNP scoring methods, respectively. Typically mutation scanning methodology is required in dominant disorders with different mutations accounting for disease alleles in each family and now in the large-scale characterization of common SNPs in mammalian genomes. Screening or scoring of mutations and SNPs is commonly employed in carrier screening and diagnosis of disease mutations as well as in the many applications of SNP typing. The rationale for division of the methods is that scanning methods are usually labor intensive, difficult to interpret and expensive, whereas the once the mutation or SNP has been discovered the scoring methods should provide efficient and straigthtforward techniques for repetetive testing of the variant in large numbers of samples. Unknown mutations One set of methods is based on differential electrophoretic migration of DNA fragments with base substitutions in heteroduplex, as in denaturing gradient gel electrophoresis, and denaturating high perfomance liquid chromotography (DHLPC) assays, or in single stranded DNA fragments, as in the single stranded conformational polymorphism assay. Another set of methods is based on the cleavage of heteroduplex molecules either by enzymes such as T4 endonuclease VII, RNase A and cleavases or chemically (reviewed by Cotton 1993, 1997, Grompe 1997). A screening method for nonsense mutations based on in vitro transcription and translation of polypeptides has also been developed (Powell et al. 1993), and recently combined with mass spectrometry to enable detection of missense mutations as well (Garvin et al. 2000). Common problems for the scanning methods have been less than perfect sensitivity and labour intensive procedures. The DHLPC method (Underhill et al. 1997) has gained popularity because of its high sensitivity in single base variation detection and the semi-automated procedure (ODonovan et al. 1998), making it a good choice for large scale variation screening projects (Cargill et al. 1999). The Sanger sequencing method benefitted from the introduction 18 of in vitro amplification and the new thermostable polymerases. Direct sequencing of PCR amplicons was introduced (Wong et al. 1987) and modified into a solid phase format (Hultman et al. 1989, Syvänen et al.1989) facilitating diagnostic sequencing. Next the format of the chain termination reaction itself was modified into a linearly amplifying thermal cycling procedure making the template preparation less demanding (Murray 1989, Ruano & Kidd 1991). Currently, modified thermostable sequencing polymerases (Tabor & Richardson 1995), improved fluorescent dyes (Metzker et al. 1996) and sophisticated capillary electrophoresis separation (Quesada 1997) have rendered Sanger dideoxy sequencing robust and hence many of the techniques for scanning mutations are becoming obsolete. Also the developments of MALDI-TOF mass-spectrometry are promising ever faster separation of sequencing fragments, though currently limited to only <100bp readouts (Roskey et al. 1996, Köster et al. 1996). Sequencing by synthesis, or pyrosequencing, in which sequential release of inorganic pyrophosphate formed upon DNA polymerase catalyzed primer extension is monitored by a luminometric assay (Ronaghi et al. 1996) is also possible. Demonstrated read-lengths by pyrosequencing are not currently sufficient for de novo sequencing (Ahmadian et al. 2000), but the method is useful for EST tag sequencing and screening of known mutations. Screening for known sequence variation The principles of common mutation screening or scoring methods are illustrated in figure 2. The next paragraphs describe these basic methods, the different reaction formats they are employed in, and their applicability for multiplex mutation detection. The systems employing DNA-array format are discussed separately. PCR RFLP Restriction enzyme digested amplification products initially required a radioactively labelled probe for detection (Saiki et al. 1985), but with improved specificity of the PCR method (Saiki et al. 1988) direct detection of restriction fragments using simple agarose gel electrophoresis became possible. The drawback of the simple PCR-RFLP method is that not all mutations change a restriction site and artificial mismatches introduced by the amplification primers are sometimes required to screen mutations (Cohen et al. 1988). Due to its simplicity, PCR-RFLP has been very popular for detection of dis19 ease mutations, and some modifications to increase capacity of gel electrophoresis have been suggested for large-scale mutation screening (Day & Humphries 1994, Bolla et al. 1995). Slab gel electrophoresis is, however, difficult to automate and not a suitable separation method for high-throughput genotyping. Optimally, an internal control to verify efficient restriction of the PCR product should be included. While long DNA molecules are still challenging for massspectrometric analysis, detection of short PCR-RFLP restriction products has been demonstrated with MALDI-TOF spectrometry (Liu et al. 1995). 5HVWULFWLRQGLJHVWLRQ $OOHOHVSHFLILF3&5 7 7 $ $ 7 3&5 5HVWULFWLRQHQ] 1WRILQWHUHVW /LJDVH & & 7 $ 7 7DUJHW'1$ & 7 G7 G& '1$SRO 7 $ 2OLJRQXFOHRWLGHOLJDWLRQ DVVD\2/$ 7 $ 7 $ $OOHOHVSHFLILFROLJRQXFOHRWLGH K\EULGL]DWLRQ$62 0LQLVHTXHQFLQJ ),*85(6FRULQJRIVLQJOHQXFOHRWLGHYDULDQWV 6FKHPDWLFSUHVHQWDWLRQRIWKHFRPPRQO\HPSOR\HGVWUDWHJLHVIRUVLQJOHQXFOHRWLGH SRO\PRUSKLVPVFRULQJ$7WR&WUDQVLWLRQLVLQWHUURJDWHGLQWKHGHSLFWHGH[DPSOH Allele specific oligonucleotide (ASO) hybridisation Short oligonucleotide probes designed to hybridize with normal or mutated target DNA can be used to screen for mutations as mismatches between probe and target destabilizes the hybrid. PCR 20 amplification provided sufficient enrichment of the target DNA of interest to have amplified DNA samples immobilized on nitrocellulose filters (Saiki et al. 1986). These dot-blots, first used for detection of human point mutations in b-globin and for HLA-DQA1 typing, and were shown to be suitable for clinical diagnostics as well in nonradioactive detection schemes using biotin labeled targets and colorimetric reaction to demonstrate positive signals (Saiki et al. 1988). Multiplexing ASO hybridization by immobilization of amplified fragments, simultaneous hybridisation with several probes followed by elution of hybridized amplicons and finally sequencing by chemical cleavage to identify underlying mutation carriers has been used (Shuber et al. 1997). The complex procedure of multiplexing in the dot blot approach can be avoided if a reverse dot-blot method is used. In reverse dot-blot hybridisation ASO probes are immobilized and the amplified samples are hybridized to these immobilized probes (Saiki et al. 1989). Applications of the reverse dot-blot method for multiplex detection of mutations have indicated that very careful optimisation of the immobilized probes is required to achieve discrimination of several mutations in the same reaction conditions (Wall et al. 1995). Another attempt to technically simplify ASO-hybridization is sandwich hybridization in microtiter plates (Cros et al.1992). Even commercial filters for ASO mutation detection have proven to be sensitive for minor changes in the reaction conditions (Thonnard et al. 1995). The relatively high background from mismatched hybridisation obviates the use of ASO for detection of minority mutations, as the limit of detection is 10% of mutant sequences in mixed samples (Farr et al. 1992). One recent suggestion to improve the power and versatility of ASO hybridsation has been to use peptide nucleic acid analogue probes (PNA) with mass-spectrometric detection. Human tyrosinase gene mutations (Griffin et al 1997) and HLA-DQA1 polymorphisms (Ross et al. 1997) have been typed using PNA probes. PNA probes with their neutral backbone hybridize at low ionic strength conditions allowing in principle better discrimination against mismatches. Also, the backbone does not fragment in the MALDI-TOF conditions, but multiplexing is limited due to the widely differing thermal stabilities of PNA probes (Griffin & Smith 2000). 21 Ligation assay In favourable reaction conditions, T4-ligase was shown to be highly discriminative against mismatches occurring near the ligation junction (Landegren et al. 1988, Alves & Carr 1988, Wu & Wallace 1989). Solid-phase systems for detection of mutations or SNPs based on hapten labeling with indirect detection or the use of lanthanide dyes with time-resolved fluorometry have been applied in oligonucleotide ligation assays (OLA) (Nickerson et al. 1990, Samiotaki et al. 1994, Tobe et al. 1996). Ligation products can be detected by massspectrometry as well (Jurinke et al. 1996). Thermostable ligases (Barany & Gelfand 1991) increased the specificity and efficiency of OLA-assays to a high level (Luo et al. 1996). The ligation assay has been multiplexed by using fluorescently labelled ligation probes with differential electrophoretic mobility to distinct each mutation and detection in an automated sequencer (Grossman et al. 1994, Day et al. 1995, Baron et al. 1996). Recently, optimization of the ligation assay conditions has enabled allele distinction at detection of mono or microsatellite repeats (Zirvi et al. 1999a, Zirvi et al. 1999b) and detection of minority mutations down to 0.2% level (Khanna et al. 1999). Allele specific PCR Mismatches at the 3end of a PCR primer hinder extension of the primer during PCR. A pair of allele specific PCR primers with 3ends complementary to either allele at a variable nucleotide site, and a common non-discriminatory primer used in parallel PCR reactions provide a convenient way for mutation detection (Wu et al. 1989, Newton et al. 1989, Sommer et al. 1989). Also primers with allele specific mismatches in non-terminal positions can be used for competetive allele-specific amplification (Gibbs et al. 1989). The advantage of the assay is that genotype assignment only required the detection of a positive amplification signal, for example after separation using simple EtBr-stained agarose gels (Wu et al. 1989). Early experiments indicated that mismatch discrimination by allele specific PCR was highly dependent on reaction conditions and was particularly poor for purine-pyrimidine mismatches (Kwok et al. 1990). The self-propagating nature of the mismatched extension in the PCR has hindered development of robust high-throughput assays, and multiplexing of the reactions has been achieved only after extensive optimisation of the reaction conditions (Ferrie et al. 1992). 22 Minisequencing primer extension Incorporation of a single nucleotide by a DNA-polymerase to the 3end of a detection primer, which anneals just 5 to the site of interest, in a sequence specific manner was first presented 10 years ago (Sokolov 1990, Syvanen et al. 1990, Kuppuswamy et al. 1991). Multiplexed versions of this method are the subject of this thesis. Several other modifications have been presented (recently reviewed by Syvanen 1999), some of which will be discussed in detail below. Subsequent chapters describe homogenous-, tagged-and multiplexed gel or array-based minisequencing systems. The early applications of the method already illustrated the excellent genotype discrimination provided by the fidelity of DNApolymerases in a single set of reaction conditions for all mutations. This allowed not only the unambigous assignment of heterozygotes, but also the discrimination of minority mutations down to 0.25% level as well as quantitation of alleles in mixed samples (Syvanen et al. 1992, 1993; Krook et al. 1992). Separated reactions for each allele to be detected can be performed using radioactively labeled nucleoside triphosphate analogues in solid-phase (Syvanen et al. 1990, 1992, 1993) with detection by scintillation counters. If the reaction is carried out in solution (Sokolov et al. 1990, Kuppuswamy et al. 1991, Krook et al. 1992) separation of primers and excess label must be done by electrophoresis and detection by autoradiography, respectively. Chemiluminescent detection of nucleotides labelled with haptens such as FITC, DNP and biotin with alkaline phosphatase or horseradish peroxidase conjugated antibodies avoids the use of radioisotopes (Syvänen et al. 1990, Harju et al. 1993, Livak et al. 1994, Pecheniuk et al. 1997, Sitbon et al. 1997, Tuuminen et al. 1997, Nikiforov et al. 1994). In the pyrosequencing detection system the release of pyrophosphate upon extension is monitored luminometrically (Nyren et al. 1993). Mass-spectrometry of the extended primers has also proved to be feasible (Haff & Smirnov 1997). Size separation of multiplexed minisequencing products in automated sequencers is discussed in the results and discussion section. Multiplexed minisequencing MALDI-TOF detection was shown (Ross et al. 1998) for 12 sites using detection primers differing by 2bp or 23 less utilizing mass tuning based on different composition (=mass) of different detection primers. Multiplexing was claimed to be extendable to 20 loci simultaneously by current mass-spectrometric detection technology. In another multiplexed approach Li and colleagues (Li et al. 1999) used detection primers with cleavable bases, resulting in lower mass of the detection primers, which could allow higher degree of multiplexing. A recent strategy for massspectrometric detection of primer extension products uses a 3thiolated detection primer and a subtracted set of a-S-dNTPs (Sauer et al. 2000). The non-phoshporothioate substrates are degraded and diluted prior to detection, and despite multiple steps this procedure is robust as no purification of the extension products prior to measurement is required. Homogenous assays Homogenous assays refer to procedures in which the separation of the genotyping reaction product from unreacted reaction components is not required. This closed-tube walk-away assay format is attractive for genotyping as carry-over contamination can be avoided in some assays and the number of steps in the procedure are minimized. Assays utilizing intercalating dyes monitor accumulation of double stranded amplification products during the PCR reaction (Higuchi et al 1992, 1993). These methods do not discriminate nonspecific amplification by-products such as primer-dimers from the target of interest limiting their usefulness. Most other assays are based on the fluorescence resonance energy phenomenon (FRET) (Foster 1965), in which two fluorescent dyes in close proximity to each other result into quenching of the emission of one dye (donor, shorter absorption wavelength) and increased emission of the other (acceptor) when the donor dye is excited. The problem of non-specific amplification products also hinders the use of sunrise primers with a FRET dye-pair incorporated into the loop-forming primers during synthesis, which quenches the monitored emission wavelength if no amplification is taking place (Nazarenko et al. 1997). A self-probing primer design was targeted to avoid problems encountered with the sunrise primers (Whitcombe et al. 1999). In the most widespread homogenous genotyping methods the PCR product spanning the mutated or polymorphic nucleotide is probed with an internally hybridizing allele specific oligonucleotide 24 forming a FRET pair in non-hybridized state. The 5nuclease assay (Holland et al. 1991) is based on the Taq-polymerase 5-3 exonuclease activity, which cleaves the amplification product bound doubly labelled probe causing an increase in the donor and decrease in the acceptor dye fluorescence in an allele specific manner (Livak et al. 1995). Molecular beacons refer to stem-loop structured probes with a quencher-dye pair in the opposite ends of the oligonucleotide which are in very close proximity in the intact beacon probe (Tyagi et al. 1996). Upon binding to the amplified target this stem-loop organization of the probes is disrupted leading to an increase in the fluorescent dye emission. Both the 5nuclease and the molecular beacon assay require careful design of the probes, as the detection of variant nucleotides is based on allele specific hybridisation. The stem-loop structure, which has a strong tendency to self-anneal, has been found to enhance mismatch destabilization compared to target specific linear probes (Bonnet et al. 1999). Both the molecular beacon and 5nuclease assay approaches allow limited multiplexing by using spectrally resolvable common quencher-probe specific dye strategy (Tyagi et al. 1998, Lee et al. 1999). Recently, a pairwise comparison of these techniques suggested that the molecular beacon approach is slightly more discriminative against single base substitutions (Tapp et al. 2000). A ligation based homogenous assay with one common dye-labelled primer and allele specific primers having different dyes both forming resolvable FRET signals upon ligation has also been devised (Chen et al. 1997a), and has the advantage of utilizing the clear genotype discrimination by the thermostable ligase. Another category of homogenous assay formats involves the addition of single nucleotide extension primers to the amplified target. In homogenous minisequencing, the inactivation of the PCR polymerase and degradation of PCR nucleotides prior to genotyping reaction are necessary. The extension primer is labelled with one dye and the incorporated ddNTPs are differentially labeled again creating an allele-specific FRET signal (Chen et al. 1997b, 1998). Fluorescence polarization (FP) detection to achieve homogenous genotyping assays has been developed based on oligonucleotide hybridisation detection of allele specific amplification products (Gibson et al. 1997) or on the minisequencing principle (Chen et al. 1999). FP minisequencing has the advantage of avoiding costly probe labelling. A compact miniaturized, homogenous assay format (GenecardsTM, Livak et al. 1999, 2nd Intl SNP Meeting, Hohenkammer, Germany) 25 should allow extremely simple genotyping, but the throughput is limited by non-multiplexed reactions, and customized sets of SNPs are not easily created. Assays with signal amplification In order to detect variation at the basepair level in the human genome with signal amplification the selectivity of the amplification process must match that of the PCR reaction, which utilizes a pair of oligonucleotides to unequivocally define the target region in the genome. The Qb-replicase assay was originally described for RNA-hybridisation probes containing the MDV-1 RNA sequence (Chu et al. 1986). After these probes have annealed to the target the hybrids are isolated and the probe amplified with Qb-replicase up to 109-fold (Lizardi & Kramer 1990). Better S/N was achieved by using binary probes and ligation reaction followed by affinity capture and separation (Tyagi et al. 1996). The Qb-replicase assay has not been applied for detection of polymorphisms or mutation in human genomic DNA, though in principle this should be possible as ligase discriminates allelic variants well. Combining thermostable ligase and ligation primers with the ligation junction near the nucleotide of interest (for both orientations) is referred to as the ligation chain reaction (LCR), and was first introduced as a method for sensitive detection of single nucleotide substitutions in whole genomic DNA (Barany et al. 1991). In subsequent studies this assay did not perform as well, and detection of sequence variation required either preamplification using PCR (Ferro et al. 1993) or extensive optimization with additional mismatches in the ligation probes (Fang et al. 1995). Padlock probes refer to linear oligonucleotides with target complementary sequences at the ends and a non-complementary linking segment in between. Upon binding to the target the probes are circularized by a ligase in a template specific manner (Nilsson et al. 1994). The padlock probes catenate with the target sequence and specific, localized detection of human metaphase chromosome centromere repeats could be demonstrated (Nilsson et al. 1997). Combination of the padlock probes with signal amplification using rolling circle replication (Fire and Xu 1995) with a strand displacing DNA polymerase could, in principle, allow mutation detection in just single target 26 molecules (Baner et al. 1998). Modifications of the padlock probe detection and F29 DNA polymerase mediated rolling circle amplification were shown to be sensitive and specific for single molecule counting (Lizardi et al. 1998). The invader assay is based on flap endonuclease (FEN) enzymes, which recognize and cleave structures formed by two overlapping oligonucleotides hybridized to a target DNA strand (Lyamichev, et al. 1999). PCR based SNP scoring by the Invader assay provides a good discrimination for all types of nucleotide substitutions (Mein et al. 2000). The principle of the invader and the exponential amplification providing modified Invader Squared Assay are illustrated in figure 3. Elegant genotyping of 12 SNPs separately with the invader squared assay was achieved using genomic DNA as a target and MALDI-TOF detection in a 5h procedure (Griffin, et al. 1999). Branched DNA probes have also been shown to be able to detect non-amplified DNA very sensitively (Collins et al. 1997), but lack specificity required for single nucleotide polymorphism detection. ,QYDGHUUHDFWLRQ ),*85(,QYDGHUDVVD\V &OHDYDJHVLWH 3ULPDU\ SUREH 3ULPDU\ LQYDGHU SUREH 7DUJHW'1$ ´SUREHµROLJRQXFOHRWLGHGRZQVWUHDPRIWKHVLWH 7 7 $ RILQWHUHVWDUHDQQHDOHGWRWKHWDUJHW7KHWZR ROLJRQXFOHRWLGHVRYHUODSDWWKHVLWHRILQWHUHVW 7 7 1RQFOHDYHGSUREHERXQG WR© DUUHVWRUªSUREH 6HFRQGDU\,QYDGHU DWDUJHWVWUXFWXUHIRUDWKHUPRVWDEOHIODS HQGRQXFOHDVH)(1)(1FOHDYHVWKHSUREH·WR WKHRYHUODSSLQJQXFOHRWLGHUHOHDVLQJDWKH·WDLO ZKLFKFDQEHPRQLWRUHG:KHQUHDFWLRQLV &OHDYDJHVLWH SURGXFW DQGWKHSUREHKDV·QRQVSHFLILFWDLO7KLVFUHDWHV SDUWRIWKHSULPDU\SUREHWKHDFFXPXODWLRQRI ©6TXDUHGªUHDFWLRQ 3ULPDU\ FOHDYDJH $Q´LQYDGHUµROLJRQXFOHRWLGHXSVWUHDPDQGD 6HFRQGDU\ SUREH 1 7 1 FDUULHGRXWQHDUWKH7PRIWKHVLJQDOROLJRWKHUH LVDFRQVWDQWWXUQRYHURIWKHROLJRQXFOHRWLGH ELQGLQJWRWKHWDUJHW'1$WKXVDPSOLI\LQJWKH 6HFRQGDU\ WDUJHW VLJQDO7KLVDVVD\FDQEHXVHGHIILFLHQWO\IRU VFRULQJ613VLQDPSOLILHG'1$0HLQHWDO 1 6HFRQGDU\ FOHDYDJHSURGXFW $GGLQJDVHFRQGDU\WDUJHW´DUUHVWRUµ ROLJRQXFOHRWLGHDVHFRQGDU\SUREHDQGXVLQJ WKHSULPDU\FOHDYDJHSURGXFWDVDQLQYDGHU ROLJRQXFOHRWLGHDSSUR[LPDWHO\VTXDUHVWKH DPRXQWRIDPSOLILFDWLRQZKHQWKHVHFRQGDU\ FOHDYDJHSURGXFWLVPRQLWRUHG7KH,QYDGHU VTXDUHGDVVD\UHVXOWHGLQWRPLOOLRQIROG DPSOLILFDWLRQRIWKHVLJQDODQGWKXVGHWHFWLRQRID PXWDWLRQLQQRQDPSOLILHGJHQRPLFWDUJHWZDV VKRZQWREHIHDVLEOH*ULIILQHWDO 27 DNA-array technology DNA-microarrays were originally introduced for sequencing and genotyping, which are discussed in detail in the following chapters, but this assay format has since found a growing number of applications listed in Table 1. The attractiveness of the array-technology is based on miniaturized size, parallel nature and solid-phase format, serving to minimize reagent consumption, increase the number of assays carried out in parallel and enable automation of the reaction and read-out. Table 1. Applications of microarray technology $33/,&$7,21 7<3(2)$55$< 5()(5(1&(6 &RPSDUDWLYHVHTXHQFLQJ 6\QWKHVL]HGRUVSRWWHG 6HHEHORZ ROLJRQXFOHRWLGHV 0RQLWRULQJP51$ 6SRWWHGDPSOLILHGF'1$IUDJPHQWV 6FKHQDHWDO %URZQDQG%RWVWHLQ H[SUHVVLRQOHYHOV &RPSDUDWLYH*HQRPLF 2OLJRQXFOHRWLGHV$II\PHWUL[ /RFNKDUWHWDO $PSOLILHGF'1$IUDJPHQWV %HKUHWDO3ROODFNHWDO +\EULGL]DWLRQ ,QVLWXGHWHFWLRQRI'1$ &ORQHGJHQRPLF'1$ 6ROLQDV7ROGRHWDO (PEHGGHGWLVVXHELRSVLHV .RQRQHQHWDO 51$RUSURWHLQV *HQRPLFPLVPDWFK 3&5DPSOLILHG<$&DQG3$&FORQH &KHXQJHWDO VFDQQLQJ IUDJPHQWV '1$SURWHLQLQWHUDFWLRQ 2OLJRQXFOHRWLGHV$II\PHWUL[ %XO\NHWDO 2OLJRQXFOHRWLGHV 0LOQHUHWDO 2OLJRQXFOHRWLGHV$II\PHWUL[ 6KRHPDNHUHWDO 2OLJRQXFOHRWLGH$II\PHWUL[ 6DSROVN\/LSVKXW] DVVD\V 6FUHHQLQJRIDQWLVHQVH WKHUDSHXWLFV 3KHQRW\SLFDQDO\VLVRI \HDVWGHOHWLRQPXWDQWV 0DSSLQJRUGHULQJ JHQRPLFFORQHV 28 Origins of DNA microarray concept A series of theoretical papers and patent applications published only a decade ago by several independent groups introduced the sequencing by hybridization (SBH) approach (Drmanac & Crkvenjakov 1987, Southern 1988, Bains & Smith 1988, Lysov et al. 1988, Khrapko et al. 1989, Bains 1991). The original SBH technology was intended for de novo sequencing by hybridization, which was believed to have higher throughput and be easily automatable. This was prompted by the launch of Human Genome Project and a common view that the Sanger dideoxy method could not be scaled up for increased sequencing speed required. The simple idea of reading a sequence based on the hybridisation reaction onto its constituent DNA-oligomers was presented in two formats. Format I had target DNA immobilized on solid support followed by sequential queries using labeled hybridisation probes (Strezoska et al. 1991, Drmanac et al. 1993). Format II had a large number of oligonucleotide probes immobilized either on polyacrylamide gel pads (Khrapko et al. 1989, 1991) or directly synthesized onto derivatized glass surface (Southern et al. 1992) followed by hybridisation of a labeled target as previously described for reverse dot-blot hybridization. De novo sequence analysis was complicated by the short oligonucleotide probes (usually octamers) which required low stringency hybrisation conditions and poor predictability of the result due to secondary structures within the target. Construction of complete nmer arrays with sufficient probe length, for example a complete set of 15-mers would have 109 probes, to alleviate the problem was not technically feasible due to the unrealistic number of different oligonucleotides required (Southern 1996). The problem of low yield hybridisation of AT-rich probes was recognized at an early stage of SBH trials (Southern et al. 1992). Suggestions to improve the SBH strategy include stacking hybridisation, with short 5-mer oligos used with the octamer arrays to increase duplex stability and mismatch discrimination (Broude et al. 1994, Yershov et al. 1996, Stomakhin et al. 2000). Due to these difficulties in de novo sequencing the interest shifted to multiplex genotyping and comparative sequencing. The following chapters describe different strategies for comparative sequence analysis on DNA-arrays. 29 Array construction DNA arrays are constructed by deposition and immobilization of different polynucleotides in spatially addressable sites on a 2-D surface with high-density. If a certain continuous DNA fragment is to be scanned for sequence variation a tiled array design is employed, which contains overlapping sets of oligonucleotides designed to interrogate successive basepairs in the target sequence. Monitoring recurrent variation at several different targets or sites within the same target is usually carried out with probe sets interrogating only these sites of interest. A third approach is to manufacture all possible sequences of a given length onto a single array, in a generic array design representing the original SBH concept that can be used for selective resequencing as well. In situ synthesis The array can be manufactured either by combinatorial in situ synthesis or premade probe deposition on a derivatized surface, sometimes referred as off-chip or linear manufacture. In situ synthesis by standard phosphoramidite chemistry on derivatized glass surfaces was pioneered by Southern and colleagues (Maskos & Southern 1992, Southern et al. 1992, Maskos & Southern 1993). The current in situ-synthesizer uses teflon-lined synthesis cells to apply reagents on the glass surface. The synthesis cell is moved along the glass surface and different parts of the surface are exposed to different phosphoramidites. The arrays produced this way will be scanning arrays with all possible probe lengths along the synthesis cell path (Southern et al. 1994). Another in situ approach utilized standard phosphoramidites on a derivatized polypropylene surface (Matson et al. 1994), which were delivered by a multichannel fluidic system with direct contact to the surface (Matson et al. 1995). Neither of the these approaches have been applied in large-scale genotyping. In principle the synthesis method by Southern is limited to produce tiled sets of oligonucleotides, which are not suitable for SNP scoring applications. A sophisticated method for synthesis of biopolymers was introduced by Fodor and colleagues (Fodor et al. 1991). The method combined semiconductor-based photolithography and solid-phase chemical synthesis to achieve highly parallel in situ synthesis of biopolymers on small glass surfaces. Using phosphoramidites with photolabile 5protective groups Affymetrix demonstrated the synthesis of 256 different octanucleotides on 1,28cm2 surface in just 16 chemical coupling steps (Pease et al. 1994) - DNA-arrays were now 30 called DNA-chips. The drawback of producing high-density arrays by the Affymetrix method has been the low step-wise yield of synthesis, varying from 92-94%, effectively limiting the probe lengths to 2025bp (McGall et al. 1997). A novel approach to utilize photolithographic technology for manufacture of DNA-arrays is based on photoresists that allows the use regular phosphoramidites to make DNA-oligos with up to 106/ cm2 densities and provides high yields (McGall et al. 1996,Wallraff et al. 1997). Spotted arrays The off-chip synthesis of oligonucleotides and deposition of these with various methods provides a simple method for DNA-array manufacture accessible also to less specialized chemistry laboratories. Photopolymerized gel-pads can be produced in relatively simple steps and oligonucleotides can be covalently immobilized to these pads (Khrapko et al. 1991, Vasiliskov et al. 1999, Proudnikov et al. 1998) either manually (Guschin et al. 1997) or with the aid of a robotic pindevice (Yershov et al. 1996). The gel-pads can be as small as 25 by 25µm in size, in principle allowing arrays with very high densities, which is somewhat limited by the requirement of alignment with the deposition device. Solid glass surfaces have been derivatized with epoxysilane (Lamture et al. 1994), phenylisothiocyanate (Guo et al. 1994), or mercaptosilane (Rogers et al. 1999) to allow covalent attachment of oligonucleotides via amino- or disulfide groups, respectively. Recently, chemistries creating dendrimeric structures on glass surfaces increasing the surface area binding the oligonucleotides (Beier & Hoheisel 1999) and immobilisation of acrylamide modified oligonucleotides via co-polymerization (Rehman et al. 1999) have been presented. In cDNA arrays with longer polynucleotide probes immobilization on polylysine coated glass slides takes place through electrostatic interactions between the surface and the negatively charged DNA backbone (Schena et al. 1995). Robotic devices based on contact printing pins (Schena et al. 1995, Shalon et al. 1996), inkjet dispensing heads (Lemmo et al. 1998, Stimpson et al. 1998, Okamoto et al. 2000), nanoliter dispensing needles (Graves et al. 1998) and electrospray deposition (Morozov & Morozova 1999) can be applied to deliver minute droplets of DNAprobes on surfaces, achieving densities of up to 105 probes/ cm2. Special array designs utilize electronic addressing of charged DNAprobes to affinity capture sites (Sosnowski et al. 1997), selective polymerization of acrylamide on optical fiber tips (Healey et al. 1997) or 31 randomly ordered, labeled microspheres binding DNA-probes on optic-fibers (Steemers et al. 2000). Array reading Early applications of the DNA-arrays invariably used 32P-labeled probes and phosphorimaging detection systems. The radioisotopic labeling with 33P coupled with phosphorimaging detection is still practical in array reading (i.e. Southern et al. 1994, Drmanac et al. 1998, II-IV, Mir et al. 1999). Development of fluorescence labeling schemes and detection systems based on an epifluorescence confocal scanner with photo-multiplier-tube (PMT) detector (Fodor et al. 1991, Pease et al. 1994, Lipshutz et al. 1995) or a fluorescence microscope coupled to a cooled CCD camera (Mirzabekov 1994, Yershov 1996) led to improved resolution. Later it became apparent that in order to utilize hybridisation based mutation scanning on high-density arrays an internal control labeled with one fluorophore and test sample with another fluorophore were needed to achieve sufficient discrimination (Chee et al. 1996, Hacia et al. 1996). Similar dual colour imaging is also used in expression array methods to normalize the results (Schena et al. 1995). There are now several commercial providers of confocal array scanners, usually with two or four excitation sources suitable for fluorophores absorbing at 488-650nm and emitting at 515-690nm. Several other approaches for array read-out have been suggested, but not yet widely applied. Direct integration of the DNA-array with a charge-coupled device (CCD) (Eggers et al. 1994) or phototransistor sensing element (Vo-Dinh et al. 1999) offer a promise of sensitive detection of fluorescent signals in a highly compact format. Arrays formed by bundles of optic fibers with probes attached to their distal ends directly (Healey et al. 1997) or on beads attached to the fibers (Steemers et al. 2000) have been described. Evanescent wave excitation for detection of particulates near the surface is created by conducting light into the edge of a wave guide, in which light propagates by total internal reflection. This approach was demonstrated for DNA-array applications by Stimpson and colleagues (Stimpson et al. 1995). Recently, a system with four excitation lasers utilizing the evanescent wave principle for detection of primer extension with labeled dideoxynucleotides was presented (Kurg et al. 2000). Developments in mass-spectrometric techniques are now being realized also in array based genotyping, offering potential for detection without labeling (Little et al. 1997, Tang et al. 1999). Entirely different detection system on solid gold surfaces based on differential 32 charge transduction through matched vs. mismatched DNA duplexes by cyclic voltammometry was recently suggested by Kelley et al. (Kelley et al. 1999), it remains to be seen whether this elegant approach is applicable in true biological assays. TABLE 2. Comparison of commonly employed DNAmicroarray labelling and detection systems / $ % ( / / ,1 * $ 1 ' $ ' 9$ 1 7$ * (6 ' ( 7 ( & 7 ,2 1 6 ( 1 6 ,7 , 5( 62 /8 9 ,7 < 7 ,2 1 " " 6<67(0 6 3 D Q G 3 & K H P LF D OV LP LOD U LW\ WR D Q D OR J X H V Q D WX U D OE D V H V 3 K R V S K R U LP D J H U / R Z F R V WLQ V WU X P H Q WD WLR Q ) OX R U H V F H Q WOD E H OV 9 H U V D WLOLW\ Z LW K G LIIH U H Q W IOX R U R S K R U H V LH LQ WH U Q D O F R Q WU R OV & R Q IR F D OV F D Q Q H U ,Q V WU X P H Q WD WL R Q Z H OO D Y D LOD E OH ) OX R U H V F H Q WOD E H OV $ V D E R Y H & & ' H OH P H Q W ) D V WH Q D E OH U H D O WLP H E D V H G U H D G H U V P H D V X U LQ J Comparative sequencing on DNA-microarrays It is surprising how much excitement and hope has been invested into DNA-array technology (Barinaga 1991, Nature Genet. [Editorial] 1996;14:367, Nature Genet [Editorial] 1998;18:195, Marshall & Hodgson 1998, Lander 1999) when in practice the development has been rather slow particularly for genotyping-SNPs. There has been only a handful of studies extending the analysis beyond the proof-ofprinciple level. In the next two chapters the arrays for scanning known sequence for unknown sequence variants and arrays for assaying known polymorphisms or mutations are discussed separately. 33 Arrays in sequence scanning The high-density light-directed synthesized arrays by Affymetrix (Santa Clara, CA) have dominated as the primary platforms for resequencing arrays. Table 3 summarizes details of the published studies of resequencing on the Affymetrix chips. Table 3. Resequencing on high-density oligonucleotide arrays 122) 122) $33/,&$7,21 6$03 352%(6 %36&$1 /(6 3(5 1(' %$6(3$,5 678 5()(5(1&( ',(' ([RQRI&)75JHQH &URQLQHWDO +,9SUJHQH .R]DOHWDO 0LWRFKRQGULDOJHQRPH &KHHHWDO +DFLDHWDO :DQJHWDO ([RQRI%5&$JHQH +XPDQ613GLVFRYHU\ &RGLQJH[RQVRI$70JHQH +XPDQFRGLQJ613 +DFLDHWDOD &DUJLOOHWDO +DOXVKNDHWDO GLVFRYHU\ +XPDQFRGLQJ613 GLVFRYHU\ 0RXVH613GLVFRYHU\ /LQGEODG7RKHWDO The earliest versions of such arrays were designed to interrogate CFTR exon 11 sequence with a minimal set of tiled 15-mer probes each synthesized on 365 by 365µm synthesis sites (Cronin et al. 1996). At this point it was already stated that the simplest form of scanning 34 array would not be sufficiently sensitive to detect heterozygous mutations unambigously. Regions of the HIV-1 genome were assayed on the Affymetrix chips, as resistance to antiviral therapy is known to be mediated by mutations in the genes coding for the drug targets (Kozal et al. 1996). The array resequencing with a considerably more complex design was shown to be equal in accuracy as compared to Sanger dideoxysequencing in this study. The evaluation of the commercial Affymetrix HIV-1 chip demonstrated that mutations present 50% of the studied viral population could not be detected reliably (Gunthard et al. 1998). Chee and colleagues applied the Affymetrix chips to a considerably larger target - the entire mitochondrial genome - using over 130.000 different probes to interrogate the sequence. A two-colour labeling strategy to include an internal control was employed. A good genotyping result with 98-99% accuracy in base calling was achieved in this haploid genome. The first large human genomic application of the high-density arrays was resequencing of an exon of the BRCA1 gene. Interpretation of the results was based on two different scoring methods, gain of signal, in which an increase of the mutant probe signal as compared to wild-type control signal is seen. Loss-of-signal, which assayed the normalized test and control signals over a larger region as a true mutation is expected to form a footprint of decreased signal at all the mutant site overlapping probes. Due to the strong mismatched hybridization at certain sites the more complicated loss-of-signal analysis was found to give better results and 14/15 tested samples were scored correctly. In a gigantic SNP-survey with over 109 oligos synthesized on 149 chips, covering a 2Mb stretch of genomic DNA, the sensitivity and specificity of were both reported to be <90% (Wang et al. 1998). Similarly with sequence survey of the ATM gene (Hacia et al. 1998a) and evaluation of the commercial p53 gene array (Ahrendt et al. 1999) resulted into sensitivities of 88-91%. Resequencing for SNP discovery consequently used DHLPC in parallel with high-density arrays to achieve higher sensitivity and specificity (Cargill et al. 1999), alternatively SNPs detected on the array were treated as candidate SNPs (Halushka et al. 1999). Mouse SNP discovery by high-density arrays was reported to be more successful (Lindblad-Toh et al. 2000), likely 35 due to the use of inbred homozygous mouse strains. Improved perfomance of poorly hybridizing probes on highdensity arrays was achieved by using 5-methyluridine triphoshates in the target (Hacia et al. 1998b). Affymetrix has also sought to extend utility of their sequence scanning arrays by constructing generic 89mer arrays and using ligation reaction to detect sequence differences in a test versus a control sequence (Gunderson et al. 1998). The performance of the 9-mer array was excellent with targets up to 1.2kb in size. Similarly Head and colleagues (Head et al. 1997) utilized the fidelity of a DNA-polymerase in model experiments scanning a 33-bp stretch of the p53 gene using single nucleotide primer extension, showing detection of variants present as little as 5% of the target sequence. An intriguing approach would be the use of polymerase extension on high-density arrays, which might be possible with the alternative array synthesis procedures (McGall et al. 1996, Wallraff et al. 1997) or a novel inversion strategy for primers attached in their 3end (Kwiatkowski et al. 1999). The less popular SBH format involving immobilisation of the targets on filters and successive interrogation with short oligonucleotide probes has been shown to be effective in determining sequence variation in stretches of cloned DNA (Drmanac et al. 1998). The use of this strategy is complicated by the need for several thousand hybridisation reactions to deduce the sequence, and is thus limited to large groups automating the successive hybridisation steps (Drmanac & Drmanac 1999). Scoring SNPs or mutations on DNA-microarrays Genotyping of previously characterized SNPs at several different loci by DNA-microarrays has different key requirements than comparative sequencing. The template preparation to enrich the several genomic fragments spanning the variants of interest is more complex, and the need for virtually 100% specificity in allele scoring is highly demanding. ASO-hybridization based methods In two short reports Southern and Maskos first described optimization of ASO probes for detection of three beta-globin alleles (Maskos and Southern 1993), followed by synthesis of the optimal probes on a solid surface and genotyping four samples (Maskos and 36 Southern 1993). A convenient reaction chamber for analysis of up to 50 samples for 100 mutations in parallel and the reuse of the probe arrays were suggested, though to date these features have not been used in practice. Similarly, Mirzabekov and colleagues have suggested hybridization for mutation screening on oligonucleotides immobilized to gel pads using also the beta-globin gene as a model. Stacking hybridization probes, two-colour hybridisation and very short targets (32-bp) were used to obtain five genotypes at three variable sites (Yershov et al. 1996). Alternatively, melting curves of the hybrids were measured in real-time to improve allelic discrimination, which was assessed using five amplified targets and several synthetic targets on arrays for five different nucleotide positions (Drobyshev et al. 1997). Various modifications of ASO-hybridisation based microarray genotyping have been presented. Guo et al. used glass supports and determined that spacer length, surface density and use of single stranded target were important for good hybridisation yields. Five tyrosinase gene mutations could be determined from amplified, fluorescently labeled and single-stranded rendered genomic samples simultaneously with optimized hybridization probes (Guo et al. 1994). Another study did not find spacer arms separating the ASO-probes from the glass surface critical, but only three samples were tested for two mutations (Beattie et al. 1995). In situ synthesized PNA probes were evaluated in model systems indicating some difficulties in predicting behavior of DNA-PNA hybrids and thus limiting their usefulness in genotyping for the time being (Weiler et al. 1997). Electronically addressable electrode array hybridization was first studied using model systems for DNA and PNA probes (Sosnowski et al. 1997, Edman et al. 1997). The assay was applied for genotyping three mannose binding protein (MBP) gene SNPs and one IL-1b gene SNP with validation using 35 blinded samples regarding the MBP SNPs (Gilles et al. 1999). Despite the claimed advantages of the electronically addressable arrays and the use of electronic stringency it remains questionable whether these arrays will be useful for routine genotyping as they are complex to manufacture and assay procedures require specialized equipment. Similar restraints apply to the use of the randomly ordered fiber-optic arrays with probes immobilized to coded beads (Steemers et al. 2000). All the above variants of ASO-hybridization on DNA microarrays provided proof-of-principle for the approach, but have not been implemented in practice. High-density arrays generated by light-directed synthesis have been applied in larger scale studies. CFTR alleles were interrogated on DNA-arrays after amplification of two regions of the CFTR gene, followed by asymmetric labeling PCR 37 reaction, fragmentation and dilution prior to hybridization. (Cronin et al. 1996) The assay was evaluated by typing 32 known and 10 blinded samples and yielded 3-5-fold discrimination against mismatches in these low complexity targets, though a threshold of 1.4-fold difference was used. SNP-scoring was carried out by Wang and colleagues, in which the SNPs were amplified in 46-plex PCR reactions. Only <400 out of the >500 sites performed well enough to allow genotype assignment. One fourth of the well performing SNPs were validated in three individuals and two CEPH families with a good success rate (98%) and high confidence assignment of genotypes (99.9%). The same arrays were applied by Hacia et al. to determine allele frequencies at 214 markers using pooled samples from different populations (Hacia et al. 1999), however details of the procedure to generate the allele frequencies were not provided. A similar approach was used tor genotype Arabidopsis thaliana SNPs (Cho et al. 1999). Almost half of the markers had to be discarded in this study due to imperfect discrimination of genotypes on the arrays. These studies indicate that there will be a considerable number of polymorphisms not amenable to high-density array scoring: at present it is unclear whether it will be possible to predict which sites will be difficult to genotype. Furthermore, high-level PCR multiplexing is costly as 10-20% of markers are lost at this stage. Commercial array (HuSNP, Affymetrix, Santa Clara, CA) with 1494 different SNP specific probe sets is claimed to yield 1200-1300 usable genotypes per sample (Figure 4A), translating into success rate of 80-87%(Genechip® HuSNP Mapping Assay, Technical Note No.1, Part No 700318, Affymetrix). It is clear that higher success rates are required for assaying coding SNPs and mutation panels. DNA-modifying enzymes in microarray genotyping Intraditional reaction formats DNA-polymerases and ligases have improved genotype discrimination under uniform reaction conditions, making multiplex genotyping more feasible. The use of DNApolymerases in improving microarray genotyping was the main target of this thesis: work by others is discussed below, and a more detailed comparison of approaches is provided in Results and Discussion section. In minisequencing on DNA-microarray detection primers are immobilized at their 5end and designed to anneal just 5 to the nucleotide on the target; DNA-polymerase incorporates labeled ddNTPs complementary to the site of interest with high specificity (II). The same method has been denoted arrayed-primer-extension (APEX, Shumaker et al. 1996, Kurg et al. 2000), nested GBA (Head et al. 1997) 38 and multibase single stranded primer extension (Dubiley et al. 1999). In the model experiment 5 successive basepairs in the HPRT gene were scanned using three ssDNA samples and extension was carried with T7 DNA-polymerase incorporating 32P-dNTPs in four parallel reactions (Shumaker et al. 1996). Recently, single nucleotide primer extension was applied for analysis of 10 allelic variants of the b-globin gene (Figure 4B and Kurg et al. 2000). The primer extension was performed in both orientations of the template with a set of four ddNTPs each labeled with a different fluorophore using a modified thermostable DNA polymerase. Results of the primer extension reactions were read with a four-colour evanescent wave laser excitation imaging system. The average genotype discrimination was nearly 40fold in the nine tested samples. $ % ),*85(,PDJHVRI613VFRULQJE\$$62K\EULGL]DWLRQRQKLJK GHQVLW\'1$DUUD\VDQG%VLQJOHQXFOHRWLGHSULPHUH[WHQVLRQRQ VSRWWHGSULPHUDUUD\V $$VPDOOSRUWLRQRIWKH+X613¹PDSSLQJDVVD\$II\PHWUL[6DQWD &ODUD&$LVVKRZQ5HGXQGDQWVHWVRI$62SUREHVIRU613VDUH V\QWKHVL]HGRQWKH'1$FKLSJHQRW\SHVFDQEHVFRUHGIURP HDFKVDPSOH,PDJHFRXUWHV\RI$II\PHWUL[,QF %$QLPDJHRIDQDVVD\IRUVFRULQJVL[613VE\VLQJOHQXFOHRWLGH SULPHUH[WHQVLRQZLWKIRXUGLIIHUHQWLDOO\ODEHOHGGG173VZLWK7,5) GHWHFWLRQ'XSOLFDWHVSRWVRIGHWHFWLRQSULPHUVIRUHDFKQXFOHRWLGH LQWHUURJDWHGIURPERWKRULHQWDWLRQVDUHSULQWHGRQDQDFWLYDWHGJODVV VOLGH7KHIRXUHPLVVLRQZDYHOHQJWKVRIWKHIOXRURSKRUHODEHOOHG GG173VDUHFROOHFWHGVHSDUDWHO\*HQRW\SHVFRULQJIRUWKHLOOXVWUDWHG H[DPSOHLVSURYLGHGLQWKHWDEOHEHORZ,PDJHFRXUWHV\RI'U$QWV 39 .XUJ7DUWX(VWRQLD Dubiley et al. used similar procedure on gel-pad arrays with a single fluorophore divided into four separate reactions to detect seven b-globin alleles in eight patients (Dubiley et al. 1999). Reported genotype discrimination was lower, with up to 20% misincorporation rates. Detection of primer extension products by mass spectrometry for three polymorphisms has been demonstrated on silicon wells (Tang et al. 1999), in which the PCR product was immobilized covalently. Despite the speed of the mass-spectrometric measurement itself, the study by Tang and colleagues suffered from a cumbersome reaction procedure and only limited miniaturization (4cm2 chip with 36 wells): further studies are required to demonstrate high-throughput potential of the array-based MALDI-TOF spectroscopy in genotyping in practice. Another polymerase-assisted primer extension assay is based on two immobilized detection primers with 3end complementary to one or the other allele, denoted as multiprimer extension assay(Dubiley et al. 1999) or allele specific extension assay(V). A total of 56 genotypes were produced at seven sites using the minisequencing or the multiprimer extension assay with DNApolymerase and fragmented dsDNA templates on gel-pad arrays yielding similar base calling accuracy (Dubiley et al. 1999). Ligation has thus far been applied for screening known mutations on DNA-microarrays in one published study (Gerry et al. 1999), in which a universal zip-code array was used. Zip-coded or tagged arrays (Figure 5) are adopted from the molecular bar-coding strategy of yeast deletion strains (Shoemaker et al. 1996). An array with ninezip-code oligonucleotides was produced and allele specific ligation probes with 5sequence complementary to one of the array immobilized oligonucleotides were used in ligation reaction. Following ligation reaction in solution the ligated products were hybridized on the zip-code arrays, and nine alleles at three K-ras sites could be demonstrated. Groups at Affymetrix (Robert Lipshutz, The Microarray Meeting, Scottsdale, AR Sept. 1999) and Whitehead Institute (Hirschhorn et al. ASHG 1999 Annual Meeting A1418) are developing the tagged single basepair extension method (TAG-SBE) (Figure 5). The method is based on a generic array of tag-sequences and multiplex minisequencing reaction performed in solution using differentially labeled F-ddNTPs with detection primers each carrying a unique 5 sequence complementary to one of the tag-sequence immobilized on the array. Following the minisequencing reaction, the primers are 40 hybridized on the tagged arrays and the incorporated nucleotides are identified by a multicolour array-reader. The advantage of the TAGSBE or zip-code ligation approach is the generic array design suitable for any set of SNPs. Furthermore, high-density arrays can be applied with DNA-polymerase based allele discrimination. The cost of the genotyping will not be lower than with primer extension directly on the array (Kurg et al. 2000), since same amount of primer synthesis for each new set of SNPs is required and multiplexing capacity in solution reaction is most likely not higher than on arrays as there are more interacting DNA sequences in the reaction mixture. 0XOWLSOH[HGVROXWLRQUHDFWLRQZLWKGLIIHUHQWGHWHFWLRQROLJRQXFOHRWLGHV HDFKFDUU\LQJDQXQLTXHQRQVSHFLILF©7$*ªVHTXHQFH 2/$ 0LQLVHTXHQFLQJ * * & 7 7 $ & $ $JHQHULFDUUD\RI©7$*ª FRPSOHPHQWDU\SUREHV * 7 * 7 7KHVLWHRIK\EULGL]DWLRQLVVSHFLILFIRUDJLYHQYDULDQWORFXVDQGWKH OLJDWHGSUREHLQFRUSRUDWHGQXFOHRWLGHLVLGHQWLILHVWKHDOOHOHDWHDFKVLWH ),*85(3ULQFLSOHVRIWKH´=LSFRGHOLJDWLRQµDUUD\DQG´7$*6%(µ²DUUD\VWUDWHJLHV Summary Hybridization based SNP scoring systems have been applied in sufficiently large array designs to demonstrate their inherent weaknesses for genotype discrimination. The seemingly unavoidable loss 41 of 20-40% of SNP markers/mutations due to an unoptimal assay principle is serious, and the final estimation of throughput of the systems by users is lacking. While enzyme based systems do improve genotype discrimination they have not been utilized in simultaneous assays for hundreds of variants, and a fair comparison of reaction principles is thus not yet possible. Practical alternatives to PCR? Amplification of the target DNA and reduction of its complexity are achieved by PCR. Parallel analysis of SNPs or mutations necessitates the use of multiplex PCR, for which several procedures have been presented (for example, see Chamberlain and Chamberlain 1994, Shuber et al. 1995, Henegariu et al. 1997, Zangenberg et al. 1999). In practice, each application requires separate optimization of multiplex PCR reaction (II, IV, V, Hacia et al. 1998, Cheng et al. 1999). Generalized rules to avoid optimization of individual reactions and to allow higher multiplexing levels have been applied with tolerance to losses of amplifiable genomic fragments (Wang et al. 1998, Cho et al. 1999). Whole genome amplification procedures (ie. Zhang et al. 1992) do not produce equal levels of copies and, more importantly, provide no decrease of complexity. An isolated report of detection of single nucleotide variation in total human genomic DNA using ASO dotblots with competitive hybridization (Wu et al. 1989) is in sharp contrast to high backround hybridization of ASO-probes in Southern blots (Conner et al. 1983). Furthermore, despite the successful detection of point mutations in 250-fold less complex yeast genome by highdensity arrays (Winzeler et al. 1998), reduction of complexity of the human genomic sample is likely to be a prequisite for SNP scoring with oligonucleotide probes. An issue in SNP typing is whether PCR could be replaced with methods providing higher throughput. A few nanograms of genomic DNA is sufficient for routine PCR reactions and 100-1000µg of genomic DNA can be extracted routinely from 10ml whole blood sample (personal comm. Dr. M. Perola, National Public Health Institute, Helsinki, Finland). A single blood sample would thus be sufficient for 105-106 separate PCR reactions, translating into 106-107 SNP genotypes with modest multiplexing. Of the proposed signal amplification procedures, the rolling circle amplification of padlock probes (Lizardi et al. 1998, Baner et al. 1998) has been demonstrated to have the ability to detect single copy 42 sequence in total genomic DNA (Lizardi et al. 1998). Demanding synthesis of full-length, pure 70-90bp padlock probes (Kwiatkowski et al. 1996) and topological factors hindering efficient rolling circle amplification of the target bound probe on solid surfaces (Baner et al. 1998) are challenging practical issues that must be resolved. The Invader Squared Assay has been applied to simplex detection of SNPs (Griffin et al. 1999), and multiplexing should be equally problematic as in PCR, making it uncompetitive with high-throughput PCR-based assays in its current form. The same limitations apply to the other target amplification or signal amplification techniques presented earlier. Furthermore, the Invader assay requires significantly more starting material than PCR. Some experimental techniques with extreme sensitivity, such as fluorescence correlation spectroscopy (Rigler 1995) suffer from high background signals and are not applicable for real-life diagnostic applications. Simplification of the array-based methods could be achieved by coupling the amplification and detection on the solid-surface and several groups are exploring ways to do this. Electronically addressable arrays were used for the anchored SDA reaction, which showed enhanced amplification with electronic addressing of one of the SDA amplification primers to create microamplification zones. Specific amplification of the factor V gene was achieved using 1µg of placental DNA as the starting material (Westin et al. 2000). Localized amplification by PCR on a polyacrylamide film has been done (Mitra & Church 1999). Another on-chip amplification-in situ PCR is termed bridge-amplification, in which forward and reverse primers for each amplicon are co-immobilized at specific sites on the chip [US patent #5,641,658] resulting in double stranded amplicon bridges on the array sites. These in situ amplification techniques remain curiosities, until their utility in true genotyping applications are shown. In conclusion, at least for the first years of large-scale SNP-scoring we will have to rely on the classical multiplex PCR followed by more extensively multiplexed high-throughput detection reactions. Automation and scale-up of existing methods has made the genome sequencing facile (Meldrum et al. 2000). Similar streamlined procedures for PCR and genotyping using submicroliter reaction volumes are likely to be the answer for at least large SNP scoring centers. For example, a single automated 384-well format SNP production line with two microliter PCR reaction volumes and mass-spectrometric detection of primer extension products (Sauer et al. 2000), yields >104 genotypes per day (personal communication - Dr. Ivo Gut, Centre 43 National de Genotypage, Evry, France). Point-of-care testing with targeted SNP or mutation panels might be practical to carry out using integrated miniaturized devices (Jacobson & Ramsey 1995, Cheng et al. 1996). Alternatives to microarrays for multiplexing Color-coded microspheres (Luminex, Austin, TX) with similar degenerate sets of tagged-probes immobilized on their surface as described for zip-code ligation or TAG-SBE can be used to detect solution based multiplex minisequencing products. A flow-cytometric measurement with two exciting lasers, one for the identification of the microsphere color-coding (SNP) and another for identification of the incorporated label (allele) easily resolves up to 50 different mutations in the same mixture (Dr. Allen Roses, Glaxo Wellcome, UK and Dr. Scott White, Los Alamos National Laboratory). Hakala et al. relied on sandwich hybridisation with fluorescein/EDANS labeled beads and lanthanide labeled ASO probes for detection of six different mutations, with detection on a microfluorometer (Hakala et al. 1998). An advantage of these color-coded sphere multiplexing methods is potentially very fast read-out requiring no image analysis. A related, futuristic approach pioneered by PharmaSeq corp. is based on coding the beads by integrating a digital transponder into each bead, which could result in almost infinite sets different reactions in the same mixture [US Patent #5,736,332]. Use of sequence variations in modern human genetics Routine mutation/SNP-scoring Many of the techniques with high SNP-typing capacity could be used to simplify and improve current routine determination of sequence variation. For example HLA typing would be ideally suited for array-based genotyping, as relatively few amplicons span a large number of SNPs in the HLA-region. Unequivocal identification of individuals with tens of SNPs should be feasible (Syvanen et al. 1993, Delahunty et al. 1996). The pharmacogenetic field is an interesting and rapidly evolving one (Evans & Relling 1999), and array-based products are already available to study cytochrome p450 polymorphisms (Affymetrix). Despite the shift of interest towards studying common diseases in the genetic research community, clinical geneticists have now efficient tools for diagnosis, and in some 44 cases, prevention of inherited disorders. Most recessive traits even with high allelic heterogeneity could be diagnosed and screened for using mutation panels on arrays or other highly multiplexed formats. Dominant disorders would generally require gene specific resequencing arrays, which do not yet have sufficient sensitivity for clinical diagnostics. An example of applying novel array-based detection methods for mutation screening in large scale was a populationbased survey of disease gene frequencies in Finland (Pastinen et al. manuscript in preparation). LD mapping of complex traits Linkage disequilibrium mapping of complex diseases is based to an approach where a large number of markers throughout the genome, or selectively flanking candidate loci are typed to identify markers, which are close to the disease causing mutation leading to cosegregation of the markers with the disease trait in families. Alternatively, sporadic disease cases can be analyzed with matched controls in association analysis based studies (Risch and Merikangas 1996), which ignited suggestions for large scale screening for SNPs (Collins et al. 1997). A complication of such analysis with complex traits is the old age of the predisposing mutations, which decreases the extent of linkage disequilibrium between the marker and disease allele (Laan & Paabo 1997). Suggestions for the number of SNP markers required for genomewide LD-mapping of complex traits has varied from 30,000 (Lonjou et al. 1999) to 500,000 (Kruglyak 1999). Recent studies of genomic diversity distribution in different populations using systematic genomic sequencing (Nickerson et al. 1998, Clark et al. 1998, Rieder et al. 1999) or SNP-typing (Goddard et al. 2000, Moffatt et al. 2000) indicate difficulties for designing and interpreting LD-scans or SNP-based association studies. The results show highly variable distribution of LD between different genes, between different parts of the same genes and also between different populations. Cataloguing SNPs in the coding regions of candidate genes has been considered to be rational to approach complex genetic traits, as one could expect that these are more likely to be functionally significant (Cargill et al. 1999, Halushka et al. 1999). Carefully designed studies with a good definition of the phenotype, a large sample size and sufficient marker density are required to settle the dispute of the usefulness of SNPs in dissection of complex traits. 45 Currently, there are nearly 30,000 human genomic biallelic polymorpisms (www.ncbi.nlm.nih/dbSNP/) in the public domain. This number will surge to hundreds of thousands of SNPs in the near future by large scale non-targeted (Masood 1999) and gene targeted SNP discovery projects (Cargill et al. 1999, Halushka et al. 1999, Buetow et al. 1999). While it is unlikely that many (or any) of the groups studying complex diseases will initiate a whole genome association study, requiring >>104 SNPs typed from each sample, it is clear that the number of SNPs required for refined mapping or candidate gene association studies will be in the order of hundreds to thousands. This increases the number of genotypes per study by two to three orders of magnitude as compared to most of the association studies to date. Some of the techniques described above have the potential to accomplish this with reasonable cost. In the next few years we will be analyzing the genomic architecture of individual genes and whole genome with unparalleled precision. Consequently, we are likely to witness many exciting and important discoveries on genetic predisposition to human disease traits similar to identification of the common genetic risk factors for venous thromboemboli (Bertina et al. 1994) or protection against HIV-1 infection conferred by a single deletion allele (Liu et al. 1996). 46 AIMS OF THE PRESENT STUDY 1) To develop techniques for multiplex analysis of human DNA sequence variation based on the primer extension reaction principle. 2) To apply this technology to studying disease causing mutations and genetic variation in the Finnish population. 47 MATERIALS AND METHODS DNA samples and extraction of DNA Lymphoblastoid cell lines characterized by the 10th International HLA Workshop and anonymous Finnish individuals were sampled for development of the HLA-DQA1 and DRB1 genotyping system (I). Several clinicians and researchers provided samples of known carriers and patients for development of mutation screening panels (II,V). Participants in the FINRISK 1992 study (Vartiainen et al. 1994) were analyzed for characterization of polymorphisms related to myocardial infarction (see III for inclusion criteria). DNA from HIV-1 infected Finns had been collected between 1990 and 1996 and anonymous healthy controls from different parts of Finland were used to study the MBL and CCR5 polymorphisms (for details see IV). DNA from laboratory personnel and anonymous blood donors along with samples from the Finnish twin registry were used as the unknown samples for evaluation of the mutation panel (V). DNA was prepared by a standard phenol-chloroform extraction method (Bell et al. 1981) in all studies except IV, in which a rapid lysis method was used (Higuchi 1989). The concentration of DNA was determined spectrophotometrically (Sambrook et al. 1989). Primer synthesis An Applied Biosystems 392 DNA synthesizer (Foster City, CA) with standard phosphoramidite chemistry was used for synthesis of primers in I-II, and oligonucleotides were used without purification. Primers were purchased from Interactiva Biotechnologie GmbH (Ulm, Germany) in all the subsequent work (III-V) and had been HPLC purified with the exception of the amino-modified primers. PCR amplification Dynazyme II DNA polymerase (Helsinki, Finland) with a manual hot-start procedure was used in all PCR reactions in I-II using an MJ Research PTC-100 thermal cycler (Watertown, MA). In the subsequent PCR amplifications (III-V) chemically modified Amplitaq Gold DNA polymerase (Perkin Elmer, Branchburg, NJ) was used, and the thermal cycling carried out in thin walled 96-well plates using MJ Research PTC-225 thermal cycler. Each amplicon had one non-biotinylated and another biotinylated primer to enable affinity capture for preparation of ssDNA (I-IV). T7-RNA polymerase promotor sequences were included 5 to the gene specific sequence in one primer of each pair to 48 enable in vitro RNA transcription (V). Multiplex PCR reactions (II-V) were optimized according to signal intensities obtained in the genotyping reactions on the arrays. Modification of the primer concentrations in the individual reaction mixes, ordering of the primer pairs in different groups and replacement of individual primer pairs were done to optimize multiplex amplification while the other amplification reaction parameters were kept constant. To facilitate multiplex amplification reactions, common 5tail sequences were added to the primers, DNA-polymerase concentrations were increased, and longer extension times (II-IV) or a touch-down PCR cycling procedure (V) were used. Affinity capture and ssDNA preparation For HLA typing (I) the combined PCR products were captured on streptavidin-coated manifolds (Amersham-Pharmacia Biotech, Uppsala, Sweden) (I, Lagerkvist et al. 1994). For genotyping on arrays by minisequencing the biotinylated PCR products were captured on streptavidin-coated polystyrene beads (Idexx Research Products, Westbrook, ME) (II-IV) and for reference genotyping by solid-phase minisequencing they were captured (Syvänen et al. 1990) in streptavidin coated microtiter wells (Labsystems, Helsinki, Finland) (III, V). Following alkaline denaturation the captured strand was subjected to primer extension in the gel-based multiplex (I) and standard minisequencing method, while the eluted strand served as the template in minisequencing on DNA-arrays with immobilized primers (II-IV). In V dsDNA served as template. Electrophoretic separation Labeled minisequencing primers and group specific PCR products were separated in an ALF automated sequencer (AmershamPharmacia Biotech). The streptavidin-coated manifolds were directly inserted into the wells of a 10% Hydrolink polyacrylamide gel (Long Ranger, AT Biochem, Malvern, PA), in which the minisequencing detection primers were released by denaturation. The gels were run for 55-65 min and reloaded up to six times. The results were interpreted using the ALF Fragment Manager version 1.1 software (Pharmacia Biotech). 49 Preparation of microarrays Microscopic glass slides with teflon lined wells (Erie Scientific, Portsmouth, NH) were treated essentially as previously described (Lamture et al. 1994) to yield an epoxysilanized surface (II-IV). A modification of an isothiocyanate activation method (Guo et al. 1994) was applied on standard microscope glass slides (V). Diluted solutions (20 µM of each primer) of the NH2-modified oligonucleotides were prepared in 0.1 M NaOH or 0.4 M NaHCO3 (pH 9.0) and immobilized on the epoxysilane of thiocyanite surface, respectively. The oligonucleotide solutions were spotted in an array format either manually (II) or with a contact printing robot (III-V). Initially (III-IV) custom-made tweezer like printing pins (Shalon et al. 1996) were used on a modified Isel EP 1090/4 XYZ robot (Eiterfeld, Germany), which was replaced (V) by the faster Isel Automation Flachbettanlage 2 robot equipped with two TeleChem CPH-2 (Sunnyvale, CA) printing pins. One detection primer for each mutation or SNP to be detected was immobilized on the minisequencing arrays (II-IV), while two allele specific primers for each site were required for allele specific extension arrays (V). The spot diameter was 300µm with the custommade pin or 125-150µm with the CPH-2 pin; spotting density was 200 (III-IV) or 2000 spots/cm2(V), respectively. The slides were stored up to 2 months in -20 - -70oC. Genotyping reactions Multiplex fluorescent minisequencing was carried out on the ssDNA templates immobilized to the manifold supports using either T7 DNA polymerase at +37 oC or ThermoSequenase DNA polymerase at +50 oC (both from Amersham-Pharmacia Biotech) with fluorescein labeled ddNTPs (NEN Dupont, Herts, UK) in four parallel slots for each sample. Following non-stringent annealing of ssDNA to the arrays the optimized (II) reaction mixture included DyNASeq DNA-polymerase (Finnzymes) and the four 33P-labelled ddNTPs (Amersham-Pharmacia Biotech)in parallel wells of the slide and the reaction was allowed to proceed at +60-65 oC for 1-15min (II-IV). The optimized (V) allelespecific extension reaction coupled the template preparation and detection reaction. The reaction mixture contained minute amounts of the combined, T7-tailed multiplex PCR product, T7 RNA polymerase, MMLV reverse transcriptase, all ribonucleotides, unlabelled dATP and dGTP, and CY5/CY3-labelled dUTP and dCTP (Amersham-Pharmacia Biotech). Trehalose was included in the reaction buffer to stabilize 50 and activate the enzymes (Carninci et al. 1998) at +52oC and the reaction time was 45-90min. Quantitation and interpretation of the results Fluorograms showed well resolved patterns of peaks corresponding to particular DQA1 genotype or DQA1 DRB1 subgroup genotypes in the multiplexed minisequencing procedure based on electrophoretic separation of the extended primers (I). The minisequencing arrays were exposed to an imaging plate for 30min-2h following the reactions (II-IV), scanned with 100µm resolution in a Fuji BAS-1500 phosphorimager (Kanagawa, Japan) and signal intensities at each spot were quantified using Tina 2.10 software (Raytest, Straubenhardt, Germany). The fluorescently labeled allele-specific extension reaction results were scanned using a ScanArray 4000 system (GSI Lumonics, Watertown, MA) with 5µm resolution, using excitation at 630nm, and emission at 670nm for CY5, and at 540nm and 570nm, respectively, for CY3 (V). The results were quantified using the Scanalyze 2.44 software (Michael Eisen, Stanford University, CA). The genotypes were determined for each site by calculating the ratio between signal intensity at the nucleotide (or probe) corresponding to one allele by that of the intensity at the nucleotide (or probe) corresponding to the other allele. The ratios fell into three distinct clusters, high for homozygotes at one of the alleles, approximately one for heterozygotes, and low for homozygotes for the second allele (II-V). For detection of minor mutations, a dual colour approach was used. A known sample was typed on the same array as the test sample using different fluorophores for extending each sample. The results of test sample were then normalized according to the signal intensities of the known control sample. Reference methods In most cases requiring genotype validation or verification (III, V) a standard solid-phase minisequencing procedure (Syvänen et al. 1990, 1992) was used. In some (the NKH and LPI mutations, V) PCRRFLP digestion was done as described (Kure et al. 1992, Torrents et al. 1999). For Salla disease mutation (Verheijen et al. 1999) an allele specific PCR reaction reaction (Wu and Wallace 1989) with an internal control amplicon was used. 51 Statistical methods The statistical significance of differences in marker allele frequency distribution between cases and controls were calculated by the c2 test or Fishers exact test (III, IV). In study III, the combined effect of GpIIIa and PAI-1 were analyzed by comparing individuals carrying 3 or 4 alleles of PlA2 or 4G with those having only 1 or 0 such alleles. A logistic regression model including age, sex, total cholesterol, HDL-cholesterol and triglyceride levels, body mass index (BMI) and smoking was utilized to control for environmental effects. HardyWeinberg equilibrium was analyzed by Genepop 1.0 software (III-V). 52 RESULTS AND DISCUSSION In the following chapters the results of this thesis are presented and discussed within the framework of closely related work by other groups. Design of assays Length-labeled multiplex fluorescent minisequencing A primer pair for amplifying the second exon of the DQA1 gene was designed. Due to the extensive allelic heterogeneity of the DRB1 gene a two-step strategy for subtyping DR2 alleles was used. Allele specific PCR reactions divided the DRB1 alleles into seven subgroups identified by their size (Westman et al. 1993). Initially six sites identifying 10 alleles were chosen disregarding overlap of some detection primers, and in each case only one of the overlapping primers performed acceptably. The selection of sites was thereafter modified to allow discrimination of the alleles with nonoverlapping primers (I), and the same design was applied for the DRB1 subtyping primers. The minisequencing primers had 18-21 bp of gene specific sequence and random 5 tails to distinguish each primer by its length. The DR2 subgroup typing primers were designed similarly for the three sites identifying the five alleles. The sizes of the detection primers were 18-42 bp with 3bp difference between primers. The target bound to an avidin-coated manifold was rendered single-stranded and multiplex minisequencing was carried out in four parallel reaction wells using fluorescein labeled ddNTPs, followed by size separation and detection of the extended primers on the A.L.F. sequencer. Size separation of multiplex minisequencing products was first proposed by Krook et al. (Krook et al. 1992), who used three polymorphisms in the glucose transporter and insulin receptor gene. In this study 32P-dATP or dCTP were incorporated in the minisequencing reaction. Four color detection on PE-ABI sequencers have been utilized to detect HPRT mutations (Shumaker et al. 1996), mitochondrial SNPs (Tully et al. 1996) and mouse SNPs (Lindblad-Toh et al. 2000). The use of capillary electrophoresis for detection of multiplexed minisequencing products has also been suggested (Piggee et al. 1997). Detection of short dinucleotide repeats with two minisequencing primers is also possible (Tully et al. 1996), but the 53 approach presented is not generally applicable as the repeat size has to be shorter than the oligonucleotide, and furthermore, multiplexing of dinucleotide repeat minisequencing is not feasible. We have also designed a multiplex minisequencing assay for genotyping the common pharmacogenetic polymorphisms in CYP2D6 and CYP2C19 genes, essentially as described for the HLA-typing (Pastinen et al. 1999). This assay has proven flexible, and has recently been extended for typing several additional polymorphisms in the CYP2D6 and NAT2 genes (Sitbon and Syvanen, in press). Generally, all the multiplexed primer extension assays have similar designs avoiding overlapping detection primers, and having a size difference of two or more bases generated by synthesizing a non-specific 5tail sequence. Also, all the assays are based on affinity capture of the PCR products followed by solid-phase minisequencing on the ssDNA targets. Exiting possibilities of multiplexing minisequencing with size addressing is now offered by the exquisite resolution and accuracy of MALDI-TOF mass detection (Ross et al. 1998). Minisequencing primer extension arrays Minisequencing detection primers designed to interrogate different mutations or SNPs can be addressed to discrete sites on a solid surface. After target annealing and extension with labeled ddNTPs the genotype at each site can be read based on the identity of the incorporated label. Initially we immobilized detection primers for mutations occurring in Finland, which ranged from single substitutions to small and large deletions (I). The immobilization was achieved via a 5amino group (Lamture et al. 1994) and non-specific spacer tails separated the primers from the surface to enhance annealing (Guo et al. 1994). Standard minisequencing primer design with 19 to 22 bp of gene specific sequence was used. Multiplex PCR was performed with a biotinylated primer and ssDNA was eluted for analysis after affinity capture. The target was annealed on 4 parallel arrays, and the extension reactions using thermostable DNA polymerase with 33P-labelled ddNTPs were carried out followed by detection on a phosphor-imager. A similar procedure was applied to common sequence variants of the MBL and CCR-5 genes (III), and SNPs related with cardiovascular disease (IV). Analogous procedure for inisequencing primer extension on a glass surface with 32P-dNTPs was shown to be possible in a nonmultiplexed format in model experiments for detection of 5bp sequence in of the HPRT gene (Shumaker et al. 1996). A tiled array 54 design to scan a 33-bp stretch of the p53 gene also employed four reactions in parallel using FITC-ddNTPs as terminators, followed by alkaline phosphase mediated generation of a fluorogenic substrate (Head et al. 1997). The tiled design demonstrates the significant advantage of having primers rather than templates immobilized to a solid-phase (Syvanen et al. 1990, I), as closely occuring SNPs or mutations can be interrogated in a multiplexed format (III). This limitation by competitive binding of primers in solution applies to the TAGSBE approaches in which the reaction takes place in solution (Hirschhorn et al., ASHG 1999 Meeting). Direct detection minisequencing extension on an array using fluorescently labelled ddNTPs in four parallel reactions has been achieved using fluorescein (Dubiley et al. 1999) or TAMRA labels (Raitio et al., manuscript). Recently, the use of four different dye terminators in array minisequencing in a single reaction chamber was demonstrated (Kurg et al. 2000). The fidelity of a DNA-polymerase in single base extension is the basis for allele discrimination in array minisequencing assays, and remarkedly similar designs of the assays by different groups is notable. On the contrary ASO hybridization based array assays have highly dissimilar designs. Various approaches to generalize design of ASO-probe array for mutation detection have been suggested. Highly redundant probe sets (Cronin et al. 1996, Wang et al. 1998, Cho et al. 1999), individual probe optimization (Guo et al. 1994), monitoring of melting curves of the hybrids (Drobyshev et al. 1997), stacking hybridization (Yershov et al. 1996), use of chaotropic salts in the hybridization buffer (Nguyen et al. 1999) and utilization of an electronic charge changer (Gilles et al. 1999) have been described to overcome the inherent limitations of multiplex discrimination of genotypic variation by ASO-hybridization. Allele specific extension arrays DNA-polymerase extension is hindered by 3 mismatches in the primers. This property has commonly been applied for allele specific PCR (Wu and Wallace 1989, Sommer et al. 1989, Newton et al. 1989), but due to the exponential nature of the PCR reaction, even limited mismatched extension generally requires careful optimization of reaction conditions and multiplexing is difficult (Ferrie et al. 1992). We designed arrays with one primer having 3-end complementary to one allele and another primer with 3-end complementary with the other allele for 31 mutations and 9 SNPs (V). Multiplex PCRs with each amplicon having one primer with 5 T7 RNA-polymerase pro55 moter sequence were carried out. RNA targets were generated from the PCR amplicons along with reverse transcriptase extension of the allele specific primers directly on the arrays. Incorporation of CY5 or CY3 labeled dNTPs were detected by a confocal epifluorescence reader. This approach utilizes the fidelity of reverse transcriptase in discrimination against terminal mismatches. Allele specific extension rather than minisequencing is used because reverse transcriptases do not incorporate ddNTPs with sufficient sequence specificity (II). In a pairwise comparison of minisequencing and allele specific extension, a notably higher (100-fold) amount PCR products was required to achieve sufficient signal-to-noise for genotype calling in minisequencing (Pastinen et al., unpublished). Allele specific extension of primers immobilized on arrays has also been shown to be feasible for seven b-globin mutations by using fragmented DNA targets, thermostable DNA-polymerase and fluorescently labelled ddNTPs (Dubiley et al. 1999). Optimization of genotype discrimination Throughout the work presented in this thesis the goal was to create assays which discriminate homozygote and heterozygote genotypes simultaneously for several loci in parallel with procedures that would be practical to apply to genomic DNA. No artificial templates such as synthetic oligos and cloned DNA fragments were used in the work to establish the method, as assays will in most cases perform differently with amplified genomic targets containing possible non-specific fragments and other reaction components. Length-labeled multiplex minisequencing assays T7-DNA polymerase incorporates fluorescein labelled ddNTPs with different efficiencies, a feature which was only partly alleviated by the use of Mn2+ in the reaction buffer, requiring that different concentrations of the nucleotides had to be used in the reaction mixture. Our preliminary results indicated that carrying out the extension reaction with a modified thermostable polymerase, ThermoSequenaseTM, would require less optimization of individual terminator concentrations (I), but further experience with another application (Pastinen et al. 1999) did not support this. Also detection primer concentrations were adjusted for each individual site. Similar optimization steps have been used in two other multiplex minisequencing applications (Shumaker et al. 1996, Tully et al. 1997). Less optimization is apparently required in procedures utilizing linear 56 amplification of minisequencing extension by thermal cycling (Lindblad-Toh et al. 2000, Ross et al. 1998). Array-based assays Studies on the hybridization behavior on solid glass supports had indicated that sufficient spacing from the surface is required to achieve efficient annealing of the template on the immobilized probes (Guo et al. 1994). We tested detection primers containing a 15mer dT-spacer, and 15-25-mer detection primers without spacer tails. Significantly improved minisequencing extension efficiency was observed with the tailed probes (T.P. unpublished observations), and thus all the subsequent array assays utilized detection primers with a 15- or 9-mer dT-tail 5 to the gene specific sequence. An epoxysilanization procedure (Lamture et al. 1994) to immobilize the aminated detection primers was initially applied (II-IV). In a comparison of different immobilization chemistries (Lindroos K., M.Sc. thesis, Helsinki University of Technology, 1998) an isothiocyanate activation procedure (Guo et al. 1994) to immobilize detection primers was found to be superior for the minisequencing extension efficiency, and a modified immobilization procedure on thiocyanate surfaces was applied thereafter (V). Primer extension on arrays (Head et al. 1997) using disulfide modified detection primers immobilized to mercaptosilanized glass slides via a disulfide bond exchange reaction (Rogers et al. 1999) has also been successful. This chemistry limits the use of reducing agents in the reaction mixture, and is thus not suitable for use in the allele-specific extension reaction (V) because the RT enzyme requires dithiotreitol. Both in minisequencing and allele specific extension has been demonstrated on detection primers immobilized to acrylamide gel-pads (Dubiley et al. 1999). The 3dimensional gel-pad enables higher loading of the detection primer compared to a plain glass surface, but the accessiblity of the target molecules and enzymes into the probes embedded in the gel matrix may be hampered, and would require pairwise comparison of the approaches to be evaluated. A non-thermostable DNA-polymerase was applied in the first description of minisequencing on glass surface (Shumaker et al. 1996). We compared several different enzymes for their extension specificity in multiplexed minisequencing on DNA-arrays (II), and specific extension using 33P-ddNTPs was only achieved with modified thermostable DNA-polymerases. The significant decrease in mis57 matched extension at high reaction temperatures is likely to be based on a lower degree of secondary structure formation (Mir et al. 1999) of the probe immobilized on the array accompanied by destabilization of non-specifically annealed targets to the detection primers. Importantly, the use of RNA targets in minisequencing assays is hindered by the high misincorporation rates of ddNTPs by reverse transcriptases (II), but RNA targets were shown to be suitable in the allele specific extension assay. MMLV enzyme with low RNAse H activity and processivity in the allele specific extension procedure was found to yield better genotype discrimination and minimal template independent extension compared to other reverse transcriptases (V). Excellent genotyping results were obtained at reaction conditions employing high temperature and trehalose in the reaction buffer to stabilize and activate the enzyme at the elevated reaction temperature (Carninci et al. 1998). Assay procedures The key considerations in development of multiplexed genotyping assays are that the procedure should consist a minimal number of steps and that it should allow complete automation. Several detection technologies possess a very high capacity, including fluorescent read-out from high-density arrays (Lipshutz et al. 1999) and MALDI-TOF spectrometry (Griffin and Smith 2000). Template preparation requiring PCR amplification and possibly concentration, purification and inactivation of certain reaction components is often overlooked. A second consideration is the general accessiblity of the method, since proprietary technologies will by necessity increase the cost of the assays and limit their use to large centers. It is now clear that SNP typing on a large scale will be of central importance in human genetics in the years to come and methods should thus preferably be applicable in many molecular biology laboratories. An illustration of the importance of accessibility is the technology for cDNA array expression analysis (Brown and Botstein 1999), which has spread in the scientific community due to active support of its dissemination by the original developers (Futcher 1999). Multiplex PCR By far the most labor intensive aspect of multiplexing the genotyping by primer extension assays is the optimization of concurrent amplification of several genomic fragments in a single reaction vessel. The applications presented here were targeted at distinct 58 polymorphisms or disease mutations, and it was not acceptable to discard loci that were diffult to amplify, unlike the situation in random genome-wide mapping by SNPs (Wang et al. 1998, Cho et al. 1999). In multiplex PCR the goal is to unify the character of different amplicons to allow their concurrent amplification (for a review, see Zangenberg et al. 1999). Multiplex amplification requires high concentrations of numerous synthetic oligonucleotides in a single reaction mixture, which promotes the generation of non-target derived amplification products or primer-dimers (Chou et al. 1992). The formation of primer-dimers can be reduced by avoiding exposing the reaction to low temperatures (for example when setting up the reaction), which is in practice achieved by adding enzyme or nucleotides only after the reaction reaches high temperatures. A general improvement of multiplex PCR perfomance was introduced by chemically modified DNA-polymerase (AmpliTaq GoldTM), which is inactive prior to thermal activation at +95 oC. Having amplicons of similar and preferably small size has been reported to yield well working multiplex PCRs (Wang et al. 1998), in practice design of very short amplicons is often prevented by the local nucleotide sequence. We attempted to limit the amplicon size to 80-200bp, with some exceptions - for example, in Batten disease, where ALU-repeat sequences flank the common mutation. A strategy to limit primer-dimer formation is to design primers with identical dinucleotides at their 3ends (Zangenberg et al. 1999), which is a criterion difficult to combine with small fragment size. In our hands there was no significant increase in multiplex amplification success by grouping amplicons according their 3end primer sequences (T.P. unpublished observations). The use of common nonspecific 5tails in PCR primers (Shuber et al. 1996), and increased concentrations of DNA-polymerase (Chamberlain and Chamberlain 1994) improved our multiplex amplification efficiency, while modification of MgCl2 or dNTP concentrations did not. Two to nineplex PCRs were set-up, applying common 5tails, increased enzyme concentrations and longer extension times along with modification of the primer concentrations of individual amplicons according to their signal intensities in assays on arrays. Similar success has been reported by others analyzing distinct sets of mutations or polymorphisms (Chamberlain et al. 1988, Hacia et al. 1998). Length labeled primers for multiplex minisequencing Solid-phase minisequencing (Syvanen et al. 1990, 1992) is based on immobilisation of amplified templates in microtiter plate wells, followed by alkaline denaturation and minisequencing on the ssDNA 59 template, and finally detection of extended primers in a scintillation counter. A limitation of the method is the requirement of a separate reaction well for each allele to be scored, and thus only 48 genotypes can be obtained per microtiter plate. Doubling the throughput of the method can be achieved by carrying out the extension with two differentially labeled ddNTPs. The multiplex fluorescent minisequencing method for genotyping six or nine SNPs in HLADQA1 and DRB1 genes (I) was based on a convenient streptavidin coated manifold support (Lagerkvist et al. 1994) for capturing amplified products. The manifold support minimized the number of pipetting steps in the procedure, and simplified loading of the gel on an automated sequencer. The extension was carried out with four fluorescein labelled ddNTPs in parallel, which were then detected on four parallel lanes of an automated sequencer. The disadvantage of separating the four nucleotides into parallel reactions, rather than carrying out extension with four different fluorophores in the same reaction followed by electrophoretic analysis in a single lane (Shumaker et al. 1996, Tully et al. 1996, Lindblad-Toh et al. 2000), was alleviated by the easy handling of the samples and less than 60 min separation time in reloadable gels. A single operator could perform genotyping of 100 samples a day, translating into 900 genotypes with the described HLA-typing system, which was a significant increase in through-put as compared to standard solid-phase minisequencing. Interpretation of the results was carried out using a software for fragment analysis, in which a threshold level of peak height was set followed by recording of the simple sequence patterns unequivocally determining alleles. As the electrophoregrams produced in the process are relatively simple to interpret, the automation of genotype scoring should be feasible by modifying the base-calling software used in standard sequencing. The multiplexing capacity of the system is limited by synthesis of oligonucleotides of different length extended in the reaction. Allele-scoring should be robust with 2-bp primer spacing, which would provide a 4-fold increase in throughput when combined with different incorporating nucleotides under same detection primer length (eg. A to C, G to T transversions could be analyzed with a single oligomer length). Three reports on using multiplex solid-phase minisequencing and four different fluorophores (Tully et al. 1996, Shumaker et al. 1996, Lindblad-Toh et al. 2000) apply magnetic streptavidin-coated beads for immobilisation of templates and procedures include multiple washes, centrifugation and concentration steps all limiting the throughput without extensive automation. Also the electrophoresis 60 time of similar detection primer pools as used in our HLA-typing procedure is reported to be twice as long in the four-color sequence analyzer (Morley et al. 1999, Lindblad-Toh et al. 2000). An attractive alternative for high-throughput genotyping would be combining a manifold-format solid support with the four-color multiplex minisequencing reaction separated on a standard four colour sequencer with 96-lanes, a single operator could produce up to 104 SNP genotypes per day a significant number by any measure at present. Array-based extension assays We next developed primer extension on DNA-arrays to achieve higher multiplexing potential and to avoid gel electrophoretic separation. Initial experiments were carried out on arrays prepared by manual spotting of previously synthesized detection oligonucleotides on the activated glass surfaces (II). Better reproducibility, miniaturization and larger production scale of arrays became possible following construction of custom-built printing robots (P. Niini M.Sc.thesis, Helsinki University of Technology 2000). The first printing robot was an industrial robot with XYZ range of movement and had a single tweezer-like pin (Shalon et al. 1996) custom made from stainless steel according to a model pin (kind gift from Dr. Mark Schena, Stanford University, CA). The first pins produced spots of 300 to 500µm in diameter and the performance varied between different pins. Nevertheless, the robotic spotting enabled evaluation of the array-based minisequencing in large number of samples (III,IV). Commercial printing pins (Telechem, Sunnyvale, CA), a faster printing robot with two printheads, and improved activation chemistry for immobilization of aminated primers further improved the throughput of the array production (V). Only one or two primers are immobilized for each mutation to be detected, thus the simple spotting robotics provides a large number of arrays at relatively low cost and sufficient speed. Use of dsDNA targets in minisequencing on arrays resulted in low extension yields (II), and ssDNA targets prepared by affinity capture and elution of the other target strand were thus used (II-IV). An alternative mean to produce sufficient signal intensities in array-based minisequencing is through fragmentation of the PCR products along with alkaline phosphatase treatment to inactivate dNTPs followed by concentration of the targets (Kurg et al. 2000, Raitio et al. manuscript). The ssDNA was anneled to the primer arrays, followed by a separate extension reaction in four parallel teflon-lined glass surfaces contain61 ing identical primer arrays (II-IV). To facilitate analysis of a large number of samples the well spacing was compatible with multichannel pipettors. The incorporation of 33P-ddNTPs was quantified by a phosphorimager after a short exposure. Signal ratios generally fell into distinct categories with 5 to 100-fold differences in ratios between homozygous and heterozygous genotypes. Obvious limitations of the system are that the use of 33P-ddNTP labels requires parallel wells for the four nucleotides; and spatial resolution is lower than with epifluorescence detection preventing the use of medium density arrays (³1000 spots per cm2). The advantages offered by 33P-labeling are high sensitivity (Raitio et al., unpublished) and speed of analyses as a 5min scan with 100µm resolution is sufficient to analyze up to 48 slides. Radiolabeling also avoids the problems encountered with incorporation by DNA-polymerases of nucleotide analogues with bulky fluorescent reporter groups (Plaschke et al. 1998). A relatively large number of liquid handling steps and significant amounts of multiplex PCR products are needed to achieve a sufficient detection sensitivity after single-nucleotide primer extension on the arrays. This necessitates precipitation of the template DNA. To avoid post-PCR sample preparation, and to increase the sensitivity of the system, an allele-specific extension procedure was devised. In this system, the template preparation is performed concurrently with the extension reaction with the aid of T7-RNA polymerase to generate single stranded RNA targets, while also increasing the copy number of templates (V). The reaction procedure compares favourably in its speed and simplicity to any described array genotyping procedures as illustrated in figure 6. Detection is based on fluorescence labelling with a single fluorophore. If two fluorophores are used, one of them may serve as an internal control enabling detection of mutations representing 5% of the target sequences (V). The developed reaction format with silicon rubber grids forming 80 separate reaction wells on one microscopic glass slide allows detection of up to 24,000 genotypes from a single slide at current spotting densities. Currently, the rate limiting step for the genotyping is the detection of extension signals using a confocal epifluorescence reader followed by signal quantitation with software designed for expression array analysis. Algorithms to automate signal quantitation and genotype scoring are now needed to fully exploit the capacity of the genotyping procedure. 62 $62K\EULGL]DWLRQ 0XOWLSOH[3&5ZLWK 77SURPRWRUWDLOHG SULPHUV 6HFRQGODEHOOLQJ3&5 UHDFWLRQZLWK77 SULPHUVDQGELRWLQG873 0LQLVHTXHQFLQJ 0XOWLSOH[3&5G773 SDUWO\UHSODFHGE\G873 $OOHOHVSHFLILFH[WHQVLRQ 0XOWLSOH[3&5ZLWK 77SURPRWRUWDLOHG SULPHUV )UDFWLRQDWLRQE\'QDVH, RU81*JO\FRVLGDVHDQG LQDFWLYDWLRQRIG173VE\ DONDOLQHSKRVSKDWDVH &RPELQHGWHPSODWHSUHSDUDWLRQ DQGJHQRW\SLQJUHDFWLRQ RQWKHDUUD\VXVLQJ51$SRO DQGUHYHUVHWUDQVFULSWDVH 3XULILFDWLRQDQG 3UHFLSLWDWLRQRIWKH FRQFHQWUDWLRQRISURGXFW IUDJPHQWHGWDUJHW +\EULGL]DWLRQ 6FDQQLQJZLWKRQHZDYHOHQJWK 0LQLVHTXHQFLQJUHDFWLRQLQ IRXUSDUDOOHODUUD\VIRXU GLIIHUHQWLDOO\ODEHOOHGGG173V 6WDLQLQJ 6FDQQLQJZLWKRQHIRXU 6FDQQLQJZLWKRQH ZDYHOHQJWKV ZDYHOHQJWK ),*85(&RPSDULVRQRISURFHGXUHVIRUPXOWLSOH[HG613VFRULQJRQ'1$DUUD\V 7KH$62SURFHGXUHLVSUHVHQWHGDVGHVFULEHGE\:DQJHWDO:DQJHWDOPLQLVHTXHQFLQJSURFHGXUHLVSUHVHQWHGDV FDUULHGRXWE\5DLWLRHWDOPDQXVFULSWLQSDUHQWKHVHVWKHSURFHGXUHRI.XUJHWDO.XUJHWDOLVVKRZQ7KHDOOHOH VSHFLILFH[WHQVLRQSURFHGXUHLVDVGHVFULEHGLQ91RWDEO\DVLQJOHOLTXLGKDQGOLQJVWHSSRVW3&5LVVXIILFLHQWIRUDOOHOH VSHFLILFH[WHQVLRQRQ'1$DUUD\VHQDEOLQJKLJKWKURXJKSXWJHQRW\SLQJZLWKRXWDXWRPDWLRQ)XUWKHUPRUHWKH51$SRO DPSOLILHVWKHWHPSODWHIXUWKHUDOORZLQJPXOWLSOH[3&5UHDFWLRQZLWKPLQLPDOUHDFWLRQYROXPHVDVOHVVWKDQORIWKHSRROHG PXOWLSOH[3&5UHDFWLRQLVUHTXLUHGIRUJHQRW\SLQJ Applications HLA typing The method for HLA-DQA1 genotyping and DRB1 group-specific typing, followed by DR2-group subtyping was initially evaluated in 42 lymphoblastoid cell lines of the 10th International HLA workshop (Kimura et al. 1992) and in 42 anonymous Finnish samples of known HLA genotypes. A complete agreement of the typing results in these controls along with high success rate allowed us to proceed to use the developed method for genotyping affected offspring and both parents from 110 families with multiple sclerosis (MS) (Pastinen et al., unpublished). The DQB1 genotyping had previously been carried out by a standard PCR-ASO method (Kimura et al. 1991). The strong LD in the HLA-region is evident by occurrence of only certain haplotype combinations of DQA1-DQB1-DRB1 and in our genotyping no exception to the previously characterized haplotype combinations was 63 seen. The HLA-association of multiple sclerosis was first described nearly 30 years ago with cellular typing techniques and later refined by DNA typing to the single haplotype DQA1*0102-DBQ1*0602DRB1*1502 (reviewed by Hillert 1994). In our MS cohort the frequency of this haplotype was 31% in MS affected individuals and 18% in parental non-transmitted chromosomes, confirming the significant HLA association (p=0.003, two-sided Fishers exact test, Pastinen et al. unpublished). Further evaluation of other MS-susceptibility loci in the extended patient-parent trio material is on-going, and HLA-typing will be used to stratify patient material in studying this phenotypically diverse disorder (Tienari et al., unpublished). Minisequencing with length labelling is in commercial use (PGL Laboratories, Uppsala, Sweden), for pharmacogenetic genotyping of SNPs (Pastinen et al. 1999, Syvanen and Sitbon in press) affecting the metabolism of certain drugs (Linder et al. 1997). Screening for mutations and SNPs We initally demonstrated proof-of-principle of the array-based minisequencing method using a system of nine different mutations in the Finnish population by applying it to 14 genomic samples (II). Most methods for screening known SNPs on microarrays have, in fact, been presented only at such a limited scale or even validated with synthetic templates (Yershov et al. 1996, Shumaker et al. 1996, Drobyshev et al. 1997, Dubiley et al. 1999, Tang et al. 1999, Gerry et al. 1999, Kurg et al. 2000). Importantly, we compared perfomance of the minisequencing primer extension to ASO hybridization under six different hybridization conditions. The results of this comparison showed clearly better discrimination of genotypes by DNA-polymerase assisted minisequencing compared to the ASO-hybridization method (Table 4). 64 Table 4. Power of genotype discrimination by minisequencing vs. ASO hybridization on DNA-arrays. 0(7+2' 352%(6 $1' 7$5*(76 6,*1$/5$7,26)5201250$/$1' 32:(52) 087$17$//(/(6 *(127<3( +RPR]SRVLWLRQV +HWHUR]SRVLWLRQV ',6&5,0,1$7, 21 33777 )9** 3377$ )9*$ 337 )9 0LQLVHT PHU '1$WDUJHW $62 PHU 51$WDUJHW $62 PHU 51$WDUJHW 2QO\SDUWLDOUHVXOWVRIWKHFRPSDULVRQDUHVKRZQLOOXVWUDWLQJWKHVWDQGDUG PLQLVHTXHQFLQJFRQGLWLRQVDQGWKRVH$62K\EULGL]DWLRQFRQGLWLRQVLQZKLFKWKH EHVWGLVFULPLQDWLRQUDWLRVZHUHREWDLQHG7KUHHZDVKLQJVWULQJHQFLHVHLWKHU'1$ RU51$WDUJHWVDQGWZRSUREHOHQJWKVZHUHHPSOR\HGIRU$62K\EULGL]DWLRQ 'HVSLWHWKHVHYHUDOGLIIHUHQWFRQGLWLRQVPLQLVHTXHQFLQJSURYHGWRKDYHQLQHWR IROGKLJKHUJHQRW\SHGLVFULPLQDWLRQWKDQ$62K\EULGL]DWLRQRQD'1$DUUD\ The practical utility of minisequencing on DNA-microarrays has been established in several applications involving genotyping SNPs and mutations in genomic samples. The following four polymorphisms were evaluated in a cohort of 111 HIV-1 infected Finns and 194 healthy controls on minisequencing microarrays. A D32-bp allele of CCR5 gene coding for a chemokine receptor had been shown to be protective against HIV-1 infection in its homozygous form (Liu et al. 1996, Samson et al. 1996). No homozygotes for the CCR5 deletion were seen among HIV-1 infected individuals, consistent with protective effect of 65 CCR5 deletion. The presence of heterozygotes among patients and controls at similar frequencies is also consistent with the lack of protection by a single allele. Despite the characteristic population history of Finns (Peltonen et al. 1999) the CCR5 allele frequency was not significantly different from other North European populations (Martinson et al. 1997). Mannose binding lectin (MBL) is a circulating serum protein with multiple functions in innate immunity (Turner et al. 1996): three nonsynonymous substitutions in the gene lead to decreased concentrations of the circulating MBL and had been associated with increased risk for HIV-1 infection in the Danish population (Garred et al. 1997). The MBL variant alleles occurred at decreased frequency as compared to the Danish population, but homozygotes for the variant alleles were significantly enriched among HIV-1 infected individuals, supporting the role of the normal MBL allele on the first line defence against the pathogen. A study with a large cohort of multiply exposed healthy controls and HIV-1 infected patients of the same ethnicity would be required to confirm this suggestive predisposing effect seen in Scandinavian populations. The second association study for genetic predisposition to myocardial infarction (MI) included four LDLR mutations accounting for majority of FH alleles in Finland (Koivisto et al. 1995) and nine common polymorphisms previously associated with an increased risk for MI. The patients and controls were derived from the large epidemiologic study FINRISK, and finally 152 MI patients with 152 healthy matched controls were included in the study. Primary association was seen for GPIIIa and PAI-1 variant alleles, apparently increasing the risk for MI, while the other polymorphisms were not significantly associated with MI risk in the study subjects. Only one individual was carrying an FH causing LDLR mutation. If the GPIIIa and PAI-1 variant alleles were analyzed jointly a high predisposing effect was seen in individuals carrying 3-4 variant alleles compared to those carrying only 0 or one variant alleles (p=0.001 in total and p=0.0005 in male subgroup of study subjects). Another study in a Finnish autopsy material subsequently associated the GPIIIa PlA2 allele with increased risk for myocardial infarction as well (Mikkelsson et al. 1999). The study illustrated the candidate gene strategy analyzed on DNA-arrays, which will be increasingly popular with the progress in coding sequence- targeted SNP discovery projects (Cargill et al. 1999, Halushka et al. 1999). Relatively limited study sizes and the analysis of only a single polymorphism in the genes calls for larger and more detailed studies of the associated genes to determine the potentially causative role. 66 The minisequencing assay in the form described above, which has also been set up for analysis of 18 Finnish BRCA1 and BRCA2 mutations (Syrjakoski, Nevanlinna et al. unpublished). Recently, a simplified target prepartion strategy and flurorescent labels were applied for 26 Y-chromosomal SNPs in Finnish and related populations (Raitio et al. manuscript). A panel of 31 Finnish mutations with major mutations of most of the characterized Finnish disease heritage mutations and other recurrent recessive mutations along with two disease predisposing polymorphisms was set-up based on the fluorescent, allele-specific extension on DNA-microarrays (Figure 7) (V). $OOHOH $OOHOH P P P P 6DPSOH$ 6DPSOH% 6DPSOH& ),* 8 5 ( ( [DP SOHVR IDOOHOHVSHFLILFH[WHQVLRQJHQRW\ SLQJ RQ' 1 $ P LFURDUUD\ V 7 KUHH GLII HUHQWVDP SOHVD QDO\ ]HGRQDOOHOH VSH FLI LFH [WHQVLR QP LFUR DUUD\ VIR UJHQRW\ SLQJ61 3 VP XWDWLR QV ) RUHDF K61 3 WREHJH QRW\ SHGWZ R GHWHFWLRQSULP HUVDUHLP P R ELOL]HGRQDVR OLGVXUIDFHULJ KWEHVLGHHDFKR WKH RQHSD LURI GHWHFWLRQROLJ RVKLJKOLJ KWHGR QWKHOHIWP RVWDUUD\ ( DFKSDLUR IGHWHFWLRQSULP HUVGLI IHUDWWKHLU ¶HQGZ KLFKLVFR P SOHP HQWDU\ WRH LWKHUDOOHOHRU,QWKHGHWHFWLRQUHDFWLRQUHYHUVHWUDQVFULSWDVH DQG& < ODEHOOHGG1 7 3VD UHXVHGLQWHP SODWHGHSHQGHQWSULP HUH[WHQVLRQ2 QO\ GHWHFWLRQSULP HUVZ LWKFRP SOHP HQWDU\ ¶HQGVD UHHIILFLHQWO\ H[WHQGHGE\ WKHHQ]\ P H WKXVGHWHUP LQLQJ WKHJ HQRW\ SHDWHDFKVLWH7 KH JLYHQH[DP SOH UHSUHVHQWVDVP DOOSRUWLR QR IP LFURVFRSLFJODVVVOLGHFRQWDLQLQJVHSDUDWHUHD FWLR QFKDP EH UVLQZ HOO I RUP DWRQZ KLFKVDP SOHVD UHDQDO\ ]HGLQ OUH DFWLRQYR OXP HV$ I WHUKUUHD FWLR QWLP HWKHVOLGH Z DVVF DQQHG XVLQJ 6 FDQ$ UUD\ V\ VWHP 6DP SOH$ LVKHWHUR ]\ J RXVI R UD &WR 7 WUDQVLWLRQLQWKH$ ,5 ( J HQH ER[HG VDP SOH% LVKHWHUR]\ JR XVIRUWZ RWUD QVLWLRQV*WR$ LQWKH) 9 JHQHDQG& WR7 LQWKH) 6+ 5 J HQHVDP SOH&LV KHWHUR]\ JR XVI RUD&WR 7 WUDQVLWLRQLQWKH 1 ( ) 5 ,1 J HQH The method was evaluated in 192 samples containing known carriers for each of the mutations and in unknown samples from which the putative carriers as determined by the microarray assay were confirmed by a reference method. A high primary success rate (96.5%) with excellent specificity was achieved with the simple genotyping procedure (0.1% ambiguous genotypes, no miscalls), yielding nearly 2500 genotypes from a single microscopic glass slide. 67 Also genotyping of the common FV and HFE mutations along with 9 SNPs in the genes in a total of 233 samples (Figure 8) with clear genotype discrimination and almost 100% success rate demonstrates the general applicability of the novel system. ES ES ES ES NE 6 9 , ( ) < & + ( ) + + + ++ ++ &< ) $ , : &< ) $ , : ,96 )9 7& *$ /HLGHQ *$ R L W D 5 O D Q J L 6 ) $ , : ) $ , : ) $ , : ) $ , : ) $ , : /HLGHQ :,$) :,$) $& *$ :,$) :,$) *$ $* R L W D 5 O D Q J L 6 $ )9 /HLGHQ *$ NE ESES Q H G L H / 9 ) )9 ++ &< NE 6LJQDOLQWHQVLW\ % 6LJQDOLQWHQVLW\ ),*85('HVLJQDQGUHVXOWVRIDQ613VFRULQJDVVD\ 3UHYLRXVO\FKDUDFWHUL]HGVHTXHQFHYDULDQWVZHUHVHOHFWHGIURP$+)(JHQHDQG%)9JHQH7KHUHODWLYHJHQRPLF ORFDWLRQRIWKHVH613VDUHLOOXVWUDWHGLQWKHXSSHUSDUWRIWKHILJXUHLQGLYLGXDOVZHUHW\SHGRQWKHDUUD\V UHYHDOLQJWKDWWKUHHRIWKH)9JHQH613VZHUHQRWSRO\PRUSKLFLQWKHVDPSOHVWXGLHG7KHJHQRW\SLQJUHVXOWVRIWKH SRO\PRUSKLF613VDUHLOOXVWUDWHGEHORZ7KHJHQRW\SHDWWKH)9/HLGHQDQG++&<PXWDWLRQVZHUHNQRZQ SUHYLRXVO\DQGWKHVDPSOHVZHUHDVVLJQHGFRUUHFWO\IRUWKHVHLQDOOFDVHV*HQHUDOO\WKHVLJQDOUDWLRVIDOOLQWRWKUHH GLVWLQFWFOXVWHUVXQHTXLYRFDOO\FKDUDFWHUL]LQJWKHJHQRW\SHDWHDFKORFXV A large population-based screening of the 31 mutations was carried out to determine their geographic distribution in Finland. Putative carriers found through the array-based analysis were further confirmed with reference methods. Approximately 2600 samples were analyzed along with blinded controls, yielding over 70.000 genotypes at a 96% success rate (Pastinen et al. in preparation). Significant variation of disease mutation frequencies was seen across Finland (Figure 9). The development of high-throughput tools for genotyping are making population-wide genetic tests possible. It will be necessary to accurately determine disease allele frequencies in the population. This information forms the basis for the evaluation of the impact and cost-effectiveness of population-wide screenings in disease prevention. 68 2XOX 1RUWK.DUHOLD 6RXWKHUQ%RWQLD +HOVLQNL ),*85(6XPPHGFDUULHUUDWHVIRUPXWDWLRQVLQIRXUUHJLRQVRI)LQODQG )RXUKXQGUHGVDPSOHVZHUHW\SHGIURPLQGLYLGXDOVRULJLQDWLQJIURPHDFKRIWKHIROORZLQJ UHJLRQV1RUWK.DUHOLD2XOXDQG6RXWKHUQ%RWQLDVDPSOHVZHUHW\SHGIURPLQGLYLGXDOV RULJLQDWLQJIURP+HOVLQNL,QDGGLWLRQRYHUFDUULHULQGLYLGXDOVVHUYHGDVSRVLWLYHFRQWUROV $OOWKHLGHQWLILHGFDUULHUVZHUHFRQILUPHGE\WKHUHIHUHQFHPHWKRG7KHDJJUHJDWHFDUULHUUDWHV EDVHGRQXQDPELJXRXVJHQRW\SLQJFDOOVDUHVKRZQGHPRQVWUDWLQJZLGHO\ YDU\LQJUDWHVRIFDUULHUVKLSLQWKHIRXUUHJLRQVVWXGLHG)XUWKHUPRUHSDUWLFXODUO\SURPLQHQW YDULDWLRQLQFDUULHUIUHTXHQFLHVRIWKHSUHVXPDEO\\RXQJHUPXWDWLRQVZDVVHHQDFURVV)LQODQG 3DVWLQHQHWDOPDQXVFULSW Table 5 describes the characteristics of different published array genotyping applications, in which more than 500 genotypes were produced and the additional applications presented in this thesis. Despite the existence of high-density array production technology so far, the highest information density (genotypes per area) has been achieved on our simple spotted arrays utilizing enzymatic extension to discriminate genotypes. 69 TABLE 5. Genotype scoring on DNA-arrays. $33/,&$7,21 1 2 6 1 3 6 1 2 2 ) 1 2 2 ) 632 76 25 6$ 0 3 * (12 3 ( 5 & 0 0 87$ /( 6 7<3(6 7 ,2 1 6 352 %( 6 * ( 1 2 7 < 3(5 3 ( 6 3 ( 5 * (12 &0 7<3( & ) P X WD WLR Q V & UR Q LQ H WD O & & 5 0 % / 6 1 3 V ,,, 0 ,D V V R F LD WH G 6 1 3 V ,9 UG J H Q H U D WLR Q P D U N H U P D S : D Q J H WD O $ Q F H V WH U D OD OOH OH G H WH U P LQ D WLR Q + D F LD H WD O $ UD E LG R S V LV 6 1 3 P DS ! & K R H WD O $ OOH OH V S H F LILF H [ WH Q V LR Q R Q ' 1 $ D U UD \ V 9 ´ ) LQ Q & K LS µ J H Q R W\ S LQ J 3 D V WLQ H Q H WD O P D Q X V F U LS W 2 X WR I D V V D \ V LWH V V F R U H G J H Q R W\ S H V U H OLD E O\ LQ G LY LG X D OV Z H UH V F R U H G R Q WK H Z K R OH D UU D \ D G G LWLR Q D OV D P S OH V Z H UH W\ S H G IR U D V X E V H W R I 6 1 3 V 7 K H V D P H D V V D \ D V G H V F U LE H G E \ : D Q J H WD O Q R Z P D UN H U V Z H U H UH S R U WH G WR V F R U H K H WH U R ] \ J R WH V UH OLD E O\ 7 H Q S R R OH G ' 1 $ V D P S OH V 2 Q O\ D E R X WK D OIR IWK H J H Q R W\ S LQ J F D S D F LW\ R IWK H D U UD \ Z D V LQ ID F WX V H G D V D V X E V H WR I P D U N H U V Z H U H W\ S H G LQ WK H S R R OH G V D P S OH V 2 X WR IWK H S U R E H V H WV V \ Q WK H V L]H G R Q WK H D U U D \ V R Q O\ P D U N H UV S H U IR U P H G D F F H S WD E O\ LQ K H WH U R ] \ J R WH G LV F ULP LQ D WLR Q 70 CONCLUDING REMARKS Deciphering the human genome sequence will finally open the door to true functional genomics studies of our own species. The few years of experience with genomics on organisms with sequenced genomes have revealed that in order assign function to all genes a wide variety of tools are required. In addition to many of the highthoughput analytical methods described in this thesis, efficient computational tools are essential in determining molecular physiology (Marcotte et al. 1999a, 1999b). Deposition of the analytical data in the public domain increases the power of these biocomputing-based methods greatly, and hopefully such sharing will be seen for the large number of on-going gene-mapping, association and expression studies in mammals as well. Databases on genotypes related with all the possible phenotypic data available might tackle with the large sample sizes and marker numbers required to characterize genes for complex traits. Finally, independent validation studies of the new methods should allow the application of these high-throughput tools in clinical applications in the near future. 71 ACKNOWLEDGEMENTS I wish to thank professor Jussi Huttunen, head of the National Public Health Institute, for providing excellent research facilities and infrastructure to carry out this work. Right from that first visit to the Department of Human Molecular genetics when I met professor Leena Peltonen (Palotie) in October 1993 her openess and action-style were striking for the young medical student entering her room. Consequently, I initiated my PhD studies the following week in her group. I have been truly privileged to be a part of Leenas group and to have her positive attitude guide me through the first steps of my research career. For the last five years I have had the pleasure to be co-supervised by professor Ann-Christine Syvanen. Her down-to-earth approach, expertise in molecular methodology and familiarity with the benchtop world made the work in this thesis happen. Among the many things Chrisse taught me was criticism and self-confidence, essential for carrying out genomics on the edge of Europe. Both Leena and Chrisse are also thanked for literally never saying no - the trust and responsibility they provided was truly flattering. Docent Jukka Partanen, an HLA-wizard, is greatly acknowledged for his always willing attitude to discuss about genetics and HLA, for being an excellent collaborator and for his continuing interest towards my work. Drs Andres Metspalu and Ants Kurg are acknowledged for their guidance on oligonucleotide immobilization and many stimulating discussions on minisequencing in several meetings during the past few years. Paavo Niini is thanked for building our arrayers and providing engineering point-of-view for the project along with his colleague Pekka Katila. The collaboration with the Microelectronics lab at VTT led by professor Matti Leppihalme was extremely useful not only for the providing possibility of testing various silicon chips, fluorescence detection system and array-spotters, but also for educating me on communication and work in an interdisciplinary project. The chip-group, which was unfortunately formed only during the later part of my stay in Leenas lab was a lot of fun to work and share laughs with. Mirjas and Katarinas great work on park-chemistry immobilizing oligonucleotides and fantastic stories immobilizing me were an essential part of the work. Paivis efficiency, humour and ability to interpret instructions not even interpretable by their composer was always outstanding (and seeing the worlds fastest pipetting hands in work reveals the secrets of high-throughput 72 genotyping). Minnas ability to pool various excel spreadsheets of repeated genotypings, carrier numbers, and so on just before that important presentation or meeting in record time was life saving along with those thousands of minisequencing and PCR reactions she carried out. Satu and Pena introduced me into the world of complex disease genetics and multiple sclerosis, even if I bailed out early. Satus unequivocal determinism and Penas enthousiasm exemplified the two ways of surviving in the competitive field. Reintroduction to multifactorial diseases and coworking with Markus were truly pleasurable, though coffee breaks and shared congress trips were even more so (as long as we didnt discuss about Star Wars). Also, with Markus we went through two horrifying near death experiences - one in the Rocky Mountains caught in a snowstorm on a small mountain road and another on the streets of LA caught as passengers in a car driven by Lasse. During those long evening and weekend hours in the lab I got to know Jyrki. He could be found loading repeat gels virtually at any time of the day, the waiting periods between loadings were often spent talking dirty and living in a fantasy world of young eagles. Sharing some special congress trips, and also my last days as a bachelor in New York City and Boston with one minute beer challenges, along with numerous other events extended our friendship beyond the lab environment. Lasses inexhaustible interest in everyday drama of life made him the perfect listener of your worries, and the regular healing sessions with Jyrki and Lasse in William K. were relaxing (to some more than others). Kaisus partying mode is seemingly always on, and her tolerance to infantile humour was amazing during the HGM 96 meeting where she shared most of the free time with me, Jyrki and Tuomas listening to on-line commentary on Fitness World competitors... Kaitsu never got tired of teaching me why Canadian hockey sucks, and Tuomas is acknowledged for his kind advice on current trends of mens wear... Johanna, Petra, Maria, Miina, Naula, Jesper, Tero, Juha and Teemu were all excellent company in the after-office-hour activities and also helping me out during office hours. The senior scientists Iski, Anu and Irma are all thanked for their interest towards my work. Sari E. and Sari K. were always helpful in sorting out reagent billing and conference trips, but particularly the submission and publication of this thesis would have not been possible without Sari Kivikkos tremendous help! Dr. Robert Sladek is kindly acknowledged for revising my English. The world of immunodiagnostics and homogenous assays kept 73 me busy during the past year, which was made possible by Professor Hans Soderlund. Hasses treasure trove of ideas for diagnostic assays never ceased to amaze me and his interest towards DNA-arrays was also important for the work in this thesis. I want to thank Adela for her patience regarding my less than organized way of working and my reluctance for true protein work. The combination of the medical school and the laboratory was only possible with the aid of Mikko, Ilkka, Antti and Samuli - poker nights and pinball tournaments without any medical jargon were true quality leisure time. Mikko and Ilkka kept me from sinking totally into an eppendorf tube, without their friendship an early burn-out would have been evident. The Quebecois party house in Kapyla provided fun Halloweens and BBQs with the good company of Frank, Martin, Markus and Michelle along with many others. Regular lunches with Tuukka are truly missed for his hilarious company brought me always in good mood. The support of my family has in all aspects of my life has been tremendous. Their silent encouragement in my career choices was particularly valuable, never generating additional pressure to succeed. Despite being brutally dragged from happy Montreal to dark Scandinavia my wife Nathalie has always provided her support and love even through the worst of times. This work is dedicated to her as during the past five and a half years she has shared all the sorrows and joys related to this thesis, beared my constant absence, and yet always encouraged me to do my best. Our dog Spugi is lastly thanked for taking care of Nathalie when I was away! This work has been supported by grants from the Technology Development Centre of Finland, the Instrumentarium Foundation, EC Biomed2 Contract no. BMH4-972013, The Hjelt Fond of the Pediatric Research Foundation, the Emil Aaltonen Foundation, Stiftelsen Oscar Oflund, the Finnish Medical Society Duodecim, the Rinnekoti Research Foundation and the Maud Kuistila Foundation. Montreal, May 25, 2000 74 REFERENCES Aaltonen LA. Peltomaki P. Genes involved in hereditary nonpolyposis colorectal carcinoma. Anticancer Research.14:1657-60, 1994 Abramson RD Thermostable DNA polymerases: an update. In:PCR Applications: protocols for functional genomics. (ed. Innis MA, Gelfand DH and Sninsky JJ) pp. 33-48. Academic Press, San Diego, 1999 Ahmadian A. Lundeberg J. Nyren P. Uhlen M. Ronaghi M. Analysis of the p53 tumor suppressor gene by pyrosequencing. Biotechniques. 28:140, 2000 Ahrendt SA, Halachmi S, Chow JT,Wu L, Halachmi N,Yang SC,Wehage S, Jen J, Sidransky D. Rapid p53 sequence analysis in primary lung cancer using an oligonucleotide probe array. Proc Natl Acad Sci U S A. 96:7382-7, 1999 Alves AM. Carr FJ. Dot blot detection of point mutations with adjacently hybridising synthetic oligonucleotide probes. Nucleic Acids Research. 16:8723, 1988 Amos DB. Bashir H. Boyle W. MacQueen M. Tiilikainen A. A simple micro cytotoxicity test. Transplantation. 7:220-3, 1969 Anagnostopoulos T. Green PM. Rowley G. Lewis CM. Giannelli F. DNA variation in a 5-Mb region of the X chromosome and estimates of sex-specific/type-specific mutation rates. American Journal of Human Genetics. 64:508-17, 1999 Antonarakis SE. McKusick VA. OMIM passes the 1,000-disease-gene mark. Nature Genetics. 25:11, 2000 Bach FH. Voynow NK. One-way stimulation in mixed leukocyte cultures. Science. 153:545, 1966 Bains W, Smith GC. A novel method for nucleic acid sequence determination. J Theor Biol. 135:303-7, 1988 Bains W. Hybridization methods for DNA sequencing. Genomics. 11:294-301, 1991 Baner J. Nilsson M. Mendel-Hartvig M. Landegren U. Signal amplification of padlock probes by rolling circle replication. Nucleic Acids Research. 26:5073-8, 1998 Barany F. Gelfand DH. Cloning, overexpression and nucleotide sequence of a thermostable 75 DNA ligase-encoding gene. Gene. 109:1-11, 1991 Barany F. Genetic disease detection and DNA amplification using cloned thermostable ligase. Proceedings of the National Academy of Sciences of the United States of America. 88:189-93, 1991 Barinaga M: Will DNA Chip speed genome initiative? Science 251:1489, 1991 Barnes WM. PCR amplification of up to 35-kb DNA with high fidelity and high yield from lambda bacteriophage templates. Proceedings of the National Academy of Sciences of the United States of America. 91:2216-20, 1994 Baron H. Fung S. Aydin A. Bahring S. Luft FC. Schuster H. Oligonucleotide ligation assay (OLA) for the diagnosis of familial hypercholesterolemia. Nature Biotechnology. 14:127982, 1996 Beattie WG, Meng L, Turner SL, Varma RS, Dao DD, Beattie KL. Hybridization of DNA targets to glass-tethered oligonucleotide probes. Mol Biotechnol. 4:213-25, 1995 Behr MA,Wilson MA, Gill WP, Salamon H, Schoolnik GK, Rane S, Small PM. Comparative genomics of BCG vaccines by whole-genome DNA microarray. Science. 284:1520-3, 1999 Beier M, Hoheisel JD. Versatile derivatisation of solid support media for covalent bonding on DNA-microchips.Nucleic Acids Res. 27:1970-7, 1999 Bell GI, Karam JH, Rutter WJ. Polymorphic DNA region adjacent to the 5' end of the human insulin gene. Proc. Natl. Acad. Sci. USA 78;5759-63, 1981 Bertina RM, Koeleman BP, Koster T, Rosendaal FR, Dirven RJ, de Ronde H, van der Velden PA, Reitsma PH. Mutation in blood coagulation factor V associated with resistance to activated protein C. Nature. 369:64-7, 1994 Birch DE. Simplified hot start PCR. Nature. 381:445-6, 1996 Bolla MK. Haddad L. Humphries SE. Winder AF. Day IN. High-throughput method for determination of apolipoprotein E genotypes with use of restriction digestion analysis by microplate array diagonal gel electrophoresis. Clinical Chemistry. 41:1599-604, 1995 Bonnet G. Tyagi S. Libchaber A. Kramer FR. Thermodynamic basis of the enhanced specificity of structured DNA probes. Proceedings of the National Academy of Sciences of the United States of America. 96:6171-6, 1999 Botstein D,White RL, Skolnick M, and Davis RW. Construction of a genetic linkage map in 76 man using restriction fragment length polymorphisms. Am J Hum Genet 32 314-31, 1980 Broude NE, Sano T, Smith CL, Cantor CR. Enhanced DNA sequencing by hybridization. Proc Natl Acad Sci U S A. 91:3072-6, 1994 Brown PO, and Botstein D Exploring the new world of the genome with DNA microarrays. Nat. Genet. 21: S33-S37, 1999 Buetow KH, Edmonson MN, Cassidy AB. Reliable identification of large numbers of candidate SNPs from public EST data. Nat Genet. 21:323-5, 1999 Bulyk ML, Gentalen E, Lockhart DJ, Church GM. Quantifying DNA-protein interactions by double-stranded DNA arrays. Nat Biotechnol. 17:573-7, 1999 Cargill, M. et al. Characterization ofsingle-nucleotide polymorphisms in coding regions of human genes. Nature Genet. 22, 231, 1999 Carninci P, Nishiyama Y, Westover A, Itoh M, Nagaoka S, Sasaki N, Okazaki Y, Muramatsu M, Hayashizaki Y. Thermostabilization and thermoactivation of thermolabile enzymes by trehalose and its application for the synthesis of full length cDNA. Proc Natl Acad Sci U S A. 95:520-4, 1998 Chakravarti A. Its raining SNPs, hallelujah? Nature Genetics. 19:216-7, 1998 Chamberlain JS, Gibbs RA, Ranier JE, Nguyen PN, Caskey CT. Deletion screening of the Duchenne muscular dystrophy locus via multiplex DNA amplification. Nucleic Acids Res. 16:11141-56, 1988 Chamberlain, JS, and Chamberlain, J.R. Optimization of multiplex PCRs. In: Mullis, K.B., Ferre, F and Gibbs, R.A. (ed). Polymerase chain reaction. Boston: Birkhäuser pp. 38-46, 1994. Chang JC. Kan YW. A sensitive new prenatal test for sickle-cell anemia. New England Journal of Medicine. 307: 30-2, 1982 Charles A. Mein, Bryan J. Barratt, Michael G. Dunn, Thorsten Siegmund, Annabel N. Smith, Laura Esposito, Sarah Nutland, Helen E. Stevens, Amanda J.Wilson, Michael S. Phillips, Nancy Jarvis, Scott Law, Monika de Arruda, and John A. Todd Evaluation of Single Nucleotide Polymorphism Typing with Invader on PCR Amplicons and Its Automation Genome Res. 10: 330-343, 2000 Chee M,Yang R, Hubbell E, Berno A, Huang XC, Stern D, Winkler J, Lockhart DJ, Morris MS, Fodor SP. Accessing genetic information with high-density DNA arrays. Science. 274:610-4, 1996 77 Chen X, Kwok PY. Template-directed dye-terminator incorporation (TDI) assay: a homogeneous DNA diagnostic method based on fluorescence resonance energy transfer. Nucleic Acids Res. 25:347-53, 1997 Chen X, Levine L, Kwok PY. Fluorescence polarization in homogeneous nucleic acid analysis Genome Res.9:492-8, 1999 Chen X, Zehnbauer B, Gnirke A, Kwok PY. Fluorescence energy transfer detection as a homogeneous DNA diagnostic method. Proc Natl Acad Sci U S A. 94:10756-61, 1997 Chen X. Livak KJ. Kwok PY. A homogeneous, ligase-mediated DNA diagnostic test. Genome Research. 8:549-56, 1998 Cheng J, Fortina P, Surrey S, Kricka LJ,Wilding P. Microchip-based Devices for Molecular Diagnosis of Genetic Diseases. Mol Diagn. 1:183-200, 1996 Cheng S, Grow MA, Pallaud C, Klitz W, Erlich HA,Visvikis S, Chen JJ, Pullinger CR, Malloy MJ, Siest G, Kane JP. A multilocus genotyping assay for candidate markers of cardiovascular disease risk. Genome Res. 9:936-49, 1999 Cheng S. Fockler C. Barnes WM. Higuchi R. Effective amplification of long targets from cloned inserts and human genomic DNA. Proceedings of the National Academy of Sciences of the United States of America. 91:5695-9, 1994 Cheung VG, Gregg JP, Gogolin-Ewens KJ, Bandong J, Stanley CA, Baker L, Higgins MJ, Nowak NJ, Shows TB, Ewens,WJ, Nelson SF, Spielman RS. Linkage-disequilibrium mapping without genotyping. Nat Genet. 18:225-30, 1998 Cho RJ, Mindrinos M, Richards DR, Sapolsky RJ, Anderson M, Drenkard E, Dewdney J, Reuber TL, Stammers M, Federspiel N, Theologis A, Yang WH, Hubbell E, Au M, Chung EY, Lashkari D, Lemieux B, Dean C, Lipshutz RJ, Ausubel FM, Davis RW, Oefner PJ. Genome-wide mapping with biallelic markers in Arabidopsis thaliana. Nat Genet. 23:203-7, 1999 Chou Q. Russell M. Birch DE. Raymond J. Bloch W. Prevention of pre-PCR mis-priming and primer dimerization improves low-copy-number amplifications. Nucleic Acids Research. 20:1717-23, 1992 Chu BC, Kramer FR, Orgel LE. Synthesis of an amplifiable reporter RNA for bioassays. Nucleic Acids Res. 14:5591-603, 1986 Clark, A.G. et al. Haplotype structure and population genetic inferences from nucleotide- 78 sequence variation in humanlipoprotein lipase. Am. J. Hum. Genet. 63: 595,1998 Cohen et al. Nature 334:119, 1988 Cohen SN, Chang AC, Boyer HW, Helling RB. Construction of biologically functional bacterial plasmids in vitro. Proc Natl Acad Sci U S A. 70:3240-4, 1973 Collins FS, Guyer MS, Charkravarti A. Variations on a theme: cataloging human DNA sequence variation. Science. 278:1580-1, 1997 Collins FS. Patrinos A. Jordan E. Chakravarti A. Gesteland R. Walters L. New goals for the U.S. Human Genome Project: 1998-2003. Science. 282: 682-9, 1998 Collins ML, Irvine B, Tyner D, Fine E, Zayati C, et al. A branched DNA signal amplification assay for quantification of nucleic acid targets below 100 molecules/ml Nucleic Acids Res 25:2979, 1997 Compton J. Nucleic acid sequence-based amplification. Nature 350: 91-2, 1991 Conner, B.J., A.A. Reyes, C. Morin, K. Itakura, R.L. Teplitz, and R.B. Wallace. Detection of sickle cell beta S-globin allele by hybridization with synthetic oligonucleotides. Proc. Natl. Acad. Sci 80: 278-282, 1983 Cooper DN, Krawczak M, Antonorakis SE. The nature and mechanisms of human gene mutation. In:Metabolic and Molecular Bases of Inherited Disease, 7th edn. (ed. Scriver C, Beaudet AL, Sly WS, Valle D) pp. 259-291. McGraw-Hill, New York, 1995 Cotton RG. Current methods of mutation detection. Mutation Research. 285:125-44, 1993 Cotton RG. Slowly but surely towards better scanning for mutations Trends in Genetics. 13:43-6, 1997 Coulondre C. Miller JH. Farabaugh PJ. Gilbert W. Molecular basis of base substitution hotspots in Escherichia coli. Nature. 274:775-80, 1978 Cox DW, Woo SL, Mansfield T. DNA restriction fragments associated with alpha 1-antitrypsin indicate a single origin for deficiency allele PI Z. Nature. 316:79-81, 1985 Cronin MT, Fucini RV, Kim SM, Masino RS, Wespi RM, Miyada CG. Cystic fibrosis mutation detection by hybridization to light-generated DNA probe arrays. Hum Mutat. 7:244-55, 1996 Cros P, Allibert P, Mandrand B, Tiercy JM, Mach B. Oligonucleotide genotyping of HLA 79 polymorphism on microtitre plates. Lancet. 340:870-3, 1992 Dang C. Jayasena SD. Oligonucleotide inhibitors of Taq DNA polymerase facilitate detection of low copy number targets by PCR. Journal ofMolecular Biology. 264:268-78, 1996 Danna K, and Nathans D. Specific cleavage of simian virus 40 DNA by restriction endonuclease of Hemophilus influenzae. Proc Natl Acad Sci U S A 68 2913, 1971. Dausset J. Acta Haematol. 20:156-166, 1958 Day DJ, Speiser PW, White PC, Barany F. Detection of Steroid 21-Hydroxylase Alleles Using Gene-Specific PCR and a Multiplexed Ligation Detection Reaction Genomics 29: 152-162, 1995 Day IN. Humphries SE. Electrophoresis for genotyping: microtiter array diagonal gel electrophoresis on horizontal polyacrylamide gels, hydrolink,or agarose. Analytical Biochemistry. 222:389-95, 1994 Delahunty C, Ankener W, Deng Q, Eng J, Nickerson DA. Testing the feasibility of DNA typing for human identification by PCR and an oligonucleotide ligation assay. Am J Hum Genet. 58:1239-46, 1996 Dib C. Faure S. Fizames C. Samson D. Drouot N.Vignal A. Millasseau P. Marc S. Hazan J. Seboun E. Lathrop M. Gyapay G. Morissette J. Weissenbach J. A comprehensive genetic map of the human genome based on 5,264 microsatellites Nature. 380:152-4, 1996 Dijan P. Cell 94:155-160, 1998 Drmanac and Crkvenjakov 1987 [Yugoslav Patent Application 570] Drmanac R, Drmanac S. cDNA screening by array hybridization. Methods Enzymol. 303:165-78, 1999 Drmanac S. Kita D. Labat I. Hauser B. Schmidt C. Burczak JD. Drmanac R. Accurate sequencing by hybridization for DNA diagnostics and individual genomics. Nature Biotechnology. 16:54-8, 1998 Drobyshev A, Mologina N, Shik V, Pobedimskaya D, Yershov G, Mirzabekov A. Sequence analysis by hybridization with oligonucleotide microchip: identification of beta-thalassemia mutations. Gene. 188:45-52, 1997 Dubiley S, Kirillov E, Mirzabekov A. Polymorphism analysis and gene detection by minisequencing on an array of gel-immobilized primers. Nucleic Acids Res. 27:e19, 1999 80 Dubrova YE. Nesterov VN. Krouchinsky NG. Ostapenko VA. Neumann R. Neil DL. Jeffreys AJ. Human minisatellite mutation rate after the Chernobyl accident Nature. 380:683-6, 1996 Dunham I. Shimizu N. Roe BA. Chissoe S. Hunt AR. Collins JE. Bruskiewich R. Beare DM. Clamp M. Smink LJ. Ainscough R. Almeida JP. Babbage A. Bagguley C. Bailey J. Barlow K. Bates KN. Beasley O. Bird CP. Blakey S.Bridgeman AM. Buck D. Burgess J. Burrill WD. OBrien KP. et al. The DNA sequence of human chromosome 22 Nature. 402: 489-95, 1999 Edman CF, Raymond DE, Wu DJ, Tu E, Sosnowski RG, Butler WF, Nerenberg M, Heller MJ. Electric field directed nucleic acid hybridization on microchips. Nucleic Acids Res. 25:490714, 1997 Eggers M, Hogan M, Reich RK, Lamture J, Ehrlich D, Hollis M, Kosicki B, Powdrill T, Beattie K, Smith S, et al. A microchip for quantitative detection of molecules utilizing luminescent and radioisotope reporter groups. Biotechniques. 17:516-25, 1994 Ellegren H. Lindgren G. Primmer CR. Moller AP. Fitness loss and germline mutations in barn swallows breeding in Chernobyl. Nature. 389:593-6, 1997 Ellis NA. German J. Molecular genetics of Blooms syndrome. Human Molecular Genetics. 5 Spec No:1457-63, 1996 Erlich HA. Gelfand D. Sninsky JJ. Recent advances in the polymerase chain reaction. Science. 252:1643-51, 1991 Evans WE, and Relling RV. Pharmacogenomics: Translating Functional Genomics into rational therapeutics. Science 286:487, 1999 Fang P, Bouma S, Jou C, Gordon J, Beaudet AL. Simultaneous analysis of mutant and normal alleles for multiple cystic fibrosis mutations by the ligase chain reaction. Hum Mutat. 6:14451, 1995 Farr C.J., R.K. Saiki, H.A. Erlich, F. McCormick, C.J. Marshall. Analysis of RAS gene mutations in acute myeloid leukemia by polymerase chain reaction and oligonucleotide probes. Proc. Natl. Acad. Sci. 85: 1629-1633, 1988 Ferrie RM. Schwarz MJ. Robertson NH. Vaudin S. Super M. Malone G. Little S. Development, multiplexing, and application of ARMS tests for common mutations in the CFTR gene. American Journal of Human Genetics. 51:251-62, 1992 Fire A, Xu SQ Rolling replication of short DNA circles. Proc Natl Acad Sci U S A 92:4641-5, 81 1995 Fodor SP, Read JL, Pirrung MC, Stryer L, Lu AT, Solas D. Light-directed, spatially addressable parallel chemical synthesis. Science. 251:767-73, 1991 Foster, T. Modern quantum chemistry, Istanbul lectures, part III, pp. 93-137. Academic Press, New York, NY, 1965 Fotin AV, Drobyshev AL, Proudnikov DY, Perov AN, Mirzabekov AD. Parallel thermodynamic analysis of duplexes on oligodeoxyribonucleotide microchips. Nucleic Acids Res 26:151521, 1998 Fu YH. Kuhl DP. Pizzuti A. Pieretti M. Sutcliffe JS. Richards S.Verkerk AJ. Holden JJ. Fenwick RG Jr.Warren ST. et al.Variation of the CGG repeat at the fragile X site results in genetic instability: resolution of the Sherman paradox. Cell. 67:1047-58, 1991 Futcher B. Blast ahead. Nat. Genet. 23:377-378, 1999 Gait MJ. Sheppard RC. Rapid synthesis of oligodeoxyribonucleotides. II. Machine-aided solid-phase syntheses of two nonanucleotides and an octanucleotide. Nucleic Acids Research. 4:4391-410, 1977 Gait MJ. Sheppard RC. Rapid synthesis of oligodeoxyribonucleotides: a new solid-phase method. Nucleic Acids Research. 4:1135-58, 1977 Garred P, Madsen HO, Balslev U, Hofmann B, Pedersen C, Gerstoft J, Svejgaard A. Susceptibility to HIV infection and progression of AIDS in relation to variant alleles of mannosebinding lectin. Lancet 349:236-40, 1997 Geever RF.Wilson LB. Nallaseth FS. Milner PF. Bittner M.Wilson JT. Direct identification of sickle cell anemia by blot hybridization. Proceedings of the National Academy of Sciences of the United States of America. 78:5081-5, 1981 Gerry NP.Witowski NE. Day J. Hammer RP. Barany G. Barany F. Universal DNA microarray method for multiplex detection of low abundance point mutations. Journal of Molecular Biology. 292:251-62, 1999 Gibbs RA, Ngyen P-N, Caskey TC. Detection of single DNA base differences by competetive oligonucleotide priming. Nucleic Acids Res. 17;2437-48, 1989 Gibson NJ , Gillard HL, Whitcombe D, Ferrie RM, Newton CR, Little S. A homogeneous method for genotyping with fluorescence polarization Clin Chem 43:1336-1341, 1997 82 Gibson QH The reduction of methaemoglobin in red blood cells and studies on the cause of idiopathic methaemoglobinemia. Biochem. J. 42:13, 1948 Gilles PN,Wu DJ, Foster CB, Dillon PJ, Chanock SJ. Single nucleotide polymorphic discrimination by an electronic dot blot assay on semiconductor microchips. Nat Biotechnol. 17:36570, 1999 Goddard KA, Hopkins PJ, Hall JM,Witte JS. Linkage disequilibrium and allele-frequency distributions for 114 single-nucleotide polymorphisms in five populations. Am J Hum Genet. 66:216-34, 2000 Graves DJ, Su HJ, McKenzie SE, Surrey S, Fortina P. System for preparing microhybridization arrays on glass slides. Anal Chem. 70:5085-92, 1998 Griffin TJ, Hall JG, Prudent JR, Smith LM. Direct genetic analysis by matrix-assisted laser desorption/ionization mass spectrometry. Proc Natl Acad Sci U S A. 96:6301-6, 1999 Griffin TJ. Smith LM. Single-nucleotide polymorphism analysis by MALDI-TOF mass spectrometry Trends in Biotechnology. 18:77-84, 2000 Griffin TJ. Tang W. Smith LM. Genetic analysis by peptide nucleic acid affinity MALDI-TOF mass spectrometry Nature Biotechnology. 15:1368-72, 1997 Grompe M. The rapid detection of unknown mutations in nucleic acids Nature Genetics. 5:111-7, 1993 Grossman PD, Bloch W, Brinson E, Chang CC, Eggerding FA, Fung S, et al. High-density multiplex detection of nucleic acid sequences: oligonucleotide ligation assay and sequence-coded separation. Nucleic Acids Res. 22;4527-34, 1994 Guatelli JC. Whitfield KM. Kwoh DY. Barringer KJ. Richman DD. Gingeras TR. Isothermal, in vitro amplification of nucleic acids by a multienzyme reaction modeled after retroviral replication. Proceedings of the National Academy of Sciences of the United States of America,1990 Gunderson KL, Huang XC, Morris MS, Lipshutz RJ, Lockhart DJ, Chee MS. Mutation detection by ligation to complete n-mer DNA arrays. Genome Res. 8:1142-53, 1998 Gunthard HF, Wong JK, Ignacio CC, Havlir DV, Richman DD. Comparative performance of high-density oligonucleotide sequencing and dideoxynucleotide sequencing of HIV type 1 pol from clinical samples. AIDS Res Hum Retroviruses. 14:869-76, 1998 83 Guo Z, Guilfoyle RA, Thiel AJ,Wang R, Smith LM. Direct fluorescence analysis of genetic polymorphisms by hybridization with oligonucleotide arrays on glass supports. Nucleic Acids Res. 22:5456-65, 1994 Guschin D, Yershov G, Zaslavsky A, Gemmell A, Shick V, Proudnikov D, Arenkov P, Mirzabekov A. Manual manufacturing of oligonucleotide, DNA, and protein microchips. Anal Biochem. 250:203-11, 1997 Gusella JF. Wexler NS. Conneally PM. Naylor SL. Anderson MA. Tanzi RE. Watkins PC. Ottina K.Wallace MR. Sakaguchi AY. et al. A polymorphic DNA marker genetically linked to Huntingtons disease. Nature. 306:234-8, 1983 Hacia JG, Fan JB, Ryder O, Jin L, Edgemon K, Ghandour G, Mayer RA, Sun B, Hsie L, Robbins CM, Brody LC, Wang D, Lander ES, Lipshutz R, Fodor SP, Collins FS. Determination of ancestral alleles for human single-nucleotide polymorphisms using high-density oligonucleotide arrays. Nat Genet. 22:164-7, 1999 Hacia JG. Sun B. Hunt N. Edgemon K. Mosbrook D. Robbins C. Fodor SP. Tagle DA. Collins FS. Strategies for mutational analysis of the large multiexon ATM gene using high-density oligonucleotide arrays. Genome Research. 8:1245-58, 1998a Hacia JG. Woski SA. Fidanza J. Edgemon K. Hunt N. McGall G. Fodor SP. Collins FS. Enhanced high density oligonucleotide array-based sequence analysis using modified nucleoside triphosphates. Nucleic Acids Research. 26:4975-82, 1998b Haff LA. Smirnov IP. Single-nucleotide polymorphism identification assays using a thermostable DNA polymerase and delayed extraction MALDI-TOF mass spectrometry. Genome Research. 7:378-88, 1997 Hakala H, Virta P, Salo H, Lonnberg H. Simultaneous detection of several oligonucleotides by time-resolved fluorometry: the use of a mixture of categorized microparticles in a sandwich type mixed-phase hybridization assay. Nucleic Acids Res. 26:5581-8, 1998 Halushka, M.K. et al. Patterns ofsingle-nucleotide polymorphisms in candidate genes regulating blood-pressure homeostasis. Nature Genet. 22, 239, 1999 Harju L.Weber T. Alexandrova L. Lukin M. Ranki M. Jalanko A. Colorimetric solid-phase minisequencing assay illustrated by detection of alpha 1-antitrypsin Z mutation. Clinical Chemistry. 39:2282-7, 1993 84 Harris H Enzyme polymorphisms in man. Proc. R. Soc. Lond. (Biol) 174:1, 1966 Head SR, Rogers YH, Parikh K, Lan G, Anderson S, Goelet P, Boyce-Jacino MT. Nested genetic bit analysis (N-GBA) for mutation detection in the p53 tumor suppressor gene. Nucleic Acids Res. 25:5065-71, 1997 Healey BG, Matson RS, Walt DR. Fiberoptic DNA sensor array capable of detecting point mutations. Anal Biochem. 251:270-9, 1997 Henegariu O, Heerema NA, Dlouhy SR, Vance GH, Vogt PH. Multiplex PCR: critical parameters and step-by-step protocol. Biotechniques. 23:504-11, 1997 Higuchi R. Dollinger G.Walsh PS. Griffith R. Simultaneous amplification and detection of specific DNA sequences. Bio/Technology 10:413-7, 1992 Higuchi R. Fockler C. Dollinger G.Watson R. Kinetic PCR analysis: real-time monitoring of DNA amplification reactions. Bio/Technology 11:1026-30, 1993 Higuchi R. Simple and rapid preparation of samples for PCR. In Ed. HA Erlich. PCR Technology: principles and applications for DNA amplification. Stockton Press, New York. 31-38, 1989 Hillert J. Human leukocyte antigen studies in multiple sclerosis. Ann Neurol. 36 Suppl:S15-7, 1994 Holland PM. Abramson RD. Watson R. Gelfand DH. Detection of specific polymerase chain reaction product by utilizing the 5' 3' exonuclease activity of Thermus aquaticus DNA polymerase. Proceedings of the National Academy of Sciences of the United States of America 88:7276-80, 1991 Hultman T. Stahl S. Hornes E. Uhlen M. Direct solid phase sequencing of genomic and plasmid DNA using magnetic beads as solid support. Nucleic Acids Research. 17:4937-46, 1989 Ingram VM A specific chemical difference between the globins of normal human and sickle cell anaemia haemoglobin. Nature 178:792, 1956 Jamer R. Differentiating genomics companies. Nat Biotech 18:153, 2000 Jeffreys AJ,Wilson V, and Thein SL. Hypervariable minisatellite regions in human DNA. Nature 314 67-73, 1985 Jurinke C. van den Boom D. Jacob A. Tang K. Worl R. Koster H. Analysis of ligase chain 85 reaction products via matrix-assisted laser desorption/ionization time-of-flight-mass spectrometry. Analytical Biochemistry. 237:174-81, 1996 Kan YW, and Dozy AM. Polymorphism of DNA sequence adjacent to human beta-globin structural gene: relationship to sickle mutation. Proc Natl Acad Sci U S A 75 5631-5, 1978 Kelley SO, Boon EM, Barton JK, Jackson NM, Hill MG. Single-base mismatch detection based on charge transduction through DNA. Nucleic Acids Res. 27:4830-7, 1999 Kelly TJ, Jr., and Smith HO. A restriction enzyme from Hemophilus influenzae. II. J Mol Biol 51 393-409, 1970 Khanna M. Park P. Zirvi M. Cao W. Picon A. Day J. Paty P. Barany F. Multiplex PCR/LDR for detection of K-ras mutations in primary colon tumors. Oncogene. 18:27-38, 1999 Khrapko KR, Lysov YuP, Khorlin AA, Ivanov IB, Yershov GM, Vasilenko SK, Florentiev VL, Mirzabekov AD. A method for DNA sequencing by hybridization with oligonucleotide matrix. DNA Seq. 1:375-88, 1991 Khrapko, K.R., Lysov, P. Yu, A.A. Khorlyn, V.V. Shick, V.L. Florentiev, and A.D. Mirzabekov. An oligonucleotide hybridization approach to DNA sequencing. FEBS Lett. 256:118-122, 1989 Kimura A, Dong Rui-P, Harada H, Sasazuki T. DNA typing of HLA Class II genes in Blymphoblastoid cell lines homozygous for HLA. Tissue Antigens 40;5-12, 1992 Kimura A, Takehiko S. Eleventh International Histocompatibility Workshop reference protocol for the HLA DNA-typing technique. HLA 1991. Oxford: Oxford University Press, 397-419, 1991 Koivisto UM,Viikari JS, Kontula K. Molecular characterization of minor gene rearrangements in Finnish patients with heterozygous familial hypercholesterolemia: identification of two common missense mutations (Gly823>Asp and Leu380>His) and eight rare mutations of the LDL receptor gene. Am J Hum Genet. 57:789-97, 1995 Kononen J, Bubendorf L, Kallioniemi A, Barlund M, Schraml P, Leighton S, Torhorst J, Mihatsch MJ, Sauter G, Kallioniemi OP. Tissue microarrays for high-throughput molecular profiling of tumor specimens. Nat Med. 4:844-7, 1998 Kopp MU. Mello AJ. Manz A. Chemical amplification: continuous-flow PCR on a chip. Science. 280:1046-8, 1998 Koster H. Tang K. Fu DJ. Braun A. van den Boom D. Smith CL. Cotter RJ. Cantor CR. A strategy 86 for rapid and efficient DNA sequencing by mass spectrometry. Nature Biotechnology. 14:1123-8, 1996 Kozal MJ, Shah N, Shen N,Yang R, Fucini R, Merigan TC, Richman DD, Morris D, Hubbell E, Chee M, Gingeras TR. Extensive polymorphisms observed in HIV-1 clade B protease gene using high-density oligonucleotide arrays. Nat Med. 2:753-9, 1996 Kramer FR, Lizardi PM. Replicatable RNA reporters. Nature. 339:401-2, 1989 Krook A, Stratton IM, ORahilly S. Rapid and simultaneous detection of multiple mutations by pooled and multiplex single nucleotide primer extension: application to the study of insulin-responsive glucose transporter and insulin receptor mutations in non-insulindependent diabetes. Hum. Mol. Gen. 1;391-5, 1992 Kruglyak L. Prospects for whole-genome linkage disequilibrium mapping of common disease genes. Nature Genetics. 22:139-44, 1999 Kuppuswamy MN, Hoffmann JW, Kasper CK, Spitzer SG, Groce SL, Bajaj PS. Single nucleotide primer extension to detect genetic diseases: Experimental application to hemophilia B (factor IX) and cystic fibrosis genes. Proc. Natl. Acad. Sci. USA 88;1143-7, 1991 Kure S, Takayanagi M, Narisawa K, Tada K, Leisti J. Identification of a common mutation in Finnish patients with nonketotic hyperglycinemia. J Clin Invest. 90:160-4, 1992 Kurg A, Tõnisson N, Georgiou I, Shumaker J, Tollett J, Metspalu A. Arrayed Primer Extension: Solid phase four-color DNA resequencing and mutation detection technology. Genetic Testing 2000 (in press) Kwiatkowski M, Fredriksson S, Isaksson A, Nilsson M, Landegren U. Inversion of in situ synthesized oligonucleotides: improved reagents for hybridization and primer extension in DNA microarrays. Nucleic Acids Res. 27:4710-4, 1999 Kwiatkowski M. Nilsson M. Landegren U. Synthesis of full-length oligonucleotides: cleavage of apurinic molecules on a novel support. Nucleic Acids Research. 24:4632-8, 1996 Kwok S. Kellogg DE. McKinney N. Spasic D. Goda L. Levenson C. Sninsky JJ. Effects of primer-template mismatches on the polymerase chain reaction: human immunodeficiency virus type 1 model studies. Nucleic Acids Research. 18:999-1005, 1990 Laan M. Paabo S. Demographic history and linkage disequilibrium in human populations Nature Genetics. 17:435-8, 1997 87 Lagerkvist A. Stewart J. Lagerstrom-Fermer M. Landegren U. Manifold sequencing: efficient processing of large sets of sequencing reactions. Proceedings of the National Academy of Sciences of the United States of America. 91:2245-9, 1994 Lambert WC, Kuo H-R, Lambert MW. Xeroderma pigmentosum and related disorders. In Jameson JL (ed.) Principles of Molecular Medicine. Humana Press, NJ. Lamture JB, Beattie KL, Burke BE, Eggers MD, Ehrlich DJ, Fowler R, Hollis MA, Kosicki BB, Reich RK, Smith SR, et al. Direct detection of nucleic acid hybridization on the surface of a charge coupled device. Nucleic Acids Res. 22:2121-5, 1994 Landegren U, Kaiser R, Sanders J, Hood L. A ligase-mediated gene detection technique. Science 241;1077-80, 1988 Lander ES. Array of hope. Nat Genet. 21(1 Suppl):3-4, 1999 Lander ES. The new genomics: global views of biology. Science. 274:536-9, 1996 Landsteiner & Wiener 1940 Landsteiner , 1901 Lathrop M. Nakamura Y. OConnell P. Leppert M. Woodward S. Lalouel JM. White R. A mapped set of genetic markers for human chromosome 9. Genomics. 3:361-6, 1988 Lawyer FC. Stoffel S. Saiki RK. Myambo K. Drummond R. Gelfand DH. Isolation, characterization, and expression in Escherichia coli of the DNA polymerase gene from Thermus aquaticus. Journal of Biological Chemistry. 264:6427-37, 1989 Lee LG. Livak KJ. Mullah B. Graham RJ. Vinayak RS. Woudenberg TM. Seven-color, homogeneous detection of six PCR products. Biotechniques. 27:342-9, 1999 Lemmo AV, Rose DJ, Tisone TC. Inkjet dispensing technology: applications in drug discovery. Curr Opin Biotechnol. 9:615-7, 1998 Levine & Stetson 1939 Lewin. GENES VI Chapter 15 DNA replication pp471-504, Oxford University Press, New York, 1997 Lewontin RC. Hubby JL. A molecular approach to the study of genic heterozygosity in natural populations. II. Amount of variation and degree of heterozygosity in natural populations of Drosophila pseudoobscura. Genetics. 542:595-609, 1966 88 Li J, Butler JM, Tan Y, Lin H, Royer S, et al. Single nucleotide polymorphism determination using primer extension and time-of-flight mass spectrometry. Electrophoresis 20 1258-65, 1999 Li, W.H. & Sadler, L.A. Low nucleotide diversity in man. Genetics 129, 513523, 1991 Lindblad-Toh K, E. Winchester, M.J. Daly, D.G. Wang, J.N. Hirschhorn, J.-P. Laviolette, K.Ardlie, D.E. Reich, E. Robinson, P. Sklar, N. Shah, D. Thomas, J.-B. Fan, T.Gingeras, J.Warrington, N. Patil, T.J. Hudson & E.S. Lander. Large-scale discovery and genotyping of single nucleotide polymorphisms in the mouse. Nat. Genet., 2000 Linder MW, Prough RA, Valdes R Jr. Pharmacogenetics: a laboratory tool for optimizing therapeutic efficiency. Clin Chem. 43:254-66, 1997 Lipshutz RJ, Fodor SP, Gingeras TR, Lockhart DJ. High density synthetic oligonucleotide arrays. Nat Genet. 21(1 Suppl):20-4, 1999 Lipshutz RJ, Morris D, Chee M, Hubbell E, Kozal MJ, Shah N, Shen N,Yang R, Fodor SP. Using oligonucleotide probe arrays to access genetic diversity. Biotechniques. 19:442-7, 1995 Little DP, Braun A, ODonnell MJ, Koster H. Mass spectrometry from miniaturized arrays for full comparative DNA analysis. Nat Med. 3:1413-6, 1997 Liu R, Paxton WA, Choe S, Ceradini D, Martin SR, Horuk R, MacDonald ME, Stuhlmann H, Koup RA, Landau, NR. Homozygous defect in HIV-1 coreceptor accounts for resistance of some multiply-exposed individuals to HIV-1 infection. Cell 86:367-77, 1996 Liu YH. Bai J. Zhu Y. Liang X. Siemieniak D.Venta PJ. Lubman DM. Rapid screening of genetic polymorphisms using buccal cell DNA with detection by matrix-assisted laser desorption/ ionization mass spectrometry. Rapid Communications in Mass Spectrometry. 9:735-43, 1995 Livak KJ. Flood SJ. Marmaro J. Giusti W. Deetz K. Oligonucleotides with fluorescent dyes at opposite ends provide a quenched probe system useful for detecting PCR product and nucleic acid hybridization. Genome Research. 4:357-62, 1995 Livak KJ. Hainer JW. A microtiter plate assay for determining apolipoprotein E genotype and discovery of a rare allele. Human Mutation. 3:379-85, 1994 Lizardi PM, Huang X, Zhu Z, Bray-Ward P, Thomas DC, Ward DC. Mutation detection and single-molecule counting using isothermal rolling-circle amplification. Nat Genet. 19:225- 89 32, 1998 Lockhart DJ, Dong H, Byrne MC, Follettie MT, Gallo MV, Chee MS, Mittmann M, Wang C, Kobayashi M, Horton H, Brown EL. Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat Biotechnol. 14:1675-80, 1996 Loeb L.A., Preston B.D. Mutagenesis by apurinic/apyrimidic sites. Ann. Rev. Genet. 20:201230, 1986 Lonjou C. Collins A. Morton NE. Allelic association between marker loci. Proceedings of the National Academy of Sciences of the United States of America. 96:1621-6, 1999 Luo J. Bergstrom DE. Barany F. Improving the fidelity of Thermus thermophilus DNA ligase. Nucleic Acids Research. 24:3071-8, 1996 Lyamichev V, Mast AL, Hall JG, Prudent JR, Kaiser MW, Takova T, Kwiatkowski RW, Sander TJ, de Arruda M, ArcoDA, Neri BP, Brow MA. Polymorphism identification and quantitative detection of genomic DNA by invasive cleavage of oligonucleotide probes. Nat Biotechnol. 17:292-6, 1999 Lysov, P. Yu, V.L. Florentiev, A.A. Khorlyn, K.R. Khrapko, V.V. Shick, and A.D. Mirzabekov. Dokl. Akad. Nauk. SSSR. 303:1508-1511, 1989 Maniatis T. Kee SG. Efstratiadis A. Kafatos FC. Amplification and characterization of a betaglobin gene synthesized in vitro. Cell. 8:163-82,1976 Marcotte EM, Pellegrini M, Ng HL, Rice DW, Yeates TO, Eisenberg D. Detecting protein function and protein-protein interactions from genome sequences. Science. 285:751-3, 1999 Marcotte EM, Pellegrini M, Thompson MJ, Yeates TO, Eisenberg D. A combined algorithm for genome-wide prediction of protein function. Nature. 402:83-6, 1999 Marshall A, Hodgson J. DNA chips: an array of possibilities. Nat Biotechnol. 16:27-31, 1998 Martinson JJ, Chapman NH, Rees DC, Liu Y-T, Clegg JB. Global distribution of the CCR5 gene 32-basepair deletion. Nature Genet 16:100-3, 1997 Maskos U, Southern EM. A novel method for the analysis of multiple sequence variants by hybridisation to oligonucleotides. Nucleic Acids Res. 21:2267-8, 1993 Maskos U, Southern EM. A novel method for the parallel analysis of multiple mutations in multiple samples. Nucleic Acids Res. 21:2269-70, 1993 90 Maskos U, Southern EM. Oligonucleotide hybridizations on glass supports: a novel linker for oligonucleotide synthesis and hybridization properties of oligonucleotides synthesised in situ. Nucleic Acids Res. 20:1679-84, 1992 Masood E. As consortium plans free SNP map of human genome. Nature. 398:545-6, 1999 Matson RS, Rampal J, Pentoney SL Jr, Anderson PD, Coassin P. Biopolymer synthesis on polypropylene supports: oligonucleotide arrays. Anal Biochem. 224:110-6, 1995 Matson RS, Rampal JB, Coassin PJ. Biopolymer synthesis on polypropylene supports. I. Oligonucleotides. Anal Biochem. 217:306-10, 1994 Matsuura et al. Nature Genet. 19:179, 1998 Maxam AM. Gilbert W. A new method for sequencing DNA. Proceedings of the National Academy of Sciences of the United States of America. 74:560-4, 1977 McGall G et al. J Am Chem Soc 119:5081, 1997 McGall G, Labadie J, Brock P, Wallraff G, Nguyen T, Hinsberg W. Light-directed synthesis of high-density oligonucleotide arrays using semiconductor photoresists. Proc Natl Acad Sci U S A 93:13555-60, 1996 McGlennen RC. Dynamic mutations pose unique challenges for the molecular diagnostics laboratory [comment]. Clinical Chemistry. 42:1582-8, 1996 Meldrum DR, Evensen HT, Pence WH, Moody SE, Cunningham DL, Wiktor PJ. ACAPELLA1K, A capillary-based submicroliter automated fluid handling system for genome analysis. Genome Res. 10:95-104, 2000 Metzker ML. Lu J. Gibbs RA. Electrophoretically uniform fluorescent dyes for automated DNA sequencing. Science. 271:1420-2, 1996 Mikkelsson J, Perola M, Kauppila LI, Laippala P, Savolainen V, Pajarinen J, Penttila A, Karhunen PJ. The GPIIIa Pl(A) polymorphism in the progression of abdominal aortic atherosclerosis. Atherosclerosis. 147:55-60, 1999 Milner N, Mir KU, Southern EM. Selecting effective antisense reagents on combinatorial oligonucleotide arrays. Nat Biotechnol. 15:537-41, 1997 Mir KU, Southern EM. Determining the influence of structure on hybridization using oligonucleotide arrays. Nat Biotechnol. 17:788-92, 1999 91 Mirzabekov AD. DNA sequencing by hybridizationa megasequencing method and a diagnostic tool? Trends Biotechnol. 12:27-32, 1994 Mitra RD, Church GM. In situ localized amplification and contact replication of many individual DNA molecules. Nucleic Acids Res. 27(24):e34, 1999 Moffatt MF. Traherne JA. Abecasis GR. Cookson WOCM. Single nucleotide polymorphism and linkage disequilibrium within the TCR a/d locus. Hum. Mol. Gen. 9:1011-9, 2000 Morley JM, Bark JE, Evans CE, Perry JG, Hewitt CA, Tully G. Validation of mitochondrial DNA minisequencing for forensic casework. Int J Legal Med. 112:241-8, 1999 Morozov VN, Morozova TYa. Electrospray deposition as a method for mass fabrication of mono- and multicomponent microarrays of biological and biologically active substances. Anal Chem. 71:3110-7, 1999 Mullis KB. Faloona FA. Specific synthesis of DNA in vitro via a polymerase-catalyzed chain reaction. Methods in Enzymology. 155:335-50, 1987 Murray V. Improved double-stranded DNA sequencing using the linear polymerase chain reaction. Nucleic Acids Res. 17:8889, 1989 Myers RM. Larin Z. Maniatis T. Detection of single base substitutions by ribonuclease cleavage at mismatches in RNA:DNA duplexes. Science.230:1242-6, 1985 Myers RM. Lumelsky N. Lerman LS. Maniatis T. Detection of single base substitutions in total genomic DNA. Nature. 313:495-8, 1985 Nazarenko IA. Bhatnagar SK. Hohman RJ. A closed tube format for amplification and detection of DNA based on energy transfer. Nucleic Acids Research. 25:2516-21, 1997 Newton CR, Graham A, Heptinstall LE, Powell SJ, Summers C, Kalsheker N, et al. Analysis of any point mutation in DNA. The amplification refractory mutation system ( ARMS ). Nucleic Acids Res. 17;2503-16, 1989 Nguyen HK, Fournier O, Asseline U, Dupret D, Thuong NT. Smoothing of the thermal stability of DNA duplexes by using modified nucleosides and chaotropic agents. Nucleic Acids Res. 27:1492-8, 1999 Nickerson DA. Kaiser R. Lappin S. Stewart J. Hood L. Landegren U. Automated DNA diagnostics using an ELISA-based oligonucleotide ligation assay. Proceedings of the National Academy of Sciences of the United States of America. 87:8923-7, 1990 92 Nickerson, D.A. et al. DNA sequence diversity in a 9.7-kb region of the human lipoprotein lipase gene. Nature Genet. 19, 233, 1998 Nilsson M. Krejci K. Koch J. Kwiatkowski M. Gustavsson P. Landegren U. Padlock probes reveal single-nucleotide differences, parent of origin and in situ distribution of centromeric sequences in human chromosomes 13 and 21. Nature Genetics. 16:252-5, 1997 Nilsson M. Malmgren H. Samiotaki M. Kwiatkowski M. Chowdhary BP. Landegren U. Padlock probes: circularizing oligonucleotides for localized DNA detection. Science. 265:2085-8, 1994 Northrup MA, Christel LA, McMillan WA, Petersen K, Pourahmadi F, Western L, and Young S. A new generation of PCR instruments and nucleic acid systems. In:PCR Applications: protocols for functional genomics. (ed. Innis MA, Gelfand DH and Sninsky JJ) pp. 105-126. Academic Press, San Diego, 1999 Nyren P. Karamohamed S. Ronaghi M. Detection of single-base changes using a bioluminometric primer extension assay. Analytical Biochemistry. 244:367-73, 1997 ODonovan MC. Oefner PJ. Roberts SC. Austin J. Hoogendoorn B. Guy C. Speight G. Upadhyaya M. Sommer SS. McGuffin P. Blind analysis of denaturing high-performance liquid chromatography as a tool for mutation detection. Genomics. 52:44-9, 1998 Okamoto T. Suzuki T.Yamamoto N. Microarray fabrication with covalent attachment of DNA using Bubble Jet technology. Nature Biotech. 18:438, 2000 Parker KC; Haff L.; Garvin AM; MALDI-TOF based mutation detection using tagged in vitro synthesized peptides Nature Biotechnology 18:95, 2000 Pastinen T, Syvänen AC, Sitbon G Lönngren J: Fluorescent, solid-phase minisequencing method for genotyping cytochrome P450 genes. In: PCR applications: Protocols for functional genomics. Ed. Michael Innis, David Gelfand ja John Snitsky. Academic Press. Pp. 521-536, 1999 Pauling L, Itano HA, Singer SJ,Wells IC. Sickle cell anemia: A molecular disease. Science 110:543, 1949 Pease AC, Solas D, Sullivan EJ, Cronin MT, Holmes CP, Fodor SP..Light-generated oligonucleotide arrays for rapid DNA sequence analysis. Proc Natl Acad Sci U S A. 91:5022-6, 1994 Pecheniuk NM. Marsh NA. Walsh TP. Dale JL. Use of first nucleotide change technology to 93 determine the frequency of factor V Leiden in a population of Australian blood donors. Blood Coagulation & Fibrinolysis. 8:491-5, 1997 Peltonen L, Jalanko A, Varilo T. Molecular genetics of the finnish disease heritage. Hum Mol Genet. 8:1913-23, 1999 Piggee CA, Muth J, Carrilho E, Karger BL. Capillary electrophoresis for the detection of known point mutations by single-nucleotide primer extension and laser-induced fluorescence detection.J Chromatogr A. 781:367-75, 1997 Plaschke J, Voss H, Hahn M, Ansorge W, Schackert HK. Doublex sequencing in molecular diagnosis of hereditary diseases. Biotechniques. 24:838-41, 1998 Powell et al. New England Journal of Medicine 329:1982, 1993 Proudnikov D, Timofeev E, Mirzabekov A. Immobilization of DNA in polyacrylamide gel for the manufacture of DNA and DNA-oligonucleotide microchips. Anal Biochem. 259:34-41, 1998 Quesada MA. Replaceable polymers in DNA sequencing by capillary electrophoresis. [Review] [65 refs] Current Opinion in Biotechnology.8:82-93, 1997 Ramsey JM, Jacobson SC, Knapp MR. Microfabricated chemical measurement systems. Nat Med. 1:1093-6, 1995 Rehman FN, Audeh M, Abrams ES, Hammond PW, Kenney M, Boles TC. Immobilization of acrylamide-modified oligonucleotides by co-polymerization. Nucleic Acids Res. 27:649-55, 1999 Rieder MJ. Taylor SL. Clark AG. Nickerson DA. Sequence variation in the human angiotensin converting enzyme. Nature Genetics.22:59-62, 1999 Rigler R. Fluorescence correlations, single molecule detection and large number screening. Applications in biotechnology. J Biotechnol. 41:177-86, 1995 Risch N, and Merikangas K. The future of genetic studies of complex human diseases. Science 273 1516-7, 1996 Rogers YH, Jiang-Baucom P, Huang ZJ, Bogdanov V, Anderson S, Boyce-Jacino MT. Immobilization of oligonucleotides onto a glass support via disulfide bonds: A method for preparation of DNA microarrays. Anal Biochem. 266:23-30, 1999 Ronaghi M. Karamohamed S. Pettersson B. Uhlen M. Nyren P. Real-time DNA sequencing 94 using detection of pyrophosphate release. Analytical Biochemistry. 242:84-9, 1996 Roskey MT. Juhasz P. Smirnov IP. Takach EJ. Martin SA. Haff LA. DNA sequencing by delayed extraction-matrix-assisted laser desorption/ionization time of flight mass spectrometry. Proceedings of the National Academy of Sciences of the United States of America. 93:47249, 1996 Ross P, Hall L, Smirnov I, and Haff L. High level multiplex genotyping by MALDI-TOF mass spectrometry Nat Biotechnol 16 1347-51, 1998 Ross PL. Lee K. Belgrader P. Discrimination of single-nucleotide polymorphisms in human DNA using peptide nucleic acid probes detected by MALDI-TOF mass spectrometry. Analytical Chemistry. 69:4197-202, 1997 Ruano G. Kidd KK. Coupled amplification and sequencing of genomic DNA. Proceedings of the National Academy of Sciences of the United States of America. 88:2815-9, 1991 Saiki RK. Bugawan TL. Horn GT. Mullis KB. Erlich HA. Analysis of enzymatically amplified beta-globin and HLA-DQ alpha DNA with allele-specific oligonucleotide probes. Nature. 324:163-6, 1986 Saiki RK. Chang CA. Levenson CH.Warren TC. Boehm CD. Kazazian HH Jr. Erlich HA. Diagnosis of sickle cell anemia and beta-thalassemia with enzymatically amplified DNA and nonradioactive allele-specific oligonucleotide probes. New England Journal of Medicine. 319:537-41, 1988 Saiki RK. Gelfand DH. Stoffel S. Scharf SJ. Higuchi R. Horn GT. Mullis KB. Erlich HA. Primerdirected enzymatic amplification of DNA with a thermostable DNA polymerase. Science. 239:487-91, 1988 Saiki RK.Walsh PS. Levenson CH. Erlich HA. Genetic analysis of amplified DNA with immobilized sequence-specific oligonucleotide probes. Proceedings of the National Academy of Sciences of the United States of America. 86:6230-4, 1989 Sajantila A. Lukka M. Syvanen AC. Experimentally observed germline mutations at human micro- and minisatellite loci. European Journal of Human Genetics. 7:263-6, 1999 Sambrook, J., Fritsch, E.F., and Maniatis, T. Molecular Cloning: A Laboratory Manual. 2nd ed. pp.E.5 Cold Spring Harbor Laboratory Press. Cold Spring Harbor, NY, 1989 Samiotaki M. Kwiatkowski M. Parik J. Landegren U. Dual-color detection of DNA sequence 95 variants by ligase-mediated analysis. Genomics. 20:238-42, 1994 Samson M, Libert F, Doranz BJ, Rucker J, Liesnard C, Farber C-M, Saragosti S, Lapoumeroulie C, Cognaux J, Forceille C, Muyldermans G, Verhofstede C, Burtonboy G, Georges M, Imai T, Rana S,Yi Y, Smyth RJ, Collman RG, Doms RW, Vassart G, Parmentier M. Resistance to HIV-1 infection in caucasian individuals bearing mutant alleles of the CCR-5 chemokine receptor gene. Nature 382:722-5, 1996 Sanger F. Nicklen S. Coulson AR. DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences of the United States of America. 74:5463-7, 1977 Sapolsky RJ, Lipshutz RJ. Mapping genomic library clones using oligonucleotide arrays. Genomics 33:445-56, 1996 Sauer S, Lechner D, Berlin K, Lehrach H, Escary JL, Fox N, Gut IG A novel procedure for efficient genotyping of single nucleotide polymorphisms. Nucleic Acids Res 28:e13, 2000 Schena M, Shalon D, Davis RW, Brown PO. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science. 270:467-70, 1995 Shalon, D., S.J. Smith. P.O. Brown. A DNA microarray system for analyzing complex DNA samples using two-color fluorescent probe hybridization. Genome Res. 6: 639-45, 1996 Sharkey DJ. Scalice ER. Christy KG Jr. Atwood SM. Daiss JL. Antibodies as thermolabile switches: high temperature triggering for the polymerase chain reaction. Bio/Technology. 12:506-9, 1994 Shoemaker DD, Lashkari DA, Morris D, Mittmann M, Davis RW. Quantitative phenotypic analysis of yeast deletion mutants using a highly parallel molecular bar-coding strategy. Nat Genet. 14:450-6, 1996 Shuber AP. Grondin VJ. Klinger KW. A simplified procedure for developing multiplex PCRs. Genome Research. 5:488-93, 1995 Shuber AP. Michalowsky LA. Nass GS. Skoletsky J. Hire LM. Kotsopoulos SK. Phipps MF. Barberio DM. Klinger KW. High throughput parallel analysis of hundreds of patient samples for more than 100 mutations in multiple disease genes. Human Molecular Genetics. 6:33747, 1997 Shumaker JM, Metspalu A, Caskey CT. Mutation detection by solid phase primer extension. 96 Hum Mutat. 7:346-54, 1996 Sitbon G. Hurtig M. Palotie A. Lonngren J. Syvanen AC. A colorimetric minisequencing assay for the mutation in codon 506 of the coagulation factor V gene. Thrombosis & Haemostasis. 77:701-3, 1997 Smith HO, and Wilcox KW. A restriction enzyme from Hemophilus influenzae. I. Purification and general properties. J Mol Biol 51 379, 1970.. Sokolov BP. Primer extension technique for the detection of single nucleotide in genomic DNA. Nucleic Acids Res. 18;3671, 1990 Solinas-Toldo S, Lampel S, Stilgenbauer S, Nickolenko J, Benner A, Dohner H, Cremer T, Lichter P. Matrix-based comparative genomic hybridization: biochips to screen for genomic imbalances. Genes Chromosomes Cancer 20:399-407, 1997 Sommer SS, Cassady JD, Sobell JL, Bottema CDK. A novel method for detecting point mutations or polymorphisms and its application to population screening for carriers of phenylketonuria. Mayo Clin. Proc.64;1361-72, 1989 Sosnowski RG, Tu E, Butler WF, OConnell JP, Heller MJ. Rapid determination of single base mismatch mutations in DNA hybrids by direct electric field control. Proc Natl Acad Sci U S A. 94:1119-23, 1997 Southern EM 1988 [Analyzing polynucleotide sequences. International Patent Application PCT GB 89/00460] Southern EM, Case-Green SC, Elder JK, Johnson M, Mir KU, Wang L, Williams JC. Arrays of complementary oligonucleotides for analysing the hybridisation behaviour of nucleic acids. Nucleic Acids Res. 22:1368-73, 1994 Southern EM, Maskos U, Elder JK. Analyzing and comparing nucleic acid sequences by hybridization to arrays of oligonucleotides: evaluation using experimental models. Genomics. 13:1008-17, 1992 Southern EM. Detection of specific sequences among DNA fragments separated by gel electrophoresis. J Mol Biol 98 503-17, 1975 Southern EM. DNA chips: analysing sequence by hybridization to oligonucleotides on a large scale. Trends Genet. 12:110-5, 1996 Steemers FJ, Ferguson JA, Walt DR. Screening unlabeled DNA targets with randomly or- 97 dered fiber-optic gene arrays. Nat Biotechnol. 18:91-4, 2000 Stimpson DI, Cooley PW, Knepper SM, Wallace DB. Parallel production of oligonucleotide arrays using membranes and reagent jet printing. Biotechniques. 25:886-90, 1998 Stimpson DI, Hoijer JV, Hsieh WT, Jou C, Gordon J, Theriault T, Gamble R, Baldeschwieler JD. Real-time detection of DNA hybridization and melting on oligonucleotide arrays by using optical wave guides. Proc Natl Acad Sci U S A. 92:6379-83, 1995 Stomakhin AA, Vasiliskov VA, Timofeev E, Schulga D, Cotter RJ, Mirzabekov AD. DNA sequence analysis by hybridization with oligonucleotide microchips: MALDI mass spectrometry identification of 5mers contiguously stacked to microchip oligonucleotides. Nucleic Acids Res. 28:1193-1198, 2000 Strezoska Z, Paunesku T, Radosavljevic D, Labat I, Drmanac R, Crkvenjakov R. DNA sequencing by hybridization: 100 bases read by a non-gel-based method. Proc Natl Acad Sci U S A. 88:10089-93, 1991 Syvanen AC. Aalto-Setala K. Harju L. Kontula K. Soderlund H. A primer-guided nucleotide incorporation assay in the genotyping of apolipoprotein E. Genomics. 8:684-92, 1990 Syvanen AC. Aalto-Setala K. Kontula K. Soderlund H. Direct sequencing of affinity-captured amplified human DNA application to the detection of apolipoprotein E polymorphism. FEBS Letters. 258:71-4, 1989 Syvanen AC. Ikonen E. Manninen T. Bengtstrom M. Soderlund H. Aula P. Peltonen L. Convenient and quantitative determination of the frequency of a mutant allele using solid-phase minisequencing: application to aspartylglucosaminuria in Finland. Genomics. 12:590-5, 1992 Syvanen AC. Sajantila A. Lukka M. Identification of individuals by analysis of biallelic DNA markers, using PCR and solid-phase minisequencing. American Journal of Human Genetics. 52:46-59, 1993 Syvanen, A.C. From gels to chips: minisequencing primer extension for analysis of point mutations and single nucleotide polymoprhisms. Human Mutation 13:1-10,1999 Tabor S. Richardson CC. A single residue in DNA polymerases of the Escherichia coli DNA polymerase I family is critical for distinguishing between deoxy- and dideoxyribonucleotides. Proceedings of the National Academy of Sciences of the United 98 States of America. 92:6339-43, 1995 Tang, K., D.-J. Fu, D. Julien, A. Braun, C.R. Cantor, and H. Koster. Chip-based genotyping by mass spectrometry. Proc. Natl. Acad. Sci. 96: 10016-10020, 1999 Tapp I, Malmberg L., Rennel E, Wik M, Syvanen AC Homogenous scoring of singlenuclleotide polymorphisms: Comparision of the 5-nuclease TaqMan assay and molecular beacon probes. Biotechniques 28:0-0, 2000 Terwilliger JD.Weiss KM. Linkage disequilibrium mapping of complex disease: fantasy or reality?. Current Opinion in Biotechnology. 9:578-94, 1998 Thacker J.The molecular nature of mutation in cultured mammalian cells: a review. Mutat. Res. 150:431-442, 1985 The Huntingtons Disease Collaborative Research Group.A novel gene containing a trinucleotide repeat that is expanded and unstable on Huntingtons disease chromosomes. Cell. 72:971-83, 1993 Thonnard J. Deldime F. Heusterspreute M. Delepaut B. Hanon F. De Bruyere M. Philippe M. HLA class II genotyping: two assay systems compared. Clinical Chemistry. 41:553-6, 1995 Tobe VO. Taylor SL. Nickerson DA. Single-well genotyping of diallelic sequence variations by a two-color ELISA-based oligonucleotide ligation assay. Nucleic Acids Research. 24:3728-32, 1996 Torrents D, Mykkanen J, Pineda M, Feliubadalo L, Estevez R, de Cid R, Sanjurjo P, Zorzano A, Nunes V, Huoponen K, Reinikainen A, Simell O, Savontaus ML, Aula P, Palacin M. Identification of SLC7A7, encoding y+LAT-1, as the lysinuric protein intolerance gene. Nat Genet. 21:2936, 1999 Tully G, Sullivan KM, Nixon P, Stones RE, Gill P. Rapid detection of mitochondrial sequence polymorphisms using multiplex solid-phase fluorescent minisequencing. Genomics. 34:107-13, 1996 Turner MW. Mannose-binding lectin: the pluripotent molecule of the innate immune system. Imm Today 17:532-40, 1996 Tuuminen T. Ingman H. Therrell BL Jr. Kallio A. Multivariant confirmation of sickle cell disease using a non-radioactive minisequencing reaction. Hemoglobin. 21:71-89, 1997 Tyagi S. Bratu DP. Kramer FR. Multicolor molecular beacons for allele discrimination. Nature 99 Biotechnology. 16:49-53, 1998 Tyagi S. Kramer FR. Molecular beacons: probes that fluoresce upon hybridization. Nature Biotechnology. 14:303-8, 1996 Tyagi S. Landegren U. Tazi M. Lizardi PM. Kramer FR. Extremely sensitive, background-free gene detection using binary probes and beta replicase. Proceedings of the National Academy of Sciences of the United States of America. 93:5395-400, 1996 Tyagi, S., D.P. Bratu, and F.R. Kramer. Multicolor molecular beacons for allele discrimination. Nature Biotech. 16: 49-53, 1998 Underhill PA. Jin L. Lin AA. Mehdi SQ. Jenkins T.Vollrath D. Davis RW. Cavalli-Sforza LL. Oefner PJ. Detection of numerous Y chromosomebiallelic polymorphisms by denaturing highperformance liquid chromatography Genome Research. 7:996, 1997 Vartiainen E, Puska P, Pekkanen J, Tuomilehto J, Jousilahti P. Changes in risk factors explain changes in mortality from ischaemic heart disease in Finland. BMJ. 309:23-7, 1994 Vasiliskov AV, Timofeev EN, Surzhikov SA, Drobyshev AL, Shick VV, Mirzabekov AD. Fabrication of microarray of gel-immobilized compounds on a chip by copolymerization. Biotechniques. 27:592, 1999 Verheijen FW, Verbeek E, Aula N, Beerens CE, Havelaar AC, Joosse M, Peltonen L, Aula P, Galjaard H, van der Spek PJ, Mancini GM. A new gene, encoding an anion transporter, is mutated in sialic acid storage diseases. Nat Genet. 23:462-5, 1999 Virtaneva K. DAmato E. Miao J. Koskiniemi M. Norio R. Avanzini G. Franceschetti S. Michelucci R. Tassinari CA. Omer S. Pennacchio LA Myers RM. Dieguez-Lucena JL. Krahe R. de la Chapelle A. Lehesjoki AE. Unstable minisatellite expansion causing recessively inherited myoclonus epilepsy, EPM1. Nature Genetics. 15: 393-6, 1997 Vo-Dinh T, Alarie JP, Isola N, Landis D,Wintenberg AL, Ericson MN. DNA biochip using a phototransistor integrated circuit. Anal Chem. 71:358-63, 1999 Walker GT. Fraiser MS. Schram JL. Little MC. Nadeau JG. Malinowski DP. Strand displacement amplificationan isothermal, in vitro DNA amplification technique. Nucleic Acids Research. 20:1691-6, 1992 Walker GT. Little MC. Nadeau JG. Shank DD. Isothermal in vitro amplification of DNA by a restriction enzyme/DNA polymerase system. Proceedings of the National Academy of 100 Sciences of the United States of America. 89:392-6, 1992 Wall J. Cai S. Chehab FF. A 31-mutation assay for cystic fibrosis testing in the clinical molecular diagnostics laboratory. Human Mutation. 5:333-8, 1995 Wallace RB. Shaffer J. Murphy RF. Bonner J. Hirose T. Itakura K. Hybridization of synthetic oligodeoxyribonucleotides to phi chi 174 DNA: the effect of single base pair mismatch. Nucleic Acids Research. 6:3543-57, 1979 Wallraff et al. Chemtech 22-32, 1997 Wang D.G. et al. Large-scale identification,mapping, and genotyping of single-nucleotide polymorphisms in the human genome. Science 280, 10771082, 1998 Weber JL. Human DNA polymorphisms and methods of analysis. Current Opinion in Biotechnology. 1:166-71, 1990 Weber JL. May PE. Abundant class of human DNA polymorphisms which can be typed using the polymerase chain reaction. American Journalof Human Genetics. 44:388-96, 1989 Weiler J, Gausepohl H, Hauser N, Jensen ON, Hoheisel JD. Hybridisation based DNA screening on peptide nucleic acid (PNA) oligomer arrays. Nucleic Acids Res. 25:2792-9, 1997 Westin L, Xu X, Miller C,Wang L, Edman CF, Nerenberg M. Anchored multiplex amplification on a microelectronic chip array. Nat Biotechnol. 18:199-204, 2000 Westman P, Kuismin T, Partanen J, Koskimies S. An HLA-DR typing protocol using groupspecific PCR-amplification followed by restriction enzyme digests. Eur J Immunogen 20;103-9, 1993 Winzeler EA, Richards DR, Conway AR, Goldstein AL, Kalman S, McCullough MJ, McCusker JH, Stevens DA,WodickaL, Lockhart DJ, Davis RW. Direct allelic variation scanning of the yeast genome. Science. 281:1194-7, 1998 Wong C. Dowling CE. Saiki RK. Higuchi RG. Erlich HA. Kazazian HH Jr. Characterization of beta-thalassaemia mutations using direct genomic sequencing of amplified single copy DNA. Nature. 330(6146):384-6, 1987 Wu DY, Nozari G, Schold M, Conner BJ,Wallace RB. Direct analysis of single nucleotide variation in human DNA and RNA using in situ dot hybridization. DNA. 8:135-42, 1989 101 Wu DY, Ugozzoli L, Pal BK,Wallace BR. Allele-specific enzymatic amplification of beta-globin genomic DNA for diagnosis of sickle cell anemia. Proc. Natl. Acad. Sci. USA 86;2757-60, 1989 Wu DY,Wallace RB.The ligation amplification reaction ( LAR ) - amplification of specific DNA sequences using sequential rounds of template-dependent ligation. Genomics 4;560-9, 1989 Yershov G, Barsky V, Belgovskiy A, Kirillov E, Kreindlin E, Ivanov I, Parinov S, Guschin D, Drobishev A, Dubiley S, Mirzabekov A. DNA analysis and diagnostics on oligonucleotide microchips. Proc Natl Acad Sci U S A. 93:4913-8, 1996 Zangenberg G, Saiki RK, Reynolds R. Multiplex PCR: Optimization guidelines. In: PCR applications: Protocols for functional genomics. Ed. Michael Innis, David Gelfand ja John SnitskyAcademic Press. Pp. 73-94, 1999 Zhang H, Tombline G, Weber BL. BRCA1, BRCA2, and DNA damage response: collision or collusion? Cell. 92:433-6, 1998 Zhang L, Cui X, Schmitt K, Hubert R, Navidi W, Arnheim N.Whole genome amplification from a single cell: implications for genetic analysis. Proc Natl Acad Sci U S A. 89:5847-51, 1992 Zirvi M, Bergstrom DE, Saurage AS, Hammer RP, Barany F. Improved fidelity of thermostable ligases for detection of microsatellite repeat sequences using nucleoside analogs. Nucleic Acids Research Methods 27:e41, 1999a Zirvi M, Nakayama T, Newman G, McCaffrey T, Paty1 P, Barany F Ligase-based detection of mononucleotide repeat sequences. Nucleic Acids Research Methods. 27:e42: 1999b 102