Méthodes d`Analyse Structurale
Transcription
Méthodes d`Analyse Structurale
Méthodes d'Analyse Structurale 2006-2007 Pedro Coutinho Pedro.Coutinho@afmb.cnrs-mrs.fr http://www.afmb.univ-mrs.fr/Pedro-M-Coutinho M1 - BBSG ■ Introduction (PC) ■ 1. Techniques d'Analyse Structurale (MAS1) ■ ■ ◆ Resonance Magnétique Nucléaire - RMN (HD) ◆ Resonance Paramagnétique Électronique - EPR (BG) ◆ Crystallographie des Rayons-X (MC,GP) ◆ Fluorescence, Dichroisme Circulaire, Infra-rouge (JS) ◆ Microscopie Électronique et de Force Atomique (PC,JS) ◆ Diffraction des Petits Angles et DLS (VRB) 2. Introduction Approfondie (MAS2) ◆ RMN (HD) ◆ EPR (BG) ◆ Crystallographie (MC,GP) Final (PC) Organisation des Cours (1) La Biochimie a un support (Bio)physique La plupart des phénomènes biologiques a lieu a une petite échelle d'espace et de temps Biologie Structurale: Dimensions ■ Les Biomolécules ont des dimensions trop petites pour permettre une observation directe (ex: à l'instar de la microscopie optique classique). Magnification 1000X Magnification 100X (a) collection de cellules (b) oeuf humain (c) grain de sel (d) cheveu humain (e) protozoaire Paramecium multicronucleatum (f) protozoaire Amoeba proteus (a) 5 cellules de bactérie Escherichia coli 2000 nm (b) 2 cellules de levure Saccharomyces cerevisiae (c) globule rouge (d) Globule blanc (e) spermatozoïde (f) cellule de l'épiderme (g) cellule musculaire striée (h) cellule nerveuse Biologie Structurale: Dimensions Magnification 100 000 X (a) collection de molécules (b) cellule de E. coli 2000 nm (c) virus du mosaic du tabac (d) HIV (virus da SIDA) (e) bacteriophage Magnification 1000 000X (a) atome de carbone 0.1 nm (h) virus de la poliomielite 30 nm (b) glucose 0.7 nm (c) ATP (adenosine triphosphate)(i) myosine (j) ADN (d) chlorophylle (k) actine (e) ARNt (l) les 10 enzymes de la glycolise (f) anticorps (m) complexe pyruvate (g) ribosome déshydrogénases Biologie Structurale: Définition ■ ■ ■ Étude de l'architecture et de la forme des macromolécules biologiques (en particulier protéines et acides nucléotidiques) et de tous les aspects associés. Importante pour les la Biologie car les macromolécules participent à la majorité des fonctions cellulaires Leur fonction est souvent dépendante de la possibilité de ce replier dans une conformation particulière (de façon à pouvoir réaliser leur fonctions) Biologie Structurale: Niveaux de Structure ■ ■ La structure fonctionnelle est souvent désignée par structure tertiaire et/ou structure quaternaire. Ces niveaux de structure dépendent des niveaux inférieurs de structure – structure primaire (liée à la composition) et la structure secondaire (liée à des arrangements locaux). Structure Primaire O C HN HC O N C C N R inosina (I) H3C CH N HC C O C C NH HN CH C C R pseuduridina (ψ) O HN O C C CH3 C CH N R ribotimidina (T) O HN C CH N O O N C N R 1-metilinosina (m 1I) N HN C C N O C N CH2 CH2 R 5,6-diidrouridina (DHU) O H3C C N N C CH C C N N H2N H 1-metilguanosina (m 1G) O HN C C N CH C C N C C N N N H3C N HN R R CH3 CH3 2N-metilguanosina 2,2N,N-dimetilguanosina 2 (m G) (m 2 2G) CH Structure Secondaire Structure Tertiaire Structure Quaternaire... Acides Nucleiques: ARN vs ADN Domains are Structurally Distinct Lobes in Proteins ■ ■ ■ ■ ■ Domains are structurally independent units that each have the characteristics of a small globular protein. Most domains consist of 100 to 200 amino acid residues and have an average diameter of ~25 Å. Domains of recently evolved proteins are frequently encoded by exons, reflecting gene fusion of simpler modules. The number of domains defined by unique folds is probably limited. Databases of domain folds are available on the internet including SCOP (http://scop.mrc-lmb.cam.ac.uk/scop) and CATH (http://www.biochem.ucl.ac.uk/bsm/cath/). C = Class A = Architecture T = Topology H = Homologous superfamily Polypeptide Chains may Associate: Form High Order Macromolecular Complexes ■ ■ To form a complex machinery which is able to fulfil complex functions - Ribosome, RNA polymerase II To bring enzymes of a metabolic pathway close together so that the loss of metabolic intermediates is avoided - Pyruvate dehydrogenase multienzyme complex ■ To build structures of a given geometry – Virus ■ To reduce the osmotic pressure - Insulin crystal ■ To enlarge the number of possible enzyme activity characteristics by introducing cooperativity between subunits and various types of regulation – Hemoglobin, cAMP-dependent protein kinase Approches/Contraintes en Biologie Structurale ■ ■ ■ Nature/États des Macromolécules ◆ Solution ◆ Cristal (bi- et tridimensionnels) ◆ Adsorbé sur une surface Nombre des Macromolécules ◆ Ensembles (solution, cristaux,...) ◆ Single (Macro)molécule (Microscopie; Biacore??) Combinaison d'approches expérimentales (Multi-resolution) ■ ■ ■ Biologie Structurale: Molecules Analysées Les méthodes de determination structurale sont généralement basées sur des mesures simultanées d'un grand nombre de molécules identiques (Signaux Détectables / Moyenne des Résultats) Ces méthodes incluent la cristallographie, et la plupart des techniques spectroscopiques. La majorité des techniques étudie les “états natifs” statiques des macromolécules. Des variations de ces méthodes permettent l'observation de phénomènes de nature dynamique liées à la transition denaturé/natif ainsi que à des changements conformationnels liées à leur fonction. Origins of 3D Structural Data ■ ■ Most 3D structures data in the PDB were obtained by one of three methods: ✦ X-ray crystallography (over 80%) ✦ solution nuclear magnetic resonance (NMR) (about 16%) ✦ theoretical modeling (2%) - Non Experimental A few structures were determined by other methods . Proteins Exp. Method X-ray NMR EM Other Total 30288 4796 90 76 32250 Molecule Type Nucleic Protein/NA Acids Complexes 916 1391 720 122 10 29 4 3 1650 1545 Other Total 28 6 0 0 34 32623 5644 129 83 38479 Increasing Number of Known Protein Atomic Structures Structural Determination X-ray crystallography NMR spectroscopy Synchrotron radiation X-ray Cryo-Electron Microscopy crystallography High Field NMR Spectrometer Electron Microscope for cryo-EM ◆ Information de haute resolution peut être obtenue à partir d'images de macromolecules avec une resolution de 3-10 Å (0.3-1.0 nm) Biologie Structurale: Microscopie Électronique cryo-EM Electron microscopy 3D Determination (EM) Provides shape information for macromolecules. Techniques for averaging particles observed after staining exist, but are limited to a resolution of 20 Å. Cryo techniques have provided structures with resolutions as high as 7 Å. New methods of tomography are increasingly providing shape information by combining multiple images of a single particle. It is sometimes possible to determine the interaction interfaces between subunits and individual structures may be fitted into the low-resolution density. “Traditional” Experimental Determination of 3D Structures X-ray X-rays Diffraction Pattern NMR RF Resonance RF H0 Direct detection of atom positions Crystals Indirect detection of H-H distances In solution X-ray crystallography 3D Determination Uses diffraction patterns observed after bombarding a crystallized protein with X-rays to construct 3D structures (up to 0.8 Å resolution). In principle there is no size limit on the proteins studied using this technique, and the majority of large complexes known to atomic resolution have been solved by this method. Often difficult to obtain crystals, and large complexes require high quality crystals for diffraction at high resolution. Synchrotron radiation X-ray crystallography X-Ray Structural Data ■ ■ ■ ■ X-ray crystal diffraction usually cannot resolve the positions of hydrogen atoms or reliably distinguish nitrogen from oxygen from carbon. The chemical identity of the terminal side-chain atoms is uncertain for Asp, Gln and Thr and is usually inferred from the protein environment of the side chain (i.e. the side chain orientation which forms the most hydrogen bonds or makes the best electrostatic interactions is selected and built). Sometimes there is also uncertainty about whether an atom that is not part of the protein is a bound water oxygen or a metal ion. Some newer x-ray crystal diffraction PDB files contain hydrogen positions; these hydrogens were added by modeling. Only for the relatively small number of structures for which the resolution of the data extended beyond about 1.2 Å is it possible to locate some of the hydrogen positions based on the x-ray diffraction data 3D Determination Nuclear Magnetic Resonance (NMR) Spectroscopy Measures transitions between different nuclear spin states within a magnetic field, which provide information about distances between atoms within a macromolecule. Technique limited to proteins of up to 40 kDa. It has an increasing role in studying interaction interfaces between structures determined independently. High Field NMR Spectrometer NMR 3D Structural Data ■ NMR determines structures of proteins in solution, but is limited to molecules not much greater than 30 kD. NMR is the method of choice for small proteins which are not readily crystallized, and yields the positions of some hydrogen atoms. The results of NMR analysis are an ensemble of alternative models, in contrast to the unique model obtained by crystallography. Modeling Data (in PDB) ■ Structures obtained by theoretical modeling tend to be less accurate than those obtained by experimental methods. One kind of modeling, called homology modeling, involves fitting a known sequence to the experimentally determined 3D structure of a sequence-similar molecule. Results of homology modeling are more likely to be reliable than are results derived purely from theory (ab initio modeling). Limitations of 3D Structural Data ■ ■ ■ Crystallization sometimes distorts portions of a structure due to contacts between neighboring molecules in the crystal However, protein crystals as used for diffraction studies are highly hydrated ("wet and gelatinous") so structures determined from crystals are not much different from the structures of soluble proteins in aqueous solution Some molecules have been studied both by crystallography and by solution NMR, and in these cases the agreement has been excellent. Atomic Resolution: NMR or Crystallography? ■ Both techniques to determine protein structures ■ NMR uses protein in solution ■ X-ray crystallography uses protein crystals ■ ■ Both techniques require large amounts of pure protein Both techniques require expensive equipment! Xtal vs RMN Paramètre ● ● ● ● ● ● ● ● ● ● ● Résolution Grandes Proteines Détails du Centre Actif Qualité Stereochimie Structure Secondaire Structure de la Surface Structure en Solution Préparation Echantillons Prot. «Non-Xtalisables» Inter. Intramoleculaires Inter. Intermoleculaires Xtal RMN +++ ++ +++ +++/+ ++ + + - + + + +++ +++/+ +++/+ + + ++ ++ PDB: exemple HEADER COMPND COMPND SOURCE AUTHOR REVDAT JRNL JRNL JRNL JRNL JRNL JRNL REMARK REMARK REMARK REMARK REMARK REMARK REMARK REMARK REMARK REMARK REMARK REMARK REMARK REMARK ...……… SHEET SHEET TURN ... CRYST1 ORIGX1 ORIGX2 ORIGX3 SCALE1 SCALE2 SCALE3 ATOM ATOM ATOM ... LYASE(OXO-ACID) 01-OCT-91 12CA CARBONIC ANHYDRASE /II (CARBONATE DEHYDRATASE) (/HCA II) 2 (E.C.4.2.1.1) MUTANT WITH VAL 121 REPLACED BY ALA (/V121A) HUMAN (HOMO SAPIENS) RECOMBINANT PROTEIN S.K.NAIR,D.W.CHRISTIANSON 1 15-OCT-92 12CA 0 AUTH S.K.NAIR,T.L.CALDERONE,D.W.CHRISTIANSON,C.A.FIERKE TITL ALTERING THE MOUTH OF A HYDROPHOBIC POCKET. TITL 2 STRUCTURE AND KINETICS OF HUMAN CARBONIC ANHYDRASE TITL 3 /II$ MUTANTS AT RESIDUE VAL-121 REF J.BIOL.CHEM. V. 266 17320 1991 REFN ASTM JBCHA3 US ISSN 0021-9258 071 1 2 2 RESOLUTION. 2.4 ANGSTROMS. 3 3 REFINEMENT. 3 PROGRAM PROLSQ 3 AUTHORS HENDRICKSON,KONNERT 3 R VALUE 0.170 3 RMSD BOND DISTANCES 0.011 ANGSTROMS 3 RMSD BOND ANGLES 1.3 DEGREES 4 4 N-TERMINAL RESIDUES SER 2, HIS 3, HIS 4 AND C-TERMINAL 4 RESIDUE LYS 260 WERE NOT LOCATED IN THE DENSITY MAPS AND, 4 THEREFORE, NO COORDINATES ARE INCLUDED FOR THESE RESIDUES. 9 10 1 S10 LYS S10 LYS T1 GLN 257 ALA 39 TYR 28 VAL 258 -1 O LYS 257 N THR 40 1 O LYS 39 N ALA 31 TYPE VIB (CIS-PRO 30) 42.700 41.700 73.000 90.00 104.60 1.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 1.000000 0.023419 0.000000 0.006100 0.000000 0.023981 0.000000 0.000000 0.000000 0.014156 1 N TRP 5 8.519 -0.751 2 CA TRP 5 7.743 -1.668 3 C TRP 5 6.786 -2.502 193 258 90.00 P 21 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 10.738 1.00 13.37 11.585 1.00 13.42 10.667 1.00 13.47 2 12CA 12CA 12CA 12CA 12CA 12CA 12CA 12CA 12CA 12CA 12CA 12CA 12CA 12CA 12CA 12CA 12CA 12CA 12CA 12CA 12CA 12CA 12CA 12CA 12CA 12CA 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 12CA 12CA 12CA 74 75 76 12CA 12CA 12CA 12CA 12CA 12CA 12CA 12CA 12CA 12CA 82 83 84 85 86 87 88 89 90 91 Size Range of Xtal Macromolecular Assemblies Size distribution in Protein Quaternary Structure (PQS: http://pqs.ebi.ac.uk). ■ ■ ■ a) Structures of: the PDZ domain of dishevelled; CheA, a dimeric multidomain bacterial signalling molecule; aquaporin, which serves as a transmembrane water channel; and 70S ribosome, which is the molecular machine for protein biosynthesis. b) Distribution of the size of the entries in the PQS database. The structures of large complexes are underrepresented, given an estimated average size of a yeast complex of Biologie Structurale: Type d'Information Structurale Subunit and assembly structure: atomic or near-atomic resolution ≤ 3 Å ■ Subunit and assembly shape: density or surface envelope at a resolution > 3Å ■ Subunit–subunit contact: protein pairs in contact with each other (in some cases the face involved in the contact) ■ Subunit proximity: proteins close to each other in the assembly, but not necessarily in direct contact. ■ Subunit stoichiometry: number of each subunit in the assembly ■ Assembly symmetry: symmetry of the subunits arrangement in the assembly ■ Alert: extreme difficulty in obtaining part of this corresponding information ■ ■ Subunit and assembly structure: atomic or near-atomic resolution ≤ 3 Å ■ Subunit and assembly shape: density or surface envelope at a resolution > 3 Å ■ ■ Macromolecular “Structural” Techniques (1) Subunit–subunit contact: protein pairs in contact with each other (in some cases the face involved in the contact) Subunit proximity: proteins close to each other in the assembly, but not necessarily in direct contact. ■ Subunit stoichiometry: number of each subunit in the assembly ■ Assembly symmetry: symmetry of the subunits arrangement in the assembly ■ Grey boxes: extreme difficulty in obtaining the corresponding information Macromolecular “Structural” Techniques (2) ■ ■ ■ ■ Aspects Complémentaires Les données structurales issues des différentes techniques de resolution structurale présentent des aspects singuliers dans la description des macromolécules et leurs complexes à différents niveaux de détail. Certaines techniques sont hautement complémentaires. Ex: SAXS est complémentaire à autres techniques d'analyse (Xtal, EM, ultracentrifugation analytique). La complémentarité permet des approches multiresolution. Hybrid Approaches to Structure Determination of Macromolecular Complexes a Integration of a diverse set of structures varying in reliability and resolution into a hypothetical hybrid assembly structure b Hybrid assembly of the 80S ribosome from yeast. Superposition of a comparative protein structure model for a domain in protein L2 from Bacillus stearothermophilus with the actual structure (1RL2) (left). A partial molecular model of the whole yeast ribosome (right) was calculated by fitting atomic rRNA (not shown) and comparative protein structure models (ribbon representation) into the electron density of the 80S ribosomal particle. BS complemente les approches de la Biologie Moléculaire 3D Structure to Function 3D Structure to Function: Possibilities ■ The atomic structure reveals the overall organization of the protein chain in three dimensions. From this we can identify: ◆ the residues that are buried in the core or exposed to solvent on the protein surface ◆ the shape and molecular composition of the surface ◆ ◆ the relative juxtaposition of individual groups. quaternary structure of the protein in the crystal environment or in solution at high concentration. 3D Structure to Function: Complexes ■ ■ ■ Protein–ligand complexes are perhaps the most useful for functional information: ◆ reveal the nature of the ligand and where it is bound ◆ the disposition of residues in the active site, from which a catalytic mechanism may be postulated (for enzymes) Classically, structures of such complexes have been determined by design, for example by adding the appropriate ligand to the crystallization medium. In structural genomics, where the ligand is unknown, several examples have already been documented in which the structure has inadvertently included a ligand from the 3D Structure to Function: Biological Function ■ ■ ■ Structural data usually only carry information about the biochemical function of the protein. Its biological role in the cell or organism is much more complex and additional experimental information is needed in order to elucidate this. In the search to determine biological function, some clues to the biochemical function of the protein will guide the choice of the appropriate experiments to extend the structure-based functional predictions. 1 Gene Sequence Gene finding <- Depth: Rational Drug Design (Pysics) BS origine des Données pour Organiser et Comprendre les Données Biologiques Breadth: Homologs, Large Scale Surveys, Informatics -> Protein Sequence Structure prediction Protein Structure Geometry calculation Protein Surface Molecular simulation Force Field Structure docking Ligand Complex Brea Pairwise comparison, sequence & structure alignment Multiple alignment, patterns, templates, trees Databases, scoring schemes, consensus 2 3-100 100+ 3D to Structure: How to infer functional knowledge? ■ ■ Comparison of the protein fold or structural motifs within a protein to the structural databases may reveal similarities, from which biochemical and biological functional information may be inferred. As well as global similarities, it is sometimes possible to identify local structural motifs that capture the essence of the biochemical function and can be used to assign function. These may result from divergent or convergent evolution Convergent Evolution Trypsin and Chymotrypsin are divergently evolved, sharing the same fold and active site. Subtilisin is convergently evolved as compared to the other two, having the same catalytic triad, but completely different fold. 3D Profile: Aspartic Proteases True hits False hits