Crystal Structure of the Extracellular Protein Secretion NTPase EpsE
Transcription
Crystal Structure of the Extracellular Protein Secretion NTPase EpsE
doi:10.1016/j.jmb.2003.07.015 J. Mol. Biol. (2003) 333, 657–674 Crystal Structure of the Extracellular Protein Secretion NTPase EpsE of Vibrio cholerae Mark A. Robien1, Brian E. Krumm1,2, Maria Sandkvist3 and Wim G. J. Hol1,2* 1 Departments of Biochemistry and Biological Structure Biomolecular Structure Center University of Washington P.O. Box 357742, Seattle WA 98195, USA 2 Howard Hughes Medical Institute, University of Washington, Seattle WA 98195, USA 3 American Red Cross Holland Laboratory, Department of Biochemistry, Rockville, MD 20855, USA Type II secretion systems consist of an assembly of 12 –15 Gsp proteins responsible for transporting a variety of virulence factors across the outer membrane in several pathogenic bacteria. In Vibrio cholerae, the major virulence factor cholera toxin is secreted by the Eps Type II secretion apparatus consisting of 14 Eps proteins. One of these, EpsE, is a cytoplasmic putative NTPase essential for the functioning of the Eps system and member of the GspE subfamily of Type II secretion ATPases. The crystal structure of a truncated form of EpsE in nucleotide-liganded and unliganded state has been determined, and reveals a two-domain architecture with the four characteristic sequence “boxes” of the GspE subfamily clustering around the nucleotide-binding site of the C-domain. This domain contains two C-terminal subdomains not reported before in this superfamily of NTPases. One of these subdomains contains a fourcysteine motif that appears to be involved in metal binding as revealed by anomalous difference density. The EpsE subunits form a right-handed helical arrangement in the crystal with extensive and conserved contacts between the C and N domains of neighboring subunits. Combining the most conserved interface with the quaternary structure of the C domain in a distant homolog, a hexameric model for EpsE is proposed which may reflect the assembly of this critical protein in the Type II secretion system. The nucleotide ligand contacts both domains in this model. The N2-domain-containing surface of the hexamer appears to be highly conserved in the GspE family and most likely faces the inner membrane interacting with other members of the Eps system. q 2003 Elsevier Ltd. All rights reserved. *Corresponding author Keywords: Type II secretion system; GspE secretion ATPases; Type IV secretion system Introduction Many Gram-negative bacteria possess a sophisticated multiprotein system for translocating periplasmic proteins across the outer membrane known as the Type II protein secretion system (T2SS), which is also referred to as the “general secretion pathway” (gsp) system.1 This machinery is responsible for the final stage of protein translocation from the periplasm to the extracellular space, while the initial stages of post-translational processing and transportation across the inner membrane are accomplished by the sec mechaAbbreviation used: gsp, general secretion pathway. E-mail address of the corresponding author: hol@gouda.bmsc.washington.edu nism. In many pathogenic bacteria, such as Pseudomonas aeruginosa, Klebsiella pneumoniae, enterohemorrhagic and enterotoxigenic Escherichia coli (EHEC and ETEC), and Vibrio cholerae, key virulence factors are transported across the outer membrane by the T2SS. The T2SS proteins in the human pathogen V. cholerae are encoded by the “extracellular protein secretion” (eps) genes.2 – 4 This T2SS of V. cholerae system mediates secretion of cholera toxin and several hydrolytic enzymes across the outer membrane. Quite remarkably the , 86 kDa AB5 cholera toxin is translocated in a folded state across this membrane5,6 and is the primary responsible agent for the acute, lifethreatening diarrhea that is the hallmark of cholera.7 The typical T2SS consists of 12– 15 different 0022-2836/$ - see front matter q 2003 Elsevier Ltd. All rights reserved. Figure 1 (legend opposite) Figure 1. Sequence and structural homologs of EpsE. (A) Representative aligned sequences of five subfamilies of ATPases within the Type II secretion family of ATPases are shown. These sequences are grouped by subfamily: the first group represents seven selected members of the GpsE subfamily, including EpsE; the second group are three TFP ATPases from the PilB/HofB subfamily; the third group is the TFP ATPase Vibrio cholerae TcpT, which is not closely related to members of other subfamilies in the Planet analysis;17 the fourth group are five selected members of another large TFP ATPase subfamily, the PilT/PilU subfamily; and the fifth group are representatives of the ComG1 subfamily of presumptive ATPases involved in competence (DNA uptake) in Gram-positive bacteria. Residues shown in highlighted text are identical within the subfamily sequences shown; TcpT residues are highlighted based on identity to EpsE, since there are no other subfamily members for comparison. Secondary structure symbols above the EpsE sequence represent the observed secondary structure in our experimental model; shaded portions of the alignment represent the four distinguishing sequence motifs common to the broad superfamily of ATPases that includes the T2SS, TFP and ComG1 ATPases as well as the T4SS ATPases. Domain/subdomains are shown by colored bars above and below the five groups of sequences. The upright and inverted triangular symbols identify residues involved in the intersubunit interface of our experimental model, with dark triangular symbols reserved for residues which are identical in over 95% of the known GspE subfamily members. The alignment shown was chosen from among those produced by Planet,17 which were prepared using CLUSTALX without experimental information about the tertiary structure of EpsE. Manual inspection of this alignment together with our experimentally determined tertiary structure uncovers the previously obscured presence of a CM-like tetracysteine motif in the TcpT sequence, in addition to the GspE and PilB/HofB subfamilies. (B) Structure-based sequence alignment of EpsE and HP0525, a member of the VirB11 subfamily of the Type 4 secretion system ATPases. Experimentally determined secondary structure is depicted above the corresponding sequence. Residues highlighted in red show instances where the DALI superposition places an identical amino acid at corresponding positions in the alignment. Shaded boxes show the position of the four characteristic sequence motifs found in the Type II/Type IV ATPases. The colored bar at the top of the block of sequences shows the boundaries of the domains identified from the experimental EpsE structure. Residues with asterisk make close contact to liganded nucleotide in the structures, black circles indicate residues implicated in NTPase activity by proximity to the nucleotide and structural homology to similar residues in other NTPase structures. Red circles are placed at the proposed positions of N2-nucleotide contacts in a putative HP0525-like “closed” conformation. Upright triangles mark the positions of the observed intersubunit N:C0 contacts in each protein; inverted triangles mark the residues involved in the C:C0 interfaces. This Figure was prepared using ESPript 2.0.53 660 proteins. In V. cholerae there are 14: designated EpsA to EpsN8 and VcpD/PilD.9,10 The generic names of the orthologs of these gsp proteins in other species are GspA to GspN and GspO. The focus of this article is the “E” component of the T2SS from V. cholerae (Figure 1). This cytosolic EpsE protein is associated with the cytoplasmic face of the inner membrane of V. cholerae forming a complex with at least two bitopic inner membrane T2SS proteins, EpsL and EpsM.11,12 Several reports have established that other GspE proteins form similar complexes in other species.13 – 15 Furthermore, additional inner membrane components including GspC, GpsF, and GspG have been found to interact with the GspE-L-M subcomplex.13,15,16 EpsE and its T2SS homologs in the GspE subfamily of proteins, belong to a large superfamily of “Type II/IV Secretion NTPases” which have been analyzed by Planet et al.17 These authors17 divided 148 genes from this superfamily into a “Type II family” and a “Type IV family”, which are each further subdivided into subfamilies according to characteristic amino acid sequence patterns. The GspE subfamily to which EpsE belongs represents a set of closely related putative ATPases involved in Type II secretion. It should be noted, however, that other subfamilies in the “Type II family” identified by Planet are not components of Type II Secretion Systems. Many are, instead, components involved in Type 4 pilus (TFP) biogenesis. Although TFP assemblies and the T2SS multi-protein complexes are distinctly different, some components of these machineries are homologous,18 particularly the TFP and GspE ATPases.1,17 TFP ATPases comprise two subfamilies exhibiting 30 – 50% amino acid sequence identity with the GspE subfamily. In contrast, members of the GspE and TFP ATPases are only distant homologs of the “Type IV family” of secretion ATPases as classified by Planet et al.17 Another subfamily of the “Type II family” of secretion NTPases, ComG1, has yet another function and is associated with a bacterial competence mechanism responsible for DNA uptake through a multi-protein complex.19 Representatives of the subfamilies of the “Type II family” of secretion NTPases are shown in the family sequence alignment of Figure 1(A). This article describes the three-dimensional structure of EpsE, the first member of the entire “Type II family” of secretion NTPases with its crystal structure solved so far. The structure of a Type IV Secretion NTPase has been reported previously: HP0525,20 a member of the VirB11 subfamily in the classification of Planet et al.17 This ATPase from Helicobacter pylori is a component of a Type IV Secretion System involved in injecting the H. pylori CagA protein into gastric epithelial cells.21 The 2.5 Å structure of HP0525 revealed a two-domain protein with ADP bound to the anticipated nucleotide-binding site (Yeo et al.)20 Comparison of the EpsE and HP0525 amino acid sequences using BLAST22 revealed a sequence Crystal Structure of EpsE identity of , 32% confined to 113 amino acid residues in the C-terminal domain of the 330-residue HP0525 ATPase. No sequence homology between EpsE and HP0525 was detected for either the 250 N-terminal residues or for the remaining 130 C-terminal residues of EpsE. Here, we (i) report the crystal structure of an N-terminally truncated version of V. cholerae EpsE in liganded and unliganded state; (ii) compare these structures with those of the closest, yet distant, relatives; (iii) analyze the effects of published mutagenesis studies in the GspE subfamily of ATPases; and (iv) propose a model for a hexameric arrangement of EpsE which may reflect features of the GspE ATPases when functioning in the Type II secretion systems. Knowledge of the EpsE structure provides a platform for the design of inhibitors of this subfamily of secretion ATPases. Given that EpsE and its subfamily members are essential for the functioning of the Type II Secretion Systems in a group of pathogens of major medical relevance,8 interfering with the functioning of these enzymes could significantly diminish the extracellular transport of virulence factors and consequently decrease the severity of several important bacterial diseases. Results Structure of the monomer The crystal structures of both unliganded and liganded EpsED90, an N-terminal deletion variant of hexahistidine-tagged EpsE, were solved in space group P61 22 using SeMet SAD methods, followed by model building and refinement to Table 1. Data reduction and refinement statistics Unliganded Wavelength (Å) 0.9748 Resolution (Å) 60–2.50 Unique reflections 18,982 Total reflections 241,296 10.2 (65.3) Rsym (last shell) 3.0 (18.6) Rpim (last shell) I=sigma (last shell) 18.6 (4.3) Multiplicity 12.71 Refinement Number of protein atoms 3045 Number of water molecules 111 Number of metal ions 1 Number of other atoms 1 (Cl2) Resolution (Å) 50–2.5 21.4/26.5 Rwork =Rfree RMS deviations from ideal geometry Bond lengths (Å) 0.011 Bond angles (8) 1.374 Chirality 0.090 Ramachandran analysis Most favorable Allowed Generously allowed Disallowed 302 (92.1%) 22 (6.7%) 2 (0.6%) 2 (0.6%) Liganded 0.9791 60–2.70 15,414 160,321 15.2 (57.7) 4.6 (16.7) 11.1 (3.8) 10.40 3034 25 1 31 (AMPPNP) 50–2.7 23.8/28.4 0.010 1.348 0.089 300 (89.8%) 29 (8.7%) 3 (0.9%) 2 (0.6%) Crystal Structure of EpsE 661 Figure 2. Structure of the EpsE monomer. (A) EpsE ribbon structure with domains colored as follows: N2, cyan; C1, dark blue; CM, yellow; C2, green. Due to loops with weak electron density, some residues between aA and aB, b4 and b5, and aJ and b14 could not be incorporated into the final structure. Hence, these loops are shown with dotted lines. The position of a bound molecule of AMPPNP observed in the 2.7 Å dataset is indicated. The position of 11 selenomethionine residues and the metal site identified by SOLVE are also shown, together with the anomalous Fourier map contoured at þ4s. This Figure and Figures 3(A) and (B), 4, 5(A) and (B), and 6(A) – (C) were generated by the programs MOLSCRIPT54 and with Xtalview,49 and rendered with Raster3D.55 (B) Surface representation of the EpsE monomer and bound AMPPNP, showing the positions of the domains and subdomains, with N2 colored cyan; C1, dark blue; CM, yellow; and C2, green. The small , 300 Å2 contact between the N2 domain (cyan) and CM subdomain (yellow) is seen in the foreground. The surface of the linker residues (white) is visible behind the nucleotide, AMPPNP. resolutions of 2.5 Å and 2.7 Å, respectively (Table 1). Twelve heavy atom sites were found by SOLVE.23 There are 12 methionine residues in the sequence of EpsED90, but the selenium of the N-terminal selenomethionine was not detectable. One of the 12 sites appeared to be a metal-binding site. The EpsED90 subunit (Figure 2(A)) is a two domain protein composed of an N2 domain and a C domain connected by a 15 residue linker. The C domain is further subdivided into C1, CM and C2 subdomains (Figure 2(B)). This nomenclature implicitly allows the deleted 90 residues to be referred to as the N1 domain. The N2 domain comprises residues 101– 225 and consists of a sixstranded antiparallel beta sheet forming one concave face of the domain, and helices aA, aB, aC, which comprise the convex face of N2. The C1 domain exhibits a topology with a central b sheet composed of six parallel strands with a seventh antiparallel b15 strand; three and four helices flank the two faces of this sheet. Consultation of the SCOP database24 of protein structures reveals that this topology is similar to that observed in ABC transporter ATPases, AAA ATPases, and other RecA-like proteins, but differs in strand order or the presence and position of the antiparallel strand from other classes of P-loop NTPases. The C1 domain, made up of residues 240– 392 and 442 – 450, contains all four characteristic sequence motifs, previously identified in all members of the T2SS and T4SS subfamilies:1,17,25 the “Walker A”, ”Walker B”, “Asp” and “His” boxes. The small CM subdomain, containing residues 662 393 –441, is a hairpin-like meandering loop with a conserved tetracysteine motif that binds a metal cation near the sharply bent proximal end of the loop. The CM subdomain protrudes from the C1 domain, resulting in effectively no contacts between these domains. The C2 subdomain, spanning the C-terminal residues 451– 500, is spatially interposed between the C1 and CM subdomains, and consists of four short helical segments arranged in a single layer along the convex face of the large C1 domain (Figure 2(A)). The interface between the C2 and CM subdomains, which buries , 1230 Å2 of surface area, features several residues, including a salt bridge between the Arg394 and Glu495, which are strictly conserved within the GspE subfamily (Figure 1(A)), and several hydrogen bonds between the loops composed of residues 393– 396 and 489 –495. Contacts between the N2 domain and the C domains within a single subunit are quite limited, with a buried interdomain surface of less than 300 Å2 between the b2– b3 and b14 –b15 loops (Figure 2(B)) in the structures from both liganded and unliganded crystals. Nucleotide-binding site One crystal, co-crystallized in the presence of 10 mM AMPPNP at 14 8C and solved at 2.7 Å resolution (Table 1), has the active site occupied by a molecule of AMPPNP in the anti conformation. The AMPPNP ligand has well defined density (Figure 3(A)), and makes contacts solely with residues belonging to the C1 domain or the linker (Figures 2(C) and 3(B)). The protein structures of unliganded EpsED90 and liganded EpsED90 are very similar (rmsd, 0.3 Å for 377 Ca atoms). We did not find structural evidence to suggest that the presence of bound nucleotide leads to a significant conformational change, such as the spatial relationship between the N2 and the C domains. The adenyl moiety of the AMPPNP makes hydrogen bonds with the backbone carbonyls of Leu239 and Arg441. The side-chains of Leu234 and Leu239 make additional hydrophobic contacts with the adenyl ring. The ribose moiety of AMPPNP makes hydrogen bonds with the sidechains of Thr232 and Arg441. In contrast to the adenyl and ribose moieties, the phosphate tail of the nucleotide forms an extensive network of close contacts with EpsE (Figure 3(B)). Oxygen atoms of the a-phosphate form hydrogen bonds with the backbone amide groups of Gly269, Lys270, Ser271, Thr272 and the side-chain of Thr272. Oxygen atoms of the b-phosphate hydrogen bond with the backbone amide groups of Ser268, Gly269, Lys270, Ser271 and the side-chain of Lys270. The oxygen atoms of the g-phosphate form hydrogen bonds with the backbone amide of Gly267 and the sidechains of Thr266 and Lys270. The phosphates are located at the N terminus of helix a allowing for a Crystal Structure of EpsE favorable interaction of the charge with the helix dipole.26 The protein surface in the vicinity of the nucleotide-binding site forms a binding groove with a bowl-shaped depression surrounding the position of the g-phosphate. The Walker A, Walker B, Asp, and His boxes contribute key residues that form the walls of this bowl (Figure 3(D)). The Walker A box is responsible for forming an extensive hydrogen-bonding network with the phosphate tail. For each of the remaining three boxes, the protein surface features a prominently exposed residue that likely plays a role in the putative NTPase activity of EpsE. The Asp box contains an exposed Glu296 side-chain that is positioned within 5.5 Å of an oxygen of the g-phosphate and 3.9 Å of a terminal oxygen of the b phosphate. The Walker B box features the Glu334 side-chain within 5.3 Å of the g-phosphate. In the His box, an imidazole nitrogen of His359 is 6.9 Å from the nearest phosphate oxygen atoms of the AMPPNP. Metal-binding site in the CM domain A metal ion is found tetrahedrally coordinated by the Sg of four cysteine residues, Cys397, Cys400, Cys430 and Cys433. Regarding the nature of the metal, crystallography cannot discriminate readily between different divalent cations. However, the cysteine-coordinated metal in EpsE is tentatively modeled as zinc for the following reasons. First, the vast majority of tetracoordinated metal sites with four ligating cysteine residues contain zinc. Consultation of the MDB database27 of metalloprotein sites from the PDB discloses 172 structures with a total of 300 metal-containing Cys4 sites. Of these 300 sites, 264 contain Zn as the metal, 31 contain Fe, and 2 contain Ga. The remaining three instances contain Cu, Cd, and Ni, respectively. Secondly, at the wavelengths we employed, Zn has an anomalous signal of 2.5 electrons, Se has an anomalous signal of over 3.8 electrons, and Fe, Ni, Co, Cu, and Cd have anomalous signals of between 1.5 and 2.2 electrons. Nonetheless, the anomalous peaks in the anomalous difference Fourier maps at the metal sites are 8.3s and 11.8s, in the liganded and unliganded EpsE structures, respectively, which is between 59% and 76% of the mean anomalous peak heights for the selenium sites. The ratio between the anomalous peak heights is thus more consistent with Zn than the other metals, in particular Fe. Construction of the model in the area of the metal was complicated by relatively weak electron density resulting in high B-factors; particularly for residues 427 through 437. The elongated shape of the metal in the anomalous difference Fourier (Figure 4) suggests several positions for the metal coupled with multiple conformations of the ligands and surrounding residues, explaining the unclear electron density in the neighborhood of the metal ion. 663 Crystal Structure of EpsE Figure 3 (legend on p.665) Linker Quaternary structure In both liganded and unliganded EpsE structures, a linker of 15 residues, composed of residues 226 – 240, connects the N2 and C1 domains. The linker makes few contacts with either of these domains, with only five hydrogen bonds and , 10 additional contacts of less than 3.5 Å. The limited contacts between the N and C domains in the EpsE monomer demonstrate that the two domains of EpsE are quite independent entities if one considers a single subunit. This becomes entirely different once subunit – subunit interactions are taken into account as shown in the next section. Of the 12 subunits per unit cell of EpsED90 (Figure 5(A)), six can be found within a single helical filament. Another six form a second antiparallel filament, related to the initial filament by a crystallographic 2-fold axis perpendicular to the 61 axis. Together, the two antiparallel helical filaments form a hollow cylinder with a central solvent-filled space running along the central axis (not shown). This cylinder has an inner diameter of , 37 Å and an outer diameter of , 105 Å. The buried surface area of a , 710 Å2 interface between contacting subunits from antiparallel helical filaments within this cylinder is quite small. An even 664 Crystal Structure of EpsE Figure 3 (legend opposite) 665 Crystal Structure of EpsE Figure 4. Metal site. Metal coordinated by the Sg of the four surrounding cysteine residues in the CM subdomain. Orange contours represent the anomalous difference Fourier map contoured at þ4s. This dataset was collected at a wavelength of 0.9748 Å, slightly above the Se edge. smaller interface of , 420 Å2 is buried between subunits from adjacent cylinders. The residues comprising these interfaces are not conserved within the GspE subfamily. These interfaces between antiparallel filaments or adjoining cylinders therefore are not likely to be relevant for the in vivo conformation of the GspE proteins. A single helical filament generated by the 6-fold screw axis, has a rise of 27 Å per subunit, six subunits per turn and extensive intermolecular contacts (Figure 5(B)). The extensive intersubunit contacts are due to the C domains of one monomer being positioned into the large space between the N20 and C0 domains of an adjacent monomer, with the primed notation signifying a domain in the adjacent monomer within a helical strand. This arrangement buries a total of , 3240 Å2 of surface area between neighboring subunits. The C1:N20 interaction is composed of , 1800 Å2 of buried surface area, which is , 12% of the total C10 þ N2 surface area. We will refer to this as the C:N0 interface (orange ellipse, Figure 5(B)). The C1 þ C2:CM0 þ C20 interaction buries , 1440 Å2, which is , 8% of the total surface area of these domains (magenta ellipse, Figure 5(B)). This interface will be referred to as the C:C0 contacts. Examination of the sequence conservation of these two interfaces discloses marked differences between them. The large C:N0 interface is composed of 28 interacting residues including four Arg– Asp salt bridges and a total of 19 hydrogen bonds between highly conserved residues within the GspE subfamily. A total of 23 of these 28 residues are strictly conserved within the selected GspE subfamily sequences shown in Figure 1(A). Of these, 13 are also strictly conserved within PilB/HofB subfamily, the closest homologous TFP ATPase subfamily.17 In contrast, the C:C0 interface is comprised of 24 residues with seven hydrogen bonds. Only 11 of the 24 interacting residues are highly conserved even when considering only the closely related members within the GspE subfamily (Figure 1(A)). The sequence conservation and the large relative buried surface area suggests that the C:N0 interface is likely of physiological relevance, as will be discussed later. The in vivo relevance of the C:C0 contacts is less clear. DALI results: structurally similar proteins A DALI28 search was conducted using the core portions of the N2 and C1 domains, to find structurally homologous proteins. HP0525 was the top scoring homolog in both searches. No other proteins were found which contained high-scoring structural homologs to both of the EpsE domains. A total of 71 proteins homologous to the N2 domain of EpsE were reported by DALI. Only two of these had a Z-score greater than 4. HP0525 was the top scoring homolog of the EpsE N2 domain with a Z-score of 6.8. HP0525 exhibits an rms deviation of 3.3 Å for 92 aligned residues out of Figure 3. Active site of EpsE with bound AMPPNP. (A) The electron density of AMPPNP, contoured at þ 1s, of a sigma-A weighted 2mFo 2 DFc map. (B) Stereo view of the model at the active site. Residues in orange are suspected of playing a role in enzyme function. This includes one strictly conserved residue from each of the four distinctive sequence motifs found in the Type II/Type IV secretory NTPase superfamily, i.e. Thr266 (Walker A), Glu296 (Asp box), Glu334 (Walker B), and His359 (His box) (Figure 1(A)). Shown in red dotted lines are hydrogen bonds or salt bridges between the protein and the AMPPNP. Other EpsE residues with bold bonds are strictly conserved in the GspE subfamily of ATPases; residues in lighter bonds are less conserved. (C) Schematic representation of EpsE interacting with bound AMPPNP. (D) Surface of the nucleotide-binding site, with residues of the Walker A, Asp box, Walker B, and His boxes (Figure 1(A)) colored cyan, green, blue and magenta, respectively. 666 Crystal Structure of EpsE Figure 5. Quaternary structure of the experimental EpsE. (A) Van der Waals representation of 12 subunits of EpsE contained in the unit cell, viewed along a crystallographic dyad perpendicular to the 61 axis. (B) Adjacent subunits of EpsE within a helical strand, showing the two major interfaces. The C:N0 interface (magenta ellipse) buries approximately 1800 Å2 of surface area. The C:C0 interface (orange ellipse) buries approximately 1440 Å2 of surface area. 104 residues in the EpsE N2 domain, with a sequence identity of 8%. Residues that are identical in this alignment of the N domains of EpsE and HP0525 are widely dispersed through the N domains. The overall topologies of the N2 domain of EpsE and the N domain of HP0525 are clearly similar (Figure 6(A)), but the positions of the initial helices of the two proteins are quite different. The initial kinked helix of HP0525 juts out from the central globular portion of the N domain on the left side of this diagram, while the small initial helical segment of EpsED90 is found on the right side of this diagram. Hence, the initial EpsED90 helix has no counterpart in HP0525. This initial helix of EpsE, spanning residues Phe101 through Glu107, is separated from the bulk of the N2 domain by 13 disordered residues, raising the possibility that the initial helix of EpsE could be part of a separate N1 domain truncated by the deletion of the first 90 residues of EpsE. DALI returned a total of 173 proteins homologous to the core C1 domain. None of the homologs contain structural elements with recognizable similarity to the CM or C2 subdomains. Many of these are relatively distant structural homologs, as only 47 of the proteins have a Z-score greater than 4. With a Z-score of 19.2, H. pylori HP0525 is the top scoring homolog to the EpsE C1 domain. The residues of HP0525 homologous to the EpsE C1 domain are all contained with the C domain of HP0525, with an rms deviation of 2.0 Å for 155 Ca atoms, out of 161 in the EpsE C1 domain, with a sequence identity of 21%. The superposition of the HP0525 C domain and Crystal Structure of EpsE 667 Figure 6. Superposition of EpsE with its homolog HP0525. (A) Superposition of the N2 domain (cyan) of EpsE and the N domain (light red) of the structure of HP0525. Dotted lines are shown in positions where the EpsE sequence could not be placed into electron density. (B) Superposition of the C domains of EpsE and the C domain (light red) of HP0525. The C1 domain of EpsE is dark blue, the CM domain yellow, and the C2 domain green. The position of ADP bound to HP0525 is red and the AMPPNP bound to EpsE is cyan. Dotted lines connect residues of the CM domain bridging residues 415– 419 that were not modeled due to weak electron density. (C) Superposition of the experimental structure of EpsE (“open” configuration, dark blue), and the structure of HP0525 (“closed” configuration, light red). The C domains of the two structures are superimposed, in order to demonstrate the relative position of the corresponding N domains in the two proteins. the EpsE C1 domain (Figure 6(B)) shows the similarity of the topology between these chains. The ADP in HP0525 and the AMPPNP in the liganded EpsE structure both adopt the anti conformation, and occupy very similar positions with respect to the nucleotide-binding domains with an rms deviation of 1.4 Å for 26 comparable atoms in the two ligands. By several measures, such as the sequence identity of 21% (versus 8%) and rms deviation of 2.0 Å (versus 3.3 Å), the C domain of HP0525 668 and the C1 domain of EpsE are more similar than the N domains of these two proteins. A major difference between the C domains of these proteins is that HP0525 lacks a structural counterpart to both the EpsE CM and C2 subdomains. The alignment of HP0525 and EpsE also discloses 12 residues, residues 290– 301, which are found in HP0525 but not in EpsE nor in other members of the GspE subfamily. In the quaternary structure of HP0525, these 12 residues form a constriction at the narrow opening of a hexameric ring.20 The mutual orientation of the N and C domains is markedly different in EpsE and HP0525. In HP0525, the two domains form a closed jaw in which both domains tightly interact with the bound ADP molecule found in the active site. Despite this close interaction with the ligand, direct interactions between the two domains of HP0525 are limited as evidenced by the very small , 90 Å2 buried interdomain surface area.20 In the EpsE structure, the N2 and C domains are in a more open configuration (Figure 6(C)). As a result, the N2 domain does not make any contacts with the AMPPNP bound to the C1 domain. Interestingly, HP0525 N domain residues implicated in binding ADP are either strictly conserved, such as Arg113 and Arg133, equivalent to Arg210 and Arg225 in EpsE, or conservatively substituted, such as Thr45 and Asn61 in HP0525 which are, respectively, equivalent to Ser140 and Asp158 in EpsE. All four of these positions are strictly conserved within the GspE subfamily. Despite the limited sequence identity between the N domains of HP0525 and EpsE, the sequence identity or conservative substitutions of residues participating in N domainADP contacts of HP0525 suggests that it is quite possible that EpsE is capable of adopting a similar closed jaw formation as will be discussed further below. The intersubunit relationships in the EpsE and HP0525 crystal structures are more diverse. The large, 24 residue, , 1440 Å2 burying C:C0 interface between adjacent subunits in the “61-helix” of the EpsE helical filament (Figure 5(B)) bears no significant structural similarity to the small , 450 Å2 C:C0 interface in the HP0525 cylindrical hexamer.20 This is reflected in the non-overlapping positions of the inverted triangles in Figure 1(B), which represent residues participating in this C:C0 interface. On the other hand, the C:N0 interface of adjacent subunits are more comparable in buried surface area: , 1370 Å2 in HP0525 and , 1800 Å2 in EpsE. Fourteen out of 28 of the EpsE residues involved in the C:N0 interface structurally align with intersubunit interacting residues in HP0525 (Figure 1(B), upright triangles), showing that similar regions of the corresponding domains are involved in the largest intersubunit interface in both crystal structures. Nonetheless, the chemical nature of these intersubunit interactions is not strongly conserved. For example, a salt bridge in the HP0525 N:C0 interface between residues Glu47 and Arg240 is Crystal Structure of EpsE not conserved in EpsE. Conversely, salt bridges in the EpsE C:N0 interface, Arg156 –Asp326, Arg156– Asp328, and Asp195 –Arg324 are without counterparts in the HP0525 N:C0 interface. Thus, although many of the residues involved in C:N0 intersubunit contacts are structurally equivalent, the chemical nature of the interacting residues is not very well conserved. Insights into catalysis based on homologous proteins Analysis of common structural elements among previously studied homologous ATPases20,29 – 33 may provide insight into the ATPase activity of EpsE. The side-chains of Glu296 and Glu334 are prominent surface exposed elements of the cavity adjacent to the g-phosphate of AMPPNP in our structure (Figure 3(B)). Two corresponding acidic residues of RecA, Glu96 and Asp144 are found in similar locations. In RecA, Glu96 is proposed to activate an attacking water during ATP hydrolysis and Asp144 participates in the coordination of the divalent cation cofactor, magnesium.29 Diverse RecA-like ATPases, such as H. pylori HP0525,20 bacteriophage T7 helicase,30 and the AAA þ ATPase NSF,31,32 display a similar arrangement of acidic residues. Thus, in EpsE, this cavity suggests itself as the active-site cleft for the putative ATPase activity of EpsE. His359, a strictly conserved residue of the His box in the GspE subfamily of secretory NTPases (Figure 1(A)), is also a surface exposed element of the cavity adjacent to the g-phosphate. The structural alignment reported by the DALI server indicates that this residue is a histidine in seven of the top 20 structural homologs including HP0525. In the second highest scoring homolog to EpsE, the bacteriophage T7 helicase domain, a role for the homologous His465 has been proposed. In the T7 helicase, this histidine may act as a g-phosphate sensor, with nucleotide hydrolysis promoting a conformational change in the position of this side-chain. As noted by Sawaya et al.,30 similar conformational switching mechanisms have been proposed for RecA29 and PcrA,33 both of which are structural homologs to the C1 domain of EpsE. Although the distances from the imidazole nitrogen atoms to the nearest terminal oxygen of AMPPNP are , 7 – 8 Å in the current liganded EpsE structure (Figure 3(B)), a rotation of the solvent-exposed His359 side-chain about the x1 dihedral angle could reduce this distance to as little as 2.7 Å (Figure 3(B)). Hence, it is plausible that His359 of EpsE may act as a phosphate sensor in GspE proteins. EpsE structure and mutation studies The EpsE structure allows the loss of function associated with several reported mutations to be more fully explained. Mutations of the EpsE Walker A residue Lys270 are known to cause a Crystal Structure of EpsE marked loss of function in several T2SS and TFP systems.11,16,25,34 As anticipated, the Lys270 sidechain amine interacts with the b-phosphate of the AMPPNP in our liganded structure. The Gly to Ala or Ser mutations at the position corresponding to EpsE Gly269 causes a loss of type II secretion in P. aeruginosa and the pullulanase secretion system of Klebsiella oxytoca,25 respectively.35 The backbone dihedral angles for this glycine are in the region of the Ramachandran plot with positive values for both the f and c angles. This helps explain the preference for glycine over other residues at this position, as non-glycine residues are relatively infrequently observed with these backbone dihedral angles. A Thr to Ile mutation at the position corresponding to EpsE Thr273 also causes a temperature sensitive phenotype in P. aeruginosa.16 The contacts of the Og of Thr273 include the Oe and Ne of Gln390 and the mainchain carbonyl of Gly269. A Thr273Ile mutation would abolish this network of favorable hydrophilic interactions. A different Thr to Ile mutation, at the position corresponding to EpsE Thr266, in the TFP ATPase PilQ also causes a loss of function of the Type IV sex pilus of the R64 plasmid.34 In EpsE, the Thr266 side-chain is , 3.5 Å from a terminal oxygen of the g-phosphate. Thr266 may be able to act as a phosphate sensor, or alternatively, may form a stabilizing hydrogen bond with ATP in vivo; in either case, the hydrophobic side-chain of isoleucine at this position would likely provide an unfavorable environment for the hydrophilic g-phosphate. Within the Asp box, an Asp to Asn mutation at a position corresponding to Asp293 in EpsE has been studied in both the T2SS system of K. oxytoca25 and in the TFP system of the R64 plasmid sex pilus.34 This aspartate residue is strictly conserved throughout the GspE subfamily. In the EpsE structure, the side-chain carboxylate oxygen atoms of Asp293 are both more than 9.7 Å from the nearest oxygen of the AMPPNP phosphate tail. This would suggest that this side-chain is likely too distant to be participating in ATP hydrolysis. In the EpsE structures, the side-chain of Asp293 forms a salt bridge with the side-chain of Walker B residue Arg336, an arginine which is strictly conserved in all GspE subfamily members. This salt bridge would be lost by mutation of Asp293 to the neutral asparagine residue. The loss of function by such mutations,25,34 may be due to the loss of this favorable interaction. The Asp293 side-chain is solvent-exposed in the EpsE structure, and thus may also participate in interactions with other components within the fully assembled Type II secretion machinery, which could be an alternative explanation for the loss of function in this mutation. Tetracysteine motif in related proteins In EpsE, the tetracysteine motif consists of two 669 CxxC motifs with 29 intervening residues that form the extended hairpin-like loop. The tetracysteine motif of the CM domain occurs in all known members of the GspE subfamily (Figure 1(A)), with the exception of Xanthomonas campestris XpsE and Xylella fastidiosa XpsE (not shown), which has a CM domain in which the four cysteine residues are replaced in the former by Asp, Asn, Thr, and Ala and in the latter by Glu, His, Ser, and Ala. Additionally, in another GspE subfamily member, Pseudomonas putida XcpR (not shown), the first CxxC is replaced by CxC. Within the GspE subfamily, the separation between the CxxC motifs ranges from 21 to 40 residues. The CM domain is also found throughout both the PilB and HofB branches of the PilB subfamily. In some of the HofB sequences, such as H. influenzae HofB (Figure 1(A)), the second CxxC motif is replaced by CxC. The separation between the second and third cysteine in the PilB subfamily is more variable than in the GspE family, ranging from as few as nine residues in several HofB sequences, up to 32 residues. The CM domain is also found in the TFP ATPase V. cholerae TcpT (Figure 1(A)). This sequence has two CxxC motifs with 20 intervening residues between the second and third cysteine residues. Thus, a CM domain is found in members of the GspE and PilB subfamilies, with some variability noted, especially in the number of residues found in the loop between the second and third cysteine residues. However, the tetracysteine sequence motif neither occurs in any of the members of the ComG1 and PilT/PilU subfamilies (Figure 1(A)), nor in the VirB11 subfamily of sequences, such as HP0525 (Figure 1(B)). A large gap is found in this region of sequences from subfamilies without the tetracysteine motif, suggesting that the CM hairpin loop and the sharp bend found at the site of metal ligation is likely to be replaced by a much simpler and shorter loop, such as that found in HP052520 (Figure 6(B)). The importance of the CM domains is underlined by several studies. Mutation of one or two of the cysteine residues to serine in the GspE of K. oxytoca lead to diminished secretion of the protein pullulanase.36 Simultaneous mutation of three of the four cysteine residues by serine led to abolition of Type II protein secretion in this organism.36 Clearly, the cysteine residues play a crucial role in the functioning of T2SS. Our structure shows (Figure 4) that these cysteine residues are implicated in metal binding. How metal binding and the consequent specific organization of the CM domains affects other components of the Type II secretion system still remains to be determined. GspE proteins are not known to interact with DNA or RNA as eukaryotic zinc-finger domains do, although this has not been excluded. Some Zn-binding proteins, such as the eukaryotic RING proteins are implicated in protein –protein interactions.37 As the CM subdomain of the EpsE monomer has a very large exposed surface area 670 (Figure 2(B)), this is an attractive hypothesis for the role of the this subdomain. A hexameric ring model of EpsE Eight of the ten top scoring DALI homologs to the C1 domain of EpsE are reported to be capable of forming multimeric ring assemblies. There are several ATPases with experimental evidence for an in vivo ring hexameric structure despite a helical filament arrangement of protomers in the crystal structure. For instance the bacteriophage T7 helicase-primase protein forms a hexamer that forms a topologically closed ring observed by electron microscopy studies. Nevertheless, the isolated helicase domain crystallizes as a right-handed helical filament.30 Sawaya et al. describe a model of the T7 helicase in a hexameric ring by collapsing the subunits forming the helix into a circle, followed by an 188 rotation and 10 Å translation of the subunits.30 Cryoelectron microscopy images show that E. coli RecA can form a ring structure,38 while the crystal structure is a helical filament with P61 symmetry. Another likely hexameric ring motor, the AAA þ ATPase chaperonin ClpA, forms a lefthanded helical filament in the observed crystal with space group P65 :39 In this case, a hexameric ring model of ClpA was constructed by using known closely related structural homologs as templates. Studies of the multimerization state of GspE proteins have been conflicting, with a monomer reported for purified histidine-tagged EpsE,11 and in vivo dimers or higher order multimers reported for K. pneumoniae PulE,25 Erwinia carotova OutE14 and P. aeruginosa XcpR.40 These differences suggest that oligomerization of GspE proteins may require other type II secretion components as well as phospholipids. Interestingly, two different forms of EpsE were detected during subcellular fractionation of V. cholerae cells; a cytoplasmic soluble form and a cytoplasmic membrane-associated form.11 It is possible that these forms represent different multimerization states of EpsE; the soluble form being a monomer and the membrane-associated form present within the T2SS complex being a larger oligomer. Additionally, a possible homooctameric form has been reported for the homologous TFP ATPase PilQ.34 The aggregation of the purified GspE proteins, particularly at higher concentrations25,36 (M.A.R. & B.E.K., unpublished observations), may have made size exclusion studies difficult to pursue, in the absence of possibly stabilizing partner proteins. To our knowledge, stoichiometric and other structural information about the GspE-GspL-GspM subcomplex that could shed light on the presence of a multimeric ring form for GspE proteins has not been reported. Although the multimerization state of EpsE within the T2SS complex is unknown, we sought to construct a model of EpsE with C6 point group symmetry while maintaining extensive inter- Crystal Structure of EpsE subunit interactions observed in our crystal structure, since many structural homologs of EpsE can form hexameric rings. A very interesting result (Figure 7(A)) was obtained by (i) placing six C-domains of EpsE onto the six C-domains of the hexameric HP0525 arrangement reported by Yeo et al.;20 and (ii) adding to each of the EpsE C-domains so positioned, the N0 -domain of EpsE as observed in the helical 61 filament in our crystals (Figure 5), i.e. maintaining the extensive and conserved C:N0 interface described above. After this procedure the AMPPNP bound to the C-domain is approached by the conserved residues (red circles in Figure 1(B)) Ser140, Asp158, Arg210, and Arg224 of the N-domain with e.g. the guanido side-chain of Arg224 less than 2.3 Å from an oxygen of the g-phosphate of AMPPNP, the Ser140 4.4 Å from the O3 of the ribose, and a side-chain carbonyl oxygen of Asp158 simultaneously 4.2 Å from the O4 of the ribose and 4.1 Å of the N3 of the adenine base of AMPPNP. This construction is more convincing than an EpsE hexamer obtained by simply placing the C and N domains of EpsE onto the C and N domains of the HP0525 hexamer since (i) in our model the C:N0 interface is considerably more extensive, 1800 Å2 versus 1470 Å2; and (ii) the residues from the N domain approach AMPPNP more closely (not shown). Mapping the degree of conservation of residues in the GspE family (Figure 1(A)) onto the hexameric EpsE model reveals a striking difference between the two sides of the hexamer (Figure 7(B)): the “lower” side with the extra C-subdomains is much less well conserved than the “upper” side where the N2-domain is positioned. The 90 N-terminal EpsE residues that are absent in our structure are most likely located near this upper side of the full-length hexamer. Since these residues are essential for interacting with the EpsL components of the Eps apparatus,41 it is reasonable to assume that the “upper” surface of the EpsE hexamer faces the bacterial inner membrane. The C2 subdomain, with less conserved residues, would then face the cytosolic compartment of V. cholerae and other T2SS-containing bacteria. The CM subdomain is found on the periphery of the ring, with the long axis of this domain essentially parallel with the central ring axis and perpendicular to the putative plane of the membrane. Within this domain, the more conserved residues surrounding the cysteine residues are found on the membrane-facing side of the C domain, with the less conserved residues of elongated loop directed down toward the cytoplasm. As mentioned before, it remains to be determined what the functions of these subdomains are. They could either be involved in contacts with as yet to be discovered, possibly transiently interacting, cytosolic proteins or, alternatively, could engage in interactions with one or more components of Type 2 secretion systems during the protein translocation cycle. Given the possibility of major conformational 671 Crystal Structure of EpsE Figure 7. Hexameric ring model of EpsE. (A) View from the proposed membrane-facing side (left), side view (middle) and the cytoplasmic face (right) of the hexameric ring model of EpsE constructed as described in the text. One monomer is shown with the domains colored as follows: N2, cyan; C1, dark blue; CM, yellow; and C2, green. The other five monomers are colored in a lighter shade of the same colors. This Figure was prepared using GRASP56 and RASTER 3D.55 (B) Same views as in (A), but with the surface colored by sequence conservation. changes of the secretion machinery during the protein translocation process, it might be that a helical filament arrangement of EpsE resembling that seen in the crystals may have physiological relevance. It has been proposed that the GspG, H, I, J, and K proteins, i.e. the pilin-like components of the Type II secretion systems, may form a dynamic, pistonlike arrangement which is involved in pushing substrate proteins like cholera toxin through the GspD pore in the outer membrane.42 – 45 Given the tendency of GspE subfamily members and RecAlike ATPases to engage in helical arrangements, transitions from hexameric toroidal to helical assemblies, and vice versa, may play a key role in the functioning of these marvelous multiprotein machineries. Experimental Procedures harvested in four hours. Cells were harvested by centrifugation at 5000g for 20 minutes and resuspended in ,50 ml buffer (for the pellet from a total of 4 l of media) consisting of 100 mM TEA, 1000 mM NaCl, 10% (v/v) glycerol, 1 mM TCEP/HCl, 2 mM imidazole, to which one COMPLETE EDTA-free protease inhibitor tablet (Roche Diagnostics) was added. The final pH of this buffer was adjusted to pH 8.0 at 4 8C. The cells were disrupted by multiple passes using a French press, and the supernatant was harvested by ultracentrifugation at 90,000g for one hour. The protein was then purified by IMAC (TALON, Clontech), followed by ion exchange chromatography (MonoQ, Pharmacia), and hydrophobic interaction chromatography (Phenyl Sepharose, Pharmacia), and a final dialysis step against 100 mM TEA, 0.5 M NaCl, 1 mM EDTA, 1 mM DTT, 1 mM PMSF adjusted to pH 8.0 at 4 8C. The final protein solution was concentrated to 5 mg/ml. The protein solution was filtered using 0.22 micron filter (Millipore) prior to setting up crystallization experiments. Crystallization Cloning, expression and purification of EpsED90 EpsED90, containing a modified C-terminal sequence with a hexahistadine tag, was cloned into E. coli strain MC1061 as described.11 In our construct, the C-terminal residues, VTKES, were replaced by GSRSHHHHHH (residues 499– 508), in order to accommodate the pQE70 vector used. Native protein was expressed in LB at 27 8C, by induction with 1 mM IPTG at A600 , 0:6 and cells were harvested when the A600 started to plateau, at approximately six hours. SeMet substituted protein was expressed in M9 minimal media supplemented with amino acid residues as described by Van Duyne et al.46 For SeMet substituted expression, induction was again with 1 mM IPTG at A600 , 0:6 – 0:8 and the cells were The initial crystallization conditions for native and SeMet EpsED90 were obtained by mixing equal volumes (1– 4 micron) of protein solution (0.1 M TEA, 0.5 M NaCl, 10% glycerol, 1 mM TCEP, 1 mM EDTA) with well solutions from commercially available screens (Crystal Screen 1, 2, Hampton Research; Wizard I, II, Cryo I, II, Emerald BioStructures) using the sitting drop vapor diffusion technique. Two conditions were identified and optimized. The initial conditions identified were optimized (12 – 18% (v/v) PEG 5000mme, 0.15– 0.20 M ammonium sulfate, 0.1 M MES pH 6.3), and SeMet crystals prepared in this fashion, and cryoprotected with 25% glycerol mixed with well solution, diffracted to 3.4 Å resolution, although these were 672 subject to considerable radiation decay. Native crystals prepared in a similar manner diffracted to , 3.1 Å resolution. These crystals were difficult to work with due to often very weak diffraction; typically fewer than one in ten crystals cryoprotected with glycerol yielded diffraction better than 5 Å resolution. Various methods to more gradually introduce the crystals to glycerol or co-crystallize them neither in the presence of cryoprotecting quantities of glycerol were successful nor was the use of MPD, ethylene glycol, low molecular mass PEGs, or highly soluble inorganic salts such as lithium formate or lithium acetate. Cryoprotection with Paratone-N (Exxon) was marginally more successful than glycerol in more frequently producing sub-5 Å resolution data. Growth at 14 8C or 4 8C seemed to improve the size and appearance of the crystals, and not only gave slightly better diffraction than those set up at room temperature. One other condition was identified though repeated screening; this condition (30% (v/v) PEG200, 0.1 M Hepes pH 8.0) initially produced extremely small and delicate crystals which were difficult to reproduce either with or without several seeding strategies. It was found that using a protein solution without glycerol improved the size and reproducibility of these crystals. The best “unliganded” datasets were obtained with crystals grown at room temperature with 10 mM AMPPNP present in the protein solution prior to setting up crystallization experiments, and precipitant solution of 18 – 22% PEG200, 0.1 M Mops, pH 7.2, 10 mM ammonium acetate. Crystals of liganded EpsED90 were grown at 14 8C, with 10 mM AMPPNP present in the protein solution prior to setting up drops, with a precipitant consisting of 18 – 22% PEG200, 0.1 M Mops, pH 7.2 (without other additives). The “PEG200 crystals” were mounted directly into loops from the cryoprecipitant containing drops and flash-cooled in liquid nitrogen. EXAFS spectra were performed to help determine optimal wavelength for the peak and high-energy remote wavelengths. X-ray diffraction data were collected from each individual crystal at ALS, Berkeley, CA (beam lines 8.2.1, 8.2.2, 5.0.2) and APS, Argonne, IL (beamlines 19ID, 19BM). The diffraction data were processed, and integrated using the ELVES (J. Holton, unpublished) package primarily to optimize the use of MOSFLM.47 Scaling and merging of the datasets was done with the SCALA program from the CCP4 suite. Initially, the best resolution datasets available were a number of datasets from separate crystals all diffracting to ,2.90 Å. The crystals have space group P61 22 or the enantiomorphic P65 22; this ambiguity was resolved when SOLVE23 initially found a solution with ten sites (later 12 sites) in P61 22 but no good solution in P65 22: With one molecule per asymmetric unit, the VM is 2.8 Å3/Da and estimated solvent content is 55.3%. Attempts to use multiple wavelength datasets were not successful, but SeMet SAD datasets were easily solved by the SOLVE package. Solvent flattening and histogram matching using RESOLVE48 produced a noisy map. At this point, additional data diffracting to 2.5 Å became available, and the autotracing facility of RESOLVE 2.0248 was able to convincingly place 105 residues with side-chains and an additional 112 residues with only main-chain atoms. This model was used as a starting point for multiple rounds of manual tracing and refinement using XtalView49 and REFMAC5.50 As multiple SeMet SAD datasets were collected, the experimental phases for each of these datasets was Crystal Structure of EpsE originally refined independently and subsequently used for multidomain multicrystal density modification using the program DMMULTI51,52 from the CCP4 suite. The working model was used to generate separate masks for the N2 domain and the grouped C-terminal domains (C1, CM, C2). The use of multiple domains was motivated by the marked paucity of intramolecular contacts between the N-terminal and C-terminal regions of the protein. Attempts to use only a single subunit mask lead to lower map correlation coefficients; use of more than two domains did not lead to clear improvements. DMMULTI was also used with datasets where experimental phasing was not available. Protein Data Bank accession codes Coordinates and structure factors have been deposited with the Protein Data Bank, ID 1P9R (unliganded) and 1P9W (with AMPPNP). Acknowledgements M.R. greatly appreciates support under the NIAID supported T32 Host Defense Training Grant during the initial stages of this work, and the encouragement of Dr Walter Stamm of the University of Washington Division of Infectious Diseases. We thank Stewart Turley for advice on crystal freezing and data collection. We gratefully acknowledge the use of ALS beamlines 5.0.1, 5.0.2 and 5.0.3 and APS beamlines 19ID and 19BM for earlier, less accommodating crystals and the use of ALS beamline 8.2.1 for the final collected datasets. W.G.J.H. and M.S. acknowledge support from NIH grants No. AI34501-10 and AI49294, respectively. References 1. Pugsley, A. P. (1993). The complete general secretory pathway in Gram-negative bacteria. Microbiol. Rev. 57, 50 – 108. 2. Overbye, L. J., Sandkvist, M. & Bagdasarian, M. (1993). Genes required for extracellular secretion of enterotoxin are clustered in Vibrio cholerae. Gene, 132, 101– 106. 3. Sandkvist, M., Morales, V. & Bagdasarian, M. (1993). A protein required for secretion of cholera toxin through the outer membrane of Vibrio cholerae. Gene, 123, 81 – 86. 4. Sandkvist, M., Michel, L. O., Hough, L. P., Morales, V. M., Bagdasarian, M., Koomey, M. & DiRita, V. J. (1997). General secretion pathway (eps) genes required for toxin secretion and outer membrane biogenesis in Vibrio cholerae. J. Bacteriol. 179, 6994– 7003. 5. Hirst, T. R. & Holmgren, J. (1987). Transient entry of enterotoxin subunits into the periplasm occurs during their secretion from Vibrio cholerae. J. Bacteriol. 169, 1037– 1045. 6. Hirst, T. R. & Holmgren, J. (1987). Conformation of protein secreted across bacterial outer membranes: a study of enterotoxin translocation from Vibrio cholerae. Proc. Natl Acad. Sci. USA, 84, 7418– 7422. Crystal Structure of EpsE 7. Spangler, B. D. (1992). Structure and function of cholera toxin and the related Escherichia coli heatlabile enterotoxin. Microbiol. Rev. 56, 622–647. 8. Sandkvist, M. (2001). Type II secretion and pathogenesis. Infect. Immun. 69, 3523– 3535. 9. Marsh, J. W. & Taylor, R. K. (1998). Identification of the Vibrio cholerae type 4 prepilin peptidase required for cholera toxin secretion and pilus formation. Mol. Microbiol. 29, 1481– 1492. 10. Fullner, K. J. & Mekalanos, J. J. (1999). Genetic characterization of a new type IV-A pilus gene cluster found in both classical and El Tor biotypes of Vibrio cholerae. Infect. Immun. 67, 1393– 1404. 11. Sandkvist, M., Bagdasarian, M., Howard, S. P. & DiRita, V. J. (1995). Interaction between the autokinase EpsE and EpsL in the cytoplasmic membrane is required for extracellular secretion in Vibrio cholerae. EMBO J. 14, 1664– 1673. 12. Sandkvist, M., Hough, L. P., Bagdasarian, M. M. & Bagdasarian, M. (1999). Direct interaction of the EpsL and EpsM proteins of the general secretion apparatus in Vibrio cholerae. J. Bacteriol. 181, 3129–3135. 13. Possot, O. M., Vignon, G., Bomchil, N., Ebel, F. & Pugsley, A. P. (2000). Multiple interactions between pullulanase secreton components involved in stabilization and cytoplasmic membrane association of PulE. J. Bacteriol. 182, 2142– 2152. 14. Py, B., Loiseau, L. & Barras, F. (1999). Assembly of the type II secretion machinery of Erwinia chrysanthemi: direct interaction and associated conformational change between OutE, the putative ATP-binding component and the membrane protein OutL. J. Mol. Biol. 289, 659– 670. 15. Py, B., Loiseau, L. & Barras, F. (2001). An inner membrane platform in the type II secretion machinery of Gram-negative bacteria. EMBO Rep. 2, 244–248. 16. Kagami, Y., Ratliff, M., Surber, M., Martinez, A. & Nunn, D. N. (1998). Type II protein secretion by P. aeruginosa: genetic suppression of a conditional mutation in the pilin-like component XcpT by the cytoplasmic component XcpR. Mol. Microbiol. 27, 221–233. 17. Planet, P. J., Kachlany, S. C., DeSalle, R. & Figurski, D. H. (2001). Phylogeny of genes for secretion NTPases: identification of the widespread tadA subfamily and development of a diagnostic key for gene classification. Proc. Natl Acad. Sci. USA, 98, 2503–2508. 18. Nunn, D. (1999). Bacterial type II protein export and pilus biogenesis: more than just homologies? Trends Cell Biol. 9, 402– 408. 19. Dubnau, D. (1999). DNA uptake in bacteria. Annu. Rev. Microbiol. 53, 217– 244. 20. Yeo, H. J., Savvides, S. N., Herr, A. B., Lanka, E. & Waksman, G. (2000). Crystal structure of the hexameric traffic ATPase of the Helicobacter pylori type IV secretion system. Mol. Cell, 6, 1461– 1472. 21. Odenbreit, S., Puls, J., Sedlmaier, B., Gerland, E., Fischer, W. & Haas, R. (2000). Translocation of Helicobacter pylori CagA into gastric epithelial cells by type IV secretion. Science, 287, 1497– 1500. 22. Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. (1990). Basic local alignment search tool. J. Mol. Biol. 215, 403– 410. 23. Terwilliger, T. C. & Berendzen, J. (1999). Automated MAD and MIR structure solution. Acta Crystallog. sect. D, Biol. Crystallog. 55, 849– 861. 673 24. Murzin, A. G., Brenner, S. E., Hubbard, T. & Chothia, C. (1995). SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247, 536– 540. 25. Possot, O. & Pugsley, A. P. (1994). Molecular characterization of PulE, a protein required for pullulanase secretion. Mol. Microbiol. 12, 287–299. 26. Hol, W. G., van Duijnen, P. T. & Berendsen, H. J. (1978). The alpha-helix dipole and the properties of proteins. Nature, 273, 443– 446. 27. Castagnetto, J. M., Hennessy, S. W., Roberts, V. A., Getzoff, E. D., Tainer, J. A. & Pique, M. E. (2002). MDB: the Metalloprotein Database and Browser at the Scripps Research Institute. Nucl. Acids Res. 30, 379 –382. 28. Holm, L. & Sander, C. (1993). Protein structure comparison by alignment of distance matrices. J. Mol. Biol. 233, 123–138. 29. Story, R. M. & Steitz, T. A. (1992). Structure of the recA protein – ADP complex. Nature, 355, 374– 376. 30. Sawaya, M. R., Guo, S., Tabor, S., Richardson, C. C. & Ellenberger, T. (1999). Crystal structure of the helicase domain from the replicative helicase-primase of bacteriophage T7. Cell, 99, 167– 177. 31. Yu, R. C., Hanson, P. I., Jahn, R. & Brünger, A. T. (1998). Structure of the ATP-dependent oligomerization domain of N-ethylmaleimide sensitive factor complexed with ATP. Nature Struct. Biol. 5, 803– 811. 32. Lenzen, C. U., Steinmann, D., Whiteheart, S. W. & Weis, W. I. (1998). Crystal structure of the hexamerization domain of N-ethylmaleimide-sensitive fusion protein. Cell, 94, 525– 536. 33. Velankar, S. S., Soultanas, P., Dillingham, M. S., Subramanya, H. S. & Wigley, D. B. (1999). Crystal structures of complexes of PcrA DNA helicase with a DNA substrate indicate an inchworm mechanism. Cell, 97, 75 – 84. 34. Sakai, D., Horiuchi, T. & Komano, T. (2001). ATPase activity and multimer formation of Pilq protein are required for thin pilus biogenesis in plasmid R64. J. Biol. Chem. 276, 17968– 17975. 35. Turner, L. R., Lara, J. C., Nunn, D. N. & Lory, S. (1993). Mutations in the consensus ATP-binding sites of XcpR and PilB eliminate extracellular protein secretion and pilus biogenesis in Pseudomonas aeruginosa. J. Bacteriol. 175, 4962– 4969. 36. Possot, O. M. & Pugsley, A. P. (1997). The conserved tetracysteine motif in the general secretory pathway component PulE is required for efficient pullulanase secretion. Gene, 192, 45 – 50. 37. Borden, K. L. (2000). RING domains: master builders of molecular scaffolds? J. Mol. Biol. 295, 1103– 1112. 38. Yu, X. & Egelman, E. H. (1997). The RecA hexamer is a structural homologue of ring helicases. Nature Struct. Biol. 4, 101– 104. 39. Guo, F., Maurizi, M. R., Esser, L. & Xia, D. (2002). Crystal structure of ClpA, an Hsp100 chaperone and regulator of ClpAP protease. J. Biol. Chem. 277, 46743 – 46752. 40. Turner, L. R., Olson, J. W. & Lory, S. (1997). The XcpR protein of Pseudomonas aeruginosa dimerizes via its N terminus. Mol. Microbiol. 26, 877–887. 41. Sandkvist, M., Keith, J. M., Bagdasarian, M. & Howard, S. P. (2000). Two regions of EpsL involved in species – specific protein– protein interactions with EpsE and EpsM of the general secretion pathway in Vibrio cholerae. J. Bacteriol. 182, 742– 748. 42. Mattick, J. S., Whitchurch, C. B. & Alm, R. A. (1996). 674 43. 44. 45. 46. 47. 48. 49. Crystal Structure of EpsE The molecular genetics of type-4 fimbriae in Pseudomonas aeruginosa—a review. Gene, 179, 147– 155. Shevchik, V. E., Robert-Baudouy, J. & Condemine, G. (1997). Specific interaction between OutD, an Erwinia chrysanthemi outer membrane protein of the general secretory pathway, and secreted proteins. EMBO J. 16, 3007– 3016. Filloux, A., Michel, G. & Bally, M. (1998). GSP-dependent protein secretion in Gram-negative bacteria: the Xcp system of Pseudomonas aeruginosa. FEMS Microbiol. Rev. 22, 177– 198. Sandkvist, M. (2001). Biology of type II secretion. Mol. Microbiol. 40, 271– 283. Van Duyne, G. D., Standaert, R. F., Karplus, P. A., Schreiber, S. L. & Clardy, J. (1993). Atomic structures of the Human immunophilin FKBP-12 complexes with FK506 and rapamycin. J. Mol. Biol. 229, 105– 124. Leslie, A. G. W., Brick, P. & Wonacutt, A. (1986). Mosflm. Daresbury Lab. Inform. Quart. Protein Crystallog. 18, 33 – 39. Terwilliger, T. C. (2000). Maximum-likelihood density modification. Acta Crystallog. sect. D, Biol. Crystallog. 56, 965–972. McRee, D. E. (1999). XtalView/Xfit—a versatile 50. 51. 52. 53. 54. 55. 56. program for manipulating atomic coordinates and electron density. J. Struct. Biol. 125, 156– 165. Murshudov, G. N., Vagin, A. A. & Dodson, E. J. (1997). Refinement of macromolecular structures by the maximum-likelihood method. Acta Crystallog. sect. D, 53, 240– 255. Cowtan, K. & Main, P. (1998). Miscellaneous algorithms for density modification. Acta Crystallog. sect. D, Biol. Crystallog. 54, 487– 493. Cowtan, K. D. & Zhang, K. Y. J. (1999). Density modification for macromolecular phase improvement. Prog. Biophys. Mol. Biol. 72, 245– 270. Gouet, P., Courcelle, E., Stuart, D. I. & Metoz, F. (1999). ESPript: analysis of multiple sequence alignments in PostScript. Bioinformatics, 15, 305– 308. Kraulis, P. J. (1991). Molscript—a program to produce both detailed and schematic plots of protein structures. J. Appl. Crystallog. 24, 946– 950. Merritt, E. A. & Bacon, D. J. (1997). Raster3D: photorealistic molecular graphics. Macromol. Crystallog. 277, 505– 524. Nicholls, A., Sharp, K. A. & Honig, B. (1991). Protein folding and association: insights from the interfacial and thermodynamic properties of hydrocarbons. Proteins: Struct. Funct. Genet. 11, 281– 296. Edited by R. Huber (Received 12 May 2003; received in revised form 13 July 2003; accepted 21 July 2003)