Pathways to a Protein Folding Intermediate Observed in a 1-Microsecond Simulation in Aqueous Solution
Tóm tắt
Từ khóa
Tài liệu tham khảo
Reviewed in
Shakhnovich E., Fersht A. R., ibid. 8, 65 (1998).
The longest single MD trajectories of proteins with explicit water have been 5.4 ns [
Brooks and co-workers have attempted to reconstruct the folding free-energy landscape [
] from the restrained unfolding simulations using the WHAM method [
Kumar S., Bouzida D., Swendsen R. H., Kollman P. A., Rosenberg J. M., J. Comp. Chem. 13, 1011 (1992);
The force field [
] of Cornell et al. was used with full representation of solvent with the TIP3P water model [
]. Periodic boundary conditions were imposed by a nearest image convention in a truncated octahedron box. An 8 Å residue-based cutoff was applied to the long-range nonbonded protein-water and water-water interactions (both electrostatic and van der Waals). The intramolecular nonbonded interactions of protein were calculated without truncation. When applicable temperature and pressure controls were imposed through use of Berendsen's algorithms [
Berendsen H. J. C., Postma J. P. M., van Gunsteren W. F., DiNola A., Haak J. R., J. Comp. Phys. 81, 3684 (1984);
]. The solute and solvent were separately coupled to a temperature bath with coupling constants of 0.1 ps. The pressure coupling constant was 20 ps. The trajectories were produced by numerical integration with the Verlet- leapfrog algorithm [
; C. L. Brooks III M. Karplus B. M. Pettitt Proteins: A Theoretical Perspective of Dynamics Structure and Thermodynamics (Wiley New York 1989)] by using a 2-fs time step. Bond constraints were imposed on all bonds involving hydrogen atoms with SHAKE [
] and SETTLE [
Preparation: The starting coordinates were the NMR structure of villin headpiece subdomain by McKnight et al. [
] {Protein Data Bank [
] access code 1vii}. It was denatured by carrying out a 1-ns simulation in water at 1000 K using constant volume. The denatured molecule was then immersed in a truncated octahedron water box constructed from a cubic box of 76.5 Å. A total of 6510 water molecules were retained. The excess water molecules were removed after ∼20 ns when a semistable compact structure was formed to reduce the computational cost. About 3000 water molecules were retained for the remainder of the simulation. Production: The simulation was started from an equilibration phase of 1.0 ns at 200 K and 1 atm pressure. The long equilibration phase was intended to mimic an equilibrated fully denatured state and for adequate solvation of the molecule and to minimize any instability caused by the high-temperature origin of the starting structure. The density of the system was initially 0.90 g/cm 3 increased to 1.05 g/cm 3 within 10 ps and remained so for the remainder of the 1-ns solvent equilibration trajectory. The simulation was then conducted for 1.0 μs (0.5 billion integration steps). Temperature and pressure were controlled at physiological conditions (that is 300 K and 1 atm) by the methods described above. Both temperature and density stabilized within 10 ps. The trajectory was saved at 20-ps intervals for the analysis. The entire simulation took ∼2 months' CPU time on a 256-CPU Cray T3D and an equal amount of CPU time on a 256-CPU Cray T3E-600.
The use of nonbonded cutoffs in the simulation as performed here is an inherent source of “noise” in the simulation trajectory. Such noise is not as deleterious in small proteins with limited formal charges as it is in for example RNA and DNA [
Cheatham T. E., Miller J. L., Fox T., Darden T. A., Kollman P. A., J. Am. Chem. Soc. 117, 4193 (1995);
] as demonstrated by the stability of the 100-ns simulation at 300 K started from the native NMR structure. But accurate inclusion of long-range electrostatic effects [
] does provide an additional challenge for achieving high parallelism in the MD code. Work is in progress to include long-range electrostatic effects while still achieving a level of parallelism and speed comparable to that presented here.
Burton R. E., Huang G. S., Daugherty M. A., Calderone T. L., Oas T. G., Nature Struct. Biol. 4, 305 (1997).
Eaton W. A., Muñoz V., Thompson P. A., Chan C. K., Hofrichter J., Curr. Opin. Struct. Biol. 7, 10 (1997) .
K. W. Plaxco personal communication (1998).
Gilmanshin R., Williams S., Callender R. H., Woodruff W. H., Dyer R. B., Proc. Natl. Acad. Sci. U.S.A. 94, 3709 (1997).
Thompson P. A., Eaton W. A., Hofrichter J., ibid. 36, 9200 (1997).
A total of 50 000 sets of coordinates were accumulated every 20 ps. They were clustered by comparison with the average coordinate of existing clusters using a 3.0 Å main-chain rmsd cutoff similar to the method described by Karpen et al. [
]. Those that are within 3.0 Å main-chain rmsd from the average coordinates of the cluster are assigned to the cluster. A total of 98 clusters were produced. Thirty clusters were highly populated with ∼500 or more coordinate sets and 13 clusters had more than 1000 sets of coordinates.
Y. Duan and P. A. Kollman unpublished data .
Supercomputing time was provided by Cray Research a subsidiary of Silicon Graphics Inc. (SGI) and by the Pittsburgh Supercomputing Center (PSC). We are grateful to R. Roskies and M. Levine (PSC) J. Carpenter and H. Pritchard (SGI) and J. Wendoloski (AMGEN) for their support. We thank K. Dill D. Agard I. Kuntz J. Pitera and T. Cheatham for critical reading of the manuscript; L. Wang C. Simmerling M. Crowley J. Wang and W. Wang for stimulating discussions; and L. Chiche for the solvation free-energy calculation program. Graphics were provided by Computer Graphics Lab of the University of California San Francisco (T. Ferrin Principal Investigator grant RR-1081). This work was supported in part by NIH grant GM-29072 by a University of California Biotechnology Star grant and by AMGEN (P.A.K.).