Pathways to a Protein Folding Intermediate Observed in a 1-Microsecond Simulation in Aqueous Solution

American Association for the Advancement of Science (AAAS) - Tập 282 Số 5389 - Trang 740-744 - 1998
Yong Duan1, Peter A. Kollman1
1Department of Pharmaceutical Chemistry, University of California, San Francisco, CA 94143 USA

Tóm tắt

An implementation of classical molecular dynamics on parallel computers of increased efficiency has enabled a simulation of protein folding with explicit representation of water for 1 microsecond, about two orders of magnitude longer than the longest simulation of a protein in water reported to date. Starting with an unfolded state of villin headpiece subdomain, hydrophobic collapse and helix formation occur in an initial phase, followed by conformational readjustments. A marginally stable state, which has a lifetime of about 150 nanoseconds, a favorable solvation free energy, and shows significant resemblance to the native structure, is observed; two pathways to this state have been found.

Từ khóa


Tài liệu tham khảo

Sifers R. N., Nature Struct. Biol. 2, 355 (1995);

10.1126/science.278.5336.245

Reviewed in

Dill K. A., Chan H. S., Nature Struct. Biol. 4, 10 (1997);

Dobson C. M., Ptitsyn O. B., Curr. Opin. Struct. Biol. 7, 1 (1997);

Shakhnovich E., Fersht A. R., ibid. 8, 65 (1998).

10.1126/science.250.4984.1121

Sali A., Shakhnovich E., Karplus M., Nature 369, 248 (1994);

Dill K. A., et al., Protein Sci. 4, 561 (1995).

Schiffer C. A., Dötsch V., Wüthrich K., van Gunsteren W. F., Biochemistry 34, 15057 (1995).

10.1126/science.278.5345.1928

Fox T., Kollman P. A., Proteins 25, 315 (1996);

Kollman P. A., Acc. Chem. Res. 29, 461 (1996).

The longest single MD trajectories of proteins with explicit water have been 5.4 ns [

Li A., Daggett V., Protein Eng. 8, 1117 (1995)].

Daggett V., Levitt M., Proc. Natl. Acad. Sci. U.S.A. 89, 5142 (1992);

Tirado-Rives J., Jorgensen W. L., Biochemistry 32, 4175 (1993);

Daggett V., Levitt M., Curr. Opin. Struct. Biol. 4, 291 (1994).

Brooks and co-workers have attempted to reconstruct the folding free-energy landscape [

Boczko E. M., Brooks C. L., Science 269, 393 (1995);

Sheinerman F. B., Brooks C. L., J. Mol. Biol. 278, 439 (1998);

] from the restrained unfolding simulations using the WHAM method [

Kumar S., Bouzida D., Swendsen R. H., Kollman P. A., Rosenberg J. M., J. Comp. Chem. 13, 1011 (1992);

Boczko E. M., Brooks C. L., J. Phys. Chem. 97, 4509 (1993)].

Demchuk E., Bashford D., Case D. A., Fold. Des. 2, 35 (1997);

Daura X., Jaun B., Seebach D., van Gunsteren W. F., Mark A. E., J. Mol. Biol. 280, 925 (1998).

Sheinerman F. B., Brooks C. L., Proc. Natl. Acad. Sci. U.S.A. 95, 1562 (1998).

Shakhnovich E. I., Curr. Opin. Struct. Biol. 7, 29 (1997).

Duan Y., Wang L., Kollman P. A., Proc. Natl. Acad. Sci. U.S.A. 95, 9897 (1998).

McKnight C. J., Doering D. S., Matsudarira P. T., Kim P. S., J. Mol. Biol. 260, 126 (1996).

McKnight C. J., Matsudaira P. T., Kim P. S., Nature Struct. Biol. 4, 180 (1997).

The force field [

10.1021/ja00124a002

] of Cornell et al. was used with full representation of solvent with the TIP3P water model [

10.1063/1.445869

]. Periodic boundary conditions were imposed by a nearest image convention in a truncated octahedron box. An 8 Å residue-based cutoff was applied to the long-range nonbonded protein-water and water-water interactions (both electrostatic and van der Waals). The intramolecular nonbonded interactions of protein were calculated without truncation. When applicable temperature and pressure controls were imposed through use of Berendsen's algorithms [

Berendsen H. J. C., Postma J. P. M., van Gunsteren W. F., DiNola A., Haak J. R., J. Comp. Phys. 81, 3684 (1984);

]. The solute and solvent were separately coupled to a temperature bath with coupling constants of 0.1 ps. The pressure coupling constant was 20 ps. The trajectories were produced by numerical integration with the Verlet- leapfrog algorithm [

Verlet L., Phys. Rev. 159, 98 (1967);

; C. L. Brooks III M. Karplus B. M. Pettitt Proteins: A Theoretical Perspective of Dynamics Structure and Thermodynamics (Wiley New York 1989)] by using a 2-fs time step. Bond constraints were imposed on all bonds involving hydrogen atoms with SHAKE [

Ryckaert J.-P., Ciccotti G., Berendsen H. J. C., J. Comp. Phys. 23, 327 (1977);

] and SETTLE [

10.1002/jcc.540130805

Preparation: The starting coordinates were the NMR structure of villin headpiece subdomain by McKnight et al. [

McKnight C. J., Matsudaira P. T., Kim P. S., Nature Struct. Biol. 4, 180 (1997);

] {Protein Data Bank [

Bernstein F. C., et al., J. Mol. Biol. 112, 535 (1977);

] access code 1vii}. It was denatured by carrying out a 1-ns simulation in water at 1000 K using constant volume. The denatured molecule was then immersed in a truncated octahedron water box constructed from a cubic box of 76.5 Å. A total of 6510 water molecules were retained. The excess water molecules were removed after ∼20 ns when a semistable compact structure was formed to reduce the computational cost. About 3000 water molecules were retained for the remainder of the simulation. Production: The simulation was started from an equilibration phase of 1.0 ns at 200 K and 1 atm pressure. The long equilibration phase was intended to mimic an equilibrated fully denatured state and for adequate solvation of the molecule and to minimize any instability caused by the high-temperature origin of the starting structure. The density of the system was initially 0.90 g/cm 3 increased to 1.05 g/cm 3 within 10 ps and remained so for the remainder of the 1-ns solvent equilibration trajectory. The simulation was then conducted for 1.0 μs (0.5 billion integration steps). Temperature and pressure were controlled at physiological conditions (that is 300 K and 1 atm) by the methods described above. Both temperature and density stabilized within 10 ps. The trajectory was saved at 20-ps intervals for the analysis. The entire simulation took ∼2 months' CPU time on a 256-CPU Cray T3D and an equal amount of CPU time on a 256-CPU Cray T3E-600.

The use of nonbonded cutoffs in the simulation as performed here is an inherent source of “noise” in the simulation trajectory. Such noise is not as deleterious in small proteins with limited formal charges as it is in for example RNA and DNA [

Cheatham T. E., Miller J. L., Fox T., Darden T. A., Kollman P. A., J. Am. Chem. Soc. 117, 4193 (1995);

] as demonstrated by the stability of the 100-ns simulation at 300 K started from the native NMR structure. But accurate inclusion of long-range electrostatic effects [

Essmann U., et al., J. Chem. Phys. 103, 8577 (1995);

] does provide an additional challenge for achieving high parallelism in the MD code. Work is in progress to include long-range electrostatic effects while still achieving a level of parallelism and speed comparable to that presented here.

Ptitsyn O. B., Curr. Opin. Struct. Biol. 5, 74 (1995).

Ballew R. M., Sabelko J., Gruebele M., Proc. Natl. Acad. Sci. U.S.A. 93, 5759 (1996).

Burton R. E., Huang G. S., Daugherty M. A., Calderone T. L., Oas T. G., Nature Struct. Biol. 4, 305 (1997).

Hagen S. J., Hofrichter J., Szabo A., Eaton W. A., Proc. Natl. Acad. Sci. U.S.A. 93, 11615 (1996);

Eaton W. A., Muñoz V., Thompson P. A., Chan C. K., Hofrichter J., Curr. Opin. Struct. Biol. 7, 10 (1997) .

K. W. Plaxco personal communication (1998).

Gilmanshin R., Williams S., Callender R. H., Woodruff W. H., Dyer R. B., Proc. Natl. Acad. Sci. U.S.A. 94, 3709 (1997).

Williams S., et al., Biochemistry 35, 691 (1996);

Thompson P. A., Eaton W. A., Hofrichter J., ibid. 36, 9200 (1997).

Alonso D. O. V., Daggett V., J. Mol. Biol. 247, 501 (1995);

; Protein Sci. 7 860 (1998).

10.1126/science.181.4096.223

A total of 50 000 sets of coordinates were accumulated every 20 ps. They were clustered by comparison with the average coordinate of existing clusters using a 3.0 Å main-chain rmsd cutoff similar to the method described by Karpen et al. [

Karpen M. E., Tobias D. J., Brooks C. L., Biochemistry 32, 412 (1993);

]. Those that are within 3.0 Å main-chain rmsd from the average coordinates of the cluster are assigned to the cluster. A total of 98 clusters were produced. Thirty clusters were highly populated with ∼500 or more coordinate sets and 13 clusters had more than 1000 sets of coordinates.

Plaxco K. W., Simons K. T., Baker D., J. Mol. Biol. 277, 985 (1998).

Y. Duan and P. A. Kollman unpublished data .

Eisenberg D., McLachlan A. D., Nature 319, 199 (1986).

Supercomputing time was provided by Cray Research a subsidiary of Silicon Graphics Inc. (SGI) and by the Pittsburgh Supercomputing Center (PSC). We are grateful to R. Roskies and M. Levine (PSC) J. Carpenter and H. Pritchard (SGI) and J. Wendoloski (AMGEN) for their support. We thank K. Dill D. Agard I. Kuntz J. Pitera and T. Cheatham for critical reading of the manuscript; L. Wang C. Simmerling M. Crowley J. Wang and W. Wang for stimulating discussions; and L. Chiche for the solvation free-energy calculation program. Graphics were provided by Computer Graphics Lab of the University of California San Francisco (T. Ferrin Principal Investigator grant RR-1081). This work was supported in part by NIH grant GM-29072 by a University of California Biotechnology Star grant and by AMGEN (P.A.K.).