Automatic early stopping using cross validation: quantifying the criteria

Neural Networks - Tập 11 - Trang 761-767 - 1998
Lutz Prechelt1
1Fakultät für Informatik, Universität Karlsruhe, Karlsruhe, Germany

Tài liệu tham khảo

Baldi, 1991, Temporal evolution of generalization during learning in linear networks, Neural Computation, 3, 589, 10.1162/neco.1991.3.4.589 Cowan, J.D., Tesauro, G. & Alspector, J. (Eds.), 1994. Advances in Neural Information Processing Systems 6, Morgan Kaufman, San Mateo, CA. Cun, Y.L., Denker, J.S. & Solla, S.A., 1990. Optimal brain damage. In: Touretzky, D.S. (Ed.), Advances in Neural Information Processing Systems 2, Morgan Kaufman, San Mateo, CA, pp. 598–605. Fiesler, E., 1994. Comparative bibliography of ontogenic neural networks. In: International Conference on Artificial Neural Networks. Springer, London. Fahlman, S.E., 1988. An empirical study of learning speed in back-propagation networks. Technical Report CMU-CS-88-162, School of Computer Science. Carnegie Mellon University Pittsburgh, PA. Fahlman, S.E. & Lebiere, C., 1990. The cascade-correlation learning architecture. In: Touretzky, D.S. (Ed.), Advances in Neural Information Processing Systems 2, Morgan Kaufman, San Mateo, CA, pp. 524–532. Finnoff, 1993, Improving model selection by nonconvergent methods, Neural Networks, 6, 771, 10.1016/S0893-6080(05)80122-4 Geman, 1992, Neural networks and the bias/variance dilemma, Neural Computation, 4, 1, 10.1162/neco.1992.4.1.1 Hanson, S.J., Gowan, J.D. & Giles, C.L. (Eds.), 1993. Advances in Neural Information Processing Systems 5, Morgan Kaufman, San Mateo, CA. Hassibi, B. & Stork, D.G., 1993. Second order derivatives for network pruning: optimal brain surgeon. In: Advances in Neural Information Processing Systems 5, Morgan Kaufman, San Mateo, CA, pp. 164–171. Krogh, A. & Hertz, J.A., 1992. A simple weight decay can improve generalization. In: Advances in Neural Information Processing Systems 4, Morgan Kaufman, San Mateo, CA, pp. 950–957. Levin, A.U., Leen, T.K. & Moody, J.E., 1994. Fast pruning using principal components. In: Advances in Neural Information Processing Systems 6, Morgan Kaufman, San Mateo, CA. Lippmann, R.P. & Moody, J.E., Touretzky, D.S. (Eds.), 1991. Advances in Neural Information Processing Systems 3, Morgan Kaufman, San Mateo, CA. Moody, J.E., Hanson, S.J. & Lippmann, R.P. (Eds.), 1992. Advances in Neural Information Processing Systems 4, Morgan Kaufman, San Mateo, CA. Morgan, N. & Bourlard, H., 1990. Generalization and parameter estimation in feedforward nets: some experiments. In: Touretzky, D.S. (Ed.), Advances in Neural Information Processing Systems 2, San Mateo, CA, pp. 630–637. Nowlan, 1992, Simplifying neural networks by soft weight-sharing, Neural Computation, 4, 473, 10.1162/neco.1992.4.4.473 Prechelt, L., 1994. PROBEN1—a set of benchmarks and benchmarking rules for neural network training algorithms. Technical Report 21/94, Fakultät für Informatik, Universität Karlsruhe, Germany. Anonymous FTP: /pub/papers/techreports/1994/1994-21.ps.gz on ftp.ira.uka.de Reed, 1993, Pruning algorithms—a survey, IEEE Transactions on Neural Networks, 4, 740, 10.1109/72.248452 Riedmiller, M. & Braun, H., 1993. A direct adaptive method for faster backpropagation learning: the RPROP algorithm. In: Proceedings of the IEEE International Conference on Neural Networks, San Francisco, CA, pp. 586–591. Touretzky, D.S. (Ed.), 1990. Advances in Neural Information Processing Systems 2, Morgan Kaufman, San Mateo, CA. Wang, C., Venkatesh, S.S. & Judd, J.S., 1994. Optimal stopping and effective machine complexity in learning. In: Advances in Neural Information Processing Systems 6, Morgan Kaufman, San Mateo, CA. Weigend, A.S., Rumelhart, D.E. & Huberman, B.A., 1991. Generalization by weight-elimination with application to forecasting. In: Advances in Neural Information Processing Systems 3, Morgan Kaufman, San Mateo, CA, pp. 875–882.