Network Synthesis through Data-Driven Growth and Decay
Tài liệu tham khảo
Amari, 1993, Statistical theory of learning curves under entropic loss criterion, Neural Computation, 5, 140, 10.1162/neco.1993.5.1.140
Atiya, 1997, How do initial conditions affect generalization performance of large Networks, IEEE transactions on Neural Networks, 8, 448, 10.1109/72.557701
Barron, A. (1991). Approximation and estimation bounds for artificial neural networks. Proc. of The 4th Workshop on Computational Learning Theory (pp. 243–249).
Baum, 1989, What size net gives valid generalization?, Neural Computation, 1, 151, 10.1162/neco.1989.1.1.151
Craig, J.J. (1986). Introduction to robotics: mechanics and control. New York: Addison-Wesley.
Denker, J.S., LeCun, Y. & Solla, S.A. (1989). Optimal Brain Damage. In D. Touretzky (Ed.), Advances in neural information processing systems, Vol. 2, (pp. 598–605). San Mateo: Morgan Kaufmann.
Duda, R. & Hart, P. (1973). Pattern classification and scene analysis. New York: John Wiley and Sons.
Fahlman, A.E. & Lebiere, C. (1989). The cascade-correlation learning architecture. In D. Touretzky (Ed.), Advances in neural information processing systems, Vol. 2, (pp. 524–532). San Mateo: Morgan Kaufmann.
Frean, 1990, The upstart algorithm: a method for constructing and training feedforward neural networks, Neural Computation, 2, 198, 10.1162/neco.1990.2.2.198
Geman, 1992, Neural networks and the bias/variance dilemma, Neural Computation, 4, 1, 10.1162/neco.1992.4.1.1
Hassibi, B. & Stork, D.G. (1992). Second order derivatives for network pruning: optimal brain surgen. In D. Touretzky (Ed.), Advanes in neural information processing systems, Vol. 5, (pp. 164–171). San Mateo: Morgan Kaufmann.
Ji, 1990, Generalizing smoothness constraints from discrete samples, Neural Computation, 2, 190, 10.1162/neco.1990.2.2.188
Lee, 1991, Handwritten digit recognition using k nearest neighbour radial-basis function, and backpropagation, Neural Networks, 3, 440
Martin, 1991, Recognizing hand-printed letters and digits using back propagation learning, Neural Computation, 3, 258, 10.1162/neco.1991.3.2.258
Moody, J. (1991). The effective number of parameters: an analysis of generalization and regularization in nonlinear learning systems. In D. Touretzky (Ed.), Advances in neural information processing systems, Vol. 4, (pp. 847–854). San Mateo: Morgan Kaufmann.
Nadel, 1989, Study of a growth algorithm for neural networks, International Journal of Neural Systems, 1, 55, 10.1142/S0129065789000463
Nowlan, 1992, Simplifying neural networks by soft weight sharing, Neural Computation, 4, 473, 10.1162/neco.1992.4.4.473
Sackinger, 1992, Application of the anna neural network chip to high-speed character-recognition, IEEE Transactions on Neural Networks, 3, 498, 10.1109/72.129422
Weigend, A., Rumelhart, D.E. & Huberman, B.A. (1990). Generalization by weight elimination with application to forecasting. In D. Touretzky (Ed.), Advances in neural information processing systems, Vol. 3, (pp. 875–882). San Mateo: Morgan Kaufmann.