On Sharpness of Error Bounds for Univariate Approximation by Single Hidden Layer Feedforward Neural Networks

Results in Mathematics - Tập 75 - Trang 1-35 - 2020
Steffen Goebbels1
1Faculty of Electrical Engineering and Computer Science, Institute for Pattern Recognition, Niederrhein University of Applied Sciences, Krefeld, Germany

Tóm tắt

A new non-linear variant of a quantitative extension of the uniform boundedness principle is used to show sharpness of error bounds for univariate approximation by sums of sigmoid and ReLU functions. Single hidden layer feedforward neural networks with one input node perform such operations. Errors of best approximation can be expressed using moduli of smoothness of the function to be approximated (i.e., to be learned). In this context, the quantitative extension of the uniform boundedness principle indeed allows to construct counterexamples that show approximation rates to be best possible. Approximation errors do not belong to the little-o class of given bounds. By choosing piecewise linear activation functions, the discussed problem becomes free knot spline approximation. Results of the present paper also hold for non-polynomial (and not piecewise defined) activation functions like inverse tangent. Based on Vapnik–Chervonenkis dimension, first results are shown for the logistic function.

Tài liệu tham khảo

Almira, J.M., de Teruel, P.E.L., Romero-Lopez, D.J., Voigtlaender, F.: Negative results for approximation using single layer and multilayer feedforward neural networks. arXiv arXiv:1810.10032 (2018) Anastassiou, G.A.: Multivariate hyperbolic tangent neural network approximation. Comput. Math. Appl. 61(4), 809–821 (2011) Anastassiou, G.A. (ed.): Univariate hyperbolic tangent neural network quantitative approximation. In: Intelligent Systems: Approximation by Artificial Neural Networks, vol. 19, pp. 33–65. Springer, Berlin (2011) Barron, A.R.: Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans. Inf. Theory 39(3), 930–945 (1993) Bartlett, P.L., Maiorov, V., Meir, R.: Almost linear VC-dimension bounds for piecewise polynomial networks. Neural Comput. 10(8), 2159–2173 (1998) Bartlett, P.L., Williamson, R.C.: The VC dimension and pseudodimension of two-layer neural networks with discrete inputs. Neural Comput. 8(3), 625–628 (1996) Chen, T., Chen, H., Liu, R.: A constructive proof and an extension of Cybenko’s approximation theorem. In: Page, C., LePage, R. (eds.) Computing Science and Statistics, pp. 163–168. Springer, New York (1992) Chen, Z., Cao, F.: The properties of logistic function and applications to neural network approximation. J. Comput. Anal. Appl. 15(6), 1046–1056 (2013) Cheney, E.W., Light, W.A.: A Course in Approximation Theory. AMS, Providence (2000) Costarelli, D.: Sigmoidal functions approximation and applications. Doctoral thesis, Roma Tre University, Rome (2014) Costarelli, D., Spigler, R.: Approximation results for neural network operators activated by sigmoidal functions. Neural Netw. 44, 101–106 (2013) Costarelli, D., Spigler, R.: Constructive approximation by superposition of sigmoidal functions. Anal. Theory Appl. 29, 169–196 (2013) Costarelli, D., Vinti, G.: Saturation classes for max-product neural network operators activated by sigmoidal functions. RM 72, 1555–1569 (2017) Cybenko, G.: Approximation by superpositions of a sigmoidal function. Math. Control Signals Syst. 2(4), 303–314 (1989) Debao, C.: Degree of approximation by superpositions of a sigmoidal function. Approx. Theory Appl. 9(3), 17–28 (1993) DeVore, R.A., Lorentz, G.G.: Constructive Approximation. Springer, Berlin (1993) DeVore, R.A., Popov, V.A.: Interpolation spaces and non-linear approximation. In: Cwikel, M., Peetre, J., Sagher, Y., Wallin, H. (eds.) Function Spaces and Applications, pp. 191–205. Springer, Berlin (1988) Dickmeis, W., Nessel, R.J., van Wickeren, E.: A general approach to counterexamples in numerical analysis. Numer. Math. 43(2), 249–263 (1984) Dickmeis, W., Nessel, R.J., van Wickeren, E.: On nonlinear condensation principles with rates. Manuscr. Math. 52(1), 1–20 (1985) Dickmeis, W., Nessel, R.J., van Wickeren, E.: Quantitative extensions of the uniform boundedness principle. Jahresber. Deutsch. Math.-Verein. 89, 105–134 (1987) Dickmeis, W., Nessel, R.J., van Wickern, E.: On the sharpness of estimates in terms of averages. Math. Nachr. 117, 263–272 (1984) Funahashi, K.I.: On the approximate realization of continuous mappings by neural networks. Neural Netw. 2, 183–192 (1989) Goebbels, S.: A sharp error estimate for numerical Fourier transform of band-limited functions based on windowed samples. Z. Anal. Anwendungen 32, 371–387 (2013) Goebbels, S.: A counterexample regarding “New study on neural networks: the essential order of approximation”. Neural Netw. 123, 234–235 (2020) Goebbels, S.: On sharpness of error bounds for multivariate neural network approximation. arXiv arXiv:2004.02203 (2020) Goldberg, P.W., Jerrum, M.R.: Bounding the Vapnik–Chervonenkis dimension of concept classes parameterized by real numbers. Mach. Learn. 18(2), 131–148 (1995) Guliyev, N.J., Ismailov, V.E.: A single hidden layer feedforward network with only one neuron in the hidden layer can approximate any univariate function. Neural Netw. 28(7), 1289–1304 (2016) Hornik, K.: Approximation capabilities of multilayer feedforward networks. Neural Netw. 4, 251–257 (1991) Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Netw. 2(5), 359–366 (1989) Johnen, H., Scherer, K.: On the equivalence of the K-functional and moduli of continuity and some applications. In: Schempp, W., Zeller, K. (eds.) Constructive Theory of Functions of Several Variables. Proc. Conf. Oberwolfach, pp. 119–140 (1976) Jones, L.K.: Constructive approximations for neural networks by sigmoidal functions. Proc. IEEE 78(10), 1586–1589, Correction and addition in Proc. IEEE 79 (1991), 243 (1990) Karpinski, M., Macintyre, A.: Polynomial bounds for VC dimension of sigmoidal and general pfaffian neural networks. J. Comput. Syst. Sci. 54(1), 169–176 (1997) Kůrková, V.: Rates of approximation of multivariable functions by one-hidden-layer neural networks. In: Marinaro, M., Tagliaferri, R. (eds.) Neural Nets WIRN VIETRI-97, Perspectives in Neural Computing, pp. 147–152. Springer, London (1998) Lei, Y., Ding, L.: Approximation and estimation bounds for free knot splines. Comput. Math. Appl. 65(7), 1006–1024 (2013) Leshno, M., Lin, V.Y., Pinkus, A., Schocken, S.: Multilayer feedforward networks with a nonpolynomial activation function can approximate any function. Neural Netw. 6(6), 861–867 (1993) Lewicki, G., Marino, G.: Approximation of functions of finite variation by superpositions of a sigmoidal function. Appl. Math. Lett. 17(10), 1147–1152 (2004) Li, F.J.: Constructive function approximation by neural networks with optimized activation functions and fixed weights. Neural Comput. Appl. 31(9), 4613–4628 (2019) Maiorov, V., Ratsaby, J.: On the degree of approximation by manifolds of finite pseudo-dimension. Constr. Approx. 15, 291–300 (1999) Maiorov, V.E., Meir, R.: On the near optimality of the stochastic approximation of smooth functions by neural networks. Adv. Comput. Math. 13, 79–103 (2000) Makovoz, Y.: Uniform approximation by neural networks. J. Approx. Theory 95(2), 215–228 (1998) Pinkus, A.: n-Width in Approximation Theory. Springer, Berlin (1985) Pinkus, A.: Approximation theory of the MLP model in neural networks. Acta Numer. 8, 143–195 (1999) Ritter, G.: Efficient estimation of neural weights by polynomial approximation. IEEE Trans. Inf. Theory 45(5), 1541–1550 (1999) Sanguineti, M.: Universal approximation by ridge computational models and neural networks: a survey. Open Appl. Math. J. 2, 31–58 (2008) Sanguineti, M., Hlaváčková-Schindler, K.: Some comparisons between linear approximation and approximation by neural networks. In: Artificial Neural Nets and Genetic Algorithms, pp. 172–177. Springer, Vienna (1999) Schmitt, M.: Radial basis function neural networks have superlinear VC dimension. In: Proceedings of the 14th Annual Conference on Computational Learning Theory COLT 2001 and 5th European Conference on Computational Learning Theory EuroCOLT 2001, Volume 2111 of Lecture Notes in Artificial Intelligence, pp. 14–30. Springer, Berlin (2001) Schmitt, M.: RBF neural networks and Descartes’ rule of signs. In: Cesa-Bianchi, N. and Numao, M. and Reischuk, R (eds.) Algorithmic Learning, Theory ALT 2002, Lecture Notes in Computer Science, vol. 2533. Springer, Heidelberg (2002) Shalev-Shwartz, S., Ben-David, S.: Understanding Machine Learning—from Theory to Algorithms. Cambridge University Press, New York (2014) Timan, A.: Theory of Approximation of Functions of a Real Variable. Pergamon Press, New York (1963) Tossavainen, T.: The lost cousin of the fundamental theorem of algebra. Math. Mag. 80(4), 290–294 (2007) Wang, J., Xu, Z.: New study on neural networks: the essential order of approximation. Neural Netw. 23(5), 618–624 (2010) Xu, Z., Cao, F.: The essential order of approximation for neural networks. Sci. China Ser. F Inf. Sci. 47(1), 97–112 (2004) Xu, Z., Wang, J.J.: The essential order of approximation for neural networks. Sci. China Ser. F Inf. Sci. 49(4), 446–460 (2006)