Automatic bandwidth selection for recursive kernel density estimators with length-biased data

Japanese Journal of Statistics and Data Science - Tập 3 - Trang 429-452 - 2019
Yousri Slaoui1
1Laboratoire de Mathématiques et Application, Université de Poitiers, Futuroscope, Chasseneuil, France

Tóm tắt

In this paper we propose an automatic selection of the bandwidth of the recursive kernel estimators of a probability density function defined by the stochastic approximation algorithm in the case of length-biased data. We compared our proposed plug-in method with the cross-validation method and the so-called smooth bootstrap bandwidth selector via simulations as well as a real data set. Results showed that, using the selected plug-in bandwidth and some special stepsizes, the proposed recursive estimators will be very competitive to the non-recursive one in terms of estimation error and much better in terms of computational costs.

Tài liệu tham khảo

Altman, N., & Leger, C. (1995). Bandwidth selection for kernel distribution function estimation. Journal of Statistical Planning and Inference, 46, 195–214. Barmi, Simonoff, & Barmi and Simonoff. (2000). Transformation-based density estimation for weighted distributions. Journal of Nonparametric Statistics, 12, 861–878. Bhattacharyya, F. L. A., & Richardson, B. G. D. (1988). A comparioson of nonparametric un- weighted and length-biased density estimation of fibres. Communications in Statistics-Theory and Methods, 17, 3629–3644. Bojanic, R., & Seneta, E. (1973). A unified theory of regularly varying sequences. Mathematische Zeitschrift, 134, 91–106. Borrajo, M. I., González-Manteiga, W., & Martìnez-Miranda, M. D. (2017). Bandwidth selection for Kernel density estimation with length-biased data. Journal of Nonparametric Statistics, 29, 636–668. Brunel, E., Comte, F., & Guilloux, A. (2009). Nonparametric density estimation in presence of bias and censoring. Test, 18, 166–194. Cox, D. (2005). Some sampling problems in technology. In D. Hand & A. Herzberg (Eds.), Selected statistical papers of Sir David Cox (Vol. 1, pp. 81–92). Cambridge: Cambridge University Press. Cutillo, L., De Feis, I., Nikolaidou, C., & Sapatinas, T. (2014). Wavelet density estimation for weighted data. Journal of Statistical Planning and Inference, 146, 1–19. de Unã-Álvarez, J. (2004). Nonparametric estimation under length-biased sampling and type I censoring: a moment based approach. Annals of the Institute of Statistical Mathematics, 56, 667–681. Delaigle, A., & Gijbels, I. (2004). Practical bandwidth selection in deconvolution Kernel density estimation. Computational Statistics and Data Analysis, 45, 249–267. Duflo, M. (1997). Random iterative models in applications of mathematics. Berlin: Springer. Duin, R. P. (1976). On the choice of smoothing parameters for Parzen estimators of probability density functions. IEEE Transactions on Computers, 25, 1175–1179. Efromovich, S. (2004). Density estimation for biased data. The Annals of Statistics, 32, 1137–1161. Fisher, R. A. (1934). The effects of methods of ascertainment upon the estimation of frequencies. The Annals of Eugenics, 6, 13–25. Galambos, J., & Seneta, E. (1973). Regularly varying sequences. Proceedings of the American Mathematical Society, 41, 110–116. Hanin, L. G., Rachev, S. T., Tsodikov, A. D., & Yakovlev, Y. (1997). A stochastic model of carcinogenesis and tumor size at detection. Advances in Applied Probability, 29, 607–628. Hart, J. D., & Vieu, P. (1990). Data-driven bandwidth choice for density estimation based on dependent data. The Annals of Statistics, 18, 873–890. Huang, Y., Chen, X., & Wu, W. B. (2014). Recursive nonparametric estimation for times series. IEEE Transactions on Information Theory, 60, 1301–1312. Jmaei, A., Slaoui, Y., & Dellagi, W. (2017). Recursive distribution estimators defined by stochastic approximation method using Bernstein polynomials. Journal of Nonparametric Statistics, 29, 792–805. Jones, M. C. (1991). Kernel density estimation for length-biased data. Biometrika, 78, 511–519. Kushner, H. J., & Yin, G. G. (2003). Stochastic approximation and recursive algorithms and applications, in applications of mathematics (p. 35). New York: Springer. Marron, J. S. (1988). Automatic smoothing parameter selection: a survey. Empirical Economics and Econometrics, 13, 187–208. Milet, J., Nuel, G., Watier, L., Courtin, D., Slaoui, Y., Senghor, P., et al. (2010). Genome wide linkage study, using a 250K SNP Map, of plasmodium falciparum infection and mild Malaria attack in a Senegalese population. PLoS One, 5(7), e11616. Mokkadem, A., & Pelletier, M. (2007). A companion for the Kiefer–Wolfowitz–Blum stochastic approximation algorithm. The Annals of Statistics, 35, 1749–1772. Mokkadem, A., Pelletier, M., & Slaoui, Y. (2009a). The stochastic approximation method for the estimation of a multivariate probability density. Journal of Statistical Planning and Inference, 139, 2459–2478. Mokkadem, A., Pelletier, M., & Slaoui, Y. (2009b). Revisiting Révész’s stochastic approximation method for the estimation of a regression function. ALEA. Latin American Journal of Probability and Mathematical Statistics, 6, 63–114. Patil, G. P. (2002). Weighted distribution. Encyclopedia of Environmetrics, 4, 2369–2377. Rao, C. R. (1965). On discrete distributions arising out of methods of ascertainment. In G. P. Patil (Ed.), Classical and contagious discrete distributions (pp. 320–332). Calcutta: Pergamon Press and Statistical Publishing Society. Révész, P. (1973). Robbins-Monro procedure in a Hilbert space and its application in the theory of learning processes I. Studia Scientiarum Mathematicarum Hungarica, 8, 391–398. Révész, P. (1977). How to apply the method of stochastic approximation in the non-parametric estimation of a regression function. Math.operationsforsch.u.statist.,ser.statist, 8, 119–126. Rudemo, M. (1982). Empirical choice of histograms and kernel density estimators. Scandinavian Journal of Statistics, 9, 65–78. Scott, D. W. (1992). Multivariate density estimation: Theory, practice and visualisation. Hoboken: Wiley. Scott, D. W., & Terrell, G. R. (1987). Biased and unbiased cross-validation in density estimation. Journal of the American Statistical Association, 82, 1131–1146. Silverman, B. W. (1986). Density estimation for statistics and data analysis. London: Chapman and Hall. Slaoui, Y. (2013). Large and moderate deviation principles for recursive kernel density estimators defined by stochastic approximation method. Serdica Mathematical Journal, 39, 53–82. Slaoui, Y. (2014a). Bandwidth selection for recursive kernel density estimators defined by stochastic approximation method, Journal of Probability and Statistics, 2014, ID 739640, https://doi.org/10.1155/2014/739640. Slaoui, Y. (2014b). The stochastic approximation method for the estimation of a distribution function. Mathematical Methods of Statistics, 23, 306–325. Slaoui, Y. (2016). On the choice of smoothing parameters for semirecursive nonparametric hazard estimators. The Journal of Statistical Theory and Practice, 10, 656–672. Slaoui, Y., & Jmaei, A. (2019). Recursive density estimators based on Robbins–Monro’s scheme and using Bernstein polynomials. Statistics and Its Interface, 12, 439–455. Slaoui, Y., & Nuel, G. (2014). Parameter estimation in a hierarchical random intercept model with censored response: An approach using a SEM algorithm and Gibbs sampling. Sankhya B, 76, 210–233. Tsybakov, A. B. (1990). Recurrent estimation of the mode of a multidimensional distribution. Problems of Information Transmission, 8, 119–126.