Assessing the Finite Dimensionality of Functional Data

Peter Hall1, Céline Vial1,2
1Australian National University, Canberra Australia
2Université de Paris X—Nanterre and Université Paris VI et VII , France

Tóm tắt

SummaryIf a problem in functional data analysis is low dimensional then the methodology for its solution can often be reduced to relatively conventional techniques in multivariate analysis. Hence, there is intrinsic interest in assessing the finite dimensionality of functional data. We show that this problem has several unique features. From some viewpoints the problem is trivial, in the sense that continuously distributed functional data which are exactly finite dimensional are immediately recognizable as such, if the sample size is sufficiently large. However, in practice, functional data are almost always observed with noise, for example, resulting from rounding or experimental error. Then the problem is almost insolubly difficult. In such cases a part of the average noise variance is confounded with the true signal and is not identifiable. However, it is possible to define the unconfounded part of the noise variance. This represents the best possible lower bound to all potential values of average noise variance and is estimable in low noise settings. Moreover, bootstrap methods can be used to describe the reliability of estimates of unconfounded noise variance, under the assumption that the signal is finite dimensional. Motivated by these ideas, we suggest techniques for assessing the finiteness of dimensionality. In particular, we show how to construct a critical point V^q such that, if the distribution of our functional data has fewer than q−1 degrees of freedom, then we should be willing to assume that the average variance of the added noise is at least V^q. If this level seems too high then we must conclude that the dimension is at least q−1. We show that simpler, more conventional techniques, based on hypothesis testing, are generally not effective.

Từ khóa


Tài liệu tham khảo

Besse, 1992, PCA stability and choice of dimensionality, Statist. Probab. Lett., 13, 405, 10.1016/0167-7152(92)90115-L

Besse, 1986, Principal components-analysis of sampled functions, Psychometrika, 51, 285, 10.1007/BF02293986

Bosq, 1989, Propriétés des opérateurs de covariance empiriques d'un processus stationnaire hilbertien, C. R. Acad. Sci. Par. I, 309, 873

Bosq, 2000, Linear processes in function spaces: theory and applications, Lect. Notes Statist., 149, 10.1007/978-1-4612-1154-9_8

Brumback, 1998, Smoothing spline models for the analysis of nested and crossed samples of curves, J. Am. Statist. Ass., 93, 961, 10.1080/01621459.1998.10473755

Capra, 1997, An accelerated-time model for response curves, J. Am. Statist. Ass., 92, 72, 10.1080/01621459.1997.10473604

Cardot, 1999, Functional linear model, Statist. Probab. Lett., 45, 11, 10.1016/S0167-7152(99)00036-X

Cardot, 2003, Spline estimators for the functional linear model, Statist. Sin., 13, 571

Dauxois, 1982, Asymptotic theory for the principal component analysis of a vector random function: some applications to statistical inference, J. Multiv. Anal., 12, 136, 10.1016/0047-259X(82)90088-4

Ferraty, 2003, Curves discrimination: a nonparametric functional approach, Comput. Statist. Data Anal., 44, 161, 10.1016/S0167-9473(03)00032-X

Ferraty, 2004, Nonparametric models for functional data, with application in regression, time-series prediction and curve discrimination, J. Nonparam. Statist., 16, 111, 10.1080/10485250310001622686

He, 2003, Functional canonical analysis for square integrable stochastic processes, J. Multiv. Anal., 85, 54, 10.1016/S0047-259X(02)00056-8

Horn, 1965, A rationale and test for the number of factors in factor analysis, Psychometrika, 30, 179, 10.1007/BF02289447

Olshen, 1989, Gait analysis and the bootstrap, Ann. Statist., 17, 1419, 10.1214/aos/1176347372

Peres-Neto, 2005, How many principal components?: stopping rules for determining the number of non-trivial axes revisited, Comput. Statist. Data Anal., 49, 974, 10.1016/j.csda.2004.06.015

Ramsay, 1991, Some tools for functional data analysis (with discussion), J. R. Statist. Soc. B, 53, 539

Ramsay, 1997, Functional Data Analysis, 10.1007/978-1-4757-7107-7

Ramsay, 2002, Applied Functional Data Analysis: Methods and Case Studies, 10.1007/b98886

Rice, 1991, Estimating the mean and covariance structure nonparametrically when the data are curves, J. R. Statist. Soc. B, 53, 233

Robinson, 1989, Influence of environmental factors and piscivory in structuring fish assemblages of small Alberta lakes, Can. J. Fish. Aquat. Sci., 46, 81, 10.1139/f89-012

Silverman, 1995, Incorporating parametric effects into functional principal components analysis, J. R. Statist. Soc. B, 57, 673

Silverman, 1996, Smoothed functional principal components analysis by choice of norm, Ann. Statist., 24, 1, 10.1214/aos/1033066196

Staniswalis, 1998, Nonparametric regression analysis of longitudinal data, J. Am. Statist. Ass., 93, 1403, 10.1080/01621459.1998.10473801

Velicer, 1976, Determining the number of components from the matrix of partial correlations, Psychometrika, 41, 321, 10.1007/BF02293557

Zwick, 1986, Factor influencing five rules for determining the number of components to retain, Psychol. Bull., 99, 432, 10.1037/0033-2909.99.3.432