Self-organizing kernel adaptive filtering
Tóm tắt
This paper presents a model-selection strategy based on minimum description length (MDL) that keeps the kernel least-mean-square (KLMS) model tuned to the complexity of the input data. The proposed KLMS-MDL filter adapts its model order as well as its coefficients online, behaving as a self-organizing system and achieving a good compromise between system accuracy and computational complexity without a priori knowledge. Particularly, in a nonstationary scenario, the model order of the proposed algorithm changes continuously with the input data structure. Experiments show the proposed algorithm successfully builds compact kernel adaptive filters with better accuracy than KLMS with sparsity or fixed-budget algorithms.
Tài liệu tham khảo
W Liu, PP Pokharel, JC Príncipe, The kernel least mean square algorithm. IEEE Trans. Sig. Process. 56(2), 543–554 (2008).
Y Engel, S Mannor, R Meir, The kernel recursive least-squares algorithm. IEEE Trans. Sig. Process. 52(8), 2275–2285 (2004).
J Platt, A resource-allocating network for function interpolation. Neural Comput. 3(4), 213–225 (1991).
P Bouboulis, S Theodoridis, Extension of Wirtinger’s calculus to reproducing kernel Hilbert spaces and the complex kernel LMS. J. IEEE Trans. Sig. Process. 59(3), 964–978 (2011).
K Slavakis, S Theodoridis, Sliding window generalized kernel affine projection algorithm using projection mappings. EURASIP J. Adv. Sig. Process. 1:, 1–16 (2008).
C Richard, JCM Bermudez, P Honeine, Online prediction of time series data with kernels. IEEE Trans. Sig. Process. 57(3), 1058–1066 (2009).
W Liu, Il Park, JC Príncipe, An information theoretic approach of designing sparse kernel adaptive filters. IEEE Trans. Neural Netw. 20(12), 1950–1961 (2009).
B Chen, S Zhao, P Zhu, JC Príncipe, Quantized kernel least mean square algorithm. IEEE Trans. Neural Netw. Learn. Syst. 23(1), 22–32 (2012).
B Chen, S Zhao, P Zhu, JC Príncipe, Quantized kernel recursive least squares algorithm. IEEE Trans. Neural Netw. Learn. Syst. 24(9), 1484–1491 (2013).
SV Vaerenbergh, J Via, I Santamana, A sliding-window kernel RLS algorithm and its application to nonlinear channel identification. IEEE Int. Conf. Acoust. Speech Sig. Process, 789–792 (2006).
SV Vaerenbergh, J Via, I Santamana, Nonlinear system identification using a new sliding-window kernel RLS algorithm. J. Commun. 2(3), 1–8 (2007).
SV Vaerenbergh, I Santamana, W Liu, JC Príncipe, Fixed-budget kernel recursive least-squares. IEEE Int. Conf. Acoust. Speech Sig. Process, 1882–1885 (2010).
M Lázaro-Gredilla, SV Vaerenbergh, I Santamana, A Bayesian approach to tracking with kernel recursive least-squares. IEEE Int. Work. Mach. Learn. Sig. Process. (MLSP), 1–6 (2011).
S Zhao, B Chen, P Zhu, JC Príncipe, Fixed budget quantized kernel least-mean-square algorithm. Sig. Process. 93(9), 2759–2770 (2013).
D Rzepka, in 2012 IEEE 17th Conference on Emerging Technologies & Factory Automation (ETFA). Fixed-budget kernel least mean squares (Krakow, 2012), pp. 1–4.
K Nishikawa, Y Ogawa, F Albu, in Signal and Information Processing Association Annual Summit and Conference (APSIPA). Fixed order implementation of kernel RLS-DCD adaptive filters (Asia-Pacific, 2013), pp. 1–6.
K Slavakis, P Bouboulis, S Theodoridis, Online learning in reproducing kernel Hilbert spaces. Sig. Process. Theory Mach. Learn.1:, 883–987 (2013).
W Gao, J Chen, C Richard, J Huang, Online dictionary learning for kernel LMS. IEEE Trans. Sig. Process. 62(11), 2765–2777 (2014).
J Rissanen, Modeling by shortest data description. Sig. Process. 14(5), 465–471 (1978).
M Li, PMB Vitányi, An introduction to Kolmogorov complexity and its applications (Publisher, Springer-Verlag, New York Inc, 2008).
H Akaike, A new look at the statistical model identification. IEEE Trans. Autom. Control. 19(2), 716–723 (1974).
G Schwarz, Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978).
A Barron, J Rissanen, B Yu, The minimum description length principle in coding and modeling. IEEE Trans. Inf. Theory. 44(6), 2743–2760 (1998).
J Rissanen, Universal coding, information, prediction, and estimation. IEEE Trans. Inf. Theory. 30(4), 629–636 (1984).
J Rissanen, MDL denoising. IEEE Trans. Inf. Theory. 46(7), 2537–2543 (2000).
L Xu, Bayesian Ying Yang learning (II): a new mechanism for model selection and regularization. Intell. Technol. Inf. Anal, 661–706 (2004).
Z Yi, M Small, Minimum description length criterion for modeling of chaotic attractors with multilayer perceptron networks. IEEE Trans. Circ. Syst. I: Regular Pap. 53(3), 722–732 (2006).
T Nakamura, K Judd, AI Mees, M Small, A comparative study of information criteria for model selection. Int. J. Bifurcation Chaos Appl. Sci. Eng. 16(8) (2153).
T Cover, J Thomas, Elements of information theory (Wiley, 1991).
M Hansen, B Yu, Minimum description length model selection criteria for generalized linear models. Stat. Sci. A Festschrift for Terry Speed. 40:, 145–163 (2003).
K Shinoda, T Watanabe, MDL-based context-dependent subword modeling for speech recognition. Acoust. Sci. Technol. 21(2), 79–86 (2000).
AA Ramos, The minimum description length principle and model selection in spectropolarimetry. Astrophys. J. 646(2), 1445–1451 (2006).
RS Zemel, A minimum description length framework for unsupervised learning, Dissertation (University of Toronto, 1993).
AWF Edwards, Likelihood, (Cambridge Univ Pr, 1984).
M Small, CK Tse, Minimum description length neural networks for time series prediction. Astrophys. J. 66(6), 066701 (2002).
A Ning, H Lau, Y Zhao, TT Wong, Fulfillment of retailer demand by using the MDL-optimal neural network prediction and decision policy. IEEE Trans. Ind. Inform. 5(4), 495–506 (2009).
JS Wang, YL Hsu, An MDL-based Hammerstein recurrent neural network for control applications. Neurocomputing. 74(1), 315–327 (2010).
YI Molkov, DN Mukhin, EM Loskutov, AM Feigin, GA Fidelin, Using the minimum description length principle for global reconstruction of dynamic systems from noisy time series. Phys. Rev. E. 80(4), 046207 (2009).
A Leonardis, H Bischof, An efficient MDL-based construction of RBF networks. Neural Netw. 11(5), 963–973 (1998).
H Bischof, A Leonardis, A Selb, MDL principle for robust vector quantisation. Pattern Anal. Appl. 2(1), 59–72 (1999).
T Rakthanmanon, EJ Keogh, S Lonardi, S Evans, MDL-based time series clustering. Knowl. Inf. Syst., 1–29 (2012).
H Bischof, A Leonardis, in 15th International Conference on Pattern Recognition. Fuzzy c-means in an MDL-framework (Barcelona, 2000), pp. 740–743.
I Jonye, LB Holder, DJ Cook, MDL-based context-free graph grammar induction and applications. Int. J. Artif. Intell. Tools. 13(1), 65–80 (2004).
S Papadimitriou, J Sun, C Faloutsos, P Yu, Hierarchical, parameter-free community discovery. Mach. Learn. Knowl. Discov. Databases., 170–187 (2008).
E Parzen, On estimation of a probability density function and mode. Ann. Math. Stat. 33(3), 1065–1076 (1962).
AM Mood, FA Graybill, DC Boes, Introduction to the Theory of Statistics (McGraw-Hill, USA, 1974).
SA Geer, Applications of empirical process theory (The Press Syndicate of the University of Cambridge, Cambridge, 2000).
The Santa Fe time series competition data. http://www-psych.stanford.edu/~andreas/Time-Series/SantaFe.html. Accessed June 2016.
AS Weigend, NA Gershenfeld, Time series prediction: forecasting the future and understanding the past (Westview Press, 1994).
Sound files obtained from system simulations. http://www.cnel.ufl.edu/~pravin/Page_7.htm. Accessed June 2016.
B Bigi, in The eighth international conference on Language Resources and Evaluation. SPPAS: a tool for the phonetic segmentations of Speech (Istanbul, 2012), pp. 1748–1755.
SPPAS: automatic annotation of speech. http://www.sppas.org. Accessed June 2016.