A novel pattern extraction method for time series classification

Springer Science and Business Media LLC - Tập 10 - Trang 253-271 - 2008
Xiaohang Zhang1, Jun Wu1, Xuecheng Yang1, Haiying Ou1, Tingjie Lv1
1School of Economics and Management, Beijing University of Posts and Telecommunications, Beijing, China

Tóm tắt

Multivariate time series classification is of significance in machine learning area. In this paper, we present a novel time series classification algorithm, which adopts triangle distance function as similarity measure, extracts some meaningful patterns from original data and uses traditional machine learning algorithm to create classifier based on the extracted patterns. During the stage of pattern extraction, Gini function is used to determine the starting position in the original data and the length of each pattern. In order to improve computing efficiency, we also apply sampling method to reduce the searching space of patterns. The common datasets are used to check our algorithm and compare with the naive algorithms. Experimental results are shown to reveal that much improvement can be gained in terms of interpretability, simplicity and accuracy.

Tài liệu tham khảo

Aach J, Church GM (2001) Aligning gene expression time series with time warping algorithms. Bioninformatics 17:495–508 Abarbanel HDI, Carroll TA, Pecora LM, Sidorowich JJ, Tsimring LS (1994) Predicting physical variables in time-delay embedding. Phys Rev E 49:1840–1853 Alcock RJ, Manolopoulos Y (1999) Time-series similarity queries employing a feature-based approach. In: Proceeding of the 7th Hellenic conference on informatics, Ioannina, Greece Alonso Gonzalez CJ, Rodriguez Diez JJ (2000) Time series classification by boosting interval based literals. Intel Artif Rev Iberoam Intel Artif 11:2–11 Ashkenazy Y, Ivanov PC, Havlin S, Peng CK, Goldberger AL, Stanley HE (2001) Magnitude and signal correlations in heartbeat fluctuations. Phys Rev Lett 86:1900–1903 Berndt D, Clifford J (1994) Using dynamic time warping to find patterns in time series. In: AAAI workshop on knowledge discovery in databases, pp 229–248 Boshoff HFV, Grotepass M (1991) The fractal dimension of fricative speech sounds. In: Proceedings of the South African symposium on communication and signal processing, pp 12–61 Buchler JR, Kollath Z, Serre T, Mattei J (1996) Nonlinear analysis of the lightcurve of the variable star R Scuti. Astrophys J:462–489 Casdagli M, Mackay RS (1989) Nonlinear prediction of chaotic time series. Physica D 35:335–356 Chu S, Keogh E, Hart D, Pazzani M (2002) Iterative deepening dynamic time warping for time series. In: Proceeding of SIAM international conference on data mining, pp 195–212 Ding Q, Zhuang Z, Zhu L, Zhang Q (1999) Application of the chaos, fractal and wavelet theories to the feature extraction of passive acoustic signal. Acta Acust 24:197–203 Farmer JD, Sidorowich JJ (1988) Exploiting chaos to predict the future and reduce noise. In: Lee YC (ed) Evolution, learning, and cognition. World Scientific, Singapore, pp 277–330 Geurts P (2001) Pattern extraction for time series classification. In: Principles of data mining and knowledge discovery. LNAI, vol 2168. Springer, Berlin, pp 115–127 Kadous MW (1999) Learning comprehensible descriptions of multivariate time series. In: Proceedings of the 16th international conference on machine learning, pp 454–463 Kadtke J (1995) Classification of highly noisy signals using global dynamical models. Phys Lett A 203:196–202 Keogh E, Kasetty S (2003) On the need for time series data mining benchmarks: a survey and empirical demonstration. Data Mining Knowl Discov 7(4):349–371 Keogh E, Ratanamahatana CA (2005) Exact indexing of dynamic time warping. Knowl Inf Syst 7(3):358–386 Kudo M, Toyama J, Shimbo M (1999) Multidimensional curve classification using passing-through regions. Pattern Recognit Lett 20(11–13):1103–1111 Kyusung K, Parlos AG (2002) Induction motor fault diagnosis based on neuropredictors and wavelet signal processing. IEEE/ASME Trans Mechatron 7(2):201–219 Manganaris S (1997) Supervised classification with temporal data. PhD thesis, Vanderbilt University Petry A, Augusto D, Barone C (2002) Speaker identification using nonlinear dynamical features. Chaos Solitons Fractals 13:221–231 Povinelli RJ, Johnson MT, Lindgren AC, Ye J (2004) Time series classification using Gaussian mixture models of reconstructed phase spaces. IEEE Trans Knowl Data Eng 16(6):779–783 Rabiner L, Juang B (1986) An introduction to hidden Markov models. IEEE Mag Accoust Speech Signal Process 3(1):4–16 Rumelhart DE, MacClelland JL (1986) Parallel distributed processing: explorations in the microstructure of cognition, vol 1: foundations. MIT Press/Bradford Books, Cambridge Schulte-Frohlinde V, Ashkenazy Y, Ivanov PC, Glass L, Goldberger AL, Stanley HE (2001) Noise effects on the complex patterns of abnormal heartbeats. Phys Rev Lett 87:068104 Sciamarella D, Mindlin GB (1999) Topological structure of chaotic flows from human speech chaotic data. Phys Rev Lett 82:1450–1453 UCI KDD archive (2007) http://kdd.ics.uci.edu Vlachos M, Kollios G, Gunopulos D (2002) Discovering similar multidimensional trajectories. In: Proceeding of international conference on data engineering, pp 673–684 Yi BK, Faloutsos C (2002) Fast time sequence indexing for arbitrary Lp norms. In: Proceedings of international conference on very large databases, pp 385–394