Experiencing SAX: a novel symbolic representation of time series

Jessica Lin1, Eamonn Keogh2, Wei Li2, Stefano Lonardi2
1Information and Software Engineering Department, George Mason University, Fairfax, USA
2Computer Science & Engineering Department, University of California-Riverside, Riverside, USA

Tóm tắt

Từ khóa


Tài liệu tham khảo

Agrawal R, Psaila G, Wimmers EL, Zait M (1995) Querying shapes of histories. In: Proceedings of the 21st International conference on very large databases, Zurich, Switzerland, Sept 11–15, pp 502–514

Andre-Jonsson H, Badal D (1997) Using signature files for querying time-series data (1997) In: Proceedings of principles of data mining and knowledge discovery, 1st European symposium, Trondheim, Norway. June 24–27, pp 211–220

Androulakis IP (2005) New approaches for representing, analyzing and visualizing complex kinetic mechanisms. In: Proceedings of the 15th European symposium on computer aided process engineering, Barcelona, Spain. May 29–June 1

Apostolico A, Bock ME, Lonardi S (2002) Monotony of surprise in large-scale quest for unusual words. In: Proceedings of the 6th International conference on research in computational molecular biology, Washington, DC, April 18–21, pp 22–31

Bagnall AJ, Janakec G (2004) Clustering time series from arma models with clipped data. In: Proceedings of the 10th ACM SIGKDD International conference on knowledge discovery and data mining, Seattle, WA. August 22–25, pp 49–58

Bakalov P, Hadjieleftherious M, Tsotras VJ (2005) Time relaxed spatiotemporal trajectory. In: Proceedings of the ACM international symposium on advances in geographic information systems, Bremen, Germany. November 4–5

Bastogne T, Noura H, Richard A, Hittinger JM (2002) Application of subspace methods to the identification of a winding process. In: Proceedings of the 4th European control conference, Brussels, Belgium

Berndt D, Clifford J (1994) Using dynamic time warping to find patterns in time series, The workshop on knowledge discovery in databases, the 12th International conference on artificial intelligence, Seattle, WA, pp 229–248

Celly B, Zordan VB (2004) Animated people textures. In: Proceedings of the 17th international conference on computer animation and social agents, Geneva, Switzerland, July 7–9

Chan K, Fu AW (1999) Efficient time series matching by wavelets. In: Proceedings of the 15th IEEE International conference on data engineering, Sydney, Australia, March 23–26, pp 126–133

Chen JS, Moon YS, Yeung HW (2005) Palmprint authentication using time series. In: Proceedings of the 5th international conference on audio- and video-based biometric person authentication, Hilton Rye Town, NY, July 20–22

Chiu B, Keogh E, Lonardi S (2003) Probabilistic discovery of time series motifs. In: Proceedings of the 9th ACM SIGKDD International conference on knowledge discovery and data mining, Washington DC, USA. August 24–27, pp 493–498

Dasgupta D, Forrest S (1999) Novelty detection in time series data using ideas from immunology. In: Proceedings of the 8th International conference on intelligent systems, Denver, CO, June 24–26

Daw CS, Finney CEA, Tracy ER (2001) Symbolic analysis of experimental data. Review of Scientific Instruments 74:915–930

Ding C, He X, Zha H, Simon H (2002) Adaptive dimension reduction for clustering high dimensional data. In: Proceedings of the 2nd IEEE International conference on data mining. Maebashi, Japan, December 9–12, pp 147–154

Duchene F, Garbay C (2005) Apprentissage de Motifs Temporels, Multidimensionnels et Heterogenes: Application a la Telesurveillance Medicale. In: Proceedings of conference francophone sur l’apprentissage Automatique. Nice, France. May 31–June 3

Duchene F, Garbay C, Rialle V (2004) Mining heterogeneous multivariate time-series for learning meaningful patterns: application to home health telecare. Research Report 1070-I, Institue de Informatique et Mathematiques Appliquees de Grenoble, Grenoble, France

Durbin R, Eddy S, Krogh A, Mitchison G (1998) Biological sequence analysis: probabilistic models of proteins and nucleic acids. Cambridge University Press

Faloutsos C, Ranganathan M, Manolopulos Y (1994) Fast subsequence matching in time-series databases. SIGMOD Record, vol 23, pp 419–429

Fayyad U, Reina C, Bradley P (1998) Initialization of iterative refinement clustering algorithms. In: Proceedings of the 4th International conference on knowledge discovery and data mining, New York, NY, August 27–31, pp 194–198

Ferreira PG, Azevedo P, Silva C, Brito R (2006) Mining approximate motifs in time series. In: Proceedings of the 9th international conference on discovery science, Barcelona, Spain, October 7–10

Gavrilov M, Anguelov D, Indyk P, Motwani R (2000) Mining the stock market: which measure is best? In: Proceedings of the 6th ACM International conference on knowledge discovery and data mining, Boston, MA, August 20–23, pp 487–496

Geurts P (2001) Pattern extraction for time series classification. In: Proceedings of the 5th European conference on principles of data mining and knowledge discovery, Freiburg, Germany, pp 115–127

Gionis A, Mannila H (2003) Finding recurrent sources in sequences. In: Proceedings of the 7th International conference on research in computational molecular biology. Berlin, Germany, pp 123–130

Hellerstein JM, Papadimitriou CH, Koutsoupias E (1997) Towards an analysis of indexing schemes. In: Proceedings of the 16th ACM symposium on principles of database systems, Tucson, AZ, May 12–14, pp 249–256

Huang YW, Yu PS (1999) Adaptive query processing for time-series data. In: Proceedings of the 5th ACM SIGKDD International conference on knowledge discovery and data mining, San Diego, CA, Aug 15–18, pp 282–286

Hugueney B (2006) Adaptive segmentation-based symbolic representation of time series for better modeling and lower bounding distance measures. In: Proceedings of the 10th European conference on principles and practice of knowledge discovery in databases, Berlin, Germany, September 18–22, pp 545–552

Kalpakis K, Gada D, Puttagunta V (2001) Distance measures for effective clustering of arima time-series. In: Proceedings of the 2001 IEEE International conference on data mining, San Jose, CA, November 29-December 2, pp 273–280

Keogh E, Chakrabarti K, Pazzani M (2001a) Locally adaptive dimensionality reduction for indexing large time series databases. In: Proceedings of ACM SIGMOD conference on management of data, Santa Barbara, May 21–24, pp 151–162

Keogh E, Chakrabarti K, Pazzani M, Mehrotra S (2001b) Dimensionality reduction for fast similarity search in large time series databases. J Knowledge Inform Syst. 3:263–286

Keogh E, Kasetty S (2002) On the need for time series data mining benchmarks: a survey and empirical demonstration. In: Proceedings of the 8th ACM SIGKDD International conference on knowledge discovery and data mining, Edmonton, Alberta, Canada, July 23–26, pp 102–111

Keogh E, Lin J, Fu AW (2005) HOT SAX: efficiently finding the most unusual time series subsequence. In: Proceedings of the 5th IEEE international conference on data mining, Houston, TX, November 27–30, pp 226–233

Keogh E, Lonardi S, Chiu B (2002) Finding surprising patterns in a time series database in linear time and space. In: Proceedings of the 8th ACM SIGKDD International conference on knowledge discovery and data mining, Edmonton, Alberta, Canada, July 23–26, pp 550–556

Keogh E, Lonardi S, Ratanamahatana CA (2004) Towards parameter-free data mining. In: Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining, Seattle. August 22–25, pp 206–215

Keogh E, Pazzani M (1998) An enhanced representation of time series which allows fast and accurate classification, clustering and relevance feedback. In: Proceedings of the 4th International conference on knowledge discovery and data mining, New York, NY, August 27–31, pp 239–241

Kumar N, Lolla N, Keogh E, Lonardi S, Ratanamahatana CA, Wei L (2005) Time series bitmaps: a practical visualization tool for working with large time series databases. In: Proceedings of the 2005 SIAM international conference on data mining, Newport Beach, CA, April 21–23, pp 531–535

Larsen RJ, Marx ML (1986) An introduction to mathematical statistics and its applications, 2nd edn. Prentice Hall, Englewood, Cliffs, NJ

Lin J, Keogh E (2006) Group SAX: extending the notion of contrast sets to time series and multimedia data. In: Proceedings of the 10th european conference on principles and practice of knowledge discovery in databases. Berlin, Germany, September 18–22, pp 284–296

Lin J, Keogh E, Lonardi S (2005) Visualizing and discovering non-trivial patterns in large time series databases. Inform Visual 4:61–82

Lin J, Keogh E, Lonardi S, Chiu B (2003) A symbolic representation of time series, with implications for streaming algorithms, Workshop on Research Issues in Data Mining and Knowledge Discovery, the 8th ACM SIGMOD, San Diego, CA

Lin J, Keogh E, Lonardi S, Lankford JP, Nystrom DM (2004) Visually mining and monitoring massive time series. In: Proceedings of the 10th ACM SIGKDD international conference on knowledge discovery and data mining, Seattle, WA, August 22–25, pp 460–469

Lin J, Keogh E, Patel P, Lonardi S (2002) Finding motifs in time series, the 2nd Workshop on Temporal Data Mining, the 8th ACM International conference on knowledge discovery and data mining, Edmonton, Alberta, Canada, pp 53–68

Lkhagva B, Suzuki Y, Kawagoe K (2006) New time series data representation ESAX for financial applications. In: Proceedings of the 22nd international conference on data engineering workshops, Atlanta, GA, April 3–8, pp 115

Lonardi S (2001) Global detectors of unusual words: design implementation and applications to pattern discovery in biosequences. Department of Computer Sciences, Purdue University

McGovern A, Kruger A, Rosendahl D, Droegemeier K (2006) Open problem: dynamic relational models for improved hazardous weather prediction. In: Proceedings of ICML workshop on open problems in statistical relational learning, Pittsburgh, PA, June 29

Mörchen F, Ultsch A (2005) Optimizing time series discretization for knowledge discovery. In: Proceedings of the 11th ACM SIGKDD international conference on knowledge discovery and data mining, Chicago, IL, August 21–24, pp 660–665

Murakami K, Yano Y, Doki S, Okuma S (2004) Behavior extraction from a series of observed robot motion. In: Proceedings of JSME conference on robotics and mechatronics. Nagoya, Japan, June

NIST/SEMATECH e-Handbook of Statistical Methods. http://www.itl.nist.gov/div898/handbook/

Ohsaki M, Sato Y, Yokoi H, Yamaguchi T (2003) A rule discovery support system for sequential medical data, in the case study of a chronic hepatitis dataset. Discovery challenge workshop, the 14th European conference on machine learning/the 7th european conference on principles and practice of knowledge discovery in databases, Cavtat-Dubrovnik, Croatia

Pouget F, Urvoy-Keller G, Dacier M (2006) Time signature to detect multi-headed stealthy attack tools. In: Proceedings of the 18th annual first conference, Baltimore, MD, June 25–30

Ratanamahatana CA, Keogh E, Bagnall AJ, Lonardi S (2005) A novel bit level time series representation with implications for similarity search and clustering. In: Proceedings of advances in knowledge discovery and data mining, 9th Pacific-Asia conference, Hanoi Vietnam, May 18–20, pp 771–777

Reinert G, Schbath S, Waterman MS (2000) Probabilistic and statistical properties of words: an overview. J Comput Biol 7:1–46

Roddick, JF, Hornsby K, Spiliopoulou M (2001) An updated bibliography of temporal, spatial and spatio-temporal data mining research. Post-workshop proceedings of the international workshop on temporal, spatial and spatio-temporal data mining, Springer, Berlin, pp 147–163

Shahabi C, Tian X, Zhao W (2000) TSA-tree: a wavelet-based approach to improve the efficiency of multi-level surprise and trend queries. In: Proceedings of the 12th international conference on scientific and statistical database management, Berlin, Germany, July 26–28, pp 55–68

Silvent A, Carbay C, Carry, PY, Dojat M (2003) Data information and knowledge for medical scenario construction. In: Proceedings of the intelligent data analysis in medicine and pharmacology workshop, Protaras, Cyprus, October 19–22

Silvent A, Dojat M, Garbay C (2004) Multi-level temporal abstraction for medical scenario construction. Int J Adapt Control Signal Process

Staden R (1989) Methods for discovering novel motifs in nucleic acid sequences. Comput Appl Biosci 5:293–298

Tanaka Y, Uehara K (2003) Discover motifs in multi dimensional time-series using the principal component analysis and the MDL principle. In: Proceedings of the 3rd international conference on machine learning and data mining in pattern recognition, Leipzig, Germany, July 5–7, pp 252–265

Tanaka Y, Uehara K (2004) Motif discovery algorithm from motion data. In: Proceedings of the 18th annual conference of the japanese society for artificial intelligence, Kanazawa, Japan, June 2–4

Tompa M, Buhler J (2001) Finding motifs using random projections. In: Proceedings of the 5th International conference on computational molecular biology, Montreal, Canada, April 22–25, pp 67–74

Vlachos M, Kollios G, Gunopulos G (2002) Discovering similar multidimensional trajectories. In: Proceedings of the 18th international conference on data engineering, San Jose, CA, February 26-Mar 1, pp 673–684

Wei L, Keogh E, Xi X (2006) SAXually explicit images: finding unusual shapes. In: Proceedings of the 2006 IEEE international conference on data mining, Hong Kong, December 18–22

Yi BK, Faloutsos C (2000) Fast time sequence indexing for arbitrary lp norms. In: Proceedings of the 26th international conference on very large databases, Cairo, Egypt, September 10–14, pp 385–394