Toward Automated Anomaly Identification in Large-Scale Systems

IEEE Transactions on Parallel and Distributed Systems - Tập 21 Số 2 - Trang 174-187 - 2010
Zhiling Lan1, Ziming Zheng1, Yawei Li2
1Department of Computer Science, Illinois Institute of Technology, Chicago, IL, USA#TAB#
2Google, Inc., Mountain View, CA, USA

Tóm tắt

Từ khóa


Tài liệu tham khảo

scottile, 2002, Supermon: A High-Speed Cluster Monitoring System, Proc IEEE Int'l Conf Cluster Computing

10.1109/SC.2004.56

grama, 2003, Introduction to Parallel Computing

10.1016/S1389-1286(00)00151-1

gujrati, 2007, A Meta-Learning Failure Predictor for Blue Gene/L Systems, Proc Int'l Conf Parallel Processing (ICPP)

10.1109/ICPP.2008.17

park, 2008, Analyzing Failure Events on ORNL's Cray XT4, Proc Conf Supercomputing (SC '08)

vaidyanathan, 2003, MSET Performance Optimization for Detection of Software Aging, Proc Int'l Symp Software Reliability Eng (ISSRE)

10.1109/ICAC.2005.18

10.1109/ISSRE.1995.497656

10.1109/MED.2008.4602207

duda, 2001, Pattern Classification

10.1016/j.compchemeng.2003.09.011

10.1109/IPDPS.2006.1639378

10.1109/DSN.2007.103

cohen, 2004, Correlating Instrumentation Data to System States: A Building Block for Automated Diagnosis and Control, Proc Operating System Design and Implementation (OSDI)

vilalta, 2002, Predicting Rare Events in Temporal Domains, Proc Int'l Conf Data Mining (ICDM)

10.1109/IPDPS.2008.4536310

10.1109/ICSE.2003.1201224

2009

makeig, 1996, Independent Component Analysis of Electroencephalographic Data, Advances in neural information processing systems, 8, 145

2009

10.1109/12.54853

knorr, 2000, Distance-Based Outliers: Algorithms and Applications, The VLDB J, 8, 237, 10.1007/s007780050006

lee, 1999, Learning the Parts of Objects by Non-Negative Matrix Factorization, Nature, 401, 788, 10.1038/44565

bach, 2002, Kernel Independent Component Analysis, J Machine Learning Research, 3, 1

10.1109/ICAC.2004.1301345

trivedi, 1999, A Measurement-Based Model for Estimation of Resource Exhaustion in Operational Software Systems, Proc Int'l Symp Software Reliability Eng (ISSRE)

10.1145/956750.956799

10.1109/CLUSTR.2007.4629246

2009

10.1109/90.663942

10.1145/1015467.1015492

10.1145/1362622.1362642

10.1109/SC.2006.50

10.1016/j.scico.2004.01.010

10.1109/DSN.2006.5

10.1145/1362622.1362678

10.1109/IPDPS.2006.1639378

10.1109/DSN.2002.1029005

2009

2009

hamerly, 2001, Bayesian Approaches to Failure Prediction for Disk Drives, Proc Int'l Conf Machine Learning (ICML)

allen, 2004, Monitoring Hard Disk with SMART, Linux J

10.1109/DSN.2006.18

hoffmann, 2004, Advanced Failure Prediction in Complex Software Systems, Proc IEEE Symp Reliable Distributed Systems (SRDS)

10.1109/TDSC.2004.2

kao, 1994, DEFINE: A Distributed Fault Injection and Monitoring Environment, Proc IEEE Workshop Fault-Tolerant Parallel and Distributed Systems

2009

venkataraman, 2006, Black-Box Anomaly Detection: Is It Utopian?, Proc Workshop Hot Topics in Networks (HotNets)

peng, 2005, Mining Logs Files for Computing System Management, Proc Int'l Conf Autonomic Computing (ICAC), 10.1109/ICAC.2005.40

10.1109/NNSP.2000.889421

10.1145/1362622.1362667

10.1109/ICDM.2007.46

10.1109/CCGRID.2008.107

10.1109/TNN.2002.804287

jutten, 2003, The Nonlinear ICA and BSS Problems, Proc Fourth Int'l Symp Independent Component Analysis and Blind Signal Separation (ICA)

hyv�rinen, 2000, Independent Component Analysis: Algorithms and Applications, Neural Networks, 411, 10.1016/S0893-6080(00)00026-5