Statistical Fraud Detection: A Review

Statistical Science - Tập 17 Số 3 - 2002
Richard J. Bolton, David J. Hand

Tóm tắt

Từ khóa


Tài liệu tham khảo

Ripley, B. D. (1996). Pattern Recognition and Neural Networks. Cambridge Univ. Press.

BREIMAN, L., FRIEDMAN, J. H., OLSHEN, R. A. and STONE, C. J. (1984). Classification and Regression Trees. Wadsworth, Belmont, CA.

CORTES, C., FISHER, K., PREGIBON, D. and ROGERS, A. (2000). Hancock: A language for extracting signatures from data streams. In Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 9-17. ACM Press, New York.

LACHENBRUCH, P. A. (1966). Discriminant analysis when the initial samples are misclassified. Technometrics 8 657-662.

SHIEH, S.-P. W. and GLIGOR, V. D. (1997). On a patternoriented model for intrusion detection. IEEE Transactions on Knowledge and Data Engineering 9 661-667.

WHEELER, R. and AITKEN, S. (2000). Multiple algorithms for fraud detection. Knowledge-Based Sy stems 13(2/3) 93-99.

PERLICH, C., PROVOST, F. and SIMONOFF, J. S. (2001). Tree induction vs. logistic regression: A learning-curve analysis. Journal of Machine Learning Research. To appear.

ALESKEROV, E., FREISLEBEN, B. and RAO, B. (1997). CARDWATCH: A neural network based database mining sy stem for credit card fraud detection. In Computational Intelligence for Financial Engineering. Proceedings of the IEEE/IAFE 220- 226. IEEE, Piscataway, NJ.

ALLEN, T. (2000). A day in the life of a Medicaid fraud statistician. Stats 29 20-22.

ANDERSON, D., FRIVOLD, T. and VALDES, A. (1995). Nextgeneration intrusion detection expert sy stem (NIDES): A summary. Technical Report SRI-CSL-95-07, Computer Science Laboratory, SRI International, Menlo Park, CA.

ANDREWS, P. P. and PETERSON, M. B., eds. (1990). Criminal Intelligence Analy sis. Palmer Enterprises, Loomis, CA.

ARTÍS, M., Ay USO, M. and GUILLÉN, M. (1999). Modelling different ty pes of automobile insurance fraud behaviour in the Spanish market. Insurance Mathematics and Economics 24 67-81.

BARAO, M. I. and TAWN, J. A. (1999). Extremal analysis of short series with outliers: Sea-levels and athletics records. Appl. Statist. 48 469-487.

BLUNT, G. and HAND, D. J. (2000). The UK credit card market. Technical report, Dept. Mathematics, Imperial College, London.

BOLTON, R. J. and HAND, D. J. (2001). Unsupervised profiling methods for fraud detection. In Conference on Credit Scoring and Credit Control 7, Edinburgh, UK, 5-7 Sept.

BRAUSE, R., LANGSDORF, T. and HEPP, M. (1999). Neural data mining for credit card fraud detection. In Proceedings of the 11th IEEE International Conference on Tools with Artificial Intelligence 103-106. IEEE Computer Society Press, Silver Spring, MD.

BROCKETT, P. L., XIA, X. and DERRIG, R. A. (1998). Using Kohonen's self-organising feature map to uncover automobile bodily injury claims fraud. The Journal of Risk and Insurance 65 245-274.

BURGE, P. and SHAWE-TAYLOR, J. (1997). Detecting cellular fraud using adaptive prototy pes. In AAAI Workshop on AI Approaches to Fraud Detection and Risk Management 9-13. AAAI Press, Menlo Park, CA.

BUYSE, M., GEORGE, S. L., EVANS, S., GELLER, N. L., RANSTAM, J., SCHERRER, B., LESAFFRE, E., MURRAY, G., EDLER, L., HUTTON, J., COLTON, T., LACHENBRUCH, P. and VERMA, B. L. (1999). The role of biostatistics in the prevention, detection and treatment of fraud in clinical trials. Statistics in Medicine 18 3435-3451.

CAHILL, M. H., LAMBERT, D., PINHEIRO, J. C. and SUN, D. X. (2002). Detecting fraud in the real world. In Handbook of Massive Datasets (J. Abello, P. M. Pardalos and M. G. C. Resende, eds.). Kluwer, Dordrecht.

CHAN, P. K., FAN, W., PRODROMIDIS, A. L. and STOLFO, S. J. (1999). Distributed data mining in credit card fraud detection. IEEE Intelligent Sy stems 14(6) 67-74.

CHAN, P. and STOLFO, S. (1998). Toward scalable learning with non-uniform class and cost distributions: A case study in credit card fraud detection. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining 164-168. AAAI Press, Menlo Park, CA.

CHARTIER, B. and SPILLANE, T. (2000). Money laundering detection with a neural network. In Business Applications of Neural Networks (P. J. G. Lisboa, A. Vellido and B. Edisbury, eds.) 159-172. World Scientific, Singapore.

CHHIKARA, R. S. and MCKEON, J. (1984). Linear discriminant analysis with misallocation in training samples. J. Amer. Statist. Assoc. 79 899-906.

CLARK, P. and NIBLETT, T. (1989). The CN2 induction algorithm. Machine Learning 3 261-285.

COHEN, W. (1995). Fast effective rule induction. In Proceedings of the 12th International Conference on Machine Learning 115- 123. Morgan Kaufmann, Palo Alto, CA.

CORTES, C. and PREGIBON, D. (1998). Giga-mining. In Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining 174-178. AAAI Press, Menlo Park, CA.

CORTES, C, PREGIBON, D. and VOLINSKY, C. (2001). Communities of interest. Lecture Notes in Comput. Sci. 2189 105-114.

COX, K. C., EICK, S. G. and WILLS, G. J. (1997). Visual data mining: Recognizing telephone calling fraud. Data Mining and Knowledge Discovery 1 225-231.

CSIDS (1999). Cisco secure intrusion detection sy stem technical overview. Available at http://www.wheelgroup.com/ warp/public/cc/cisco/mkt/security/nranger/tech/ntran_tc.htm.

DENNING, D. E. (1997). Cy berspace attacks and countermeasures. In Internet Besieged (D. E. Denning and P. J. Denning, eds.) 29-55. ACM Press, New York.

DORRONSORO, J. R., GINEL, F., SANCHEZ, C. and CRUZ, C. S. (1997). Neural fraud detection in credit card operations. IEEE Transactions on Neural Networks 8 827-834.

FANNING, K., COGGER, K. O. and SRIVASTAVA, R. (1995). Detection of management fraud: A neural network approach. International Journal of Intelligent Sy stems in Accounting, Finance and Management 4 113-126.

FAWCETT, T. and PROVOST, F. (1997a). Adaptive fraud detection. Data Mining and Knowledge Discovery 1 291-316.

FAWCETT, T. and PROVOST, F. (1997b). Combining data mining and machine learning for effective fraud detection. In AAAI Workshop on AI Approaches to Fraud Detection and Risk Management 14-19. AAAI Press, Menlo Park, CA.

FAWCETT, T. and PROVOST, F. (1999). Activity monitoring: Noticing interesting changes in behavior. In Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 53-62. ACM Press, New York.

FORREST, S., HOFMEy R, S., SOMAy AJI, A. and LONGSTAFF, T. (1996). A sense of self for UNIX processes. In Proceedings of the 1996 IEEE Sy mposium on Security and Privacy 120-128. IEEE Computer Society Press, Silver Spring, MD.

GHOSH, S. and REILLY, D. L. (1994). Credit card fraud detection with a neural network. In Proceedings of the 27th Hawaii International Conference on Sy stem Sciences (J. F. Nunamaker and R. H. Sprague, eds.) 3 621-630. IEEE Computer Society Press, Los Alamitos, CA.

GLASGOW, B. (1997). Risk and fraud in the insurance industry. In AAAI Workshop on AI Approaches to Fraud Detection and Risk Management 20-21. AAAI Press, Menlo Park, CA.

GOLDBERG, H. and SENATOR, T. E. (1995). Restructuring databases for knowledge discovery by consolidation and link formation. In Proceedings of the First International Conference on Knowledge Discovery and Data Mining 136-141. AAAI Press, Menlo Park, CA.

GOLDBERG, H. and SENATOR, T. E. (1997). Break detection sy stems. In AAAI Workshop on AI Approaches to Fraud Detection and Risk Management 22-28. AAAI Press, Menlo Park, CA.

GOLDBERG, H. and SENATOR, T. E. (1998). The FinCEN AI sy stem: Finding financial crimes in a large database of cash transactions. In Agent Technology: Foundations, Applications, and Markets (N. Jennings and M. Wooldridge, eds.) 283-302. Springer, Berlin.

GREEN, B. P. and CHOI, J. H. (1997). Assessing the risk of management fraud through neural network technology. Auditing 16 14-28.

HAND, D. J. (1981). Discrimination and Classification. Wiley, Chichester.

HAND, D. J. (1997). Construction and Assessment of Classification Rules. Wiley, Chichester.

HAND, D. J. and BLUNT, G. (2001). Prospecting for gems in credit card data. IMA Journal of Management Mathematics 12 173- 200.

HAND, D. J., BLUNT, G., KELLY, M. G. and ADAMS, N. M. (2000). Data mining for fun and profit (with discussion). Statist. Sci. 15 111-131.

HAND, D. J. and HENLEY, W. E. (1997). Statistical classification methods in consumer credit scoring: A review. J. Roy. Statist. Soc. Ser. A 160 523-541.

HASSIBI, K. (2000). Detecting pay ment card fraud with neural networks. In Business Applications of Neural Networks (P. J. G. Lisboa, A. Vellido and B. Edisbury, eds.). World Scientific, Singapore.

HE, H., GRACO, W. and YAO, X. (1999). Application of genetic algorithm and k-nearest neighbour method in medical fraud detection. Lecture Notes in Comput. Sci. 1585 74-81. Springer, Berlin.

HE, H. X., WANG, J. C., GRACO, W. and HAWKINS, S. (1997). Application of neural networks to detection of medical fraud. Expert Sy stems with Applications 13 329-336.

HILL, T. P. (1995). A statistical derivation of the significant-digit law. Statist. Sci. 10 354-363.

Hy NNINEN, J. (2000). Experiences in mobile phone fraud. Seminar on Network Security. Report Tik-110.501, Helsinki Univ. Technology.

JENKINS, P. (2000). Getting smart with fraudsters. Financial Times, September 23.

JENSEN, D. (1997). Prospective assessment of AI technologies for fraud detection: a case study. In AAAI Workshop on AI Approaches to Fraud Detection and Risk Management 34-38. AAAI Press, Menlo Park, CA.

JU, W.-H. and VARDI, Y. (2001). A hy brid high-order Markov chain model for computer intrusion detection. J. Comput. Graph. Statist. 10 277-295.

KIRKLAND, J. D., SENATOR, T. E., HAy DEN, J. J., Dy BALA, T., GOLDBERG, H. G. and SHYR, P. (1998). The NASD regulation advanced detection sy stem (ADS). In Proceedings of the 15th National Conference on Artificial Intelligence (AAAI-98) and of the 10th Conference on Innovative Applications of Artificial Intelligence (IAAI-98) 1055-1062. AAAI Press, Menlo Park, CA.

KOSORESOW, A. P. and HOFMEy R, S. A. (1997). Intrusion detection via sy stem call traces. IEEE Software 14 35-42.

KUMAR, S. and SPAFFORD, E. (1994). A pattern matching model for misuse intrusion detection. In Proceedings of the 17th National Computer Security Conference 11-21.

LACHENBRUCH, P. A. (1974). Discriminant analysis when the initial samples are misclassified. II: Non-random misclassification models. Technometrics 16 419-424.

LANE, T. and BRODLEY, C. E. (1998). Temporal sequence learning and data reduction for anomaly detection. In Proceedings of the 5th ACM Conference on Computer and Communications Security (CCS-98) 150-158. ACM Press, New York.

LEE, W. and STOLFO, S. (1998). Data mining approaches for intrusion detection. In Proceedings of the 7th USENIX Security Sy mposium, San Antonio, TX 79-93. USENIX Association, Berkeley, CA.

LEONARD, K. J. (1993). Detecting credit card fraud using expert sy stems. Computers and Industrial Engineering 25 103-106.

LIPPMANN, R., FRIED, D., GRAF, I., HAINES, J., KENDALL, K., MCCLUNG, D., WEBER, D., WEBSTER, S., WYSCHOGROD, D., CUNNINGHAM, R. and ZISSMAN, M. (2000). Evaluating intrusion detection sy stems: The 1998 DARPA off-line intrusion-detection evaluation. Unpublished manuscript, MIT Lincoln Laboratory.

MAJOR, J. A. and RIEDINGER, D. R. (1992). EFD: A hy brid knowledge/statistical-based sy stem for the detection of fraud. International Journal of Intelligent Sy stems 7 687-703.

MARCHETTE, D. J. (2001). Computer Intrusion Detection and Network Monitoring: A Statistical Viewpoint. Springer, New York.

MCCARTHY, J. (2000). Phenomenal data mining. Comm. ACM 43 75-79.

MCLACHLAN, G. J. (1992). Discriminant Analy sis and Statistical Pattern Recognition. Wiley, New York.

MOBILE EUROPE (2000). New IP world, new dangers. Mobile Europe, March.

MOREAU, Y., PRENEEL, B., BURGE, P., SHAWE-TAYLOR, J., STOERMANN, C. and COOKE, C. (1996). Novel techniques for fraud detection in mobile communications. In ACTS Mobile Summit, Grenada.

MOREAU, Y., VERRELST, H. and VANDEWALLE, J. (1997). Detection of mobile phone fraud using supervised neural networks: A first prototy pe. In Proceedings of 7th International Conference on Artificial Neural Networks (ICANN'97) 1065- 1070. Springer, Berlin.

MURAD, U. and PINKAS, G. (1999). Unsupervised profiling for identifying superimposed fraud. Principles of Data Mining and Knowledge Discovery. Lecture Notes in Artificial Intelligence 1704 251-261. Springer, Berlin.

NEURAL TECHNOLOGIES (2000). Reducing telecoms fraud and churn. Report, Neural Technologies, Ltd., Petersfield, U.K.

NIGRINI, M. J. (1999). I've got your number. Journal of Accountancy May 79-83.

NIGRINI, M. J. and MITTERMAIER, L. J. (1997). The use of Benford's law as an aid in analytical procedures. Auditing: A Journal of Practice and Theory 16 52-67.

NORTEL (2000). Nortel networks fraud solutions. Fraud Primer, Issue 2.0. Nortel Networks Corporation.

PAK, S. J. and ZDANOWICZ, J. S. (1994). A statistical analysis of the U.S. Merchandise Trade Database and its uses in transfer pricing compliance and enforcement. Tax Management, May 11.

PATIENT, S. (2000). Reducing online credit card fraud. Web Developer's Journal. Available at http://www. webdevelopersjournal.com/articles/card_fraud.html

PRESS, S. J. and TANUR, J. M. (2001). The Subjectivity of Scientists and the Bayesian Approach. Wiley, New York.

PROVOST, F. and FAWCETT, T. (2001). Robust classification for imprecise environments. Machine Learning 42 203-210.

QU, D., VETTER, B. M., WANG, F., NARAy AN, R., WU, S. F., HOU, Y. F., GONG, F. and SARGOR, C. (1998). Statistical anomaly detection for link-state routing protocols. In Proceedings of the Sixth International Conference on Network Protocols 62-70. IEEE Computer Society Press, Los Alamitos, CA.

QUINLAN, J. R. (1990). Learning logical definitions from relations. Machine Learning 5 239-266.

QUINLAN, J. R. (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA.

ROBINSON, M. E. and TAWN, J. A. (1995). Statistics for exceptional athletics records. Appl. Statist. 44 499-511.

ROSSET, S., MURAD, U., NEUMANN, E., IDAN, Y. and PINKAS, G. (1999). Discovery of fraud rules for telecommunications-challenges and solutions. In Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 409-413. ACM Press, New York.

Ry AN, J., LIN, M. and MIIKKULAINEN, R. (1997). Intrusion detection with neural networks. In AAAI Workshop on AI Approaches to Fraud Detection and Risk Management 72-79. AAAI Press, Menlo Park, CA.

SCHONLAU, M., DUMOUCHEL, W., JU, W.-H., KARR, A. F., THEUS, M. and VARDI, Y. (2001). Computer intrusion: Detecting masquerades. Statist. Sci. 16 58-74.

SENATOR, T. E. (2000). Ongoing management and application of discovered knowledge in a large regulatory organization: A case study of the use and impact of NASD regulation's advanced detection sy stem (ADS). In Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 44-53. ACM Press, New York.

SENATOR, T. E., GOLDBERG, H. G., WOOTON, J., COTTINI, M. A., UMAR KHAN, A. F., KLINGER, C. D., LLAMAS, W. M., MARRONE, M. P. and WONG, R. W. H. (1995). The financial crimes enforcement network AI sy stem (FAIS)Identifying potential money laundering from reports of large cash transactions. AI Magazine 16 21-39. SHAWE-TAy LOR, J., HOWKER, K., GOSSET, P., Hy LAND, M., VERRELST, H., MOREAU, Y., STOERMANN, C. and

BURGE, P. (2000). Novel techniques for profiling and fraud detection in mobile telecommunications. In Business Applications of Neural Networks (P. J. G. Lisboa, A. Vellido and B.Edisbury, eds.) 113-139. World Scientific, Singapore.

SHIEH, S.-P. W. and GLIGOR, V. D. (1991). A pattern-oriented intrusion-detection model and its applications. In Proceedings of the 1991 IEEE Computer Society Sy mposium on Research in Security and Privacy 327-342. IEEE Computer Society Press, Silver Spring, MD.

SMITH, R. L. (1997). Comment on "Statistics for exceptional athletics records," by M. E. Robinson and J. A. Tawn. Appl. Statist. 46 123-128.

STOLFO, S. J., FAN, D. W., LEE, W., PRODROMIDIS, A. L. and CHAN, P. K. (1997a). Credit card fraud detection using metalearning: Issues and initial results. In AAAI Workshop on AI Approaches to Fraud Detection and Risk Management 83-90. AAAI Press, Menlo Park, CA.

STOLFO, S., FAN, W., LEE, W., PRODROMIDIS, A. L. and CHAN, P. (1999). Cost-based modeling for fraud and intrusion detection: Results from the JAM Project. In Proceedings of the DARPA Information Survivability Conference and Exposition 2 130-144. IEEE Computer Press, New York.

STOLFO, S. J., PRODROMIDIS, A. L., TSELEPIS, S., LEE, W., FAN, D. W. and CHAN, P. K. (1997b). JAM: Java agents for meta-learning over distributed databases. In AAAI Workshop on AI Approaches to Fraud Detection and Risk Management 91-98. AAAI Press, Menlo Park, CA.

TANIGUCHI, M., HAFT, M., HOLLMÉN, J. and TRESP, V. (1998). Fraud detection in communication networks using neural and probabilistic methods. In Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'98) 2 1241-1244. IEEE Computer Society Press, Silver Spring, MD.

U.S. CONGRESS (1995). Information technologies for the control of money laundering. Office of Technology Assessment, Report OTA-ITC-630, U.S. Government Printing Office, Washington, DC.

WASSERMAN, S. and FAUST, K. (1994). Social Network Analy sis: Methods and Applications. Cambridge Univ. Press.

WEBB, A. R. (1999). Statistical Pattern Recognition. Arnold, London.

ARONIS, J. and PROVOST, F. (1997). Increasing the efficiency of data mining algorithms with breadth-first marker propagation. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining 119-122. AAAI Press, Menlo Park, CA.

BREIMAN, L. (2001). Statistical modeling: The two cultures (with discussion). Statist. Sci. 16 199-231.

CHAMBERS, J. M. (1993). Greater or lesser statistics: A choice for future research. Statist. Comput. 3 182-184.

FAWCETT, T. and PROVOST, F. (2002). Fraud detection. In Handbook of Knowledge Discovery and Data Mining (W. Kloesgen and J. Zy tkow, eds.). Oxford Univ. Press.

FELLEMAN, H., ed. (1936). The Best Loved Poems of the American People. Doubleday, New York.

GOPINATHAN, K. M., BIAFORE, L. S., FERGUSON, W. M., LAZARUS, M. A., PATHRIA, A. K. and JOST, A. (1998). Fraud detection using predictive modeling. U.S. Patent 5819226, October 6.

HAND, D. J. (1996). Classification and computers: Shifting the focus. In COMPSTAT-96: Proceedings in Computational Statistics (A. Prat, ed.) 77-88. physica, Heidelberg.

HAND, D. J. (1998). Breaking misconceptions-statistics and its relationship to mathematics (with discussion). The Statistician 47 245-250, 284-286.

KELLY, M. G., HAND, D. J. and ADAMS, N. M. (1999). The impact of changing populations on classifier performance. In Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (S. Chaudhuri and D. Madigan, eds.) 367-371. ACM Press, New York.