Potential identification of pediatric asthma patients within pediatric research database using low rank matrix decomposition
Tóm tắt
Asthma is a prevalent disease in pediatric patients and most of the cases begin at very early years of life in children. Early identification of patients at high risk of developing the disease can alert us to provide them the best treatment to manage asthma symptoms. Often evaluating patients with high risk of developing asthma from huge data sets (e.g., electronic medical record) is challenging and very time consuming, and lack of complex analysis of data or proper clinical logic determination might produce invalid results and irrelevant treatments. In this article, we used data from the Pediatric Research Database (PRD) to develop an asthma prediction model from past All Patient Refined Diagnosis Related Groupings (APR-DRGs) coding assignments. The knowledge gleamed in this asthma prediction model, from both routinely use by physicians and experimental findings, will become fused into a knowledge-based database for dissemination to those involved with asthma patients. Success with this model may lead to expansion with other diseases.
Tài liệu tham khảo
Chandra Shekar DV, Sesha Srinivas V: Clinical Data Mining An Approach for Identification of Refractive Errors. 2008, Hong Kong: Proceedings of the International MultiConference of Engineers and Computer Scientists 2008 Vol I IMECS 2008, 19-21 March
Palaniappan S, Ling C: Clinical Decision Support Using OLAP With Data Mining. IJCSNS International Journal of Computer Science and Network Security. September 2008, 8: 9-
Prather JC, et al: Medical data mining: knowledge discovery in a clinical data warehouse. Proc AMIA Annu Fall Symp. 1997, 101-105.
Chae YM, et al: Data mining approach to policy analysis in a health insurance domain. Int J Med Inform. 2001, 62 (2-3): 103-111. 10.1016/S1386-5056(01)00154-X.
Hedberg SR: The data gold-rush. Byte. 1995, 20 (10): 83-88.
Mohri M, Rostamizadeh A, Talwalkar A: Foundations of Machine Learning. 2012, New York: The MIT Press
Huang Z: Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Mining and Knowledge Discovery. 1998, 283: 304-
Jain AK, Murty MN, Flynn PJ: Data Clustering: A Review. 1999, ohio: ACM computing surveys
Neapolitan RE: Learning Bayesian Networks. 2004, Illinois: Prentice Hall
Gelman A: A Bayesian formulation of exploratory data analysis and goodness-of-fit testing. International Statistical Review. 2003, 71 (2): 369-382.
Tom M: Machine Learning. 1997, McGraw-Hill, 55-58.
Grzymala-Busse JW: Selected algorithms of machine learning from examples. Fundamenta Informaticae. 1993, 18: 193-207.
Liu WX, et al: Nonnegative matrix factorization and its applications in pattern recognition. Chinese Science Bulletin. 2006, 51 (1): 7-18. 10.1007/s11434-005-1109-6.
Cemgil AT: Bayesian inference for nonnegative matrix factorisation models. Comput Intell Neurosci. 2009, 785152-
Berry MW, Gillis N, Glineur F: Document Classification Using Nonnegative Matrix Factorization and Underapproximation. 2009, IEEE
Sedman AB, Bahl V, Bunting E, Bandy K, Jones S, Nasr SZ, Schulz K, Campbell DA: Clinical redesign using all patient refined diagnosis related groups. Pediatrics. 2004, 114 (4): 965-969. 10.1542/peds.2004-0650.
Viangteeravat T: Giving Raw Data a Chance to Talk: A demonstration of de-identified Pediatric Research Database and exploratory analysis techniques for possible cohort discovery and identifiable high risk factors for readmission. Proceeding of 12TH Annual UT-ORNL-KBRIN Bioinformatics Summit. 2013
Srebro N, Jaakkola T: Weighted Low Rank Approximation. 2003, Washington DC: Proceedings of the Twentieth International Conference on Machine Learning (ICML-2003)
Young E: Singular Value Decomposition. http://www.schonemann.de//svd.htm,
Cadzow JA: Signal enhancement: a useful signal processing tool Spectrum Estimation and Modeling. Fourth Annual ASSP Workshop. 1988, 162: 167-
Cadzow JA: Minimum l(1), l(2), and l(infinity) norm approximate solutions to an overdetermined system of linear equations. Digital Signal Processing. 2002, 12 (4): 524-560. 10.1006/dspr.2001.0409.
Viangteeravat T: Discrete Approximation using L1 norm Techniques. 2000, Master Thesis: Electrical Engineering, Vanderbilt University
Cadzow JA: Application of the l1 norm in Signal Processing". Department of Electrical Engineering. 1999, Nashville: Vanderbilt University
Perkins J: Python Text Processing with NLTK 2.0 Cookbook. 2010, Birmingham: Packt Publishing
