Lập trình DC và DCA cho phân tích phân biệt Fisher thưa thớt

Neural Computing and Applications - Tập 28 - Trang 2809-2822 - 2016
Hoai An Le Thi1,2, Duy Nhat Phan2
1Department for Management of Science and Technology Development & Faculty of Mathematics Statistics, Ton Duc Thang University, Ho Chi Minh City, Vietnam
2Laboratory of Theoretical and Applied Computer Science, University of Lorraine, Metz, France

Tóm tắt

Chúng tôi xem xét việc phân loại mẫu có giám sát trong bối cảnh có chiều cao, trong đó số lượng các đặc trưng lớn hơn nhiều so với số lượng các quan sát. Chúng tôi đưa ra một cách tiếp cận mới cho vấn đề phân biệt Fisher thưa thớt sử dụng chuẩn $$\ell _0$$. Vấn đề tối ưu hóa thu được không lõm, không liên tục và rất khó giải quyết. Chúng tôi vượt qua sự không liên tục bằng cách sử dụng các xấp xỉ thích hợp cho chuẩn $$\ell _0$$ sao cho các vấn đề thu được có thể được cấu trúc lại dưới dạng chương trình hài hòa của các hàm lõm khác nhau (DC) mà lập trình DC và các thuật toán DC (DCA) được điều tra. Kết quả thực nghiệm trên cả tập dữ liệu mô phỏng và dữ liệu thực cho thấy hiệu quả của các thuật toán được đề xuất so với một số phương pháp tiên tiến nhất.

Từ khóa

#phân tích phân biệt Fisher thưa thớt #lập trình DC #thuật toán DC #tối ưu hóa phi lõm #chuẩn ℓ0

Tài liệu tham khảo

Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA 96(12):6745–6750 Bickel PJ, Levina E (2004) Some theory for Fisher’s linear discriminant function, naive Bayes, and some alternatives when there are many more variables than observations. Bernoulli 10(6):989–1010 Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1):1–124 Bradley PS, Mangasarian OL (1998) Feature selection via concave minimization and support vector machines. In: Proceeding of international conference on machine learning ICML98 Chen X, Xu FM, Ye Y (2010) Lower bound theory of nonzero entries in solutions of l2-lp minimization. SIAM J Sci Comput 32(5):2832–2852 Cheng S, Le Thi HA (2013) Learning sparse classifiers with difference of convex functions algorithms. Optim Methods Softw 28(4):830–854 Clemmensen L, Hansen M, Ersboll B, Frisvad J (2007) A method for comparison of growth media in objective identification of penicillium based on multi-spectral imaging. J Microbiol Methods 69:249–255 Clemmensen L, Hastie T, Witten D, Ersbøll B (2011) Sparse discriminant analysis. Technometrics 53(4):406–413 Collobert R, Sinz F, Weston J, Bottou L (2006) Trading convexity for scalability. In Proceedings of the 23rd international conference on machine learning, NY, USA, pp 201–208 Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7:179–188 Friedman J, Hastie T, Hoefling H, Tibshirani R (2007) Pathwise coordinate optimization. An Appl Stat 1:302–332 Gasso G, Rakotomamonjy A, Canu S (2009) Recovering sparse signals with a certain family of nonconvex penalties and dc programming. IEEE Trans Signal Process 57:4686–4698 Gordon GJ, Jensen RV, Hsiao LL, Gullans SR, Blumenstock JE, Ramaswamy S, Richards WG, Sugarbaker DJ, Bueno R (2002) Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Res 62:4963–4967 Grosenick L, Greer S, Knutson B (2008) Interpretable classifiers for fmri improve prediction of purchases. IEEE Trans Neural Syst Rehabil Eng 16(6):539–547 Guo Y, Hastie T, Tibshirani R (2007) Regularized linear discriminant analysis and its application in microarrays. Biostatistics 8(1):86–100 Hastie T, Buja A, Tibshirani R (1995) Penalized discriminant analysis. Ann Stat 23(1):73–102 Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning. Springer Verlag, New York Khan J, Wei JS, Ringner M, Saal LH, Ladanyi M, Westermann F, Berthold F, Schab M, Antonescu CR, Peterson C, Meltzer PS (2001) Classification and diagnostic prediction of cancers using expression profiling and artificial neural networks. Nat Med 7:673–679 Krause N, Singer Y (2004) Leveraging the margin more carefully. In: Proceedings of the twenty first international conference on machine learning, NY, USA Krzanowski W, Jonathan P, Mccarthy W, Thomas M (1995) Discriminant analysis with singular covariance matrices: methods and applications to spectroscopic data. J R Stat Soc 44(1):101–115 Le Hoai M, Le Thi HA, Pham Dinh T, Huynh VN (2013) Block clustering based on difference of convex functions (DC) programming and DC algorithms. Neural Comput 25:259–278 Le Thi HA (2000) An efficient algorithm for globally minimizing a quadratic function under convex quadratic constraints. Math Program 87:401–426 Le Thi HA, Le Hoai M, Nguyen VV, Pham Dinh T (2008) A DC programming approach for feature selection in support vector machines learning. J Adv Data Anal Classif 2(3):259–278 Le Thi HA, Le Hoai M, Pham Dinh T (2007) Optimization based DC programming and DCA for hierarchical clustering. Eur J Oper Res 183:1067–1085 Le Thi HA, Le HM, Dinh TP (2014a) New and efficient DCA based algorithms for minimum sum-of-squares clustering. Pattern Recognit 47:388–401 Le Thi HA, Le Hoai M, Pham Dinh T (2015a) Feature selection in machine learning: an exact penalty approach using a difference of convex function algorithm. Mach Learn 101:163–186 Le Thi HA, Nguyen MC (2014) Self-organizing maps by difference of convex functions optimization. Data Min Knowl Discov 28:1336–1365 Le Thi HA, Nguyen VV, Ouchani S (2009) Gene selection for cancer classification using DCA. J Front Comput Sci Technol 3:612–620 Le Thi HA, Pham Dinh T (2005) The DC (difference of convex functions) programming and DCA revisited with DC models of real world nonconvex optimization problems. Ann Oper Res 133:23–46 Le Thi HA, Pham Dinh T, Huynh VN (2012) Exact penalty and error bounds in DC programming. J Glob Optim 52(3):509–535 Le Thi HA, Pham Dinh T, Le Hoai M, Vo Xuan T (2015b) DC approximation approaches for sparse optimization. Eur J Oper Res 244:26–44 Le Thi HA, Vo Xuan T, Pham Dinh T (2014b) Feature selection for linear SVMs under uncertain data: robust optimization based on difference of convex functions algorithms. Neural Netw 59:36–50 Leng C (2008) Sparse optimal scoring for multiclass cancer diagnosis and biomarker detection using microarray data. Comput Biol Chem 32:417–425 Liu Y, Shen X, Doss H (2005) Multicategory \(\psi \)-learning and support vector machine: computational tools. J Comput Graph Stat 14:219–236 Mai Q, Zou H (2013) A note on the connection and equivalence of three sparse linear discriminant analysis methods. Technometrics 55(2):243–246 Mai Q, Zou H, Yuan M (2012) A direct approach to sparse discriminant analysis in ultra-high dimensions. Biometrika 99(1):29–42 Mardia KV, Kent JT, Bibby JM (1979) Multivariate Analysis. Academic Press, London, New York, Toronto, Sydney, San Francisco Nakayama R, Nemoto T, Takahashi H, Ohta T, Kawai A, Yoshida T, Toyama Y, Ichikawa H, Hasegama T (2007) Gene expression analysis of soft tissue sarcomas: characterization and reclassification of malignant fibrous histiocytoma. Mod Pathol 20(7):749–759 Neumann J, Schnorr G, Steidl G (2005) Combined SVM-based feature selection and classification. Mach Learn 61:129–150 Peleg D, Meir R (2008) A bilinear formulation for vector sparsity optimization. Signal Process 88(2):375–389 Pham Dinh T, Le Thi HA (1997) Convex analysis approach to D.C. programming: theory, algorithms and applications. Acta Math Vietnam 22(1):289–355 Pham Dinh T, Le Thi HA (1998) A DC optimization algorithm for solving the trust-region subproblem. SIAM J Optim 8(2):476–505 Pham Dinh T, Le Thi HA (2014) Recent advances in dc programming and dca. Trans Comput Collect Intell 8342:1–37 Sun L, Hui A, Su Q, Vortmeyer A, Kotliarov Y, Pastorino S, Passaniti A, Menon J, Wlling J, Bailey R, Rosenblum M, Mikkelsen T, Fine H (2006) Neuronal and glioma-derived stem cell factor induces angiogenesis within the brain. Cancer Cell 9:287–300 Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc 58:267–288 Tibshirani R, Hastie T, Narasimhan B, Chu G (2002) Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci 99:6567–6572 Tibshirani R, Hastie T, Narasimhan B, Chu G (2003) Class prediction by nearest shrunken centroids, with applications to DNA microarrays. Stat Sci 18(1):104–117 Trendafilov NT, Jolliffe IT (2007) Dalass: Variable selection in discriminant analysis via the lasso. Comput Stat Data Anal 51:3718–3736 Witten D, Tibshirani R (2011) Penalized classification using Fisher’s linear discriminant. J R Stat Soc B 73:753–772 Wu M, Zhang L, Wang Z, Christiani D, Lin X (2009) Sparse linear discriminant analysis for simultaneous testing for the significance of a gene set/pathway and gene selection. Bioinformatics 25:1145–1151 Xu P, Brock GN, Parrish RS (2009) Modified linear discriminant analysis approaches for classification of high-dimensional microarray data. Comput Stat Data Anal 53:1674–1687 Yeoh EJ, Ross ME, Shurtleff SA, Williams WK, Patel D, Mahfouz R, Behm FG, Raimondi SC, Relling MV, Patel A et al (2002) Classification, subtype discovery, and prediction of outcome in pediatric lymphoblastic leukemia by gene expression profiling. Cancer Cell 1:133–143