Permutation – based statistical tests for multiple hypotheses

Springer Science and Business Media LLC - Tập 3 - Trang 1-8 - 2008

Anyela Camargo¹, Francisco Azuaje², Haiying Wang³, Huiru Zheng³

¹University of East Anglia, School of Computing, Norwich, UK

²Laboratory of Cardiovascular Research, CRP-Santé, Luxembourg

³University of Ulster at Jordanstown, School of Computing and Mathematics,, Co. Antrim, UK

Tóm tắt

Genomics and proteomics analyses regularly involve the simultaneous test of hundreds of hypotheses, either on numerical or categorical data. To correct for the occurrence of false positives, validation tests based on multiple testing correction, such as Bonferroni and Benjamini and Hochberg, and re-sampling, such as permutation tests, are frequently used. Despite the known power of permutation-based tests, most available tools offer such tests for either t-test or ANOVA only. Less attention has been given to tests for categorical data, such as the Chi-square. This project takes a first step by developing an open-source software tool, Ptest, that addresses the need to offer public software tools incorporating these and other statistical tests with options for correcting for multiple hypotheses. This study developed a public-domain, user-friendly software whose purpose was twofold: first, to estimate test statistics for categorical and numerical data; and second, to validate the significance of the test statistics via Bonferroni, Benjamini and Hochberg, and a permutation test of numerical and categorical data. The tool allows the calculation of Chi-square test for categorical data, and ANOVA test, Bartlett's test and t-test for paired and unpaired data. Once a test statistic is calculated, Bonferroni, Benjamini and Hochberg, and a permutation tests are implemented, independently, to control for Type I errors. An evaluation of the software using different public data sets is reported, which illustrates the power of permutation tests for multiple hypotheses assessment and for controlling the rate of Type I errors. The analytical options offered by the software can be applied to support a significant spectrum of hypothesis testing tasks in functional genomics, using both numerical and categorical data.

Tài liệu tham khảo

Barth AS, Kuner R, Buness A, Ruschhaupt M, Merk S, Zwermann L, Kääb S, Kreuzer E, Steinbeck G, Mansmann U, Poustka A, Nabauer M, Sültmann H: Identification of a common gene expression signature in dilated cardiomyopathy across independent microarray studies. J Am Coll Cardiol. 2006, 48: 1610-7. 10.1016/j.jacc.2006.07.026. Mathur P, Kaga S, Zhan L, Das DK, Maulik N: Potential candidates for ischemic preconditioning-associated vascular growth pathways revealed by antibody array. Am J Physiol Heart Circ Physiol. 2005, 288 (6): H3006-10. 10.1152/ajpheart.01203.2004. Dudoit S, Shaffer JP, Boldrick JC: Multiple hypotheses testing in microarray experiments. Statistical Science. 2003, 18 (1): 71-103. 10.1214/ss/1056397487. Feilotter H: A Biologist's guide to analysis of DNA microarray data. Am J Hum Genet. 2002, 71 (6): 1483-1484. 10.1086/344458. Multiple Testing Corrections. [http://www.chem.agilent.com/cag/bsp/sig/downloads/pdf/mtc.pdf] Belmonte M, Yurgelun-Todd D: Permutation testing made practical for functional magnetic resonance image analysis. IEEE Trans Med Imaging. 2001, 20 (3): 243-8. 10.1109/42.918475. Nakagawa S: A farewell to Bonferroni: the problems of low statistical power and publication bias. Behav Ecol. 2004, 15: 1044-1045. 10.1093/beheco/arh107. Kimmel G, Jordan MI, Halperin E, Shamir R, Karp RM: A randomization test for controlling population stratification in whole-genome association studies. Am J Hum Genet. 2007, 81 (5): 895-905. 10.1086/521372. Cheverud JM: A simple correction for multiple comparisons in interval mapping genome scans. Heredity. 2001, 87: 52-58. 10.1046/j.1365-2540.2001.00901.x. Saeed AI, Sharov V, White J, Li J, Liang W, Bhagabati N, Braisted J, Klapa M, Currier T, Thiagarajan M, Sturn A, Snuffin M, Rezantsev A, Popov D, Ryltsov A, Kostukovich E, Borisovsky I, Liu Z, Vinsavich A, Trush V, Quackenbush J: TM4: a free, open-source system for microarray data management and analysis. Biotechniques. 2003, 34 (2): 374-8. GeneSpring GX Software. [http://www.chem.agilent.com/Scripts/PDS.asp?lPage=27881] NIST/SEMATECH e-Handbook of Statistical Methods. [http://www.itl.nist.gov/div898/handbook] Gene Expression Omnibus (GEO). [http://www.ncbi.nlm.nih.gov/geo] Single Nucleotide Polymorphism database (SNPdb). [http://www.ncbi.nlm.nih.gov/projects/SNP/index.html] Hinds DA, Stuve LL, Nilsen GB, Halperin E, Eskin E, Ballinger DG, Frazer KA, Cox DR: Whole-genome patterns of common DNA variation in three human populations. Science. 2005, 307 (5712): 1052-3. 10.1126/science.1105436. Carlson CS, Eberle MA, Rieder MJ, Smith JD, Kruglyak L, Nickerson DA: Additional SNPs and linkage-disequilibrium analyses are necessary for whole-genome association studies in humans. Nat Genet. 2003, 33 (4): 518-21. 10.1038/ng1128. Kittleson MM, Minhas KM, Irizarry RA, Ye SQ, Edness G, Breton E, Conte JV, Tomaselli G, Garcia JG, Hare JM: Gene expression analysis of ischemic and nonischemic cardiomyopathy: shared and distinct genes in the development of heart failure. Physiol Genomics. 2005, 21 (3): 299-307. 10.1152/physiolgenomics.00255.2004.

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Về chúng tôi

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích các bài báo, công bố khoa học Việt Nam. Công cụ trợ giúp người nghiên cứu, tạp chí, đơn vị nghiên cứu tra cứu, phân tích và thống kê dữ liệu nghiên cứu khoa học tại Việt Nam và quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia vào Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA