A strategy for multimodal data integration: application to biomarkers identification in spinocerebellar ataxia

Briefings in Bioinformatics - Tập 19 Số 6 - Trang 1356-1369 - 2018
Imène Garali1, Isaac M. Adanyeguh2, Farid Ichou3, Vincent Perlbarg1, Alexandre Seyer4, Benoît Colsch5,6,7, Ivan Moszer1, Vincent Guillemot8, Alexandra Dürr9, Fanny Mochel10, Arthur Tenenhaus1,11
1Bioinformatics and Biostatistics Core Facility of the Brain and Spine Institute, La Pitié-Salpêtriére Hospital, Paris, France
2Pierre and Marie Curie University
3ICANalytics department, institute of cardiometabolism and nutrition, Paris, France
4SpectMet platform of the MedDay Pharmaceuticals company, Paris, France
5LEMM Laboratory at CEA-Saclay, France
6MetaboHUB-IDF
7SPI - Service de Pharmacologie et Immunoanalyse
8Institut Pasteur, Statistical Genetics group, Bioinformatics/Biostatistics Core Facility
9Pitié-Salpêtriére University Hospital in Paris
10University Pierre and Marie Curie (UPMC) and the Pitié-Salpêtriére University Hospital
11L2S Laboratory at CentraleSupélec, France

Tóm tắt

Từ khóa


Tài liệu tham khảo

Tenenhaus, 2011, Regularized generalized canonical correlation analysis, Psychometrika, 76, 257, 10.1007/s11336-011-9206-8

Tenenhaus, 2017, Regularized generalized canonical correlation analysis: a framework for sequential multiblock component methods, Accepted Psychometrika, 10.1007/s11336-017-9573-x

Tenenhaus, 2014, Variable selection for generalized canonical correlation analysis., Biostatistics, 15, 569, 10.1093/biostatistics/kxu001

Günther, 2014, Novel multivariate methods for integration of genomics and proteomics data: applications in a kidney transplant rejection study, OMICS, 18, 682, 10.1089/omi.2014.0062

Meng, 2016, Dimension reduction techniques for the integrative analysis of multi-omics data, Brief Bioinform, 17, 628, 10.1093/bib/bbv108

Meng, 2014, A multivariate approach to the integration of multi-omics datasets, BMC Bioinformatics, 15, 1, 10.1186/1471-2105-15-162

Hotelling, 1933, Analysis of a complex of statistical variables into principal components, J Educ Psychol, 24, 417, 10.1037/h0071325

Hotelling, 1936, Relation between two sets of variates, Biometrika, 28, 321, 10.1093/biomet/28.3-4.321

Tucker, 1958, An inter-battery method of factor analysis, Psychometrika, 23, 111, 10.1007/BF02289009

Wold, 1983, The multivariate calibration problem in chemistry solved by the PLS method, Proc Conf Matrix Pencils, 973, 286, 10.1007/BFb0062108

Van den Wollenberg, 1977, Redundancy analysis – an alternative to canonical correlation analysis, Psychometrika, 42, 207, 10.1007/BF02294050

Carroll, 1968, A generalization of canonical correlation analysis to three or more sets of variables, Proc 76th Conv Am Psych. Assoc, 3, 227

Carroll

Wold, 1996, Hierarchical multiblock PLS and PC models for easier model interpretation and as an alternative to variable selection, J Chemom, 10, 463, 10.1002/(SICI)1099-128X(199609)10:5/6<463::AID-CEM445>3.0.CO;2-L

Chessel, 1996, Analyses de la co-inertie de K nuages de points, Rev Stat Appl, 44, 35

Westerhuis, 1998, Analysis of multiblock and hierarchical PCA and PLS models, J Chemom, 12, 301, 10.1002/(SICI)1099-128X(199809/10)12:5<301::AID-CEM515>3.0.CO;2-S

Smilde, 2003, A framework for sequential multiblock component methods, J Chemom, 17, 323, 10.1002/cem.811

Escofier, 1994, Multiple factor analysis, (AFMULT package), Comput Stat Data Anal, 18, 121, 10.1016/0167-9473(94)90135-X

Horst, 1961, Relations among m sets of variables, Psychometrika, 26, 126, 10.1007/BF02289710

Kettenring, 1971, Canonical analysis of several sets of variables, Biometrika, 58, 433, 10.1093/biomet/58.3.433

Hanafi, 2007, PLS Path modelling: computation of latent variables with the estimation mode B, Comput Stat, 22, 275, 10.1007/s00180-007-0042-3

Van de Geer, 1984, Linear relations among k sets of variables, Psychometrika, 49, 70, 10.1007/BF02294207

Hanafi, 2006, Analysis of K sets of data, with differential emphasis on agreement between and within sets, Comput Stat Data Anal, 51, 1491, 10.1016/j.csda.2006.04.020

Kramer, 2007

Wold, 1982, Systems under Indirect Observation: Part 2, 1

Tenenhaus, 2015, Kernel generalized canonical correlation analysis, Comput Stat Data Anal, 90, 114, 10.1016/j.csda.2015.04.004

Tenenhaus, 2005, PLS path modeling, Comput Stat Data Anal, 48, 159, 10.1016/j.csda.2004.03.005

Bro, 2003, Centering and scaling in component analysis, J Chemom, 17, 16, 10.1002/cem.773

Van Deun, 2009, A structured overview of simultaneous component based data integration, BMC Bioinformatics, 10, 246, 10.1186/1471-2105-10-246

Ledoit, 2004, A well conditioned estimator for large-dimensional covariance matrices, J Multivar Anal, 88, 365, 10.1016/S0047-259X(03)00096-4

Schäfer, 2005, Shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics, Stat Appl Genet Mol Biolgy, 4, 32.

Barker, 2003, Partial least squares for discrimination, J Chemom, 17, 166, 10.1002/cem.785

Bickel, 2004, Some theory for Fisher's linear discriminant function,'naive Bayes', and some alternatives when there are many more variables than observations, Bernoulli, 10, 989, 10.3150/bj/1106314847

Chun, 2010, Sparse partial least squares regression for simultaneous dimension reduction and variable selection, J R Stat Soc Ser B Stat Methodol, 72, 3, 10.1111/j.1467-9868.2009.00723.x

Efron, 1979, Bootstrap methods: another look at the jackknife, Ann Stat, 7, 1, 10.1214/aos/1176344552

Efron, 1987, Better bootstrap confidence intervals, J Am Stat Assoc, 82, 171, 10.1080/01621459.1987.10478410

Fleiss, 1971, Measuring nominal scale agreement among many raters, Psychol Bull, 76, 378., 10.1037/h0031619

Meinshausen, 2010, Stability selection, J R Stat Soc Ser B Stat Methodol, 72, 417, 10.1111/j.1467-9868.2010.00740.x

Gu, 2016, A variable selection method for simultaneous component based data integration, Chemom Intell Lab Syst, 158, 187, 10.1016/j.chemolab.2016.07.013

Keiser, 2015, Broad distribution of ataxin 1 silencing in rhesus cerebella for spinocerebellar ataxia type 1 therapy, Brain, 138, 3555, 10.1093/brain/awv292

Rüb, 2013, Clinical features, neurogenetics and neuropathology of the polyglutamine spinocerebellar ataxias type 1, 2, 3, 6 and 7, Prog Neurobiol, 104, 38, 10.1016/j.pneurobio.2013.01.001

Durr, 2010, Autosomal dominant cerebellar ataxias: polyglutamine expansions and beyond, Lancet Neurol, 9, 885, 10.1016/S1474-4422(10)70183-6

Klaes, 2016, MR Imaging in Spinocerebellar Ataxias: a systematic review, AJNR Am J Neuroradiol, 37, 1405, 10.3174/ajnr.A4760

Jacobi, 2013, Biological and clinical characteristics of individuals at risk for spinocerebellar ataxia types 1, 2, 3, and 6 in the longitudinal RISCA study: analysis of baseline data, Lancet Neurol, 12, 650, 10.1016/S1474-4422(13)70104-2

Mochel, 2007, Early energy deficit in Huntington disease: identification of a plasma biomarker traceable during disease progression, PLoS One, 2, e647, 10.1371/journal.pone.0000647

Mochel, 2011, Energy deficit in Huntington disease: why it matters, J Clin Invest, 121, 493, 10.1172/JCI45691

Adanyeguh, 2015, Triheptanoin improves brain energy metabolism in patients with Huntington disease, Neurology, 84, 490, 10.1212/WNL.0000000000001214

Schmitz-Hubsch, 2006, Scale for the assessment and rating of ataxia: development of a new clinical scale, Neurology, 66, 1717, 10.1212/01.wnl.0000219042.60538.92

Wishart, 2007, HMDB: the Human Metabolome Database, Nucleic Acids Res, 35, 521, 10.1093/nar/gkl923

Wishart, 2009, HMDB: a knowledgebase for the human metabolome, Nucleic Acids Res, 37, 603, 10.1093/nar/gkn810

Wishart, 2013, HMDB 3.0 | The Human Metabolome Database in 2013, Nucleic Acids Res, 41, 801, 10.1093/nar/gks1065

Kanehisa, 2000, KEGG: Kyoto Encyclopedia of Genes and Genomes, Nucleic Acids Res, 28, 27, 10.1093/nar/28.1.27

Kanehisa, 2016, KEGG as a reference resource for gene and protein annotation, Nucleic Acids Res, 44, 457, 10.1093/nar/gkv1070

Lamari, 2013, Disorders of phospholipids, sphingolipids and fatty acids biosynthesis: toward a new category of inherited metabolic diseases, J Inherit Metab Dis, 36, 411, 10.1007/s10545-012-9509-7

Caspi, 2008, The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases, Nucleic Acids Res, 3636(Suppl 1), D623

McKeon, 1966, Canonical analysis: some relation between canonical correlation, factor analysis, discriminant analysis, and scaling theory, Psychom Monogr, 13

Lastres-Becker, 2008, Insulin receptor and lipid metabolism pathology in ataxin-2 knock-out mice, Hum Mol Genet, 17, 1465, 10.1093/hmg/ddn035

Martin, 2005, Detailed characterization of the lipid composition of detergent-resistant membranes from photoreceptor rod outer segment membranes, Invest Ophthalmol Vis Sci, 46, 1147, 10.1167/iovs.04-1207

McMahon, 2011, Epidermal expression of an Elovl4 transgene rescues neonatal lethality of homozygous Stargardt disease-3 mice, J Lipid Res, 52, 1128, 10.1194/jlr.M014415

Lamari, 2015, An overview of inborn errors of complex lipid biosynthesis and remodelling, J Inherit Metab Dis, 38, 3, 10.1007/s10545-014-9764-x

Tenenhaus, 2015

Tenenhaus, 2014, Regularized generalized canonical correlation analysis for multiblock or multigroup data analysis, Eur J Oper Res, 238, 391, 10.1016/j.ejor.2014.01.008