Brain Connectivity-Informed Regularization Methods for Regression

Statistics in Biosciences - Tập 11 - Trang 47-90 - 2017
Marta Karas1, Damian Brzyski2, Mario Dzemidzic3, Joaquín Goñi4, David A. Kareken3, Timothy W. Randolph5, Jaroslaw Harezlak2
1Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, USA
2Department of Epidemiology and Biostatistics, Indiana University Bloomington, Bloomington, USA
3Department of Neurology, Indiana University School of Medicine, Indianapolis, USA
4School of Industrial Engineering and Weldon School of Biomedical Engineering, Purdue University, West Lafayette, USA
5Biostatistics and Biomathematics, Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, USA

Tóm tắt

One of the challenging problems in brain imaging research is a principled incorporation of information from different imaging modalities. Frequently, each modality is analyzed separately using, for instance, dimensionality reduction techniques, which result in a loss of mutual information. We propose a novel regularization method to estimate the association between the brain structure features and a scalar outcome within the linear regression framework. Our regularization technique provides a principled approach to use external information from the structural brain connectivity and inform the estimation of the regression coefficients. Our proposal extends the classical Tikhonov regularization framework by defining a penalty term based on the structural connectivity-derived Laplacian matrix. Here, we address both theoretical and computational issues. The approach is first illustrated using simulated data and compared with other penalized regression methods. We then apply our regularization method to study the associations between the alcoholism phenotypes and brain cortical thickness using a diffusion imaging derived measure of structural connectivity. Using the proposed methodology in 148 young male subjects with a risk for alcoholism, we found a negative associations between cortical thickness and drinks per drinking day in bilateral caudal anterior cingulate cortex, left lateral OFC, and left precentral gyrus.

Tài liệu tham khảo

Belge M, Kilmer ME, Miller EL (2002) Efficient determination of multiple regularization parameters in a generalized l-curve framework. Inverse Probl 18(4):1161–1183 Bertero M, Boccacci P (1998) Introduction to inverse problems in imaging. Institute of Physics, Bristol Bjorck A (1996) Numerical methods for least squares problems. SIAM, Philadelphia Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp. https://doi.org/10.1088/1742-5468/2008/10/P10008 Brezinski C, Redivo-Zaglia M, Rodriguez G, Seatzu S (2003) Multi-parameter regularization techniques for ill-conditioned linear systems. Numer Math 94(2):203–228 Charpentier J, Dzemidzic M, West J, Oberlin BG 2nd, Eiler W, Saykin AJ, Kareken DA (2016) Externalizing personality traits, empathy, and gray matter volume in healthy young drinkers. Psychiatry Res 248:64–72 Chung F (2005) Laplacians and the Cheeger inequality for directed graphs. Ann Comb 9(1):1–19 Cole MW, Bassett DS, Power JD, Braver TS, Petersen SE (2014) Intrinsic and task-evoked network architectures of the human brain. Neuron 83(1):238–251 Craven P, Wahba G (1979) Smoothing noisy data with spline functions: estimating the correct degree of smoothing by the method of generalized cross-validation. Numer Math 31:377–403 Csardi G, Nepusz T (2006) The igraph software package for complex network research. InterJournal Complex Systems 1695. http://igraph.org Demidenko E (2004) Mixed models: theory and applications. Wiley, Hoboken Desikan RS, Segonne F, Fischl B, Quinn BT, Dickerson BC, Blacker D, Buckner RL, Dale AM, Maguire RP, Hyman BT, Albert MS, Killiany RJ (2006) An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest. NeuroImage 31(3):968–80 Elden L (1982) A weighted pseudoinverse, generalized singular values, and constrained least squares problems. BIT 22:487–502 Engl HW, Hanke M, Neubauer A (2000) Regularization of inverse problems. Kluwer, Dordrecht Fischl B (2012) FreeSurfer. Neuroimage 62(2):774–781. https://doi.org/10.1016/j.neuroimage.2012.01.021. http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3685476/ Freytag S, Manitz J, Schlather M, Kneib T, Amos CI, Risch A, Chang-Claude J, Heinrich J, Bickeböller H (2014) A network-based kernel machine test for the identification of risk pathways in genome-wide association studies. Hum Hered 76(2):64–75 Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1–22. http://www.jstatsoft.org/v33/i01/ Golub G, Van Loan C (2013) Matrix computations, 4th edn. Johns Hopkins University Press, Baltimore Hagmann P, Cammoun L, Gigandet X, Meuli R, Honey CJ, Wedeen VJ, Sporns O (2008) Mapping the structural core of human cerebral cortex. PLoS Biol 6(7):e159 Hansen PC (1998) Rank-deficient and discrete III-posed problems: numerical aspects of linear inversion. SIAM, Philadelphia Hastie T, Buja A, Tibshirani R (1995) Penalized discriminant analysis. Ann Stat 23(1):73–102 Huang J, Shen H, Buja A (2008) Functional principal components analysis via penalized rank one approximation. Electron J Stat 2:678–695 Johnson SG (2016) The nlopt nonlinear-optimization package. http://ab-initio.mit.edu/nlopt Karas M (2016) mdpeer: graph-constrained regression with enhanced regularization parameters selection. r package version 0.1.0. https://CRAN.R-project.org/package=mdpeer Li C, Li H (2008) Network-constrained regularization and variable selection for analysis of genomic data. Bioinformatics 24(9):1175–1182 Lu S, Pereverzev SV (2011) Multi-parameter regularization and its numerical realization. Numer Math 118(1):1–31 Maldonado YM (2009) Mixed models, posterior means and penalized least-squares. Optimality 57:216–236 McCulloch CE, Neuhaus JM, Searle SR (2008) Generalized, linear, and mixed models, 2nd edn. Wiley, Hoboken Momenan R, Steckler LE, Saad ZS, van Rafelghem S, Kerich MJ, Hommer DW (2012) Effects of alcohol dependence on cortical thickness as determined by magnetic resonance imaging. Psychiatry Res 204(2–3):101–111 Nakamura-Palacios EM, Souza RS, Zago-Gomes MP, Melo AM, Braga FS, Kubo TT, Gasparetto EL (2014) Gray matter volume in left rostral middle frontal and left cerebellar cortices predicts frontal executive performance in alcoholic subjects. Alcohol Clin Exp Res 38(4):1126–33 Oberlin BG, Dzemidzic M, Tran SM, Soeurt CM, Albrecht DS, Yoder KK, Kareken DA (2013) Beer flavor provokes striatal dopamine release in male drinkers: mediation by family history of alcoholism. Neuropsychopharmacology 38(9):1617–24 Oberlin BG, Dzemidzic M, Tran SM, Soeurt CM, O’Connor SJ, Yoder KK, Kareken DA (2015) Beer self-administration provokes lateralized nucleus accumbens dopamine release in male heavy drinkers. Psychopharmacology (Berl) 232(5):861–70 Paige CC, Saunders MA (2006) Towards a generalized singular value decomposition. SIAM J Numer Anal 18(3):398–405 Pennington DL, Durazzo TC, Schmidt TP, Abe C, Mon A, Meyerhoff DJ (2015) Alcohol use disorder with and without stimulant use: brain morphometry and its associations with cigarette smoking, cognition, and inhibitory control. PLoS ONE 10(3):e0122,505 Phillips D (1962) A technique for the numerical solution of certain integral equations of the first kind. J ACM 9(1):84–97 Purdom E (2011) Analysis of a data matrix and a graph: metagenomic data and the phylogenetic tree. Ann Appl Stat 5(4):2326–2358 Randolph TW, Harezlak J, Feng Z (2012) Structured penalties for functional linear models: partially empirical eigenvectors for regression. Electron J Stat 6:323–353 Reiss PT, Ogden RT (2009) Smoothing parameter selection for a class of semiparametric linear models. J R Stat Soc 71(2):505–523 Rowan T (1990) Functional stability analysis of numerical algorithms. PhD thesis, University of Texas at Austin Ruppert D, Wand MP, Carroll RJ (2003) Semiparametric regression. Cambridge University Press, Cambridge Slawski M, Castell WZ, Tutz G (2010) Feature selection guided by structural information. Ann Appl Stat 4(2):1056–1080 Sporns O (2013) Network attributes for segregation and integration in the human brain. Curr Opin Neurobiol 23(2):162–171 Sporns O, Betzel RF (2016) Modular brain networks. Annu Rev Psychol 67:613 Squeglia LM, Sorg SF, Schweinsburg AD, Wetherill RR, Pulido C, Tapert SF (2012) Binge drinking differentially affects adolescent male and female brain morphometry. Psychopharmacology (Berl) 220(3):529–539 Tibshirani R, Taylor J (2011) The solution path of the generalized lasso. Ann Stat 39(3):1335–1371 Tibshirani R, Saunders M, Rosset S, Zhu J, Knight K (2005) Sparsity and smoothness via the fused lasso. J R Stat Soc 67(1):91–108 Tikhonov A (1963) Solution of incorrectly formulated problems and the regularization method. Sov Math 4(4):1035–1038 Weafer J, Dzemidzic M 2nd, Eiler W, Oberlin BG, Wang Y, Kareken DA (2015) Associations between regional brain physiology and trait impulsivity, motor inhibition, and impaired control over drinking. Psychiatry Res 233(2):81–7 Ypma J (2014) nloptr: R interface to NLopt. r package version 1.0.4. https://CRAN.R-project.org/package=nloptr