Row–column interaction models, with an R implementation
Tóm tắt
We propose a family of models called row–column interaction models (RCIMs) for two-way table responses. RCIMs apply some link function to a parameter (such as the cell mean) to equal a row effect plus a column effect plus an optional interaction modelled as a reduced-rank regression. What sets this work apart from others is that our framework incorporates a very wide range of statistical models, e.g., (1) log-link with Poisson counts is Goodman’s RC model, (2) identity-link with a double exponential distribution is median polish, (3) logit-link with Bernoulli responses is a Rasch model, (4) identity-link with normal errors is two-way ANOVA with one observation per cell but allowing semi-complex modelling of interactions of the form
$$\mathbf{A}\mathbf{C}^T$$
, (5) exponential-link with normal responses are quasi-variances. Proposed here also is a least significant difference plot augmentation of quasi-variances. Being a special case of RCIMs, quasi-variances are naturally extended from the
$$M=1$$
linear/additive predictor
$$\eta $$
case (within the exponential family) to the
$$M>1$$
case (vector generalized linear model families). A rank-1 Goodman’s RC model is also shown to estimate the site scores and optimums of an equal-tolerances Poisson unconstrained quadratic ordination. New functions within the VGAM R package are described with examples. Altogether, RCIMs facilitate the analysis of matrix responses of many data types, therefore are potentially useful to many areas of applied statistics.
Tài liệu tham khảo
Andrews HP, Snee RD, Sarner MH (1980) Graphical display of means. Am Stat 34:195–199
de Rooij M (2007) The distance perspective of generalized biadditive models: scalings and transformations. J Comput Graph Stat 16:210–227
Easton DF, Peto J, Babiker AGAG (1991) Floating absolute risk: an alternative to relative risk in survival and case-control analysis avoiding an arbitrary reference group. Stat Med 10:1025–1035
Firth D (2000) Quasi-variances in Xlisp-Stat and on the web. J Stat Softw 5:1–13, http://www.jstatsoft.org/v05/i04
Firth D, de Menezes RX (2004) Quasi-variances. Biometrika 91:65–80
Goodman LA (1981) Association models and canonical correlation in the analysis of cross-classifications having ordered categories. J Am Stat Assoc 76:320–334
Gower JC, Lubbe SG, Le Roux NJ (2011) Understanding biplots. Wiley, Chichester
Kotz S, Kozubowski TJ, Podgórski K (2001) The laplace distribution and generalizations: a revisit with applications to communications, economics, engineering, and finance. Birkhäauser, Boston
McCullagh P, Nelder JA (1989) Generalized linear models, 2nd edn. Chapman & Hall, London
Mosteller F, Tukey JW (1977) Data analysis and regression. Addison-Wesley, Reading, MA
Powers DA, Xie Y (2008) Statistical methods for categorical data analysis, 2nd edn. Bingley, Emerald
Schenker N, Gentleman JF (2001) On judging the significance of differences by examining the overlap between confidence intervals. Am Stat 55:182–186
Scott D (2012) GeneralizedHyperbolic: the generalized hyperbolic distribution. http://CRAN.R-project.org/package=GeneralizedHyperbolic, R package version 0.8-1
Turner H, Firth D (2007) g\({ nm}\): a package for generalized nonlinear models. R News 7:8–12, http://CRAN.R-project.org/doc/Rnews/
Yee TW (2008) The \({ VGAM}\) package. R News 8:28–39, http://CRAN.R-project.org/doc/Rnews/
Yee TW (2010) The \({ VGAM}\) package for categorical data analysis. J Stat Softw 32:1–34, http://www.jstatsoft.org/v32/i10/
Yee TW (2014) Reduced-rank vector generalized linear models with two linear predictors. Comput Stat Data Anal 71:889–902
Yee TW (2015) Vector generalized linear and additive models. Springer, NY (in preparation)
Yee TW, Hastie TJ (2003) Reduced-rank vector generalized linear models. Stat Model 3:15–41