A hierarchical model for compositional data analysis
Tóm tắt
This article introduces a hierarchical model for compositional analysis. Our approach models both source and mixture data simultaneously, and accounts for several different types of variation: these include measurement error on both the mixture and source data; variability in the sample from the source distributions; and variability in the mixing proportions themselves, generally of main interest. The method is an improvement on some existing methods in that estimates of mixing proportions (including their interval estimates) are sure to lie in the range [0, 1]; in addition, it is shown that our model can help in situations where identification of appropriate source data is difficult, especially when we extend our model to include a covariate. We first study the likelihood surface of a base model for a simple example, and then include prior distributions to create a Bayesian model that allows analysis of more complex situations via Markov chain Monte Carlo sampling from the likelihood. Application of the model is illustrated with two examples using real data: one concerning chemical markers in plants, and another on water chemistry.
Tài liệu tham khảo
Aebischer, N. J., Robertson, P. A., and Kenward, R. E. (1993), “Compositional Analysis of Habitat Use from Animal Radio-Tracking Data,” Ecology, 74, 1313–1325.
Aitchison, J. (1982), The “Statistical Analysis of Compositional Data” (with discussion), Journal of the Royal Statistical Society, Ser. B, 44, 139–177.
— (1986), The Statistical Analysis of Compositional Data, London: Chapman and Hall.
Barndorff-Nielsen, O. E., and Cox, D. R. (1989), Asymptotic Techniques for Use in Statistics, London: Chapman and Hall.
Best, N. G., Cowles, M. K., and Vines, S. K. (1997), CODA: Convergence Diagnosis and Output Analysis Software for Gibbs Sampling Output, Version 0.4, Cambridge, UK: MRC Biostatistics Unit, Institute of Public Health.
Billheimer, D. (2001), “Compositional Receptor Modelling,” Environmetrics, 12, 451–467.
Billheimer, D., Guttorp, P., and Fagan, W. F. (2001), “Statistical Interpretation of Species Composition,” Journal of the American Statistical Association, 96, 1205–1214.
Brewer, M. J., Dunn, S. M., and Soulsby, C. (2002), “A Bayesian Model for Compositional Data Analysis,” in Proceedings of Compstat 2002, eds. W. Härdle and B. Rönz, Heidelberg: Physica-Verlag, pp. 105–110.
Elston, D. A., Illius, A. W., and Gordon, I. A. (1996), “Assessment of Preference Among a Range of Options Using Log Ratio Analysis,” Ecology, 77, 2538–2548.
Genereux, D. (1998), “Quantifying Uncertainty in Tracer-Based Hydrograph Separations,” Water Resources Research, 34, 915–919.
Hopke, P. K. (2003), “Recent Developments in Receptor Modeling,” Journal of Chemometrics, 17, 255–265.
Krzanowski, W. J. (1988), Principles of Multivariate Analysis, Oxford: Oxford University Press.
Lunn, D. J., Thomas, A., Best, N., and Spiegelhalter, D. J. (2000), “WinBUGS—A Bayesian Modelling Framework: Concepts, Structure and Extensibility,” Statistics and Computing, 10, 325–337.
Mayes, R. W., Lamb, C. S., and Colgrove, P. M. (1986), “The Use of Dosed and Herbage n-Alkanes as Markers for the Determination of Herbage Intake,” Journal of Agricultural Science, 107, 161–170.
Newman, J. A., Thomson, W. A., Penning, P. D., and Mayes, R. W. (1995), “Least-Squares Estimation of Diet Composition from n-Alkanes in Herbage Faeces Using Matrix Mathematics,” Australian Journal of Agricultural Research, 46, 793–805.
Park, E. S., Oh, M. S., and Guttorp, P. (2002), “Multivariate Receptor Models and Model Uncertainty,” Chemometrics and Intelligent Laboratory Systems, 60, 49–67.
Park, E. S., Spiegelman, C. H., and Henry, R. C. (2002), “Bilinear Estimation of Pollution Source Profiles and Amounts by using Multivariate Receptor Models,” Environmentrics, 13, 775–798.
Renner, R. M. (1993), “The Resolution of a Compositional Data Set into Mixtures of Fixed Source Compositions,” Applied Statistics, 42, 615–631.
Soulsby, C., Petry, J., Brewer, M. J., Dunn, S. M., Ott, B., and Malcolm, I. A. (2003), “Identifying and Assessing Uncertainty in Hydrological Pathways: A Novel Approach to End Member Mixing in a Scottish Agricultural Catchment,” Journal of Hydrology, 274, 109–128.
Spiegelhalter, D. J., Best, N. G., Carlin, B. P., and van der Linde, A. (2002), “Bayesian Measures of Model Complexity and Fit,” (with discussion), Journal of the Royal Statistical Society, Ser. B, 64, 583–639.