A comparison of clustering methods for river benthic community analysis

Hydrobiologia - Tập 347 - Trang 24-40 - 1997
Yong Cao1, , Anthony W. Bark1, W. Peter Williams1
1Division of Life Sciences, King's College London, London, UK

Tóm tắt

Four commonly used clustering methods (UPGMA, Ward Linkage,Complete Linkage and TWINSPAN) were compared in their abilitytorecognise the structure of three river macroinvertebratesdatasetswhich were pre-determined based on habitat and biologicalcharacteristics or chemical water quality of sampling sites.DCA,NMDS and ANOSIM were applied to the same datasets to providefurther information about data structure, and nonparametrictestswere also undertaken on major chemical variables to justifythepredeterminations. The modified Rand Index was used to measuretheagreement between a particular solution and the pre-determinedclassification. The results showed that Ward Linkage performedbestwhen its use was broadened and used with the CY DissimilarityMeasure, followed by TWINSPAN and Complete Linkage with UPGMAbeingleast successful. There was evidence to suggest that theeffectiveness of some clustering methods (e.g. UPGMA) may varyatdifferent clustering levels, and simulation techniques whichhavebeen used to assess clustering methods could leave somepropertiesof clustering methods unexamined.

Tài liệu tham khảo

Bargos, T., J. Mesanza, M. Basaguren & E. Orive, 1990. Assessing river water quality by means of multifactorial methods using macroinvertebrates. a comparative study of main water courses of Biscay. Wat. Res. 24: 1–10. Belbin, L., 1987. The use of non-hierarchical allocation methods for clustering large sets of data. Aust. Comput. J. 19: 32–41. Belbin, L. & C. McDonald, 1993. Comparing three classification strategies for use in ecology. J. Veg. Sci. 4: 341–348. Boyce, A. J., 1969. Mapping diversity: a comparative study of some numerical methods. In A. J. Cole (ed.), Numerical Taxonomy, Proceedings of the Colloquium in Numerical Taxonomy, held in University of St. Andrews, Sept. 1968. Academic Press, London. Bredenkamp, G. T., H. Bezuidenhout, O. J. H. Bosch & F. P. J. Vanrensburg, 1991. A comparison of vegetation classifications from wheel point and total floristic data sets from a South African grassland. Bot. Bull. Acad. Sinica 32: 187–195. Camargo, J. A., 1993. Macroinvertebrate surveys as a valuable tool for assessing freshwater quality in the Iberian Peninsula. Envir. Monit. Assess. 24: 71–90. Cao, Y., 1995. Spatial and temporal changes of macroinvertebrate community structure in two UK lowland river systems. Unpublished Ph.D. thesis, King’s College London, UK. Cao, Y., A.W. Bark & W. P. Williams, 1996. Measuring the responses of macro-invertebrate communities to water pollution: a comparison of multivariate approaches, biotic and diversity indices. Hydrobiologia 341: 1–19. Cao, Y., W. P. Williams & A. W. Bark, 1997a. Similarity measure bias in river benthic Aufwuchs community analysis. Wat. Envir. Res. 69: 95–106. Cao, Y.,W. P. Williams & A. W. Bark, 1997b. Effects of sample size on similarity measures in river benthic Aufwuchs community analysis. Wat. Envir. Res. 69: 107–114. Clarke, K. R. & R. M. Green, 1988. Statistical design and analysis for a ‘biological effects’ study. Mar. Ecol. Prog. Ser. 46: 213–226. Clarke, K. R. & R. M. Warwic, 1994. Change inmarinecommunities: an approach to statistical analysis and interpretation. UK NERC and Plymouth Marine Laboratory. Cuanalo, H. E. De La C. & R. Webster, 1970. A comparative study of numerical classification and ordination of soil profiles in a locality near Oxford. J. Soil Sci. 21: 340–352. Dale, M. B., 1991. Knowing when to stop: cluster concept–concept cluster. In Feoli, E. & L. Orloci (eds), Computer-Assisted Vegetation Analysis, Kluwer Academic Publishers, Dordrecht: 149–171. Dale, M. B., 1995. Evaluation classification strategies. J. Veget. Sci. 6: 347–440. Davis, W. S. & T. P. Simon, 1995. Introduction. In Davis, W. S. & T. P. Simon (eds), Biological Assessment and Criteria: Tools for Water Resource Planning and Decision-Making. Lewis Publishers, London: 3–6. Digby, P. G. N. & R. A. Kempton, 1987. Multivariate analysis of ecological communities. Chapman & Hall, London. DoE (Department of Environment), 1984. Methods of biological sampling–a colonisation sampler for collecting macroinvertebrate indicators of the water quality in lowland rivers. H.M.S.O., London. Everitt, B. S., 1979. Unresolved problems in cluster analysis. Biometrics 35: 169–181. Faith, D. P., P. R. Minchin & L. Belbin, 1987. Compositional dissimilarity as a robust measure of ecological distance. Vegetatio 69: 57–68. Farris, J. S., 1969. On the cophenetic correlation coefficient. Syst. Zool. 18: 279–285. Frenkel, R. E. & C. M. Harrison. 1972. An assessment of the usefulness of phyto-sociological and numerical classificatory methods for the community biogeographer. J. Biogeogr. 1: 27–56. Gauch, H. G., 1982. Multivariate analysis in community ecology, Cambridge University Press. Gauch, H. G. & R. H. Whittaker, 1981. Hierarchical classification of community data. J. Ecol. 69: 537–557. Girton, C., 1980. Ecological studies on benthic macroinvertebrate communities in relation to their use in river water quality surveillance. Unpublished Ph.D. thesis, University of Aston, Birmingham, UK. Goodall, D. W., 1978. Numerical methods of classification. In R. H. Whittaker (ed.), Classification of Plant Communities. Dr W. Junk Publishers, Boston: 249–286. Hill, M. O., 1979a. TWINSPAN–a FORTRAN program for arrangingmultivariate. data in an ordered two-way table by classification of individuals and attributes. Section of Ecology and Systematics, Cornell University, Ithaca, New York. Hill, M. O., 1979b. DECORANA–a FORTRAN program for detrended correspondence analysis and reciprocal averaging. Section of Ecology and Systematics, Cornell University, Ithaca, New York. Hruby, T., 1987. Using similarity measures in benthic impact assessments, Envir. Monit. Assess. 8: 163–180. Hubert, L. & P. Arabie, 1985. Comparing partitions. J. Classific. 2: 193–218. Jackson, D. M., 1969. Comparison of classification. In Cole, A. C. (ed.), Numerical Taxonomy. Academic Press, London: 91–111. Jackson, D. A., K. M. Somers & H. H. Harvey, 1989. Similarity coefficients: measures of co-occurrence and association or simply measures of occurrence? Am. Nat. 133: 436–453. James, F. C. & C. E. McCulloch, 1990. Multivariate analysis in ecology and systematics: panacea or pandora’s box? Annu. Rev. Ecol. Syst. 21: 129–166. Kaesler, R. L. & J. Cairns, 1972. Cluster analysis of data from limnological surveys of the Upper Potomac River, Am. Midl. Nat. 88: 56–67. Kent, M. & P. Coker, 1992. Vegetation description and analysis–A practical approach. Belhaven Press, London. Krebs, C. J., 1989. Ecological methodology. Harper and Row, Publishers, New York. Kruskal, J. B. & M. Wish, 1978. Multidimensional Scaling. Sage Publishers, California. Lance, G. N. & W. T. Williams, 1967. Mixed data classificatory programs. I. Agglomerative systems. Aust. Comput. J. 1: 1–6. Legendre, L. & P. Legendre, 1983. Numerical ecology, Elsevier, New York: 171–218. McBride, B. B., J. C. Loftis & N. C. Adkins, 1993. What do signifi-cance tests really tell us about the environment? Envir. Mgmt 17: 423–432. Milligan, G. W. & M. C. Cooper, 1987. Methodology review: clustering methods. Appl. Psychol. Measurement 11: 329–345. Minchin, P. R., 1987a. Simulation of multidimensional community patterns: toward a comprehensive model. Vegetatio 71: 145–156. Minchin, P. R., 1987b. An evaluation of the relative robustness of techniques for ecological ordination. Vegetatio 69: 89–107. Murray-Bligh, A. J., 1987. Ecological studies on benthic macro invertebrate in lowland rivers in relation to water quality. unpublished Ph.D. thesis, Aston University, Birmingham. Nemec, A. F. L. & R. O. Brinkhurst, 1988. Using the bootstrap to assess statistical significance in the cluster analysis of species abundance data. Can. J. Fish. aquat. Sci. 45: 965–970. Norris, R. H. & A. Georges, 1993. Analysis and interpretation of benthic macro-invertebrate surveys. In Rosenberg, D. M. & V. H. Resh (eds), Freshwater Biomonitoring and Benthic Macro-Invertebrates. Chapman&Hall, London and NewYork: 234–286. NRA (National Rivers Authority), 1991. The quality of rivers, canals and estuaries in England and Wales. Water Quality Series No. 4, UK. Orloci, L., 1967. An agglomerative method for classification of plant communities. J. Ecol. 55: 193–206. Orloci, L., 1978. Multivariate Analysis in Vegetation Research (2nd edn). Dr W. Junk Publishers, Boston. Pielou, E. C., 1984. The Interpretation of Ecological Data. Wiley, New York. Podani, J., 1989. New combinatorial clustering methods. Vegetatio 81: 61–77. Rassaro, B. & A. Pietrangelo, 1993. Macroinvertebrate distribution in streams: a comparison of CA ordination with biotic indices. Hydrobiologia 263: 109–118. Rohlf, F. J., 1970. Adaptive hierarchical clustering schemes. Syst. Zool. 18: 58–82. Romesburg, H. C., 1984. Clustering Analysis for Researchers. Lifetime Learning Publications, Belmont, California. Sneath, P. H. A. & R. R. Sokal, 1973. NumericalTaxonomy Freeman, San Francisco. Sokal, R. R. & P. H. Sneath, 1963. Principles of Numerical Taxonomy. Freeman, San Francisco. SPSS Inc., 1990. Reference Guide. Chicago, SPSS Inc., 949 pp. Taylor, L. C., 1992. The responses of spring-dwelling ostracodes to intra-regional differences in groundwater chemistry associated with road salting practices in Southern Ontario: a test using an urban-rural transect. unpublished M.Sc. thesis, University of Toronto, Canada. ter Braak, C. J. F., 1987–1992. CANOCO–a FORTRAN program for canonical community ordination (version 3.12). Microcomputer Power, Ithaca, New York, USA. ter Braak, C. J. F., 1987. Ordination. In Jongman, R. H., C. J. F. ter Braak & O. F. R. van Tongeren (eds), Data Analysis in Community and Landscape Ecology. Pudoc Wageningen: 91–173. van Groenewoud, H., 1992. The robustness of Correspondence, Detrended Correspondence, and TWINSPAN analysis. J. Veget. Sci. 3: 239–246. van Tongeren, O. F. R., 1987. Clustering analysis. In Jongman, R.H., C. J. F. ter Braak & O. F. R van Tongeren (eds), Data Analysis in Community and Landscape Ecology. Pudoc Wageningen: 180–183. Williams, W. T., J. M. Lambert & G. N. Lance, 1966. Multivariate methods in plant ecology V. similarity analyses and informationanalysis. J. Ecol. 54: 427–445. Zar, J. H., 1984. Biostatistical Analysis, Prentice-Hall International, Inc., New York.