Nonparametric estimation of Shannon’s index of diversity when there are unseen species in sample

Environmental and Ecological Statistics - Tập 10 Số 4 - Trang 429-443 - 2003
Chao, Anne1, Shen, Tsung-Jen1
1Institute of Statistics, National Tsing Hua University, Hsin-Chu, Taiwan

Tóm tắt

A biological community usually has a large number of species with relatively small abundances. When a random sample of individuals is selected and each individual is classified according to species identity, some rare species may not be discovered. This paper is concerned with the estimation of Shannon’s index of diversity when the number of species and the species abundances are unknown. The traditional estimator that ignores the missing species underestimates when there is a non-negligible number of unseen species. We provide a different approach based on unequal probability sampling theory because species have different probabilities of being discovered in the sample. No parametric forms are assumed for the species abundances. The proposed estimation procedure combines the Horvitz–Thompson (1952) adjustment for missing species and the concept of sample coverage, which is used to properly estimate the relative abundances of species discovered in the sample. Simulation results show that the proposed estimator works well under various abundance models even when a relatively large fraction of the species is missing. Three real data sets, two from biology and the other one from numismatics, are given for illustration.

Từ khóa


Tài liệu tham khảo

citation_journal_title=Communications in Statistics-Simulation; citation_title=Coverage-adjusted estimators for mark-recapture in heterogeneous populations; citation_author=J. Ashbridge, I.B.J. Goudie; citation_volume=29; citation_publication_date=2000; citation_pages=1215-37; citation_id=CR1

citation_journal_title=Theory of Probability and Its Applications; citation_title=On a statistical estimate for the entropy of a sequence of independent random variables; citation_author=G.P. Basharin; citation_volume=4; citation_publication_date=1959; citation_pages=333-6; citation_id=CR2

citation_journal_title=Proceedings of the Royal Irish Academy; citation_title=Bird communities of some Killarney woodlands; citation_author=L.A. Batten; citation_volume=76; citation_publication_date=1976; citation_pages=285-313; citation_id=CR3

citation_journal_title=Journal of the American Statistical Association; citation_title=Estimating the number of species: a review; citation_author=J. Bunge, M. Fitzpatrick; citation_volume=88; citation_publication_date=1993; citation_pages=364-73; citation_id=CR4

citation_journal_title=Journal of Applied Statistics; citation_title=Comparison of three estimators of the number of species; citation_author=J. Bunge, M. Fitzpatrick, J. Handley; citation_volume=22; citation_publication_date=1995; citation_pages=45-59; citation_id=CR5

citation_journal_title=Journal of the American Statistical Association; citation_title=Estimating the number of classes via sample coverage; citation_author=A. Chao, S.-M. Lee; citation_volume=87; citation_publication_date=1992; citation_pages=210-17; citation_id=CR6

citation_journal_title=Statistica Sinica; citation_title=Estimating the number of shared species in two communities; citation_author=A. Chao, W.-H. Hwang, Y.-C. Chen, C.-Y. Kuo; citation_volume=10; citation_publication_date=2000; citation_pages=227-46; citation_id=CR7

citation_journal_title=Biometrika; citation_title=Stopping rules and estimation for recapture debugging with unequal failure rates; citation_author=A. Chao, M.-C. Ma, M.C.K. Yang; citation_volume=80; citation_publication_date=1993; citation_pages=193-201; citation_id=CR8

citation_journal_title=Philosophical Transactions of the Royal Society, London B; citation_title=Estimating terrestrial biodiversity through extrapolation; citation_author=R.K. Colwell, J.A. Coddington; citation_volume=345; citation_publication_date=1994; citation_pages=101-18; citation_id=CR9

citation_title=An Introduction to the Bootstrap; citation_publication_date=1993; citation_id=CR10; citation_author=B. Efron; citation_author=R.J. Tibshirani; citation_publisher=Chapman and Hall

citation_title=Stochastic Abundance Models; citation_publication_date=1978; citation_id=CR11; citation_author=S. Engen; citation_publisher=Halsted Press

citation_journal_title=The Annals of Statistics; citation_title=The efficiency of Good's nonparametric coverage estimator; citation_author=W. Esty; citation_volume=14; citation_publication_date=1986; citation_pages=1257-60; citation_id=CR12

citation_journal_title=Biometrika; citation_title=The population frequencies of species and the estimation of population parameters; citation_author=I.J. Good; citation_volume=40; citation_publication_date=1953; citation_pages=237-64; citation_id=CR13

citation_journal_title=Journal of the American Statistical Association; citation_title=Estimating the number of classes in a finite population; citation_author=P. Haas, L. Stokes; citation_volume=93; citation_publication_date=1998; citation_pages=1475-87; citation_id=CR14

citation_journal_title=Scandinavian Journal of Statistics; citation_title=Some asymptotic results for incomplete multinomial or Poisson samples; citation_author=L. Holst; citation_volume=8; citation_publication_date=1981; citation_pages=243-6; citation_id=CR15

citation_journal_title=Journal of the American Statistical Association; citation_title=A generalization of sampling without replacement from a finite universe; citation_author=D.G. Horvitz, D.J. Thompson; citation_volume=47; citation_publication_date=1952; citation_pages=663-85; citation_id=CR16

citation_journal_title=Communications in Statistics; citation_title=Some moments of an estimate of Shannon's measure of information; citation_author=K. Hutcheson, L.R. Shenton; citation_volume=3; citation_publication_date=1974; citation_pages=89-94; citation_id=CR17

citation_journal_title=Ecology; citation_title=Sweep samples of tropical foliage insects: description of study sites, with data on species abundances and size distributions; citation_author=D.H. Janzen; citation_volume=54; citation_publication_date=1973; citation_pages=659-86; citation_id=CR18

citation_journal_title=Ecology; citation_title=Sweep samples of tropical foliage insects: effects of seasons, vegetation types, elevation, time of day, and insularity; citation_author=D.H. Janzen; citation_volume=54; citation_publication_date=1973; citation_pages=687-708; citation_id=CR19

citation_journal_title=Proceedings of National Academy of Science; citation_title=On the relative abundances of bird species; citation_author=R.H. MacArthur; citation_volume=43; citation_publication_date=1957; citation_pages=193-295; citation_id=CR20

citation_title=Ecological Diversity and Its Measurement; citation_publication_date=1988; citation_id=CR21; citation_author=A.E. Magurran; citation_publisher=Princeton University Press

citation_title=Fractals, Form, Chance and Dimension; citation_publication_date=1977; citation_id=CR22; citation_author=B. Mandelbrot; citation_publisher=Freeman

citation_journal_title=Environmental and Ecological Statistics; citation_title=Non-parametric MLE for Poisson species abundance models allowing for heterogeneity between species; citation_author=J.L. Norris, K.H. Pollock; citation_volume=5; citation_publication_date=1998; citation_pages=391-402; citation_id=CR23

citation_journal_title=Annual Review of Ecology and Systematics; citation_title=The measurement of species diversity; citation_author=R.K. Peet; citation_volume=5; citation_publication_date=1974; citation_pages=285-307; citation_id=CR24

citation_title=Ecological Diversity; citation_publication_date=1975; citation_id=CR25; citation_author=E.C. Pielou; citation_publisher=Wiley

citation_journal_title=Biometrics; citation_title=Sampling properties of a family of diversity measures; citation_author=W. Smith, J.F. Grassle; citation_volume=33; citation_publication_date=1977; citation_pages=283-92; citation_id=CR26

citation_journal_title=Journal of Animal Ecology; citation_title=A simple test for change in community structure; citation_author=A.R. Solow; citation_volume=62; citation_publication_date=1993; citation_pages=191-3; citation_id=CR27

citation_title=Sampling; citation_publication_date=1992; citation_id=CR28; citation_author=S.K. Thompson; citation_publisher=Wiley

citation_journal_title=Ecology; citation_title=Jackknifing an index of diversity; citation_author=S. Zahl; citation_volume=58; citation_publication_date=1977; citation_pages=907-13; citation_id=CR29

citation_title=Human Behavior and Principle of Least Effort; citation_publication_date=1965; citation_id=CR30; citation_author=G.K. Zipf; citation_publisher=Addison-Wesley