An improved nonparametric lower bound of species richness via a modified good–turing frequency formula

Biometrics - Tập 70 Số 3 - Trang 671-682 - 2014
Chun‐Huo Chiu1, Yi‐Ting Wang1, Bruno Walther2, Anne Chao1
1Institute of Statistics, National Tsing Hua University, Hsin-Chu 30043, Taiwan
2Master Program in Global Health and Development, College of Public Health and Nutrition; Taipei Medical University; 250 Wu-Hsing St., Taipei 110 Taiwan

Tóm tắt

SummaryIt is difficult to accurately estimate species richness if there are many almost undetectable species in a hyper‐diverse community. Practically, an accurate lower bound for species richness is preferable to an inaccurate point estimator. The traditional nonparametric lower bound developed by Chao (1984, Scandinavian Journal of Statistics 11, 265–270) for individual‐based abundance data uses only the information on the rarest species (the numbers of singletons and doubletons) to estimate the number of undetected species in samples. Applying a modified Good–Turing frequency formula, we derive an approximate formula for the first‐order bias of this traditional lower bound. The approximate bias is estimated by using additional information (namely, the numbers of tripletons and quadrupletons). This approximate bias can be corrected, and an improved lower bound is thus obtained. The proposed lower bound is nonparametric in the sense that it is universally valid for any species abundance distribution. A similar type of improved lower bound can be derived for incidence data. We test our proposed lower bounds on simulated data sets generated from various species abundance models. Simulation results show that the proposed lower bounds always reduce bias over the traditional lower bounds and improve accuracy (as measured by mean squared error) when the heterogeneity of species abundances is relatively high. We also apply the proposed new lower bounds to real data for illustration and for comparisons with previously developed estimators.

Từ khóa


Tài liệu tham khảo

10.1080/10618600.2011.647174

10.2307/2290733

10.1093/biomet/65.3.625

10.2307/1936861

Chao A., 1984, Nonparametric estimation of the number of classes in a population, Scandinavian Journal of Statistics, 11, 265

10.2307/2531532

Chao A., 2014, Estimation of species richness and shared species richness, Handbook of Methods and Applications of Statistics in the Atmospheric and Earth Sciences

10.1890/11-1952.1

10.1080/01621459.1992.10475194

10.1111/j.1541-0420.2011.01739.x

Chao A., 2010, Program SPADE (Species Prediction and Diversity Estimation)

10.1046/j.1472-4642.2003.00027.x

10.1098/rstb.1994.0091

10.2307/2531485

10.2307/1411

Foissner W., 2002, Soil ciliates (Protozoa, Ciliophora) from Namibia (Southwest Africa), with emphasis on two contrasting environments, the Etosha region and the Namib Desert, Denisia, 5, 1

10.1111/j.1574-6941.2006.00105.x

10.1080/00949650008812016

10.1093/biomet/43.1-2.45

Gotelli N. J., 2011, Estimating species richness, Biological Diversity: Frontiers in Measurement and Assessment

10.2307/1935358

10.2307/1935359

10.1016/j.csda.2011.01.017

10.1073/pnas.43.3.293

Magurran A. E., 2004, Measuring Biological Diversity

10.1034/j.1600-0706.2000.890316.x

10.2307/1941127

10.1214/aoms/1177698526

Royle J. A., 2008, Hierarchical Modelling and Inference in Ecology

10.1111/j.1474-919X.2001.tb04942.x

10.1111/j.2005.0906-7590.04112.x

10.1017/S0031182097002230

10.1093/biomet/asq026

10.1111/j.1654-1103.2012.01423.x