The Impact of Levene’s Test of Equality of Variances on Statistical Theory and Practice

Statistical Science - Tập 24 Số 3 - 2009
Joseph L. Gastwirth1, Yulia R. Gel2, Weiwen Miao3
1(George Washington University)
2University of Waterloo
3#N#Haverford College#N#

Tóm tắt

Từ khóa


Tài liệu tham khảo

Abelson, R. P. and Tukey, J. W. (1963). Efficient utilization of non-numerical information in quantitative analysis: General theory and the case of simple order. <i>Ann. Math. Statist.</i> <b>34</b> 1347–1369.

Box, G. E. P. (1953). Non-normality and tests on variances. <i>Biometrika</i> <b>40</b> 318–335.

Rosenbaum, P. R. (2002). <i>Observational Studies</i>. Springer, New York.

Agresti, A. (2002). <i>Categorical Data Analysis</i>. Wiley, New York.

Andrews, D. F., Bickel, P. J., Hampel, F. R., Huber, P. J., Rodgers, W. H. and Tukey, J.W. (1972). <i>Robust Estimates of Location: Survey and Advances</i>. Princeton Univ. Press, Princeton, NJ.

Evett, I. W. and Weir, B. S. (1998). <i>Interpreting DNA Evidence</i>. Sinauer, Sunderland, MA.

Gastwirth, J. L. (2001). Screening and selection. In <i>International Encyclopedia of Social Sciences</i> (N. J. Smelser and P. B. Bates, eds.). Elsevier, Oxford, U.K. 13755–13767.

Gillespie, J. H. (1998). <i>Population Genetics: A Concise Guide</i>. Johns Hopkins Univ. Press, Baltimore, MD.

Hedrick, P. W. (2000). <i>Genetics of Populations</i>, 2nd ed. Jones and Bartlett, Sudbury, MA.

Johnson, N. L. and Leone, F. C. (1964). <i>Statistics and Experimental Design in Engineering and Physical Sciences</i>, 2nd ed. Wiley, New York.

Korn, E. L. and Graubard, B. I. (1999). <i>The Analysis of Health Surveys</i>. Wiley, New York.

Kutner, M. H., Nachtsheim, C. J. and Neter, J. (2004). <i>Applied Regression Analysis</i>. McGraw-Hill/Irwin, Boston.

Levene, H. (1960). Robust testes for equality of variances. In <i>Contributions to Probability and Statistics</i> (I. Olkin, ed.) 278–292. Stanford Univ. Press, Palo Alto, CA.

Little, R. J. A. and Rubin, D. A. (2002). <i>Statistical Analysis with Missing Data</i>. Wiley, New York.

Miller, R. G., Jr. (1986). <i>Beyond ANOVA: Basics of Applied Statistics</i>. Wiley, New York.

Milliken, G. A. and Johnson, D. E. (1984). <i>Analysis of Messy Data</i>, Vol.1. Van Nostrand Reinhold, New York.

Molenberghs, G. and Kenward, M. G. (2007). <i>Missing Data in Clinical Studies</i>. Wiley, Chichester, UK.

Piegorsch, W. W. and Bailer, A. J. (2005). <i>Analyzing Environmental Data</i>. Wiley, Chichester, UK.

Pepe, M. (2003). <i>The Statistical Evaluation of Medical Tests for Classification and Prediction</i>. Wiley, Chichester, UK.

Pollak, E. (2006). The influence of Levene’s paper on polymorphism in subdivided populations. In <i>Proceedings of the Joint Statistical Meetings, August, 2006</i>. Amer. Statist. Assoc., Alexandria, VA.

Scheffe, H. (1959). <i>The Analysis of Variance</i>. Wiley, New York.

van Belle, G. (2002). <i>Statistical Rules of Thumb</i>. Wiley, New York.

Weir, B. (1996). <i>Genetic Data Analysis II</i>. Sinauer, Sunderland, MA.

Algina, J., Olejnik, S. and Ocanto, R. (1989). Type I error rates and power estimates for selected two-sample tests of scale. <i>Journal of Educational Statistics</i> <b>14</b> 373–383.

Arnold, S. F. (1980). Asymptotic validity of <i>F</i> tests for the ordinary linear model and the multiple correlation model. <i>J. Amer. Statist. Assoc.</i> <b>75</b> 890–894.

Auger, J. and Jouannet, P. (1997). Evidence for regional differences of semen quality among fertile french men. <i>Human Reproduction</i> <b>12</b> 740–745.

Balakrishnan N. and Ma, C. W. (1990). A comparative study of various tests for the equality of two population variances. <i>J. Stat. Comput. Simul.</i> <b>35</b> 41–89.

Bancroft, T. A. (1964). Analysis and inference for incompletely specified models involving the use of preliminary test(s) of significance. <i>Biometrics</i> <b>20</b> 427–442.

Bartlett, M. S. (1937). Properties of sufficiency and statistical tests. <i>Proc. Roy. Soc. Ser. A</i> <b>160</b> 268–282.

Bathke, A. (2002). ANOVA for a large number of treatments. <i>Math. Methods Statist.</i> <b>11</b> 118–132.

Bathke, A. (2004). The ANOVA <i>F</i>-test can still be used in some balanced designs with unequal variances and non-normal data. <i>J. Statist. Plann. Inference</i> <b>2</b> 413–422.

Berger, A. K., Gottdiener, J. S., Yohe, M. A. and Guerro, J. L. (1999). Epidemiologic approach to quality assessment in echocardiographic diagnosis. <i>Journal of the American College of Cardiology</i> <b>34</b> 1831–1836.

Bickel, P. J (1975). One-step Huber estimates in the linear model. <i>J. Amer. Statist. Assoc.</i> <b>70</b> 428–434.

Boos, D. D. and Brownie, C. (1989). Bootstrap methods for testing homogeneity of variances. <i>Technometrics</i> <b>31</b> 69–82.

Boos, D. D. and Brownie, C. (1995). ANOVA and ranks test when the number of treatments is large. <i>Statist. Probab. Lett.</i> <b>23</b> 183–191.

Boos, D. D. and Brownie C. (2004). Comparing variances and other measures of dispersion. <i>Statist. Sci.</i> <b>19</b> 571–578.

Box, G. E. P. and Andersen, S. L. (1955). Permutation theory in the derivation of robust criteria and the study of departures from assumption. <i>J. Roy. Statist. Soc. Ser. B</i> <b>17</b> 1–26.

Brown, M. B. and Forsythe, A. B. (1974). Robust tests for equality of variances. <i>J. Amer. Statist. Assoc.</i> <b>69</b> 364–367.

Carlton, G. C. and Bazzaz, F. A. (1998). Resource congruence and forest regeneration following an experimental hurricane blowdown. <i>Ecology</i> <b>79</b> 1305–1319.

Carroll, R. J. and Ruppert, D. (1982). Robust estimation in heteroscedastic linear models. <i>Ann. Statist.</i> <b>10</b> 429–441.

Carroll, R. J. and Schneider, H. (1985). A note on Levene’s tests for equality of variances. <i>Statist. Probab. Lett.</i> <b>3</b> 191–194.

Cattaneo, Z., Postma, A and Vecchi, T. (2006). Gender differences in memory for object and word. <i>Quarterly Journal of Experimental Psychology</i> <b>59</b> 904–919.

Chacko, V. J. (1963). Testing homogeneity against ordered alternatives. <i>Ann. Math. Statist.</i> <b>34</b> 945–956.

Chang, E. C., Jain, P. C. and Locke, P. R.(1995). Standard and Poors 500 index futrues volatility and price changes around the New York stock exchange close. <i>Journal of Business</i> <b>68</b> 61–84.

Chang, E. C., Pinegar, J. M. and Schacter B. (1997). Interday variations in volume, variance and participation of large speculators. <i>Journal of Banking and Finance</i> <b>21</b> 797–810.

Conover, W. J., Johnson, M. E. and Johnson, M. M. (1981). A comparative study of tests for homogeneity of variances, with applications to the outer continental shelf bidding data. <i>Technometrics</i> <b>23</b> 351–361.

Crow, E. L. and Siddiqui, M. M. (1967). Robust estimation of location. <i>J. Amer. Statist. Assoc.</i> <b>62</b> 353–389.

Coulson, D. and Joyce, L. (2006). Indexing variability: A case study with climate change impacts on ecosystems. <i>Ecological Indicators</i> <b>6</b> 749–769.

Christie, D. R. and Koch, T. W. (1997). The impact of market-specific public information on return variance in an illiquid market. <i>Journal of Futures Markets</i> <b>17</b> 887–908.

Cumming, J. and Hall, C. (2002). Athlete’s use of imagery in the off-season. <i>Sport Pshychologist</i> <b>16</b> 160–172.

Davis, J. T. (1996). Experience and auditors’ selection of relevant information for preliminary control risk assessments. <i>Auditing</i> <b>15</b> 16–37.

Dempster, A. P. (1988). Employment discrimination and statistical science. <i>Statist. Sci.</i> <b>3</b> 149–161.

Dhillon, U. S., Lasser, D. J. and Watanbe, T. (1997). Volatility, information and double versus walrasian auction pricing in US and Japanese futures markets. <i>Journal of Banking and Finance</i> <b>21</b> 1045–1061.

Dorfman, D. D. and Berbaum, K. S. (2000). A contaminated binormal model for ROC data-part III: Initial evaluation with detection ROC data. <i>Academic Radiology</i> <b>7</b> 438–447.

English, D. R., Armstrong, B. K. and Kricker, A. (1998). Reproducibility of reported measurements of sun exposure in a case-control study. <i>Cancer, Epidemiology, Biomarkers and Prevention</i> <b>7</b> 857–863.

Esserman, L., Cowley, H., Eberle, C., Kirkpatrick, A., Chang S., Berbaum, K. and Gale, A. (2002). Improving the accuracy of mammography: Volume and outcome relationships. <i>Journal of the National Cancer Institute</i> <b>94</b> 369–375.

Fisher, N. I. (1986). Robust-tests for comparing the dispersions of several Fisher or Watson distributions on the sphere. <i>Geophysical Journal of the Royal Astronomical Society</i> <b>85</b> 563–572.

Fligner, M. A. and Killeen, T. J. (1976). Distribution-free two-sample tests for scale. <i>J. Amer. Statist. Assoc.</i> <b>71</b> 210–213.

Flynn, F. J. and Brockner, J. (2003). It is different to give than to receive: Predictors of givers’ and receivers’ reactions to favor exchange. <i>Journal of Applied Psychology</i> <b>88</b> 1034–1045.

Francois, N., Guydot-Declerck, C., Hug, B., Callemien, D., Govaerts, B. and Collin, S. (2006). Beer astringency assessed by time-intensity and quantitative descriptive analysis: Influence of pH and accelerated aging. <i>Food Quality and Preference</i> <b>17</b> 445–452.

Freidlin, B. and Gastwirth, J. L. (2004). A note on the use of tests of mutation rates on ordered groups. <i>Genetic Testing</i> <b>8</b> 437–440.

Freidlin, B., Miao M. and Gastwirth, J. L. (2003). On the use of the Shapiro–Wilk test in two-stage adaptive inference for paired data from moderate to very heavy tailed distributions. <i>Biometrical Journal</i> <b>45</b> 887–900.

Fujino, Y. (1979). Tests for the homogeneity of variances for ordered aternatives. <i>Biometrika</i> <b>66</b> 133–139.

Gastwirth, J. L. and Rubin, H. (1969). On robust linear estimators. <i>Ann. Math. Statist.</i> <b>40</b> 24–39.

Gastwirth, J. L. (1972). Robust estimation of the Lorenz curve and Gini index. <i>Rev. Econom. Statist.</i> <b>54</b> 306–316.

Gastwirth, J. L. (1987), The statistical precision of medical screening procedures: Application to polygraph and AIDS antibodies test data. <i>Statist. Sci.</i> <b>2</b> 213–238.

Giraud, T. and Capy, P. (1996). Somatic activity of the mariner trasposable element in natural populations of Drosophila simulans. <i>Proceedings: Biological Sciences</i> <b>263</b> 1481–1486.

Grambsch, P. M. (1994). Simple robust tests for scale differences in paired data. <i>Biometrika</i> <b>81</b> 359–372.

Goodman, J., Green, E. and Loftus, E. F. (1989). Runaway verdicts or reasonable determination: Mock juror strategies in awarding damages. <i>Jurimetrics Journal</i> <b>29</b> 285–309.

Grissom, R. J. (2000). Heterogeneity of variance in clinical data. <i>Journal of Consulting and Clinical Psychology</i> <b>68</b> 155–165.

Graubard, B. I. and Korn, E. L. (1987). Choice of column scores for testing independence in ordered 2×<i>k</i> contingency tables. <i>Biometrics</i> <b>43</b> 471–476.

Greene, E., Coon, D. and Boornstein, B. (2001). The effects of limiting punitive damage awards. <i>Law and Human Behavior</i> <b>25</b> 217–234.

Hall, P. and Padmanabhan, A. R. (1997). Adaptive inference for the two-sample scale problem. <i>Technometrics</i> <b>39</b> 412–422.

Hardy, G. H. (1908). Mendelian proportions in a mixed population. <i>Science</i> <b>28</b> 40–50.

Hedrick, P. W. (2006). Genetic polymorphism in heterogeneous environments: The age of genomics. <i>Ann. Rev. Ecol. Systems</i> <b>37</b> 67–93.

Henriksen, H. (2003). The role of some regional factors in the assessment of well yields from hard-rock aquifers of Fennoscandia. <i>Hydrogeology Journal</i> <b>11</b> 628–645.

Hays, M. A., Irsula, B., McMullen, S. L. and Feldblum, P. J. (2001). A comparison of three daily coital diary designs and a phone-in regimen. <i>Contraception</i> <b>63</b> 159–166.

Hicks, T. V. and Leitenberg, H. (2001). Sexual fantasies about one’s partner versus someone else: Gender differences in incidence and frequency. <i>The Journal of Sex Research</i> <b>38</b> 43–50.

Hines, W. G. S. and Hines, R. J. O. (2000). Increased power with modified forms of the Levene (med) test for heterogeneity of variance. <i>Biometrics</i> <b>56</b> 451–454.

Hogg, R. V. (1974). Adaptive robust procedures: A partial review and some suggestions for future applications and theory. <i>J. Amer. Statist. Assoc.</i> <b>69</b> 909–923.

Hogg, R. V., Fisher, V. M. and Randles, R. H. (1975). A two-sample adaptive distribution-free test. <i>J. Amer. Statist. Assoc.</i> <b>70</b> 656–661.

Huber, P. J. (1972). Robust statistics: A review. <i>Ann. Math. Statist.</i> <b>43</b> 1041–1067.

Huber, P. J. (1973). Robust regression: Asymptotic, conjectures and Monte Carlo. <i>Ann. Statist.</i> <b>1</b> 799–821.

Huber, M, Chen, Y. G., Dinwoodie, I., Dobra, A. and Nicholas, M. (2006). Monte Carlo algorithms for Hardy–Weinberg proportions. <i>Biometrics</i> <b>62</b> 49–53.

Johnson, S. W., Rice, S. D. and Moles, D. A. (1998). Effects of submarine mine tailings disposal on juvenile yellowfin sole (Pleuronectes asper): A laboratory study. <i>Marine Pollution Bulletin</i> <b>36</b> 278–287.

Kahn, M. S., Coulibaly, P. and Dibike, Y. (2006). Uncertainty analysis of statistical downscaling methods using canadian global climate predictors. <i>Hydrological Processes</i> <b>20</b> 3085–3104.

Keyes, T. K. and Levy, M. S. (1997). Analysis of Levene’s test under design imbalance. <i>Journal of Educational and Behavioral Statistics</i> <b>22</b> 845–858.

Koissi, M. C., Shapiro, A. R. and Hognas, G. (2006). Evaluating and extending the Lee–Carter model for mortality forecasting: Bootstrap confidence interval. <i>Insurance Math. Econom.</i> <b>38</b> 1–20.

Krutchkoff, R. G. (1988). One-way fixed effects analysis of variance when the variances may be unequal. <i>J. Stat. Comput. Simul.</i> <b>30</b> 259–183.

Kvamme, K. L., Stark, M. T. and Longacre, M. A. (1996). Alternative procedures for assessing standardization in ceramic assemblages. <i>American Antiquity</i> <b>61</b> 116–126.

Levene, H. (1949). On a matching problem arising in genetics. <i>Ann. Math. Statist.</i> <b>20</b> 91–94.

Levene, H. (1953). Genetic equilibrium when more than one ecological niche is available. <i>American Naturalist</i> <b>87</b> 331–333.

Lim, T. S. and Loh, W. Y. (1996). A comparison of tests of equality of variances. <i>Comput. Statist. Data Anal.</i> <b>22</b> 287–301.

Manly, B. F. J. and Francis, R. I. C. C. (2002). Testing for mean and variance differences with samples from distributions that may be non-normal with unequal variances. <i>J. Stat. Comput. Simul.</i> <b>72</b> 633–646.

Marti, M. W. and Wissler, R. L. (2000). Be careful what you ask for: The effect of anchors on personal injury damages awards. <i>Journal of Experimental Psychology-Applied</i> <b>6</b> 91–103.

Martin, C. G. and Games, P. A. (1977). Tests for homogeneity of variance: Non-normality and unequal samples. <i>Journal of Educational Statistics</i> <b>2</b> 187–206.

Maurer, H. P., Melchinger, A. E. and Frisch, M. (2007). An incomplete enumeration algorithm for an exact test of Hardy–Weinberg proportions with multiple alleles. <i>Theoretical and Applied Genetics</i> <b>115</b> 393–398.

Mayhew, D. A., Comer, C. P. and Stargel, W. W. (2003). Food consumption and body weight changes with neotame, a new sweetener with intense taste: Differentiating effects of palatability from toxicity in dietary safety studies. <i>Regulatory Toxicology and Pharmacology</i> <b>38</b> 124–143.

Miao, W. and Gastwirth, J. L (2009). A new two stage adaptive nonparametric test for paired difference. <i>Statistics and Its Interface</i> <b>2</b> 213–221.

Miller, R. G., Jr. (1968). Jacknifing variances. <i>Ann. Math. Statist.</i> <b>39</b> 567–582.

Mitchell-Olds, T. and Rutledge, J. J. (1986). Quantitative genetics in natural populations: A review of the theory. <i>The American Naturalist</i> <b>127</b> 379–402.

Moser, B. K., Stevens, G. R. and Watts, C. L. (1989). The two-sample <i>T</i>-test versus Satterthwaite’s approximation <i>F</i>-test. <i>Communication in Statistics—Theory and Methods</i> <b>18</b> 3963–3975.

Moser, B. K., Stevens, G. R. and Watts, C. L. (1992). Homogeneity of variances in the two-sample means test. <i>Amer. Statist.</i> <b>46</b> 19–21.

Neave, F. B., Mandrak, N. E., Docker, M. F. and Noakes, D. L. (2006). Effects of preservation on pigmentation and length measurements in larval lampreys. <i>Journal of Fish Biology</i> <b>68</b> 991–1001.

Neuhauser, M. and Hothorn, L. A. (2000). Parametric location-scale and scale trend tests based on Levene’s transformation. <i>Comput. Statist. Data Anal.</i> <b>33</b> 189–200.

Nygard, F. and Sandstrom, A. (1989). Income inequality measures based on sample surveys. <i>J. Econometrics</i> <b>42</b> 81–95.

O’Brien, R. G. (1979). A general ANOVA method for robust tests of additive models for variances. <i>J. Amer. Statist. Assoc.</i> <b>74</b> 877–880.

O’Gorman, T. (1997). A comparison of an adaptive two-sample test to the <i>t</i>-test and the rank sum test. <i>Commun. Statist. Simulation and Comput.</i> <b>26</b> 1393–1411.

O’Neil, K. M., Penrod, S. D. and Bornstein, B. H. (2003). Web-based research: Methodological variables’ effects on dropout and sample characteristics. <i>Behavior Research Methods Instruments and Computers</i> <b>35</b> 217–226.

O’Neil, M. E. and Mathews, K. L. (2000). A weighted least squares approach to Levene’s test of homogeneity of variance. <i>Aust. N. Z. J. Stat.</i> <b>42</b> 81–100.

O’Neil, M. E and Mathews, K. L. (2002). Levene tests of homogeneity of variance for general block and treatment designs. <i>Biometrics</i> <b>58</b> 216–2224.

Pan, G. (2002). Confidence intervals for comparing two scale parameters based on levene statistics. <i>J. Nonparametr. Stat.</i> <b>14</b> 459–476.

Plourdes, A. and Watkins, G. C. (1998). Crude oil prices between 1985 and 1994: How volatile in relation to other commodities? <i>Resource and Energy Economics</i> <b>20</b> 245–262.

Robbennolt, J. K. and Studebaker, C. A. (1999). Anchoring in the courtroom: The effect of caps on punitive damages. <i>Law and Human Behavior</i> <b>23</b> 353–373.

Rosser, D. A. Murdoch, I. E. and Cousens, S. N. (2004). The effect of optical defocus on the test-retest variability of visual acuity measurements. <i>Investigative Ophththalmology and Visual Science</i> <b>45</b> 1076–1079.

Roth, A. J. (1983). Robust trend tests derived and simulated: Analogs of the Welch and Brown–Forsythe tests. <i>J. Amer. Statist. Assoc.</i> <b>78</b> 1972–1980.

Saks, M. J., Hollinger, L. A., Wissler, R. L., Evans, D. L. and Hart, A. (1997). Reducing variability in civil jury awards. <i>Law and Human Behavior</i> <b>21</b> 243–256.

Sant, R. and Cowan, A. R. (1994). Do dividends signal earnings—the case of omitted dividends. <i>Journal of Banking and Finance</i> <b>18</b> 1113–1133.

Schaale, G. B. and Despain, D. J. (1996). Robustness of variance tests for randomized complete block data. <i>Commun. Statist. Simulation</i> <b>25</b> 961–977.

Schom, C. B. and Kit, J. M. (1980). Genetic and environmental-control of avian embryos response to a teratogen. <i>Poultry Science</i> <b>59</b> 473–478.

Schucany, W. R. and Ng, H. K. T. (2006). Preliminary goodness of fit tests for normality do not validate the one-sample student <i>t. Comm. Statist.</i> <b>5</b> 2275–2286.

Shorack, G. R. (1969). Testing and estimating ratios of scale parameters. <i>J. Amer. Statist. Assoc.</i> <b>64</b> 999–1013.

Sprott, D. A. and Farewell, V. T. (1993). The difference between two normal means. <i>Amer. Statist.</i> <b>47</b> 126–128.

Star, B., Stoffels, R. J. and Spencer, H. G. (2007). Evolution of fitness and allele frequencies in a population with spatially heterogeneous selection pressures. <i>Genetics</i> <b>177</b> 1743–1751.

Tabain, M. (2001). Variability in frictive production and spectra: Implications for the hyper- and hypo- and quantal theories of speech production. <i>Language and Speech</i> <b>44</b> 57–94.

Vangel, M. G. (2005). A numerical approach to the Behrens–Fisher problem. <i>J. Statist. Plann. Inference</i> <b>130</b> 341–350.

Vincent, S. E. (1961). A test of homogeneity for ordered variances. <i>J. Roy. Statist. Soc. Ser. B</i> <b>23</b> 195–206.

Waldo, D. R. and Goering, H. K. (1979). Insolubility of proteins in ruminant feeds by 4 methods. <i>Journal of Animal Science</i> <b>49</b> 1560–1568.

Weerhandi, S. (1995). ANOVA under unequal error variances. <i>Biometrics</i> <b>51</b> 589–599.

Weinberg, W. (1908). Uber den Nachweis der Vererbung beim Menschen. <i>Jaresh. Verein f. Vaterl. Naturk. In Wuttemberg</i> <b>64</b> 364–382.

Welch, B. L. (1938). The significance of the difference between two means when the population variances are unequal. <i>Biometrika</i> <b>29</b> 350–362.

Welch, B. L. (1951). On the comparison of several mean values: An alternative approach. <i>Biometrika</i> <b>38</b> 330–336.

Wilcox, R. R. (1989). Comparing the variances of dependent groups. <i>Psychometrika</i> <b>54</b> 305–315.

Yitnosumarto, S. and O’Neill, M. E. (1986). On Levene’s tests of variance homogeneity. <i>Aust. J. Statist.</i> <b>28</b> 230–241.

Zheng, G., Freidlin, B., Li, Z. and Gastwirth, J. L. (2003). Choice of scores in trend tests for case-control studies of candidate-gene associations. <i>Biometrical Journal</i> <b>45</b> 335–348.

Zimmerman, D. W. (2004). A note on preliminary tests of variances. <i>British J. Math. Statist. Psych.</i> <b>57</b> 173–181.