A new GM-estimate with high breakdown point

Springer Science and Business Media LLC - Tập 48 - Trang 419-437 - 2013
Serif Hekimoğlu1, R. Cuneyt Erenoglu2
1Department of Geomatics Engineering, Yildiz Technical University, Istanbul, Turkey
2Department of Geomatics Engineering, Canakkale Onsekiz Mart University, Canakkale, Turkey

Tóm tắt

The breakdown point of GM-estimators does not exceed 1/(p+1) where p is dimension of explanatory variables in linear regression. One proposes a new version of the GM-estimation with a high breakdown point (HBGM) to provide high resistance against leverage points and vertical outliers. This paper presents a technique aimed at routinely normalized and robustified Euclidean distance among data points for finding leverage points and gross errors in the x- and y-directions, respectively. In addition, a graphical visualization is simultaneously used to display which points are leverage and gross error. Finally, weights of flagged data points will be decreased to a certain value before applying M-estimator. Since robustification and normalization procedures are completely based on the median estimator with the highest breakdown point, the proposed method has a conditional breakdown point of 50 % theoretically. The technique was tested with simulated data and also real data set containing the series of landslide deformation. Tests were performed for linear regression models including different scenarios. Consequently, the experimental results showed that the proposed method reaches up to a 50 % of breakdown point. Morever, the HBGM method is less time- consuming in parameter estimation.

Tài liệu tham khảo

Barrodale I, Roberts FDK (1974) Solution of an over determined system of equations in L1 norm. Commun ACM 17:319–320. doi:10.1145/355616.361024

Coakley CW, Hettmansperger TP (1993) A bounded influence, high breakdown, efficient regression estimator. J Am Stat Assoc 88:872–880. doi:10.1080/01621459.1993.10476352

Dodge Y (1992) Thoughts on real data and statistics. In: Dodge Y (ed) L 1-statistical analysis and related methods. Elsevier, Amsterdam, pp 3–7

Donoho DL, Huber PJ (1983) The notion of breakdown point. In: Bickel PJ, Doksum KA, Hodges JL Jr (eds) A festschrift for Erich L. Lehmann Wadsworth, Belmont, pp 157–184

Gomarasca MA (2009) Basics of geomatics. Springer, Dordrecht

Hekimoglu S (2005) Do robust methods identify outliers more reliably than conventional tests for outliers? Z Geod Geoinform Landmanag 130(3):174–180

Hennig C (2000) Regression fixed point clusters: motivation, consistency and simulations. Technical report, Fachbereich mathematik, Universitaet Hamburg

Hettmansperger TP, Sheather SJ (1992) A cautionary note on the method of least median squares. Am Stat 46:79–83. doi:10.1080/00031305.1992.10475855

Huber PJ (1964) Robust estimation of a location parameter. Ann Math Stat 35:73–101. doi:10.1214/aoms/1177703732

Huber PJ (1981) Robust statistics. Wiley, New York

Hubert M, Rousseeuw PJ, van Aelst S (2008) High-breakdown robust multivariate methods. Stat Sci 23(1):92–119. doi:10.1214/088342307000000087

Jureckova J, Portnoy S (1987) Asymptotics for one-step M-estimators in regression with application to combining efficiency and high breakdown point. Commun Stat A 16:2187–2199. doi:10.1080/03610928708829500

Koch KR (1996) Robuste parameterschaetzung. AVN 103:1–18

Koch KR (2007) Outlier detection in observations including leverage points by Monte Carlo simulations. AVN 10:330–336

Krasker WS, Welsch RE (1982) Efficient bounded-influence regression estimation. J Am Stat Assoc 77:595–604. doi:10.1080/01621459.1982.10477855

Mallows CL (1975) On some topics in robustness. Technical report, Bell Telephone Laboratories, Murray Hill

Maronna RA, Yohai VJ (1981) Asymptotic behavior of general M-estimates for regression and scale with random carriers. Z Wahrscheinlichkeitstheor Verw Geb 58(1):7–20. doi:10.1007/BF00536192

Rousseeuw PJ (1984) Least median of squares regression. J Am Stat Assoc 79:871–880. doi:10.1080/01621459.1984.10477105

Siegel AF (1982) Robust regression using repeated medians. Biometrika 69:242–244. doi:10.1093/biomet/69.1.242

Simpson DG, Yohai VJ (1998) Functional stability of one-step GM-estimators in linear regression. Ann Stat 26:1147–1169. doi:10.1214/aos/1024691092

Simpson DG, Ruppert D, Carroll RJ (1992) On one-step GM-estimates and stability of inferences in linear regression. J Am Stat Assoc 87:439–450. doi:10.1080/01621459.1992.10475224

Stefanski LA (1991) A note on high-breakdown estimators. Stat Probab Lett 11:353–358. doi:10.1016/0167-7152(91)90048-V

Wang H, Suter D (2003) Using symmetry in robust model fitting. Pattern Recogn Lett (PRL) 24(16):2953–2966. doi:10.1016/S0167-8655(03)00156-9

Xu PL (1989) Statistical criteria for robust methods. ITC J 1:37–40

Yohai VJ (1987) High breakdown point and high efficiency robust estimates for regression. Ann Stat 15:642–656. doi:10.1214/aos/1176350366

Yohai V, Zamar R (1988) High breakdown point estimates of regression by means of the minimization of an efficient scale. J Am Stat Assoc 83:406–413. doi:10.1080/01621459.1988.10478611

Youcai H (1995) On the design of estimators with high breakdown points for outlier identification in triangulation networks. Bull Geod 69:292–299. doi:10.1007/BF00806741