In search of good probability assessors: an experimental comparison of elicitation rules for confidence judgments

Springer Science and Business Media LLC - Tập 80 - Trang 363-387 - 2015
Guillaume Hollard1,2, Sébastien Massoni3, Jean-Christophe Vergnaud2,4
1Département d’Economie, Ecole Polytechnique, Palaiseau, France
2CNRS, Paris, France
3QuBE - School of Economics and Finance, Queensland University of Technology, Brisbane, Australia
4Centre d’Economie de la Sorbonne, University of Paris 1, Paris, France

Tóm tắt

In this paper, we use an experimental design to compare the performance of elicitation rules for subjective beliefs. Contrary to previous works in which elicited beliefs are compared to an objective benchmark, we consider a purely subjective belief framework (confidence in one’s own performance in a cognitive task and a perceptual task). The performance of different elicitation rules is assessed according to the accuracy of stated beliefs in predicting success. We measure this accuracy using two main factors: calibration and discrimination. For each of them, we propose two statistical indexes and we compare the rules’ performances for each measurement. The matching probability method provides more accurate beliefs in terms of discrimination, while the quadratic scoring rule reduces overconfidence and the free rule, a simple rule with no incentives, which succeeds in eliciting accurate beliefs. Nevertheless, the matching probability appears to be the best mechanism for eliciting beliefs due to its performances in terms of calibration and discrimination, but also its ability to elicit consistent beliefs across measures and across tasks, as well as its empirical and theoretical properties.

Tài liệu tham khảo

Abdellaoui, M., Vossmann, F., & Weber, M. (2005). Choice-based elicitation and decomposition of decision weights for gains and losses under uncertainty. Management Science, 51(9), 1384–1399. Andersen, S., Fountain, J., Harrison, G., & Rutstrom, E. (2010). Estimating subjective probabilities. CEAR Working Paper. Armantier, O., & Treich, N. (2013). Eliciting beliefs: Proper scoring rules, incentives, stakes and hedging. European Economic Review, 62, 17–40. Arrow, K. J. (1951). Alternative approaches to the theory of choice in risk-taking situations. Econometrica, 19, 404–437. Baillon, A., & Bleichrodt, H. (2015). Testing ambiguity models through the measurement of probabilities for gains and losses. American Economic Journal: Microeconomics (forthcoming), 7(2), 77–100. Baillon, A., Cabantous, L., & Wakker, P. (2012). Aggregating imprecise or conflicting beliefs: An experimental investigation using modern ambiguity theories. Journal of Risk and Uncertainty, 44(2), 115–147. Baranski, J., & Petrusic, W. (1994). The calibration and resolution of confidence in perceptual judgments. Perception and Psychophysics, 55(4), 412–428. Becker, G., DeGroot, M., & Marschak, J. (1964). Measuring utility by a single-response sequential method. Behavioral Science, 9(3), 226–232. Biais, B., Hilton, D., Mazurier, K., & Pouget, S. (2005). Judgmental overconfidence, self monitoring, and trading performance in an experimental financial market. The Review of Economic Studies, 72(2), 287–312. Blavatskyy, P. (2009). Betting on own knowledge: Experimental test of overconfidence. Journal of Risk and Uncertainty, 38(1), 39–49. Brainard, D. (1997). The psychophysics toolbox. Spatial Vision, 10, 433–436. Brier, G. W. (1950). Verification of forecasts expressed in terms of probability. Monthly Weather Review, 78(1), 1–3. Camerer, C., & Lovallo, D. (1999). Overconfidence and excess entry: An experimental approach. The American Economic Review, 89(1), 306–318. Clark, J., & Friesen, L. (2009). Overconfidence in forecasts of own performance: An experimental study. The Economic Journal, 119(534), 229–251. Dimmock, S., Kouwenberg, R., & Wakker, P. (2011). Ambiguity attitudes and portfolio choice: Evidence from a large representative survey. Netspar Discussion Paper No 06/2011-054. Fleming, S., & Dolan, R. (2012). The neural basis of accurate metacognition. Philosophical Transactions of the Royal Society B, 367(1594), 1338–1349. Fleming, S. M., Weil, R. S., Nagy, Z., Dolan, R. J., & Rees, G. (2010). Relating introspective accuracy to individual differences in brain structure. Science, 329, 1541–1543. Galvin, S. J., Podd, J. V., Drga, V., & Whitmore, J. (2003). Type 2 tasks in the theory of signal detectability: Discrimination between correct and incorrect decisions. Psychonomic Bulletin and Review, 10, 843–876. Gneiting, T., & Raftery, A. E. (2007). Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association, 102(477), 359–378. Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics. New York: Wiley. Grether, D. (1992). Testing Bayes rule and the representativeness heuristic: Some experimental evidence. Journal of Economic Behavior and Organization, 17, 31–57. Hanley, J. A., & McNeil, B. J. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 143, 29–36. Hao, L., & Houser, D. (2012). Belief elicitation in the presence of naive respondents: An experimental study. Journal of Risk and Uncertainty, 44(2), 161–180. Harvey, N. (1997). Confidence in judgment. Trends in Cognitive Sciences, 1(2), 78–82. Holt, C. (2006). Markets, games, and strategic behavior: Recipes for interactive learning. Reading: Addison-Wesley. Holt, C., & Smith, M. (2009). An update on Bayesian updating. Journal of Economic Behavior and Organization, 69(2), 125–134. Hossain, T., & Okui, R. (2013). The binarized scoring rule. The Review of Economic Studies, 80(3), 984–1001. Kadane, J. B., & Winkler, R. L. (1988). Separating probability elicitation from utilities. Journal of the American Statistical Association, 83(402), 357–363. Kaivanto, K. (2006). Informational rent, publicly known firm type, and ‘closeness’ in relationship finance. Economics Letters, 91(3), 430–435. Karni, E. (2009). A mechanism for eliciting probabilities. Econometrica, 77(2), 603–606. Kothiyal, A., Spinu, V., & Wakker, P. (2011). Comonotonic proper scoring rules to measure ambiguity and subjective beliefs. Journal of Multi-Criteria Decision Analysis, 17, 101–113. LaValle, I. H. (1978). Fundamentals of decision analysis. New York: Holt, Rinehart and Winston. Levitt, H. (1971). Transformed up-down methods in psychoacoustics. Journal of the Acoustical Society of America, 49, 467–477. Lichtenstein, S., & Fischhoff, B. (1977). Do those who know more also know more about how much they know? The calibration of probability judgments. Organizational Behavior and Human Performance, 20(7), 159–183. Lichtenstein, S., Fischhoff, B., & Phillips, L. (1982). Calibration of probabilities: The state of the art to 1980. In D. Kahneman, P. Slovic, & A. Tversky (Eds.), Judgment under uncertainty: Heuristic and biases (pp. 306–334). Cambridge: Cambridge University Press. Massoni, S. (2009). A direct revelation mechanism for elicitating confidence in perceptual and cognitive tasks: An experimental study. Master’s Thesis, Université Paris 1. Massoni, S., Gajdos, T., & Vergnaud, J. C. (2014). Confidence measurement in the light of signal detection theory. Frontiers in Psychology, 5, 1455. McCurdy, L., Maniscalco, B., Metcalfe, J., Liu, K., de Lange, F., & Lau, H. (2013). Anatomical coupling between distinct metacognitive systems for memory and visual perception. The Journal of Neuroscience, 33(5), 1897–1906. Mobius, M., Niederle, M., Niehaus, P., & Rosenblat, T. (2011). Managing self-confidence: Theory and experimental evidence. NBER Working Paper No 17014. Murphy, A. H. (1972). Scalar and vector partitions of the probability score. Part I: Two-state situation. Journal of Applied Meteorology, 11, 273–282. Murphy, A. H. (1998). The early history of probability forecasts: Some extensions and clarifications. Weather and Forecasting, 13, 5–15. Nyarko, Y., & Schotter, A. (2002). An experimental study of belief learning using elicited beliefs. Econometrica, 70(3), 971–1005. Offerman, T., Sonnemans, J., Van de Kuilen, G., & Wakker, P. (2009). A truth-serum for non-Bayesian: Correcting proper scoring rules for risk attitudes. Review of Economic Studies, 76(4), 1461–1489. Palfrey, T., & Wang, S. (2009). On eliciting beliefs in strategic games. Journal of Economic Behavior and Organization, 71(2), 98–109. Raiffa, H. (1968). Decision analysis. London: Addison-Wesley. Rounis, E., Maniscalco, B., Rothwell, J. C., Passingham, R. E., & Lau, H. (2010). Theta-burst transcranial magnetic stimulation to the prefrontal cortex impairs metacognitive visual awareness. Cognitive Neuroscience, 1(3), 165–175. Schotter, A., & Trevino, I. (2014). Belief Elicitation in the Laboratory. Annual Review of Economics, 6, 103–128. Song, C., Kanai, R., Fleming, S., Weil, R., Schwarzkopf, D., & Rees, G. (2011). Relating inter-individual differences in metacognitive performance on different perceptual tasks. Consciousness and Cognition, 20(4), 1787–1792. Trautmann, S., & van de Kuilen, G. (2015). Belief elicitation: A horse race among truth serums. The Economic Journal (forthcoming). Wallsten, T. S., & Budescu, D. V. (1983). Encoding subjective probabilities: A psychological and psychometric review. Management Science, 29(2), 151–173. Winkler, R. L. (1972). An introduction to Bayesian inference and decision theory. New York: Holt, Rinehart and Winston. Winkler, R. L., & Murphy, A. H. (1968). “good” probability assessors. Journal of Applied Meteorology, 7, 751–758. Yates, J. F. (1982). External correspondence: Decompositions of the mean probability score. Organizational Behavior and Human Performance, 30(1), 132–156.