Structural Topic Models for Open‐Ended Survey Responses

American Journal of Political Science - Tập 58 Số 4 - Trang 1064-1082 - 2014
Margaret E. Roberts1, Brandon Stewart2, Dustin Tingley2, Christopher Lucas2, Jetson Leder‐Luis3, Shana Kushner Gadarian4, Bethany Albertson5, David G. Rand6
1University of California, San Diego
2Harvard University
3California Institute of Technology
4Syracuse University
5University of Texas at Austin#TAB#
6Yale University

Tóm tắt

Collection and especially analysis of open‐ended survey responses are relatively rare in the discipline and when conducted are almost exclusively done through human coding. We present an alternative, semiautomated approach, the structural topic model (STM) (Roberts, Stewart, and Airoldi 2013; Roberts et al. 2013), that draws on recent developments in machine learning based analysis of textual data. A crucial contribution of the method is that it incorporates information about the document, such as the author's gender, political affiliation, and treatment assignment (if an experimental study). This article focuses on how the STM is helpful for survey researchers and experimentalists. The STM makes analyzing open‐ended responses easier, more revealing, and capable of being used to estimate treatment effects. We illustrate these innovations with analysis of text from surveys and experiments.

Từ khóa


Tài liệu tham khảo

Anandkumar Anima, 2012, A Spectral Algorithm for Latent Dirichlet Allocation, Advances in Neural Information Processing Systems, 25, 926

10.1162/coli.07-034-R2

Bischof Jonathan, 2012, Summarizing topical content with word frequency and exclusivity, 201

10.1145/2133806.2133826

Blei David M., 2003, Latent Dirichlet Allocation, Journal of Machine Learning Research, 3, 993

10.1016/j.jsc.2005.04.011

Chang Jonathan JordanBoyd‐Graber ChongWang SeanGerrish andDavid M.Blei.2009. “Reading Tea Leaves: How Humans Interpret Topic Models.”Advances in Neural Information Processing Systems 288–296.

10.1017/S0003055406062514

Eisenstein Jacob, 2011, Sparse Additive Generative Models of Text, Proceedings of the 28th International Conference on Machine Learning, 1041

Gadarian Shana, Anxiety, Immigration, and the Search for Information, Political Psychology

10.1086/269113

10.1086/269268

10.1017/CBO9780511815492

10.1093/pan/mpp034

10.1093/pan/mpq027

10.1073/pnas.1018067108

10.1093/pan/mps028

10.1111/j.1540-5907.2009.00428.x

Hopkins Daniel.2013. “The Exaggerated Life of Death Panels: The Limits of Framing Effects in the 2009‐2012 Health Care Debate.” Working Paper. Georgetown University.

10.1177/0002716296546001006

10.1016/j.patrec.2009.09.011

10.1515/9781400855650

10.4324/9780203882313

10.1146/annurev.psych.50.1.537

10.1017/S0003055403000698

10.1086/265666

10.1126/science.1167742

10.1111/j.1468-2958.2002.tb00826.x

Lucas Christopher RichardNielsen MargaretRoberts BrandonStewart AlexStorer andDustinTingley.2013. “Computer Assisted Text Analysis for Comparative Politics.” Working paper.

10.1017/CBO9780511809071

Mimno David, 2011, Proceedings of the Conference on Empirical Methods in Natural Language Processing, 262

Newman David, 2010, Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Lin‐ guistics, 100

10.1214/12-BA734

10.1111/j.1540-5907.2009.00427.x

10.1038/nature11467

10.2307/1954456

Roberts Margaret E. Brandon M.Stewart andEdoardo M.Airoldi.2013. “Structural Topic Models.” Working paper.

Roberts Margaret E. Brandon M.Stewart DustinTingley andEdoardo M.Airoldi.2013. “The Structural Topic Model and Applied Social Science.”Advances in Neural Information Processing Systems Workshop on Topic Models: Computation Application and Evaluation.

10.2307/2090907

Schuman Howard, 1996, Questions and Answers in Attitude Surveys: Experiments on Question Form, Wording, and Context

10.1093/pan/mph004

Sontag D., 2009, Advances in Neural Information Processing: Workshop on Applications for Topic Models: Text and Beyond

10.1145/2020408.2020480

Zou James, 2012, Priors for Diversity in Generative Latent Variable Models, Advances in Neural Information Processing Systems, 25, 3005