Knowledge Discovery: Methods from data mining and machine learning
Tài liệu tham khảo
Aizawa, 2021, Decomposition of improvements in infant mortality in asian developing countries over three decades, Demography, 58, 137, 10.1215/00703370-8931544
Akaike, 1977
Anand, 1998
Anyadike-Danes, 2010, My brilliant career: characterizing the early labor market trajectories of British women from generation X, Socio. Methods Res., 38, 482, 10.1177/0049124110362968
Arpino, 2022, What tears couples apart: a machine learning analysis of union dissolution in Germany, Demography, 59, 161, 10.1215/00703370-9648346
Athey, 2015, A measure of robustness to misspecification, Am. Econ. Rev., 105, 476, 10.1257/aer.p20151020
Athey, 2016, Recursive partitioning for heterogeneous causal effects, Proc. Natl. Acad. Sci., 113, 7353, 10.1073/pnas.1510489113
Bacher, 2000, A probabilistic clustering model for variables of mixed type, Qual. Quantity, 34, 223, 10.1023/A:1004759101388
Bail, 2008, The configuration of symbolic boundaries against immigrants in Europe, Am. Socio. Rev., 73, 37, 10.1177/000312240807300103
Bankes, 2002, Agent-based modeling: a revolution, Proc. Natl. Acad. Sci. USA, 99, 7199, 10.1073/pnas.072081299
Billari, 2006, Timing, sequencing, and quantum of life course events: a machine learning approach, Eur. J. Popul., 22, 37, 10.1007/s10680-005-5549-0
2014
Bond, 2012, A 61-million-person experiment in social influence and political mobilization, Nature, 489, 295, 10.1038/nature11421
Bonikowski, 2016, Varieties of American popular nationalism, Am. Socio. Rev., 81, 949, 10.1177/0003122416663683
Brand, 2021, Uncovering sociological effect heterogeneity using tree-based machine learning, Socio. Methodol., 51, 189, 10.1177/0081175021993503
Brand, 2023, Recent developments in causal inference and machine learning, Annu. Rev. Sociol., 10.1146/annurev-soc-030420-015345
Breiman, 2001, Statistical modeling: two cultures (with discussion), Stat. Sci., 16, 199, 10.1214/ss/1009213726
Breiman, 2001, Random forests, Mach. Learn., 45, 5, 10.1023/A:1010933404324
Breiman, 1984
Clogg, 1995, Latent class models” in
Conte, 2016, Computational social and behavioral science
Deza, 2006
Diamond, 2013, Genetic matching for estimating causal effects: a general multivariate matching method for achieving balance in observational studies, Rev. Econ. Stat., 95, 932, 10.1162/REST_a_00318
Donoho, 2017, 50 Years of data science, J. Comput. Graph Stat., 26, 745, 10.1080/10618600.2017.1384734
Dumbill, 2013, A revolution that will transform how we live, work, and think: an interview with the author of big data, Big Data, 1, 73, 10.1089/big.2013.0016
Epstein, 2006, Remarks on the foundations of agent-based generative social science, Handb. Comput. Econ., 2, 1585, 10.1016/S1574-0021(05)02034-4
Fayyad, 1996, Knowledge discovery and data mining: towards a unifying framework, KDD-96 Proceedings, 82
Frye, 2015, Ideals as anchors for relationship experiences, Am. Socio. Rev., 80, 496, 10.1177/0003122415581333
Garip, 2012
Garip, 2017
Garson, 1998
Gilbert, 2006, Emerging artificial societies through learning, J. Artif. Soc. Soc. Simulat., 9, 9
Glymour, 1997, Statistical themes and lessons for data mining, Data Min. Knowl. Discov., 1, 11, 10.1023/A:1009773905005
Goldberger, 2017
Gondal, 2022, Multiplexity as a lens to investigate the cultural meanings of interpersonal ties, Soc. Network., 68, 209, 10.1016/j.socnet.2021.07.002
Gorunescu, 2011
Hagenaars, 2002
Han, 2018
Hand, 2001
Hedt, 2011, Health indicators: eliminating bias from convenience sampling estimators, Stat. Med., 30, 560, 10.1002/sim.3920
Heiberger, 2021, Facets of Specialization and its Relation to Career Success: An Analysis of U.S. Sociology, 1980 to 2015." American Sociological Review, 86, 1164
Hofman, 2017, Prediction and explanation in social systems, Science, 355, 486, 10.1126/science.aal3856
Holton, 2017
Hu, 2021, Analysis of heterogeneity effects: opportunities and challenges of machine learning, Sociol. Stud.
ImageNet
Kim, 2018, Evaluating sampling methods for content analysis of twitter data, Social Media + Soc., 4, 10.1177/2056305118772836
Kramer, 2014, Experimental evidence of massive-scale emotional contagion through social networks, Proc. Natl. Acad. Sci. USA, 111, 8788, 10.1073/pnas.1320040111
Lazer, 2009, Computational social science, Science, 323, 721, 10.1126/science.1167742
Lee, 2017, Social disadvantage, severe child abuse, and biological profiles in adulthood, J. Health Soc. Behav., 58, 371, 10.1177/0022146516685370
Levenshtein, 1966, Binary codes capable of correcting deletions, insertions, and reversals, Dokl. Phys., 10, 707
Lundberg, 2022
Luma-Osmani, 2020, 48
MacKay, 2003
Manyika, 2011
Mason, 2014, Computational social science and social computing, Mach. Learn., 95, 257, 10.1007/s10994-013-5426-8
Mauro, 2016, A formal definition of big data based on its essential features, Libr. Rev., 65, 122, 10.1108/LR-06-2015-0061
Michel, 2011, The google books team, joseph P. Pickett, dale hoiberg, dan clancy, peter norvig, jon orwant, steven pinker, martin A nowak, erez lieberman aiden, Quantit. Anal. Cult. Using Millions Digitized Books.” Sci., 331, 176
Molina, 2019, Machine learning for sociology, Annu. Rev. Sociol., 45, 27, 10.1146/annurev-soc-073117-041106
Moody, 2004, The structure of a social science collaboration network: disciplinary cohesion from 1963 to 1999, Am. Socio. Rev., 69, 213, 10.1177/000312240406900204
Morgan, 2015
Muthén, 2004, Latent variable analysis: growth mixture modeling and related techniques for longitudinal data
Neal, 1992, Connectionist learning of belief networks, Artif. Intell., 56, 71, 10.1016/0004-3702(92)90065-6
Nelson, 2021, Cycles of conflict, a century of continuity: the impact of persistent place-based political logics on women’s movement form, Am. J. Sociol., 127, 10.1086/714915
Nelson, 2020, Computational grounded theory: a methodological framework, Socio. Methods Res., 49, 3, 10.1177/0049124117729703
Pavlova, 2020, Mental health discourse and social media: which mechanisms of cultural power drive discourse on twitter, Soc. Sci. Med., 263, 10.1016/j.socscimed.2020.113250
Peterson, 2014, Convenience samples of college students and research reproducibility, J. Bus. Res., 67, 1035, 10.1016/j.jbusres.2013.08.010
Provost, 2013
Reitermanova, 2010, Data Splitting, WDS’10 Proceedings of Contributed Papers, 1, 31
Rigobon, 2019, Winning models for GPA, grit, and layoff in the fragile families challenge, Socius, 5, 1, 10.1177/2378023118820418
Ross, 1986, Induction of decision trees, Mach. Learn., 1, 81, 10.1007/BF00116251
Salganik, 2020, Measuring the predictability of life outcomes with a scientific mass collaboration, Proc. Natl. Acad. Sci. USA, 117, 8398, 10.1073/pnas.1915006117
Samuel, 1959, Some studies in machine learning using the game of checkers, IBM J. Res. Dev., 3, 210, 10.1147/rd.33.0210
Scarborough, 2020, Gendered places: the dimensions of local gender norms across the United States, Gend. Soc., 34, 705, 10.1177/0891243220948220
Seife, 2015, Big data: the revolution is digitized, Nature, 518, 480, 10.1038/518480a
Scarborough, 2021, The intersection of racial and gender attitudes, 1977 through 2018, Am. Socio. Rev., 86, 823, 10.1177/00031224211033582
Scarborough, 2019, Attitudes and the stalled gender revolution: egalitarianism, traditionalism, and ambivalence from 1977 through 2016, Gend. Soc., 33, 173, 10.1177/0891243218809604
Shu, 2003
Shu, 2020
Sianes, 2014, Rating the rich: an ordinal classification to determine which rich countries are helping poorer ones the most, Soc. Indicat. Res., 116, 47, 10.1007/s11205-013-0270-6
Soehl, 2021, How legacies of geopolitical trauma shape popular nationalism today, Am. Socio. Rev., 86, 406, 10.1177/00031224211011981
Van de Rijt, 2013, Only 15 minutes? The social stratification of fame in printed media, Am. Socio. Rev., 78, 266, 10.1177/0003122413480362
Watts, 2013, Computational social science: exciting progress and future directions, The Bridge on Frontiers of Engineering, 43, 5
Wager, 2018, Estimation and inference of heterogeneous treatment effects using random forests, J. Am. Stat. Assoc., 113, 1228, 10.1080/01621459.2017.1319839
Westreich, 2010, Propensity score estimation: neural networks, support vector machines, decision trees (CART), and meta-classifiers as alternatives to logistic regression, J. Clin. Epidemiol., 63, 826, 10.1016/j.jclinepi.2009.11.020
Winton, 2021, A multi-group Analysis of convenience samples: free, cheap, friendly, and fancy sources, Int. J. Soc. Res. Methodol., 1
Witten, 2011
Wyss, 2014, The role of prediction modeling in propensity score estimation: an evaluation of logistic regression, bCART, and the covariate-balancing propensity score, Am. J. Epidemiol., 180, 645, 10.1093/aje/kwu181
Xu, 2021, Detecting suicide risk using knowledge-aware natural language processing and counseling service data, Soc. Sci. Med., 283, 10.1016/j.socscimed.2021.114176
Zhang, 2019, CASM: a deep learning approach for identifying collective action events with text and image data from social media, Socio. Methodol., 49, 1, 10.1177/0081175019860244
Zhang, 2022, Image clustering: an unsupervised approach to categorize visual data in social science research, Socio. Methods Res., 10.1177/00491241221082603
Zhang, 2016, Tweet sarcasm detection using deep neural network.” Paper presented at the COLING 2016 - 26th International Conference on Computational Linguistics, Proceedings of COLING 2016: Technical Papers, 2449