Can’t see the forest for the trees
Tóm tắt
Từ khóa
Tài liệu tham khảo
Banerjee M, Ding Y, Noone AM (2012) Identifying representative trees from ensembles. Stat Med 31(15):1601–16. https://doi.org/10.1002/sim.4492
Biecek, P.: Dalex (2018) Explainers for complex predictive models in r. J Mach Learn Res 19(84):1–5. https://jmlr.org/papers/v19/18-416.html
Bischl B, Binder M, Lang M, Pielok T, Richter J, Coors S, Thomas J, Ullmann T, Becker M, Boulesteix AL, Deng D, Lindauer M (2021) Hyperparameter optimization: Foundations, algorithms, best practices and open challenges. CoRR arXiv:2107.05847
Bücker M, Szepannek G, Gosiewska A, Biecek P (2021) Transparency, Auditability and explainability of machine learning models in credit scoring. J Oper Res Soc. https://doi.org/10.1080/01605682.2021.1922098
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Int Res 16(1):321–357
Cowan N (2010) The magical mystery four: How is working memory capacity limited, and why? Curr Dir Psychol Sci 19(1):51–57. https://doi.org/10.1177/0963721409359277
DeLong ER, DeLong DM, Clarke-Pearson DL (1988) Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44(3):837–845. https://doi.org/10.2307/2531595
Dua D, Graff C (2017) UCI machine learning repository. http://archive.ics.uci.edu/ml. Accessed 20 Nov 2022
European Commission (2020) On artificial intelligence—a European approach to excellence and trust. https://ec.europa.eu/info/sites/info/files/commission-white-paper-artificial-intelligence-feb2020_en.pdf. Accessed 20 Nov 2022
Fernández-Delgado M, Cernadas E, Barro S, Amorim D (2014) Do we need hundreds of classifiers to solve real world classification problems? J Mach Learn Res 15(1):3133–3181
Gower JC (1971) A general coefficient of similarity and some of its properties. Biometrics 27(4):857–871. https://doi.org/10.2307/2528823
Groemping U (2019) South German credit data: correcting a widely used data set. Technical report 4/2019, Department II, Beuth University of Applied Sciences Berlin. http://www1.beuth-hochschule.de/FB_II/reports/Report-2019-004.pdf. Accessed 20 Nov 2022
Laabs von Holt BH (2020) timbR: Tree interpretation methods based on range, r package version 0.1.0. https://github.com/imbs-hl/timbR. Accessed 20 Nov 2022
Laabs von Holt BH, Westenberger A, König IR (2022) Identification of representative trees in random forests based on a new tree-based distance measure. biorXiv. https://doi.org/10.1101/2022.05.15.492004
Maechler M, Rousseeuw P, Struyf A, Hubert M, Hornik K (2022) Cluster: cluster analysis basics and extensions. R package version 2.1.4. https://CRAN.R-project.org/package=cluster. Accessed 20 Nov 2022
Miller G (1956) The magical number seven, plus or minus two: some limits on our capacity for processing information. Psychol Rev 63(2):81–97. https://doi.org/10.1037/h0043158
Molnar C (2022) Interpretable machine learning, 2nd edn. https://christophm.github.io/interpretable-ml-book. Accessed 20 Nov 2022
Moro S, Cortez P, Rita P (2014) A data-driven approach to predict the success of bank telemarketing. Decis Support Syst 62:22–31. https://doi.org/10.1016/j.dss.2014.03.001
Murtagh F, Contreras P (2017) Algorithms for hierarchical clustering: an overview, ii. WIREs Data Min Knowl Discov. https://doi.org/10.1002/widm.1219
Probst P, Boulesteix AL, Bischl B (2021) Tunability: importance of hyperparameters of machine learning algorithms. J Mach Learn Res 20(1):1934–1965
Ridgeway G (2020) Generalized boosted models: a guide to the gbm package. https://cran.r-project.org/web/packages/gbm/vignettes/gbm.pdf. Accessed 20 Nov 2022
Robnik-SŠikonja M, Bohanec M (2018) Perturbation-based explanations of prediction models. Springer International Publishing, Cham, pp 159–175
Szepannek G (2017) On the practical relevance of modern machine learning algorithms for credit scoring applications. WIAS Rep Ser 29:88–96. https://doi.org/10.20347/wias.report.29
Szepannek G (2022) An overview on the landscape of r packages for open source scorecard modelling. Risks. https://doi.org/10.3390/risks10030067
Szepannek G, Lübke K (2021) Facing the challenges of developing fair risk scoring models. Front Artif Intell 4:117. https://doi.org/10.3389/frai.2021.681915
Szepannek G, Lübke K (2022) Explaining artificial intelligence with care. KI Künstliche Intelligenz. https://doi.org/10.1007/s13218-022-00764-8
Szepannek G, Lübke K (2023) How much do we see? on the explainability of partial dependence plots for credit risk scoring. Argum Oecon. https://doi.org/10.15611/aoe.2023.1.07
Therneau TM, Atkinson EJ (2015) An introduction to recursive partitioning using the rpart routines. https://www.biostat.wisc.edu/~kbroman/teaching/statgen/2004/refs/therneau.pdf. Accessed 20 Nov 2022
Vanschoren J, van Rijn JN, Bischl B, Torgo L (2013) OpenML: networked science in machine learning. SIGKDD Explor 15(2):49–60. https://doi.org/10.1145/2641190.2641198