Interpreting tree ensembles with inTrees

Houtao Deng1
1San Francisco, USA

Tóm tắt

Từ khóa


Tài liệu tham khảo

Adnan, M.N., Islam, M.Z.: Forex++: a new framework for knowledge discovery from decision forests. Austral. J. Inf. Syst. https://doi.org/10.3127/ajis.v21i0.1539 (2017)

Agrawal, R., Srikant, R., et al.: Fast algorithms for mining association rules. In: Proceedings of 20th International Conference on Very Large Data Bases, VLDB, Vol. 1215, pp. 487–499 (1994)

Bastani, O., Kim, C., Bastani, H.: Interpretability via model extraction. arXiv preprint arXiv:1706.09773 (2017)

Bastani, O., Kim, C., Bastani, H.: Interpreting blackbox models via model extraction. arXiv preprint arXiv:1705.08504 (2017)

Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and Regression Trees. Wadsworth, Belmont (1984)

Breiman, L., Shang, N.: Born again trees. University of California, Berkeley, Berkeley, CA, Technical Report (1996)

Deng, H.: Guided random forest in the RRF package. arXiv preprint arXiv:1306.0237 (2013)

Deng, H.: Interpreting tree ensembles with in trees. arXiv preprint arXiv:1408.5456 (2014)

Deng, H., Runger, G.: Gene selection with guided regularized random forest. Pattern Recogn. 46(12), 3483–3489 (2013)

Deng, H., Runger, G., Tuv, E., Bannister, W.: CBC: An associative classifier with a small number of rules. Decis. Support Syst. 59, 163–170 (2014)

Domingos, P.: Knowledge acquisition from examples via multiple models. In: Proceedings of the Fourteenth International Conference on Machine Learning, pp. 98–106. Morgan Kaufmann (1997)

Eskandarian, S., Bahrami, P., Kazemi, P.: A comprehensive data mining approach to estimate the rate of penetration: application of neural network, rule based models and feature ranking. J. Pet. Sci. Eng. 156, 605–615 (2017)

Fokkema, M.: PRE: an R package for fitting prediction rule ensembles. arXiv preprint arXiv:1707.07149 (2017)

Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001)

Friedman, J.H., Popescu, B.E.: Predictive learning via rule ensembles. Ann. Appl. Stat. 2, 916–954 (2008)

Gallego-Ortiz, C., Martel, A.L.: Using quantitative features extracted from t2-weighted MRI to improve breast MRI computer-aided diagnosis (CAD). PLoS ONE 12(11), e0187501 (2017)

Gargett, A., Barnden, J.: Modeling the interaction between sensory and affective meanings for detecting metaphor. In: Proceedings of the Third Workshop on Metaphor in NLP, pp. 21–30 (2015)

Guidotti, R., Monreale, A., Turini, F., Pedreschi, D., Giannotti, F.: A survey of methods for explaining black box models. arXiv preprint arXiv:1802.01933 (2018)

Gurrutxaga, I., Pérez, J.M., Arbelaitz, O., Muguerza, J., Martín, J.I., Ansuategi, A.: CTC: an alternative to extract explanation from bagging. In: Conference of the Spanish Association for Artificial Intelligence, pp. 90–99. Springer (2007)

Hahsler, M., Grün, B., Hornik, K.: Introduction to a rules—mining association rules and frequent item sets. SIGKDD Explorations (2007)

Hara, S., Hayashi, K.: Making tree ensembles interpretable. arXiv preprint arXiv:1606.05390 (2016)

Hara, S., Hayashi, K.: Making tree ensembles interpretable: a bayesian model selection approach. arXiv preprint arXiv:1606.09066 (2016)

Khalid, M.H., Tuszynski, P.K., Szlek, J., Jachowicz, R., Mendyk, A.: From black-box to transparent computational intelligence models: a pharmaceutical case study. In: 2015 13th International Conference on Frontiers of Information Technology (FIT), pp. 114–118. IEEE (2015)

Liaw, A., Wiener, M.: Classification and regression by random forest. R News 2(3), 18–22 (2002)

Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml

Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Proceeding of the 1998 International Conference on Knowledge Discovery and Data Mining, pp. 80–86. ACM (1998)

Meinshausen, N.: Node harvest. Ann. Appl. Stat. 4, 2049–2072 (2010)

Miraboutalebi, S.M., Kazemi, P., Bahrami, P.: Fatty acid methyl ester (FAME) composition used for estimation of biodiesel cetane number employing random forest and artificial neural networks: a new approach. Fuel 166, 143–151 (2016)

Narayanan, I., Wang, D., Jeon, M., Sharma, B., Caulfield, L., Sivasubramaniam, A., Cutler, B., Liu, J., Khessib, B., Vaid, K.: Ssd failures in datacenters: What? when? and why? In: Proceedings of the 9th ACM International on Systems and Storage Conference, p. 7. ACM (2016)

Ridgeway, G., et al.: GBM: Generalized boosted regression models. R Package Version 1(3), 55 (2006)

Szlęk, J., Pacławski, A., Lau, R., Jachowicz, R., Kazemi, P., Mendyk, A.: Empirical search for factors affecting mean particle size of PLGA microspheres containing macromolecular drugs. Comput. Methods Programs Biomed. 134, 137–147 (2016)

Therneau, T.M., Atkinson, B., Ripley, B.: RPART: Recursive partitioning. R Package Version 3(3.8) (2010)

Vandewiele, G., Lannoye, K., Janssens, O., Ongenae, F., De Turck, F., Van Hoecke, S.: A genetic algorithm for interpretable model extraction from decision tree ensembles. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 104–115. Springer (2017)

Wang, X., Lin, P., Ho, J.W.: Discovery of cell-type specific dna motif grammar in cis-regulatory elements using random forest. BMC Genom. 19(1), 929 (2018)

Zhou, Y., Hooker, G.: Interpreting models via single tree approximation. arXiv preprint arXiv:1610.09036 (2016)