Sherpa: Robust hyperparameter optimization for machine learning

SoftwareX - Tập 12 - Trang 100591 - 2020
Lars Hertel1, Julian Collado2, Peter Sadowski3, Jordan Ott2, Pierre Baldi2
1Department of Statistics, Donald Bren School of Information and Computer Sciences, University of California, Irvine Bren Hall 2019 Irvine, CA 92697-1250, USA
2Department of Computer Science, Donald Bren School of Information and Computer Sciences, University of California, Irvine 3019 Donald Bren Hall Irvine, CA 92697-3435, USA
3Information and Computer Science, University of Hawai’i at Mãnoa, 1680 East-West Rd, Honolulu, HI 96822, USA

Tài liệu tham khảo

Hutter, 2011, Sequential model-based optimization for general algorithm configuration, 507 Snoek, 2012, Practical bayesian optimization of machine learning algorithms, 2951 Bergstra, 2013, Hyperopt: A python library for optimizing the hyperparameters of machine learning algorithms, 13, 10.25080/Majora-8b375195-003 Authors, 2016 Klein A, Falkner S, Mansur N, Hutter F. RoBO: A flexible and robust Bayesian optimization framework in Python. In: NIPS 2017 Bayesian optimization workshop; 2017. Kandasamy, 2020, Tuning hyperparameters without grad students: Scalable and robust Bayesian optimisation with dragonfly, J Mach Learn Res, 21, 1 Wu, 2016, The parallel knowledge gradient method for batch bayesian optimization, 3126 Wu, 2017, Bayesian optimization with gradients, 5267 Bischla, 2017, mlrMBO: A Modular framework for model-based optimization of expensive black-box functions, stat, 1050, 9 Li, 2017, Hyperband: A novel bandit-based approach to hyperparameter optimization, J Mach Learn Res, 18, 6765 Falkner, 2018, BOHB: Robust and efficient hyperparameter optimization at scale, vol. 80, 1437 Jaderberg, 2017 Igel, 2006, A computational efficient covariance matrix update and a (1+ 1)-CMA for evolution strategies, 453 Olson, 2016, Applications of evolutionary computation: 19th european conference, evoapplications 2016, porto, Portugal, march 30 – april 1, 2016, proceedings, part I, 123 Olson, 2016, Evaluation of a tree-based pipeline optimization tool for automating data science, 485, 10.1145/2908812.2908918 Kotthoff, 2017, Auto-WEKA 2.0: Automatic model selection and hyperparameter optimization in WEKA, J Mach Learn Res, 18, 826 Feurer, 2015, Efficient and robust automated machine learning, 2962 Holmes, 1994, Weka: A machine learning workbench, 357 Pedregosa, 2011, Scikit-learn: Machine learning in python, J Mach Learn Res, 12, 2825 Chollet, 2015 Pumperla, 2019 Jin, 2019, Auto-keras: An efficient neural architecture search system, 1946 Golovin, 2017, Google vizier: A service for black-box optimization, 1487 Falcon, 2017 Liaw, 2018 Akiba, 2019, Optuna: A next-generation hyperparameter optimization framework, 2623 Gustafson, 2018 O’Malley, 2019 Sadowski P, Baldi P. Neural network regression with beta, Dirichlet, and Dirichlet-multinomial outputs [Unpublished results]. Cao, 2019, Convolutional neural networks for crystal material property prediction using hybrid orbital-field matrix and magpie descriptors, Crystals, 9, 191, 10.3390/cryst9040191 Baldi, 2019, Improved energy reconstruction in NOvA with regression convolutional neural networks, Phys Rev D, 99, 10.1103/PhysRevD.99.012011 Ritter, 2019, Hyperparameter optimization for image analysis: application to prostate tissue images and live cell data of virus-infected cells, Int J Comput Assist Radiol Surg, 1 Langford, 2019, Robust signal classification using siamese networks, 1 Bergstra, 2012, Random search for hyper-parameter optimization, J Mach Learn Res, 13, 281 Li, 2018, Parallelizing hyperband for large-scale tuning Inselberg, 1987, Parallel coordinates for visualizing multi-dimensional geometry, 25 Hauser, 2002, Angular brushing of extended parallel coordinates, 127 Chang, 2019 Gentzsch, 2001, Sun grid engine: Towards creating a compute power grid, 35 Yoo, 2003, Slurm: Simple linux utility for resource management, 44 Deng, 2012, The MNIST database of handwritten digit images for machine learning research [best of the web], IEEE Signal Process Mag, 29, 141, 10.1109/MSP.2012.2211477 Kingma DP, Ba JL. Adam: A method for stochastic gradient descent. In: ICLR: international conference on learning representations; 2015. Rasp, 2018, Deep learning to represent subgrid processes in climate models, Proc Natl Acad Sci, 115, 9684, 10.1073/pnas.1810286115 Ioffe, 2015, Batch normalization: Accelerating deep network training by reducing internal covariate shift, vol. 37, 448 Srivastava, 2014, Dropout: a simple way to prevent neural networks from overfitting, J Mach Learn Res, 15, 1929 Baldi, 2013, Understanding dropout, 2814 Agostinelli, 2014