Efficient image dataset classification difficulty estimation for predicting deep-learning accuracy

The Visual Computer - Tập 37 Số 6 - Trang 1593-1610 - 2021
Scheidegger, Florian1,2, Istrate, Roxana2,3, Mariani, Giovanni2, Benini, Luca1,4, Bekas, Costas2, Malossi, Cristiano2
1ETH Zürich, Zürich, Switzerland
2IBM Research - Zürich, Rüschlikon, Switzerland
3Queen’s University of Belfast, Northern Ireland, UK
4Università di Bologna, Bologna, Italy

Tóm tắt

In the deep-learning community, new algorithms are published at a very fast pace. Therefore, solving an image classification problem for new datasets becomes a challenging task, as it requires to re-evaluate published algorithms and their different configurations in order to find a close to optimal classifier. To facilitate this process, before biasing our decision toward a class of neural networks or running an expensive search over the network space, we propose to estimate the classification difficulty of the dataset. Our method computes a single number that characterizes the dataset difficulty $$97\times $$ faster than training state-of-the-art networks. The proposed method can be used in combination with network topology and hyper-parameter search optimizers to efficiently drive the search toward promising neural network configurations.

Tài liệu tham khảo

Baker, B., Gupta, O., Naik, N., Raskar, R.: Designing neural network architectures using reinforcement learning. CoRR (2016). arXiv:1611.02167 citation_journal_title=Mach. Learn.; citation_title=A theory of learning from different domains; citation_author=S Ben-David, J Blitzer, K Crammer, A Kulesza, F Pereira, JW Vaughan; citation_volume=79; citation_issue=1–2; citation_publication_date=2010; citation_pages=151-175; citation_doi=10.1007/s10994-009-5152-4; citation_id=CR2 citation_journal_title=J. Mach. Learn. Res.; citation_title=Random search for hyper-parameter optimization; citation_author=J Bergstra, Y Bengio; citation_volume=13; citation_publication_date=2012; citation_pages=281-305; citation_id=CR3 citation_title=Food-101 - mining discriminative components with random forests; citation_inbook_title=Computer Vision - ECCV 2014; citation_publication_date=2014; citation_pages=446-461; citation_id=CR4; citation_author=L Bossard; citation_author=M Guillaumin; citation_author=L Gool; citation_publisher=Springer International Publishing Cai, H., Chen, T., Zhang, W., Yu, Y., Wang, J.: Efficient architecture search by network transformation. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018) citation_journal_title=Vis. Comput.; citation_title=Multi-label image classification with recurrently learning semantic dependencies; citation_author=L Chen, R Wang, J Yang, L Xue, M Hu; citation_volume=35; citation_issue=10; citation_publication_date=2019; citation_pages=1361-1371; citation_doi=10.1007/s00371-018-01615-0; citation_id=CR6 Chen, W., Huang, H., Peng, S., Zhou, C., Zhang, C.: Yolo-face: a real-time face detector. Vis. Comput. pp. 1–9 (2020) Cimpoi, M., Maji, S., Kokkinos, I., Mohamed, S., Vedaldi, A.: Describing textures in the wild. In: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR ’14, pp. 3606–3613. IEEE Computer Society, Washington, DC, USA (2014). 10.1109/CVPR.2014.461 Coates, A., Ng, A., Lee, H.: An analysis of single-layer networks in unsupervised feature learning. In: Gordon, G., Dunson, D., Dudík, M. (eds.) Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, vol. 15, pp. 215–223. PMLR, Fort Lauderdale, FL, USA (2011). http://proceedings.mlr.press/v15/coates11a.html Courbariaux, M., Hubara, I., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarized neural networks: Training deep neural networks with weights and activations constrained to+ 1 or-1. arXiv preprint arXiv:1602.02830 (2016) Deadman, E., Higham, N.J., Ralha, R.: Blocked schur algorithms for computing the matrix square root. In: International Workshop on Applied Parallel Computing, pp. 171–182. Springer, Berlin (2012) Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE CVPR, pp. 248–255 (2009) citation_journal_title=IEEE Signal Process. Mag.; citation_title=The mnist database of handwritten digit images for machine learning research [best of the web]; citation_author=L Deng; citation_volume=29; citation_issue=6; citation_publication_date=2012; citation_pages=141-142; citation_doi=10.1109/MSP.2012.2211477; citation_id=CR13 citation_journal_title=J. Multivar. Anal.; citation_title=The fréchet distance between multivariate normal distributions; citation_author=D Dowson, B Landau; citation_volume=12; citation_issue=3; citation_publication_date=1982; citation_pages=450-455; citation_doi=10.1016/0047-259X(82)90077-X; citation_id=CR14 citation_journal_title=J. Mach. Learn. Res.; citation_title=Domain-adversarial training of neural networks; citation_author=Y Ganin, E Ustinova, H Ajakan, P Germain, H Larochelle, F Laviolette, M Marchand, V Lempitsky; citation_volume=17; citation_issue=1; citation_publication_date=2016; citation_pages=2030-2096; citation_id=CR15 Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset (2007) Gupta, S., Agrawal, A., Gopalakrishnan, K., Narayanan, P.: Deep learning with limited numerical precision. In: International Conference on Machine Learning, pp. 1737–1746 (2015) Hanzhang H., Debadeepta Dey, M.H.J.A.B.: Anytime neural network: a versatile trade-off between computation and accuracy (2017) He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp. 1026–1034 (2015) He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in Neural Information Processing Systems, pp. 6626–6637 (2017) citation_journal_title=IEEE Trans. Pattern Anal. Mach. Intell.; citation_title=Complexity measures of supervised classification problems; citation_author=TK Ho, M Basu; citation_volume=3; citation_publication_date=2002; citation_pages=289-300; citation_id=CR22 Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: Efficient convolutional neural networks for mobile vision applications. CoRR (2017). arXiv:1704.04861 Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017) citation_journal_title=J. Classif; citation_title=Comparing partitions; citation_author=L Hubert, P Arabie; citation_volume=2; citation_issue=1; citation_publication_date=1985; citation_pages=193-218; citation_doi=10.1007/BF01908075; citation_id=CR25 Istrate, R., Scheidegger, F., Mariani, G., Nikolopoulos, D., Bekas, C., Malossi, A.C.I.: Tapas: Train-less accuracy predictor for architecture search (2018) Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images (2009) Kumar, A., Sattigeri, P., Wadhawan, K., Karlinsky, L., Feris, R., Freeman, B., Wornell, G.: Co-regularized alignment for unsupervised domain adaptation. In: Advances in Neural Information Processing Systems, pp. 9367–9378 (2018) Lee, C.Y., Xie, S., Gallagher, P., Zhang, Z., Tu, Z.: Deeply-supervised nets. In: Artificial Intelligence and Statistics, pp. 562–570 (2015) Li, L., Jamieson, K., DeSalvo, G., Rostamizadeh, A., Talwalkar, A.: Hyperband: Bandit-based configuration evaluation for hyperparameter optimization (2016) Liu, C., Zoph, B., Neumann, M., Shlens, J., Hua, W., Li, L.J., Fei-Fei, L., Yuille, A., Huang, J., Murphy, K.: Progressive neural architecture search. In: The European Conference on Computer Vision (ECCV) (2018) citation_journal_title=Vis. Comput; citation_title=Deep similarity network fusion for 3d shape classification; citation_author=L Luciano, AB Hamza; citation_volume=35; citation_issue=6–8; citation_publication_date=2019; citation_pages=1171-1180; citation_doi=10.1007/s00371-019-01668-9; citation_id=CR32 Lucic, M., Kurach, K., Michalski, M., Gelly, S., Bousquet, O.: Are gans created equal? A large-scale study. In: Advances in neural information processing systems, pp. 698–707 (2018) Miikkulainen, R., Liang, J., Meyerson, E., Rawal, A., Fink, D., Francon, O., Raju, B., Shahrzad, H., Navruzyan, A., Duffy, N., et al.: Evolving deep neural networks. In: Artificial Intelligence in the Age of Neural Networks and Brain Computing, pp. 293–312. Elsevier (2019) Mundhenk, T.N., Konjevod, G., Sakla, W.A., Boakye, K.: A large contextual dataset for classification, detection and counting of cars with deep learning. In: European Conference on Computer Vision, pp. 785–800. Springer (2016) Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning. In: NIPS Workshop on Deep Learning and Unsupervised Feature Learning, vol. 2011, p. 5 (2011) Nilsback, M.E., Zisserman, A.: Automated flower classification over a large number of classes. In: 2008 Sixth Indian Conference on Computer Vision, Graphics Image Processing, pp. 722–729 (2008) Pham, H., Guan, M.Y., Zoph, B., Le, Q.V., Dean, J.: Efficient neural architecture search via parameter sharing. CoRR (2018). arXiv:1802.03268 Pourashraf, P., Tomuro, N.: Use of a large image repository to enhance domain dataset for flyer classification. In: International Symposium on Visual Computing, pp. 609–617. Springer, Berlin (2015) Quattoni, A., Torralba, A.: Recognizing indoor scenes. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 413–420 (2009). 10.1109/CVPR.2009.5206537 Real, E., Moore, S., Selle, A., Saxena, S., Suematsu, Y.L., Tan, J., Le, Q., Kurakin, A.: Large-scale evolution of image classifiers (2017) Rosenberg, A., Hirschberg, J.: V-measure: A conditional entropy-based external cluster evaluation measure. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL) (2007) citation_journal_title=J. Comput. Appl. Math.; citation_title=Silhouettes: a graphical aid to the interpretation and validation of cluster analysis; citation_author=PJ Rousseeuw; citation_volume=20; citation_publication_date=1987; citation_pages=53-65; citation_doi=10.1016/0377-0427(87)90125-7; citation_id=CR43 Scheidegger, F., Istrate, R., Mariani, G., Benini, L., Bekas, C., Malossi, C.: Efficient image dataset classification difficulty estimation for predicting deep-learning accuracy (2018) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 25, pp. 2951–2959. Curran Associates, Inc. (2012). http://papers.nips.cc/paper/4522-practical-bayesian-optimization-of-machine-learning-algorithms.pdf Stallkamp, J., Schlipsing, M., Salmen, J., Igel, C.: The german traffic sign recognition benchmark: A multi-class classification competition. In: The 2011 International Joint Conference on Neural Networks, pp. 1453–1460 (2011). 10.1109/IJCNN.2011.6033395 Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015) Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016) Tieleman, T., Hinton, G.: Lecture 6.5-rmsprop, coursera: Neural networks for machine learning. University of Toronto, Technical Report (2012) Tudor Ionescu, R., Alexe, B., Leordeanu, M., Popescu, M., Papadopoulos, D.P., Ferrari, V.: How hard can it be? estimating the difficulty of visual search in an image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2157–2166 (2016) citation_journal_title=J. Mach. Learn. Res.; citation_title=Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance; citation_author=NX Vinh, J Epps, J Bailey; citation_volume=11; citation_publication_date=2010; citation_pages=2837-2854; citation_id=CR52 citation_journal_title=IEEE Trans. Image Process.; citation_title=Image quality assessment: from error visibility to structural similarity; citation_author=Z Wang, AC Bovik, HR Sheikh, EP Simoncelli; citation_volume=13; citation_issue=4; citation_publication_date=2004; citation_pages=600-612; citation_doi=10.1109/TIP.2003.819861; citation_id=CR53 Xiao, H., Rasul, K., Vollgraf, R.: Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms (2017) Xie, L., Yuille, A.: Genetic cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1379–1388 (2017) Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: A 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. (2017) Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. CoRR (2016). arXiv:1611.01578 Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8697–8710 (2018)