Jittor: a novel deep learning framework with meta-operators and unified graph execution

Springer Science and Business Media LLC - Tập 63 Số 12 - 2020
Shi‐Min Hu1, Dun Liang1, Guo-Ye Yang1, Guowei Yang1, Wenyang Zhou1
1Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China

Tóm tắt

Từ khóa


Tài liệu tham khảo

Collobert R, Bengio S, Mariethoz J. Torch: a modular machine learning software library. In: Proceedings of IDIAP Research Report, 2002

Al-Rfou R, Alain G, Almahairi A, et al. Theano: a python framework for fast computation of mathematical expressions. 2016. ArXiv:1605.02688

Jia Y, Shelhamer E, Donahue J, et al. Caffe: convolutional architecture for fast feature embedding. In: Proceedings of ACM International Conference on Multimedia, 2014. 675–678

Abadi M, Barham P, Chen J, et al. Tensorflow: a system for large-scale machine learning. In: Proceedings of the 12th Symposium on Operating Systems Design and Implementation, 2016. 265–283

Paszke A, Gross S, Massa F, et al. Pytorch: an imperative style, high-performance deep learning library. In: Proceedings of Advances in Neural Information Processing Systems, 2019. 8024–8035

Cyphers D S, Bansal A K, Bhiwandiwalla A, et al. Intel nGraph: an intermediate representation, compiler, and executor for deep learning. 2018. ArXiv:1801.08058

Schoenholz S S, Cubuk E D. JAX, M.D.: end-to-end differentiable, hardware accelerated, molecular dynamics in pure Python. 2019. ArXiv:1912.04232

Chen T, Moreau T, Jiang Z, et al. TVM: an automated end-to-end optimizing compiler for deep learning. In: Proceedings of the 13th Symposium on Operating Systems Design and Implementation, 2018. 578–594

Nickolls J, Buck I, Garland M, et al. Scalable parallel programming with CUDA. In: Proceedings of IEEE Hot Chips 20 Symposium (HCS), 2008

Thompson J A, Schlachter K. An introduction to the OpenCL programming model. 2012. https://cims.nyu.edu/∼schlacht/OpenCLModel.pdf

Oliphant T E. Guide to NumPy. North Charleston: CreateSpace Publishing, 2015

Chen T Q, Li M, Li Y T, et al. MXNet: a flexible and efficient machine learning library for heterogeneous distributed systems. 2015. ArXiv:1512.01274

Chetlur S, Woolley C, Vandermersch P, et al. cuDNN: efficient primitives for deep learning. 2014. ArXiv:1410.0759

Lattner C, Adve V S. LLVM: a compilation framework for lifelong program analysis & transformation. In: Proceedings of International Symposium on Code Generation and Optimization, 2004. 97–104

Rumelhart D E, Hinton G E, Williams R J. Learning representations by back-propagating errors. Nature, 1986, 323: 533–536

Gabriel E, Fagg G E, Bosilca G, et al. Open MPI: goals, concept, and design of a next generation MPI implementation. In: Proceedings of European Parallel Virtual Machine/Message Passing Interface Users’ Group Meeting, 2004. 97–104

Tokui S, Oono K. Chainer: a next-generation open source framework for deep learning. In: Proceedings of Workshop on Machine Learning Systems (LearningSys), 2015

Goodfellow I J, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets. In: Proceedings of Advances in Neural Information Processing Systems, 2014

Sutton R S, Barto A G. Reinforcement Learning: An Introduction. Cambridge: MIT Press, 2018

He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016. 770–778

Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks. In: Proceedings of Advances in Neural Information Processing Systems, 2012. 1106–1114

Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2015. ArXiv:1409.1556

Zagoruyko S, Komodakis N. Wide residual networks. 2016. ArXiv:1605.07146

Iandola F N, Han S, Moskewicz M W, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5 MB model size. 2017. ArXiv:1602.07360

Xie S N, Girshick R, Dollar P, et al. Aggregated residual transformations for deep neural networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. 5987–5995

Gao S H, Cheng M M, Zhao K, et al. Res2Net: a new multi-scale backbone architecture. IEEE Trans Pattern Anal Mach Intell, 2019. doi: https://doi.org/10.1109/TPAMI.2019.2938758

Gulrajani I, Ahmed F, Arjovsky M, et al. Improved training of wasserstein GANs. In: Proceedings of Advances in Neural Information Processing Systems, 2017. 5767–5777

Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks. In: Proceedings of the 4th International Conference on Learning Representations, 2016

Mao X D, Li Q, Xie H R, et al. Least squares generative adversarial networks. In: Proceedings of IEEE International Conference on Computer Vision, 2017. 2813–2821

Zhu J Y, Park T, Isola P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of IEEE International Conference on Computer Vision, 2017. 2223–2232

LeCun Y, Cortes C, Burges C J C. The MNIST database of handwritten digits. 2005. http://yann.lecun.com/exdb/mnist/

Cordts M, Omran M, Ramos S, et al. The cityscapes dataset for semantic urban scene understanding. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016

Li T M. Differentiable visual computing. 2019. ArXiv:1904.12228

Kato H, Ushiku Y, Harada T. Neural 3D mesh renderer. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018. 3907–3916

Hu Y M, Anderson L, Li T M, et al. DiffTaichi: differentiable programming for physical simulation. In: Proceedings of the 8th International Conference on Learning Representations, 2020