Large-Scale Distributed Training of Transformers for Chemical Fingerprinting

Journal of Chemical Information and Modeling - Tập 62 Số 20 - Trang 4852-4862 - 2022
Hisham Abdel-Aty1, Ian R. Gould1
1Department of Chemistry and Institute of Chemical Biology, Imperial College London, Molecular Sciences Research Hub, Shepherd’s Bush, LondonW12 0BZ, UK

Tóm tắt

Từ khóa


Tài liệu tham khảo

10.1039/C8SC04175J

10.1007/s10822-016-9938-8

10.1021/acs.jcim.6b00601

10.1021/acscentsci.9b00576

Jin W., 2017, Adv. Neur. Inf. Proc. Sys., 30

10.1021/ci300415d

10.1186/s13321-019-0341-z

10.1021/ja902302h

Lowe, D. Chemical Reactions from US Patents (1976-Sep2016), 2017.

10.1039/C9SC04944D

10.1039/C7SC02664A

Chithrananda, S.; Grand, G.; Ramsundar, B.ChemBERTa: Large-Scale Self-Supervised Pretraining for Molecular Property Prediction. arXiv:2010.09885 [physics, q-bio] 2020.

Fabian, B.; Edlich, T.; Gaspar, H.; Segler, M.; Meyers, J.; Fiscato, M.; Ahmed, M.Molecular Representation Learning with Language Models and Domain-Relevant Auxiliary Tasks. arXiv:2011.13230 [cs] 2020.

10.1021/acs.jpclett.1c03058

Duvenaud D. K., 2015, Adv. Neur. Inf. Proc. Sys., 28

Gilmer J., 2017, International Conference on Machine Learning, 1263

10.1021/ci100050t

Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. In Proceedings of the 31st International Conference on Neural Information Processing Systems; NIPS’17; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 6000–6010.

Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers); Association for Computational Linguistics: Minneapolis, Minnesota, 2019; pp. 4171–4186.

10.1021/ci00057a005

10.1039/C8SC02339E

Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V.RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv:1907.11692 [cs] 2019.

10.1186/1758-2946-5-26

10.1021/ci400466r

10.1021/acs.jcim.5b00559

10.1093/nar/gkaa971

10.1093/nar/gkr777

Landrum, G.; Tosco, P.; Kelley, B.; sriniker; gedeck; Schneider, N.; Vianello, R.; Ric; Dalke, A.; Cole, B.; Savelyev, A.; Swain, M.; Turk, S.; Dan, N.; Vaucher, A.; Kawashima, E.; Wójcikowski, M.; Probst, D.; godin, g.; Cosgrove, D.; Pahl, A.; JP; Berenger, F.; strets123; Varjo, J. L.; O’Boyle, N.; Fuller, P.; Jensen, J. H.; Sforna, G.; Gavid, D. Rdkit/Rdkit: 2020_03_1 (Q1 2020) Release; Zenodo, 2020.

Salle, A. terashuf. 2017. https://github.com/alexandres/terashuf.

Sennrich, R.; Haddow, B.; Birch, A. Neural Machine Translation of Rare Words with Subword Units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers); Association for Computational Linguistics: Berlin, Germany, 2016; pp. 1715–1725.

10.18653/v1/D18-2012

Manning C. D., 2012, Introduction to Information Retrieval

Paszke A., 2019, Adv. Neur. Inf. Proc. Sys. 32, 8024

Distributed Training and Fast inter-GPU Communication with NCCL | GTC Silicon Valley2019https://on-demand-gtc.gputechconf.com/gtcnew/sessionview.php?sessionName=s9656-distributed+training+and+fast+inter-gpu+communication+with+nccl (accessed 2021–05 -18).

Wolf T., 2020, Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 38, 10.18653/v1/2020.emnlp-demos.6

Reimers, N.; Gurevych, I.Sentence-BERT: Sentence Embeddings Using Siamese BERT-Networks. arXiv:1908.10084 [cs] 2019.

10.1021/ci300124c

10.1016/j.chembiol.2016.07.023

Ramsundar, B; Eastman, P; Feinberg, E; Gomes, J; Leswing, K; Pappu, A; Wu, M; Pande, V. Deepchem: Democratizing Deep-Learning for Drug Discovery, Quantum Chemistry, Materials Science and Biology, 2016, https://github.com/deepchem/deepchem

Pedregosa F., 2011, J. Mach. Learn. Res., 12, 2825