Current progress, challenges, and future perspectives of language models for protein representation and protein design

The Innovation - Tập 4 - Trang 100446 - 2023
Tao Huang1, Yixue Li1,2,3,4,5
1Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
2Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310024, China
3Guangzhou Laboratory, Guangzhou, 510005, China
4School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, 200240, China
5Collaborative Innovation Center for Genetics and Development, Fudan University, Shanghai 200433, China

Tài liệu tham khảo

Vu, 2023, Linguistically inspired roadmap for building biologically reliable protein language models, Nat. Mach. Intell., 10, 1038 Tunyasuvunakool, 2021, Highly accurate protein structure prediction for the human proteome, Nature, 596, 590, 10.1038/s41586-021-03828-1 Dill, 2012, The protein-folding problem, 50 years on, Science, 338, 1042, 10.1126/science.1219021 Huang, 2023, Analysis and prediction of protein stability based on interaction network, gene ontology, and KEGG pathway enrichment scores, Biochim. Biophys. Acta, Proteins Proteomics, 1871, 10.1016/j.bbapap.2023.140889 Unsal, 2022, Learning functional properties of proteins with language models, Nat. Mach. Intell., 4, 227, 10.1038/s42256-022-00457-9 Huang, 2022, A backbone-centred energy function of neural networks for protein design, Nature, 602, 523, 10.1038/s41586-021-04383-5 Lutz, 2023, Top-down design of protein architectures with reinforcement learning, Science (New York, N.Y.), 380, 266, 10.1126/science.adf6591 Howarth, 2015, Say it with proteins: an alphabet of crystal structures, Nat. Struct. Mol. Biol., 22, 349, 10.1038/nsmb.3011 Madani, 2023, Large language models generate functional protein sequences across diverse families, Nat. Biotechnol., 10, 1038 Russ, 2020, An evolution-based model for designing chorismate mutase enzymes, Science (New York, N.Y.), 369, 440, 10.1126/science.aba3304