Direct Preference Optimization: Your Language Model is Secretly a Reward Model Tập 36 - Trang 53728-53741 - 2023
Rafailov, Rafael, Sharma, Archit, Mitchell, Eric, Manning, Christopher D, Ermon, Stefano, Finn, Chelsea
Deep Bidirectional Language-Knowledge Graph Pretraining Tập 35 - Trang 37309-37323 - 2022
Yasunaga, Michihiro, Bosselut, Antoine, Ren, Hongyu, Zhang, Xikun, Manning, Christopher D, Liang, Percy S., Leskovec, Jure
Pile of Law: Learning Responsible Data Filtering from the Law and a 256GB Open-Source Legal Dataset Tập 35 - Trang 29217-29234 - 2022
Henderson, Peter, Krass, Mark, Zheng, Lucia, Guha, Neel, Manning, Christopher D, Jurafsky, Dan, Ho, Daniel