https://github.com/guyulongcs/Deep-Learning-for-Search-Recommendation-Advertisements/blob/master/07_LLM/01_LLM_Classical/2017%20%28OpenAI%29%20%28NIPS%29%20%5BRLHF%5D%20Deep%20Reinforcement%20Learning%20from%20Human%20Preferences.pdf
Last synced: 8 days ago
JSON representation