An open API service indexing awesome lists of open source software.

https://github.com/carperai/instructgpt

For experiments involving instruct gpt. Currently used for documenting open research questions.
https://github.com/carperai/instructgpt

Last synced: over 1 year ago
JSON representation

For experiments involving instruct gpt. Currently used for documenting open research questions.

Awesome Lists containing this project

README

          

# BigModelName

This repository is for open-questions relating to RLHF and InstructGPT as pertaining to BigModelName.

## Open Questions

* What is the preference rate of PPO vs PPO-Ptx? Why was 27.8 chosen as the mixing factor between the pre-training gradients and the PPO gradients?
* What do the gradient norms and gradient noise scales look like for PPO grads vs pre-training grads?
* How important is SFT pretraining on human-written completions?