https://github.com/carperai/instructgpt

For experiments involving instruct gpt. Currently used for documenting open research questions.
https://github.com/carperai/instructgpt

Last synced: 6 days ago
JSON representation

For experiments involving instruct gpt. Currently used for documenting open research questions.

Host: GitHub
URL: https://github.com/carperai/instructgpt
Owner: CarperAI
License: mit
Created: 2022-10-10T22:50:30.000Z (almost 4 years ago)
Default Branch: main
Last Pushed: 2022-11-08T17:48:38.000Z (over 3 years ago)
Last Synced: 2025-01-08T02:15:42.194Z (over 1 year ago)
Size: 5.86 KB
Stars: 71
Watchers: 9
Forks: 4
Open Issues: 25
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# BigModelName

This repository is for open-questions relating to RLHF and InstructGPT as pertaining to BigModelName.

## Open Questions

* What is the preference rate of PPO vs PPO-Ptx? Why was 27.8 chosen as the mixing factor between the pre-training gradients and the PPO gradients?
* What do the gradient norms and gradient noise scales look like for PPO grads vs pre-training grads?
* How important is SFT pretraining on human-written completions?

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/carperai/instructgpt

Awesome Lists containing this project

README