RL from Human Feedback AND ChatGPT: training
Common descendants
4 Documents
2023-01-03 About