ChatGPT: training ; RL from Human Feedback AND Slides
Common descendants