ChatGPT: training ; Tweet AND RL from Human Feedback
Common descendants