RL from Human Feedback AND NLP@Stanford
Common descendants