GPT-2 ; Knowledge distillation AND ML/NLP blog
Common descendants