Pretrained models ; Attention mechanism ; GPT: alternatives AND Dataset quality
Common descendants