Guillaume Lample sur Twitter : "Last year, we showed that you can outperform a 24-layer transformer in language modeling with just...
Tags:
About This Document
File info