"Bidirectional Encoder Representations from Transformers": pretraining technique for NLP. [Google AI blog post]( > BERT is designed to pre-train deep bidirectional representations by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT representations can be fine-tuned with just one additional output layer Builds on [#The Transformer](/tag/attention_is_all_you_need) Code and pre-trained models open-sourced on Nov 3rd, 2018.
18 Documents (Long List