"Bidirectional Encoder Representations from Transformers": pretraining technique for NLP.
[Google AI blog post](https://ai.googleblog.com/2018/11/open-sourcing-bert-state-of-art-pre.html)
> BERT is designed to pre-train
deep bidirectional representations by jointly
conditioning on both left and right context in
all layers. As a result, the pre-trained BERT
representations can be fine-tuned with just one
additional output layer
Builds on [#The Transformer](/tag/attention_is_all_you_need)
Code and pre-trained models open-sourced on Nov 3rd, 2018.