About This Document
- sl:arxiv_author :
- sl:arxiv_firstAuthor : Kelvin Guu
- sl:arxiv_num : 2002.08909
- sl:arxiv_published : 2020-02-10T18:40:59Z
- sl:arxiv_summary : Language model pre-training has been shown to capture a surprising amount of
world knowledge, crucial for NLP tasks such as question answering. However,
this knowledge is stored implicitly in the parameters of a neural network,
requiring ever-larger networks to cover more facts.
To capture knowledge in a more modular and interpretable way, we augment
language model pre-training with a latent knowledge retriever, which allows the
model to retrieve and attend over documents from a large corpus such as
Wikipedia, used during pre-training, fine-tuning and inference. For the first
time, we show how to pre-train such a knowledge retriever in an unsupervised
manner, using masked language modeling as the learning signal and
backpropagating through a retrieval step that considers millions of documents.
We demonstrate the effectiveness of Retrieval-Augmented Language Model
pre-training (REALM) by fine-tuning on the challenging task of Open-domain
Question Answering (Open-QA). We compare against state-of-the-art models for
both explicit and implicit knowledge storage on three popular Open-QA
benchmarks, and find that we outperform all previous methods by a significant
margin (4-16% absolute accuracy), while also providing qualitative benefits
such as interpretability and modularity.@en
- sl:arxiv_title : REALM: Retrieval-Augmented Language Model Pre-Training@en
- sl:arxiv_updated : 2020-02-10T18:40:59Z
- sl:bookmarkOf : https://arxiv.org/abs/2002.08909
- sl:creationDate : 2020-12-12
- sl:creationTime : 2020-12-12T02:30:25Z
- sl:relatedDoc :
Documents with similar tags (experimental)