About This Document
- sl:arxiv_author :
- sl:arxiv_firstAuthor : Canwen Xu
- sl:arxiv_num : 2203.06169
- sl:arxiv_published : 2022-03-11T18:53:12Z
- sl:arxiv_summary : In this paper, we propose LaPraDoR, a pretrained dual-tower dense retriever
that does not require any supervised data for training. Specifically, we first
present Iterative Contrastive Learning (ICoL) that iteratively trains the query
and document encoders with a cache mechanism. ICoL not only enlarges the number
of negative instances but also keeps representations of cached examples in the
same hidden space. We then propose Lexicon-Enhanced Dense Retrieval (LEDR) as a
simple yet effective way to enhance dense retrieval with lexical matching. We
evaluate LaPraDoR on the recently proposed BEIR benchmark, including 18
datasets of 9 zero-shot text retrieval tasks. Experimental results show that
LaPraDoR achieves state-of-the-art performance compared with supervised dense
retrieval models, and further analysis reveals the effectiveness of our
training strategy and objectives. Compared to re-ranking, our lexicon-enhanced
approach can be run in milliseconds (22.5x faster) while achieving superior
performance.@en
- sl:arxiv_title : LaPraDoR: Unsupervised Pretrained Dense Retriever for Zero-Shot Text Retrieval@en
- sl:arxiv_updated : 2022-03-11T18:53:12Z
- sl:bookmarkOf : https://arxiv.org/abs/2203.06169
- sl:creationDate : 2022-03-29
- sl:creationTime : 2022-03-29T08:03:18Z
Documents with similar tags (experimental)