About This Document
- sl:arxiv_author :
- sl:arxiv_firstAuthor : Canjia Li
- sl:arxiv_num : 2008.09093
- sl:arxiv_published : 2020-08-20T17:32:30Z
- sl:arxiv_summary : Pretrained transformer models, such as BERT and T5, have shown to be highly
effective at ad-hoc passage and document ranking. Due to inherent sequence
length limits of these models, they need to be run over a document's passages,
rather than processing the entire document sequence at once. Although several
approaches for aggregating passage-level signals have been proposed, there has
yet to be an extensive comparison of these techniques. In this work, we explore
strategies for aggregating relevance signals from a document's passages into a
final ranking score. We find that passage representation aggregation techniques
can significantly improve over techniques proposed in prior work, such as
taking the maximum passage score. We call this new approach PARADE. In
particular, PARADE can significantly improve results on collections with broad
information needs where relevance signals can be spread throughout the document
(such as TREC Robust04 and GOV2). Meanwhile, less complex aggregation
techniques may work better on collections with an information need that can
often be pinpointed to a single passage (such as TREC DL and TREC Genomics). We
also conduct efficiency analyses, and highlight several strategies for
improving transformer-based aggregation.@en
- sl:arxiv_title : PARADE: Passage Representation Aggregation for Document Reranking@en
- sl:arxiv_updated : 2021-06-10T17:46:31Z
- sl:bookmarkOf : https://arxiv.org/abs/2008.09093
- sl:creationDate : 2022-09-21
- sl:creationTime : 2022-09-21T23:10:09Z
Documents with similar tags (experimental)