About This Document
- sl:arxiv_author :
- sl:arxiv_firstAuthor : Shauli Ravfogel
- sl:arxiv_num : 2305.12517
- sl:arxiv_published : 2023-05-21T17:14:31Z
- sl:arxiv_summary : In this work, we aim to connect two research areas: instruction models and
retrieval-based models. While instruction-tuned Large Language Models (LLMs)
excel at extracting information from text, they are not suitable for semantic
retrieval. Similarity search over embedding vectors allows to index and query
vectors, but the similarity reflected in the embedding is sub-optimal for many
use cases. We identify the task of retrieving sentences based on abstract
descriptions of their content. We demonstrate the inadequacy of current text
embeddings and propose an alternative model that significantly improves when
used in standard nearest neighbor search. The model is trained using positive
and negative pairs sourced through prompting an a large language model (LLM).
While it is easy to source the training material from an LLM, the retrieval
task cannot be performed by the LLM directly. This demonstrates that data from
LLMs can be used not only for distilling more efficient specialized models than
the original LLM, but also for creating new capabilities not immediately
possible using the original model.@en
- sl:arxiv_title : Retrieving Texts based on Abstract Descriptions@en
- sl:arxiv_updated : 2023-05-21T17:14:31Z
- sl:bookmarkOf : https://arxiv.org/abs/2305.12517
- sl:creationDate : 2023-06-15
- sl:creationTime : 2023-06-15T19:09:12Z
- sl:relatedDoc : http://www.semanlink.net/doc/2023/05/ل_ل_yoav_👾_sur_twit
Documents with similar tags (experimental)