About This Document
- sl:arxiv_author :
- sl:arxiv_firstAuthor : Siddhant Garg
- sl:arxiv_num : 2004.05119
- sl:arxiv_published : 2020-04-10T16:57:06Z
- sl:arxiv_summary : Fine-tuning (FT) pre-trained sentence embedding models on small datasets has
been shown to have limitations. In this paper we show that concatenating the
embeddings from the pre-trained model with those from a simple sentence
embedding model trained only on the target data, can improve over the
performance of FT for few-sample tasks. To this end, a linear classifier is
trained on the combined embeddings, either by freezing the embedding model
weights or training the classifier and embedding models end-to-end. We perform
evaluation on seven small datasets from NLP tasks and show that our approach
with end-to-end training outperforms FT with negligible computational overhead.
Further, we also show that sophisticated combination techniques like CCA and
KCCA do not work as well in practice as concatenation. We provide theoretical
analysis to explain this empirical observation.@en
- sl:arxiv_title : Beyond Fine-tuning: Few-Sample Sentence Embedding Transfer@en
- sl:arxiv_updated : 2020-10-05T16:57:39Z
- sl:bookmarkOf : https://arxiv.org/abs/2004.05119
- sl:creationDate : 2022-03-31
- sl:creationTime : 2022-03-31T21:04:02Z
Documents with similar tags (experimental)