About This Document
- sl:arxiv_author : Anna Kruspe
- sl:arxiv_firstAuthor : Anna Kruspe
- sl:arxiv_num : 2008.11228
- sl:arxiv_published : 2020-08-25T18:31:08Z
- sl:arxiv_summary : Pre-trained sentence embeddings have been shown to be very useful for a
variety of NLP tasks. Due to the fact that training such embeddings requires a
large amount of data, they are commonly trained on a variety of text data. An
adaptation to specific domains could improve results in many cases, but such a
finetuning is usually problem-dependent and poses the risk of over-adapting to
the data used for adaptation. In this paper, we present a simple universal
method for finetuning Google's Universal Sentence Encoder (USE) using a Siamese
architecture. We demonstrate how to use this approach for a variety of data
sets and present results on different data sets representing similar problems.
The approach is also compared to traditional finetuning on these data sets. As
a further advantage, the approach can be used for combining data sets with
different annotations. We also present an embedding finetuned on all data sets
in parallel.@en
- sl:arxiv_title : A simple method for domain adaptation of sentence embeddings@en
- sl:arxiv_updated : 2020-08-25T18:31:08Z
- sl:bookmarkOf : https://arxiv.org/abs/2008.11228
- sl:creationDate : 2022-04-01
- sl:creationTime : 2022-04-01T14:07:28Z
Documents with similar tags (experimental)