About This Document
- sl:arxiv_author :
- sl:arxiv_firstAuthor : Ting Jiang
- sl:arxiv_num : 2201.04337
- sl:arxiv_published : 2022-01-12T06:54:21Z
- sl:arxiv_summary : The poor performance of the original BERT for sentence semantic similarity
has been widely discussed in previous works. We find that unsatisfactory
performance is mainly due to the static token embeddings biases and the
ineffective BERT layers, rather than the high cosine similarity of the sentence
embeddings. To this end, we propose a prompt based sentence embeddings method
which can reduce token embeddings biases and make the original BERT layers more
effective. By reformulating the sentence embeddings task as the
fillin-the-blanks problem, our method significantly improves the performance of
original BERT. We discuss two prompt representing methods and three prompt
searching methods for prompt based sentence embeddings. Moreover, we propose a
novel unsupervised training objective by the technology of template denoising,
which substantially shortens the performance gap between the supervised and
unsupervised setting. For experiments, we evaluate our method on both non
fine-tuned and fine-tuned settings. Even a non fine-tuned method can outperform
the fine-tuned methods like unsupervised ConSERT on STS tasks. Our fine-tuned
method outperforms the state-of-the-art method SimCSE in both unsupervised and
supervised settings. Compared to SimCSE, we achieve 2.29 and 2.58 points
improvements on BERT and RoBERTa respectively under the unsupervised setting.@en
- sl:arxiv_title : PromptBERT: Improving BERT Sentence Embeddings with Prompts@en
- sl:arxiv_updated : 2022-01-12T06:54:21Z
- sl:bookmarkOf : https://arxiv.org/abs/2201.04337
- sl:creationDate : 2022-09-16
- sl:creationTime : 2022-09-16T10:06:59Z
- sl:relatedDoc : http://www.semanlink.net/doc/2022/09/promptbert_improving_bert_sente
Documents with similar tags (experimental)