About This Document
- sl:arxiv_author :
- sl:arxiv_firstAuthor : Yu Fei
- sl:arxiv_num : 2210.16637
- sl:arxiv_published : 2022-10-29T16:01:51Z
- sl:arxiv_summary : Recent work has demonstrated that pre-trained language models (PLMs) are
zero-shot learners. However, most existing zero-shot methods involve heavy
human engineering or complicated self-training pipelines, hindering their
application to new situations. In this work, we show that zero-shot text
classification can be improved simply by clustering texts in the embedding
spaces of PLMs. Specifically, we fit the unlabeled texts with a Bayesian
Gaussian Mixture Model after initializing cluster positions and shapes using
class names. Despite its simplicity, this approach achieves superior or
comparable performance on both topic and sentiment classification datasets and
outperforms prior works significantly on unbalanced datasets. We further
explore the applicability of our clustering approach by evaluating it on 14
datasets with more diverse topics, text lengths, and numbers of classes. Our
approach achieves an average of 20% absolute improvement over prompt-based
zero-shot learning. Finally, we compare different PLM embedding spaces and find
that texts are well-clustered by topics even if the PLM is not explicitly
pre-trained to generate meaningful sentence embeddings. This work indicates
that PLM embeddings can categorize texts without task-specific fine-tuning,
thus providing a new way to analyze and utilize their knowledge and zero-shot
learning ability.@en
- sl:arxiv_title : Beyond Prompting: Making Pre-trained Language Models Better Zero-shot Learners by Clustering Representations@en
- sl:arxiv_updated : 2022-11-23T09:47:51Z
- sl:bookmarkOf : https://arxiv.org/abs/2210.16637
- sl:creationDate : 2022-11-25
- sl:creationTime : 2022-11-25T11:44:39Z
Documents with similar tags (experimental)