About This Document
- sl:arxiv_author :
- sl:arxiv_firstAuthor : Dejiao Zhang
- sl:arxiv_num : 2103.12953
- sl:arxiv_published : 2021-03-24T03:05:17Z
- sl:arxiv_summary : Unsupervised clustering aims at discovering the semantic categories of data
according to some distance measured in the representation space. However,
different categories often overlap with each other in the representation space
at the beginning of the learning process, which poses a significant challenge
for distance-based clustering in achieving good separation between different
categories. To this end, we propose Supporting Clustering with Contrastive
Learning (SCCL) -- a novel framework to leverage contrastive learning to
promote better separation. We assess the performance of SCCL on short text
clustering and show that SCCL significantly advances the state-of-the-art
results on most benchmark datasets with 3%-11% improvement on Accuracy and
4%-15% improvement on Normalized Mutual Information. Furthermore, our
quantitative analysis demonstrates the effectiveness of SCCL in leveraging the
strengths of both bottom-up instance discrimination and top-down clustering to
achieve better intra-cluster and inter-cluster distances when evaluated with
the ground truth cluster labels@en
- sl:arxiv_title : Supporting Clustering with Contrastive Learning@en
- sl:arxiv_updated : 2021-03-24T03:05:17Z
- sl:bookmarkOf : https://arxiv.org/abs/2103.12953
- sl:creationDate : 2021-05-20
- sl:creationTime : 2021-05-20T16:55:29Z
- sl:relatedDoc : http://www.semanlink.net/doc/2021/05/a_self_training_approach_for_sh
Documents with similar tags (experimental)