Semanlink - Text Classification

> **One-sentence Summary**: we suggest adding an unsupervised intermediate classification step, before finetunning and after pretraining BERT, and show it improves performance for data-constrained cases.

> for text classification cold start (when labeled
data is scarce), **add an intermediate unsupervised
classification task**, between the pretraining
and fine-tuning phases:
> perform clustering and
train the pre-trained model on predicting the
cluster labels.

> this additional
classification phase can significantly improve
performance, mainly for **topical classification**
tasks

> we use an efficient clustering technique,
that relies on simple Bag Of Words (BOW)
representations, to partition the unlabeled training
data into relatively homogeneous clusters of text
instances.
>
> Next, we treat these clusters as labeled
data for an intermediate text classification task, and
train the pre-trained model – with or without additional
MLM pretraining – with respect to this
multi-class problem, prior to the final fine-tuning
over the actual target-task labels

> The underlying
intuition is that inter-training the model
over a related text classification task would be more
beneficial compared to MLM inter-training, which
focuses on different textual entities, namely predicting
the identity of a single token.

2022-04-06 About

[1712.05972] Train Once, Test Anywhere: Zero-Shot Learning for Text Classification

Tags:

2021-10-16 About

Term Based Semantic Clusters for Very Short Text Classification (2019)

Tags:

2021-05-26 About

Adventures in Zero-Shot Text Classification

Tags:

2021-05-25 About

New pipeline for zero-shot text classification - 🤗Transformers - Hugging Face Forums

Tags:

2021-03-15 About

Zero-Shot Learning in Modern NLP | Joe Davison Blog (2020-05)

Tags:

2021-02-23 About

Sylvain Gugger sur Twitter : "Training a transformer model for text classification..."

Tags:

2020-10-19 About

[1911.11506] Word-Class Embeddings for Multiclass Text Classification

Tags:

> In supervised tasks such as multiclass
text classification (the focus of this article) it seems appealing to enhance word representations
with ad-hoc embeddings that encode task-specific information. We propose (supervised) word-class
embeddings (WCEs), and show that, when concatenated to (unsupervised) pre-trained word embeddings,
they substantially facilitate the training of deep-learning models in multiclass classification by
topic.
>
> A differentiating aspect of our method is that it keeps the modelling of word-class interactions separate from the
original word embedding. Word-class correlations are confined in a dedicated vector space, whose vectors enhance
(by concatenation) the unsupervised representations. The net effect is an embedding matrix that is better suited to
classification, and imposes no restriction to the network architecture using it.

[github](https://github.com/AlexMoreo/word-class-embeddings). Refers to [LEAM](doc:2020/02/joint_embedding_of_words_and_la) :

> [in LEAM] Once words and labels are embedded in a common vector space, word-label
compatibility is measured via cosine similarity. Our method instead models these compatibilities directly, without
generating intermediate embeddings for words or labels.

2020-10-11 About

[2004.03705] Deep Learning Based Text Classification: A Comprehensive Review

Tags:

2020-10-11 About

Top 6 Open Source Pretrained Models for Text Classification you should use

Tags:

Text Classification

2020-10-11 About

[1909.01259] Neural Attentive Bag-of-Entities Model for Text Classification

Tags:

A model that performs **text classification using entities in a knowledge base**.

> Entities provide unambiguous and relevant semantic signals that are beneficial for capturing semantics in texts. We combine **simple high-recall entity detection based on a dictionary** (word->list of entities), to detect entities in a document, with a novel neural **attention mechanism that enables the model to focus on a small number of unambiguous and relevant entities**.

2 steps:

1. Entity detection
2. Classification using the detected entities (+text) as inputs

Regarding entity linking, a local model which uses cosine
similarity between the embedding of the target
entity and the word-based representation of
the document to capture the relevance of an entity
given a document.

Embeddings from the KB: computed using [#Wikipedia2Vec](tag:wikipedia2vec) (similar words and entities
close to one another in a unified vector space)

Model using attention, with 2 features :

- cosine similarity between the
embedding of the entity and the word based
representation of the document
- the probability that the entity
name refers to the entity in KB.

Somewhat [related](doc:2020/01/investigating_entity_knowledge_)

### Conclusion:

>a neural
network model that performs text classification using
entities in Wikipedia. We combined simple
dictionary-based entity detection with a neural attention
mechanism to enable the model to focus
on a small number of unambiguous and relevant
entities in a document.

2020-09-02 About

Hugging Face sur Twitter : "No labeled data? No problem. The 🤗 Transformers master branch now includes a built-in pipeline for zero-shot text classification...

Tags:

2020-08-12 About

FastHugs | ntentional

Tags: