Classification: label correlation
http://www.semanlink.net/tag/classification_relations_between_classes
Documents tagged with Classification: label correlationX-BERT: eXtreme Multi-label Text Classification using Bidirectional Encoder Representations from Transformers
http://www.semanlink.net/doc/2021/01/x_bert_extreme_multi_label_tex
> Challenges in extending BERT to the XMC problem:
- difficulty of capturing [dependencies or correlations among labels](tag:classification_relations_between_classes.html)
- tractability to scale to the extreme label setting because of the Softmax bottleneck scaling linearly with the output space.
> X-BERT leverages both the label and input text to build label representations, which induces semantic label clusters to better model label dependencies. At the heart of X-BERT is a procedure to finetune BERT models to capture the contextual relations between input text and the induced label clusters. Finally, an ensemble of the different BERT models trained on heterogeneous label clusters leads to our best final mode
2021-01-10T19:23:20Z[1306.6802] Evaluation Measures for Hierarchical Classification: a unified view and novel approaches
http://www.semanlink.net/doc/2020/09/1306_6802_evaluation_measures
How to properly evaluate hierarchical classification algorithms?
> Classification errors in the upper levels of the hierarchy (e.g. when wrongly
classifying a document of the class music into the class food) are more severe
than those in deeper levels (e.g. when classifying a document from progressive
rock as alternative rock).
2020-09-01T23:46:48Z[2003.11644] MAGNET: Multi-Label Text Classification using Attention-based Graph Neural Network
http://www.semanlink.net/doc/2020/08/2003_11644_multi_label_text_c
> **Existing methods tend to ignore the relationship among labels**.
This model employs [Graph Attention Networks](tag:graph_attention_networks) (GAT) to find the correlation between
labels. The generated classifiers are applied to sentence feature vectors obtained from the text feature extraction network (BiLSTM) to enable end-to-end training.
> GAT network takes the node features and adjacency
matrix that represents the graph data as inputs.
The adjacency matrix is constructed based on
the samples. **In our case, we do not have a graph
dataset. Instead, we learn the adjacency matrix**, hoping
that the model will determine the graph, thereby
learning the correlation of the labels.
> Our intuition is that by modeling the correlation
among labels as a weighted graph, we force the GAT
network to learn such that the adjacency matrix and
the attention weights together represent the correlation.
// TODO compare with [this](doc:2019/06/_1905_10070_label_aware_docume)
2020-08-14T16:11:43Z[1905.10070] Label-aware Document Representation via Hybrid Attention for Extreme Multi-Label Text Classification
http://www.semanlink.net/doc/2019/06/_1905_10070_label_aware_docume
> This paper is motivated to better explore the semantic **relationship between each document and extreme labels by taking advantage of both document content and label correlation**. Our objective is to establish an explicit **label-aware representation for each document**.
> LAHA consists of three parts.
> 1. The first part
adopts a multi-label self-attention mechanism **to detect the contribution
of each word to labels**.
> 2. The second part exploits the label structure and
document content **to determine the semantic connection between words
and labels in a same latent space**.
> 3. An adaptive fusion strategy is designed
in the third part to obtain the final label-aware document representation
[Github](https://github.com/HX-idiot/Hybrid_Attention_XML)
// TODO compare with [this](doc:2020/08/2003_11644_multi_label_text_c)
2019-06-22T17:15:57Z