Semanlink - [2401.12178] In-Context Learning for Extreme Multi-Label Classification

Tags:

About This Document

sl:arxiv_author :
sl:arxiv_firstAuthor : Karel D'Oosterlinck
sl:arxiv_num : 2401.12178
sl:arxiv_published : 2024-01-22T18:09:52Z
sl:arxiv_summary : Multi-label classification problems with thousands of classes are hard to solve with in-context learning alone, as language models (LMs) might lack prior knowledge about the precise classes or how to assign them, and it is generally infeasible to demonstrate every class in a prompt. We propose a general program, $\texttt{Infer--Retrieve--Rank}$, that defines multi-step interactions between LMs and retrievers to efficiently tackle such problems. We implement this program using the $\texttt{DSPy}$ programming model, which specifies in-context systems in a declarative manner, and use $\texttt{DSPy}$ optimizers to tune it towards specific datasets by bootstrapping only tens of few-shot examples. Our primary extreme classification program, optimized separately for each task, attains state-of-the-art results across three benchmarks (HOUSE, TECH, TECHWOLF). We apply the same program to a benchmark with vastly different characteristics and attain competitive performance as well (BioDEX). Unlike prior work, our proposed solution requires no finetuning, is easily applicable to new tasks, alleviates prompt engineering, and requires only tens of labeled examples. Our code is public at https://github.com/KarelDO/xmc.dspy.@en
sl:arxiv_title : In-Context Learning for Extreme Multi-Label Classification@en
sl:arxiv_updated : 2024-01-22T18:09:52Z
sl:bookmarkOf : https://arxiv.org/abs/2401.12178
sl:creationDate : 2024-03-17
sl:creationTime : 2024-03-17T07:58:15Z

File info

Bookmark of: https://arxiv.org/abs/2401.12178

Documents with similar tags (experimental)

[2212.14024] Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP

Tags:

2023-06-23 About

[1609.02521] DiSMEC - Distributed Sparse Machines for Extreme Multi-label Classification

Tags:

2020-09-06 About

[1905.10070] Label-aware Document Representation via Hybrid Attention for Extreme Multi-Label Text Classification

Tags:

2019-06-22 About