Semanlink - [2105.00828] Memorisation versus Generalisation in Pre-trained Language Models

Εκτύπωση

Βρες μου:

Search Doc:

Προτιμήσεις...

[2105.00828] Memorisation versus Generalisation in Pre-trained Language Models

Tags:

Σχετικά με το έγγραφο αυτό

sl:arxiv_author :
sl:arxiv_firstAuthor : Michael Tänzer
sl:arxiv_num : 2105.00828
sl:arxiv_published : 2021-04-16T18:53:19Z
sl:arxiv_summary : State-of-the-art pre-trained language models have been shown to memorise facts and perform well with limited amounts of training data. To gain a better understanding of how these models learn, we study their generalisation and memorisation capabilities in noisy and low-resource scenarios. We find that the training of these models is almost unaffected by label noise and that it is possible to reach near-optimal results even on extremely noisy datasets. However, our experiments also show that they mainly learn from high-frequency patterns and largely fail when tested on low-resource tasks such as few-shot learning and rare entity recognition. To mitigate such limitations, we propose an extension based on prototypical networks that improves performance in low-resource named entity recognition tasks.@en
sl:arxiv_title : Memorisation versus Generalisation in Pre-trained Language Models@en
sl:arxiv_updated : 2022-03-15T01:14:16Z
sl:bookmarkOf : https://arxiv.org/abs/2105.00828
sl:creationDate : 2022-03-30
sl:creationTime : 2022-03-30T16:11:53Z

Πληροφορία αρχείου

Bookmark of: https://arxiv.org/abs/2105.00828

Documents with similar tags (experimental)

[2311.11077] Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning

Tags:

2023-11-25 About

[2305.11778] Cross-Lingual Supervision improves Large Language Models Pre-training

Tags:

2023-05-22 About

[2305.06897] AfriQA: Cross-lingual Open-Retrieval Question Answering for African Languages

Tags:

2023-05-15 About

[2302.11529] Modular Deep Learning

Tags:

2023-02-23 About

[2210.16637] Beyond Prompting: Making Pre-trained Language Models Better Zero-shot Learners by Clustering Representations

Tags:

2022-11-25 About

[2203.09435] Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation

Tags:

2022-09-08 About

[2205.05131] Unifying Language Learning Paradigms

Tags:

2022-05-12 About

[1910.06294] Training Compact Models for Low Resource Entity Tagging using Pre-trained Language Models

Tags:

2022-03-31 About

[2101.12294] Combining pre-trained language models and structured knowledge

Tags:

2022-03-25 About

[2106.13474] Adapt-and-Distill: Developing Small, Fast and Effective Pretrained Language Models for Domains

Tags:

2021-10-21 About

[2106.04647] Compacter: Efficient Low-Rank Hypercomplex Adapter Layers

Tags:

2021-09-29 About

[2107.12708] QA Dataset Explosion: A Taxonomy of NLP Resources for Question Answering and Reading Comprehension

Tags:

2021-08-06 About

[2012.15723] Making Pre-trained Language Models Better Few-shot Learners

Tags:

2021-01-02 About

[2010.11967] Language Models are Open Knowledge Graphs

Tags:

2020-10-26 About

[2004.14958] A Call for More Rigor in Unsupervised Cross-lingual Learning

Tags:

2020-05-02 About

[1909.03193] KG-BERT: BERT for Knowledge Graph Completion

Tags:

2020-03-22 About

[2003.08271] Pre-trained Models for Natural Language Processing: A Survey

Tags:

2020-03-19 About

[1911.01464] Emerging Cross-lingual Structure in Pretrained Language Models

Tags:

2019-11-06 About

[1901.11504] Multi-Task Deep Neural Networks for Natural Language Understanding

Tags:

2019-02-17 About

[1811.06031] A Hierarchical Multi-task Approach for Learning Embeddings from Semantic Tasks

Tags:

2018-11-17 About

[1706.04902] A Survey Of Cross-lingual Word Embedding Models

Tags:

2018-05-20 About

[1801.06146] Universal Language Model Fine-tuning for Text Classification

Tags:

2018-01-19 About