Σχετικά με το έγγραφο αυτό
- sl:arxiv_author :
- sl:arxiv_firstAuthor : Michael Tänzer
- sl:arxiv_num : 2105.00828
- sl:arxiv_published : 2021-04-16T18:53:19Z
- sl:arxiv_summary : State-of-the-art pre-trained language models have been shown to memorise
facts and perform well with limited amounts of training data. To gain a better
understanding of how these models learn, we study their generalisation and
memorisation capabilities in noisy and low-resource scenarios. We find that the
training of these models is almost unaffected by label noise and that it is
possible to reach near-optimal results even on extremely noisy datasets.
However, our experiments also show that they mainly learn from high-frequency
patterns and largely fail when tested on low-resource tasks such as few-shot
learning and rare entity recognition. To mitigate such limitations, we propose
an extension based on prototypical networks that improves performance in
low-resource named entity recognition tasks.@en
- sl:arxiv_title : Memorisation versus Generalisation in Pre-trained Language Models@en
- sl:arxiv_updated : 2022-03-15T01:14:16Z
- sl:bookmarkOf : https://arxiv.org/abs/2105.00828
- sl:creationDate : 2022-03-30
- sl:creationTime : 2022-03-30T16:11:53Z