About This Document
- sl:arxiv_author :
- sl:arxiv_firstAuthor : Michael A. Hedderich
- sl:arxiv_num : 1807.00745
- sl:arxiv_published : 2018-07-02T15:35:02Z
- sl:arxiv_summary : Manually labeled corpora are expensive to create and often not available for
low-resource languages or domains. Automatic labeling approaches are an
alternative way to obtain labeled data in a quicker and cheaper way. However,
these labels often contain more errors which can deteriorate a classifier's
performance when trained on this data. We propose a noise layer that is added
to a neural network architecture. This allows modeling the noise and train on a
combination of clean and noisy data. We show that in a low-resource NER task we
can improve performance by up to 35% by using additional, noisy data and
handling the noise.@en
- sl:arxiv_title : Training a Neural Network in a Low-Resource Setting on Automatically Annotated Noisy Data@en
- sl:arxiv_updated : 2018-07-22T06:01:14Z
- sl:bookmarkOf : https://arxiv.org/abs/1807.00745
- sl:creationDate : 2022-07-18
- sl:creationTime : 2022-07-18T11:39:48Z
Documents with similar tags (experimental)