Semanlink - [2004.10964] Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

Tags:

About This Document

sl:arxiv_author :
sl:arxiv_firstAuthor : Suchin Gururangan
sl:arxiv_num : 2004.10964
sl:arxiv_published : 2020-04-23T04:21:19Z
sl:arxiv_summary : Language models pretrained on text from a wide variety of sources form the foundation of today's NLP. In light of the success of these broad-coverage models, we investigate whether it is still helpful to tailor a pretrained model to the domain of a target task. We present a study across four domains (biomedical and computer science publications, news, and reviews) and eight classification tasks, showing that a second phase of pretraining in-domain (domain-adaptive pretraining) leads to performance gains, under both high- and low-resource settings. Moreover, adapting to the task's unlabeled data (task-adaptive pretraining) improves performance even after domain-adaptive pretraining. Finally, we show that adapting to a task corpus augmented using simple data selection strategies is an effective alternative, especially when resources for domain-adaptive pretraining might be unavailable. Overall, we consistently find that multi-phase adaptive pretraining offers large gains in task performance.@en
sl:arxiv_title : Don't Stop Pretraining: Adapt Language Models to Domains and Tasks@en
sl:arxiv_updated : 2020-05-05T22:00:44Z
sl:bookmarkOf : https://arxiv.org/abs/2004.10964
sl:creationDate : 2020-12-01
sl:creationTime : 2020-12-01T15:43:33Z

File info

Documents with similar tags (experimental)

Tags:

2021-10-21 About

Tags:

2021-10-21 About