Semanlink - [2004.14958] A Call for More Rigor in Unsupervised Cross-lingual Learning

Printer friendly

Search Tag:

Search Doc:

Preferences...

[2004.14958] A Call for More Rigor in Unsupervised Cross-lingual Learning

Tags:

About This Document

sl:arxiv_author :
sl:arxiv_firstAuthor : Mikel Artetxe
sl:arxiv_num : 2004.14958
sl:arxiv_published : 2020-04-30T17:06:23Z
sl:arxiv_summary : We review motivations, definition, approaches, and methodology for unsupervised cross-lingual learning and call for a more rigorous position in each of them. An existing rationale for such research is based on the lack of parallel data for many of the world's languages. However, we argue that a scenario without any parallel data and abundant monolingual data is unrealistic in practice. We also discuss different training signals that have been used in previous work, which depart from the pure unsupervised setting. We then describe common methodological issues in tuning and evaluation of unsupervised cross-lingual models and present best practices. Finally, we provide a unified outlook for different types of research in this area (i.e., cross-lingual word embeddings, deep multilingual pretraining, and unsupervised machine translation) and argue for comparable evaluation of these models.@en
sl:arxiv_title : A Call for More Rigor in Unsupervised Cross-lingual Learning@en
sl:arxiv_updated : 2020-04-30T17:06:23Z
sl:bookmarkOf : https://arxiv.org/abs/2004.14958
sl:creationDate : 2020-05-02
sl:creationTime : 2020-05-02T12:35:54Z

File info

Bookmark of: https://arxiv.org/abs/2004.14958

Documents with similar tags (experimental)

[2311.11077] Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning

Tags:

2023-11-25 About

[2305.06897] AfriQA: Cross-lingual Open-Retrieval Question Answering for African Languages

Tags:

2023-05-15 About

[2302.11529] Modular Deep Learning

Tags:

2023-02-23 About

[2203.09435] Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation

Tags:

2022-09-08 About

[2105.00828] Memorisation versus Generalisation in Pre-trained Language Models

Tags:

2022-03-30 About

[2106.04647] Compacter: Efficient Low-Rank Hypercomplex Adapter Layers

Tags:

2021-09-29 About

[2107.12708] QA Dataset Explosion: A Taxonomy of NLP Resources for Question Answering and Reading Comprehension

Tags:

2021-08-06 About

[1911.02116] Unsupervised Cross-lingual Representation Learning at Scale

Tags:

2021-07-29 About

[1806.06478] Co-training Embeddings of Knowledge Graphs and Entity Descriptions for Cross-lingual Entity Alignment

Tags:

2020-09-06 About

[1306.6802] Evaluation Measures for Hierarchical Classification: a unified view and novel approaches

Tags:

2020-09-01 About

[2003.08505] A Metric Learning Reality Check

Tags:

2020-05-10 About

[1911.01464] Emerging Cross-lingual Structure in Pretrained Language Models

Tags:

2019-11-06 About

Efficient multi-lingual language model fine-tuning · fast.ai NLP

Tags:

2019-10-22 About

[1908.08983] A Little Annotation does a Lot of Good: A Study in Bootstrapping Low-resource Named Entity Recognizers

Tags:

2019-08-28 About

[1811.06031] A Hierarchical Multi-task Approach for Learning Embeddings from Semantic Tasks

Tags:

2018-11-17 About

[1706.04902] A Survey Of Cross-lingual Word Embedding Models

Tags:

2018-05-20 About

[1801.06146] Universal Language Model Fine-tuning for Text Classification

Tags:

2018-01-19 About

[1710.04087] Word Translation Without Parallel Data

Tags:

2017-10-14 About