About This Document
- sl:arxiv_author :
- sl:arxiv_firstAuthor : Wilhelmina Nekoto
- sl:arxiv_num : 2010.02353
- sl:arxiv_published : 2020-10-05T21:50:38Z
- sl:arxiv_summary : Research in NLP lacks geographic diversity, and the question of how NLP can
be scaled to low-resourced languages has not yet been adequately solved.
\"Low-resourced\"-ness is a complex problem going beyond data availability and
reflects systemic problems in society. In this paper, we focus on the task of
Machine Translation (MT), that plays a crucial role for information
accessibility and communication worldwide. Despite immense improvements in MT
over the past decade, MT is centered around a few high-resourced languages. As
MT researchers cannot solve the problem of low-resourcedness alone, we propose
participatory research as a means to involve all necessary agents required in
the MT development process. We demonstrate the feasibility and scalability of
participatory research with a case study on MT for African languages. Its
implementation leads to a collection of novel translation datasets, MT
benchmarks for over 30 languages, with human evaluations for a third of them,
and enables participants without formal training to make a unique scientific
contribution. Benchmarks, models, data, code, and evaluation results are
released under https://github.com/masakhane-io/masakhane-mt.@en
- sl:arxiv_title : Participatory Research for Low-resourced Machine Translation: A Case Study in African Languages@en
- sl:arxiv_updated : 2020-11-06T23:30:45Z
- sl:bookmarkOf : https://arxiv.org/abs/2010.02353
- sl:creationDate : 2021-08-25
- sl:creationTime : 2021-08-25T17:01:12Z
Documents with similar tags (experimental)