Semanlink - [1706.00384] Deep Mutual Learning

Tags:

About This Document

sl:arxiv_author :
sl:arxiv_firstAuthor : Ying Zhang
sl:arxiv_num : 1706.00384
sl:arxiv_published : 2017-06-01T16:57:15Z
sl:arxiv_summary : Model distillation is an effective and widely used technique to transfer knowledge from a teacher to a student network. The typical application is to transfer from a powerful large network or ensemble to a small network, that is better suited to low-memory or fast execution requirements. In this paper, we present a deep mutual learning (DML) strategy where, rather than one way transfer between a static pre-defined teacher and a student, an ensemble of students learn collaboratively and teach each other throughout the training process. Our experiments show that a variety of network architectures benefit from mutual learning and achieve compelling results on CIFAR-100 recognition and Market-1501 person re-identification benchmarks. Surprisingly, it is revealed that no prior powerful teacher network is necessary -- mutual learning of a collection of simple student networks works, and moreover outperforms distillation from a more powerful yet static teacher.@en
sl:arxiv_title : Deep Mutual Learning@en
sl:arxiv_updated : 2017-06-01T16:57:15Z
sl:bookmarkOf : https://arxiv.org/abs/1706.00384
sl:creationDate : 2020-05-11
sl:creationTime : 2020-05-11T21:21:42Z
sl:references : http://www.semanlink.net/doc/2020/05/1511_03643_unifying_distillat
sl:relatedDoc : http://www.semanlink.net/doc/2020/06/1804_03235_large_scale_distri

File info

Linked From

Tags:

2020-06-06 About

Documents with similar tags (experimental)

Tags:

2020-05-12 About