@prefix rdf: . @prefix sl: . @prefix skos: . @prefix rdfs: . @prefix tag: . @prefix foaf: . @prefix dc: . tag:mutual_learning a sl:Tag ; rdfs:isDefinedBy ; sl:comment "// TODO see collaborative learning / co-training" ; skos:broader tag:machines_teaching_machines ; skos:prefLabel "Mutual Learning" ; skos:related tag:knowledge_distillation , tag:co_training ; foaf:page tag:mutual_learning.html . dc:title "[1706.00384] Deep Mutual Learning" ; sl:comment "> In this paper we explore a different but related idea to model distillation – that of mutual learning. Distillation starts with a powerful large and pre-trained teacher network and performs one-way knowledge transfer to a small untrained student. In contrast, in mutual learning we start with a pool of untrained students who learn simultaneously to solve the task together.\r\n\r\n[critic here](doc:2020/06/1804_03235_large_scale_distri):\r\n\r\n> Zhang et al. (2017) reported a benefit in quality over\r\nbasic distillation, but they compare distilling model M1 into model M2 with training model M1\r\nand model M2 using codistillation; they do not compare to distilling an ensemble of models M1\r\nand M2 into model M3.\r\n>\r\n> ...\r\n>\r\n> we can achieve the 70.7% they report for online\r\ndistillation using traditional offline distillation." ; sl:creationDate "2020-05-11" ; sl:tag tag:mutual_learning , tag:knowledge_distillation , tag:kd_mkb_biblio , tag:arxiv_doc . tag:knowledge_distillation a sl:Tag ; skos:prefLabel "Knowledge distillation" . tag:co_training a sl:Tag ; skos:prefLabel "Co-training" . tag:machines_teaching_machines a sl:Tag ; skos:prefLabel "Machines teaching machines" . tag:arxiv_doc a sl:Tag ; skos:prefLabel "Arxiv Doc" . tag:kd_mkb_biblio a sl:Tag ; skos:prefLabel "KD-MKB biblio" .