GPL (Generative Pseudo Labeling)
http://www.semanlink.net/tag/gpl_generative_pseudo_labeling
Documents tagged with GPL (Generative Pseudo Labeling)Domain Adaptation with Generative Pseudo-Labeling (GPL) | Pinecone
http://www.semanlink.net/doc/2023/04/domain_adaptation_with_generati
2023-04-09T10:30:34Z[2205.11498] Domain Adaptation for Memory-Efficient Dense Retrieval
http://www.semanlink.net/doc/2022/09/2205_11498_domain_adaptation_
Refers to [Binary Passage Retriever (BPR)](doc:2021/06/2106_00882_efficient_passage_)
2022-09-26T17:46:39ZDomain transfer with GGPL: German Generative Pseudo Labeling 🥨 | by Matthias Richter | Jun, 2022 | ML6team
http://www.semanlink.net/doc/2022/06/domain_transfer_with_ggpl_germ
2022-06-02T13:55:12ZNils Reimers sur Twitter : "GPL goes multi-lingual..."
http://www.semanlink.net/doc/2022/06/nils_reimers_sur_twitter_gpl
[Domain transfer with GGPL: German Generative Pseudo Labeling](doc:2022/06/domain_transfer_with_ggpl_germ)
2022-06-01T17:45:24ZRamsri Goutham Golla sur Twitter : "Hi @Nils_Reimers For GPL you used "msmarco-distilbert-base-tas-b" model and ..."
http://www.semanlink.net/doc/2022/04/ramsri_goutham_golla_sur_twitte
2022-04-27T22:17:10ZDomain Adaptation — Sentence-Transformers documentation
http://www.semanlink.net/doc/2022/03/domain_adaptation_sentence_tr
2022-03-31T08:59:25ZNAVER LABS Europe : "@Nils_Reimers of @huggingface on 'Unsupervised domain adaptation for neural search'"
http://www.semanlink.net/doc/2022/03/naver_labs_europe_nils_reim
2022-03-09T10:53:24Z[2112.07577] GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval
http://www.semanlink.net/doc/2021/12/2112_07577_gpl_generative_ps
An unsupervised domain adaptation technique for dense retrieval models
1. synthetic queries
are generated for each passage from the target corpus (using an existing pre-trained [T5](tag:text_to_text_transfer_transformer)
encoder-decoder)
2. the generated queries are used for mining negative
passages (retrieving the most similar
paragraphs using an existing dense retrieval
model == hard negatives!)
3. the query-passage pairs are labeled by a cross-encoder and used to train the domain-adapted
dense retriever (using method described in [Hofstätter et al.,
2020](doc:2021/12/2010_02666_improving_efficien))
[Nils Reimers sur Twitter](doc:2021/12/nils_reimers_sur_twitter_do_), [GitHub](https://github.com/UKPLab/gpl), by the author of [TSDAE](doc:2021/09/2104_06979_tsdae_using_trans)
Claims to improve "Doc2Query" [Document Expansion by Query Prediction](doc:2022/01/1904_08375_document_expansion): ([src](https://twitter.com/KexinWang2049/status/1471435779415150598))
> - GPL: Uses doc2query to construct synthetic data and does knowledge distillation (i.e. training) on that data.
> - Doc2query: Generates queries to extend the documents and use BM25 on top of them w/o training.
2021-12-15T18:23:28Z