<?xml version='1.0' encoding='UTF-8'  ?><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:dc="http://purl.org/dc/elements/1.1/">	<channel rdf:about="http://www.semanlink.net/tag/unsupervised_domain_adaptation_nlp">		<title>Unsupervised Domain Adaptation (NLP)</title>		<link>http://www.semanlink.net/tag/unsupervised_domain_adaptation_nlp</link>		<description>Documents tagged with Unsupervised Domain Adaptation (NLP)</description>		<items>			<rdf:Seq>							<rdf:li resource="http://www.semanlink.net/doc/2023/04/domain_adaptation_with_generati"/>				<rdf:li resource="http://www.semanlink.net/doc/2023/01/1904_02817_unsupervised_domai"/>				<rdf:li resource="http://www.semanlink.net/doc/2022/09/2205_11498_domain_adaptation_"/>				<rdf:li resource="http://www.semanlink.net/doc/2022/08/unsupervised_learning_sentenc"/>				<rdf:li resource="http://www.semanlink.net/doc/2022/06/domain_transfer_with_ggpl_germ"/>				<rdf:li resource="http://www.semanlink.net/doc/2022/06/nils_reimers_sur_twitter_gpl"/>				<rdf:li resource="http://www.semanlink.net/doc/2022/04/ramsri_goutham_golla_sur_twitte"/>				<rdf:li resource="http://www.semanlink.net/doc/2022/03/domain_adaptation_sentence_tr"/>				<rdf:li resource="http://www.semanlink.net/doc/2022/03/2006_00632_neural_unsupervise"/>				<rdf:li resource="http://www.semanlink.net/doc/2022/03/unsupervised_training_of_retrie"/>				<rdf:li resource="http://www.semanlink.net/doc/2022/03/naver_labs_europe_nils_reim"/>				<rdf:li resource="http://www.semanlink.net/doc/2021/12/2112_09118_towards_unsupervis"/>				<rdf:li resource="http://www.semanlink.net/doc/2021/12/2112_07577_gpl_generative_ps"/>				<rdf:li resource="http://www.semanlink.net/doc/2021/12/nils_reimers_sur_twitter_do_"/>				<rdf:li resource="http://www.semanlink.net/doc/2021/11/unsupervised_training_for_sente"/>				<rdf:li resource="http://www.semanlink.net/doc/2021/09/2104_06979_tsdae_using_trans"/>			</rdf:Seq>		</items>	</channel>		<item rdf:about="http://www.semanlink.net/doc/2023/04/domain_adaptation_with_generati">		<title>Domain Adaptation with Generative Pseudo-Labeling (GPL) | Pinecone</title>		<link>http://www.semanlink.net/doc/2023/04/domain_adaptation_with_generati</link>		<dc:date>2023-04-09T10:30:34Z</dc:date>	</item>	<item rdf:about="http://www.semanlink.net/doc/2023/01/1904_02817_unsupervised_domai">		<title>[1904.02817&#93; Unsupervised Domain Adaptation of Contextualized Embeddings for Sequence Labeling</title>		<link>http://www.semanlink.net/doc/2023/01/1904_02817_unsupervised_domai</link>		<dc:date>2023-01-12T16:29:04Z</dc:date>	</item>	<item rdf:about="http://www.semanlink.net/doc/2022/09/2205_11498_domain_adaptation_">		<title>[2205.11498&#93; Domain Adaptation for Memory-Efficient Dense Retrieval</title>		<link>http://www.semanlink.net/doc/2022/09/2205_11498_domain_adaptation_</link>		<description>Refers to [Binary Passage Retriever (BPR)&#93;(doc:2021/06/2106_00882_efficient_passage_)		</description>		<dc:date>2022-09-26T17:46:39Z</dc:date>	</item>	<item rdf:about="http://www.semanlink.net/doc/2022/08/unsupervised_learning_sentenc">		<title>Unsupervised Learning — Sentence-Transformers documentation</title>		<link>http://www.semanlink.net/doc/2022/08/unsupervised_learning_sentenc</link>		<description>&gt; In our paper TSDAE we compare approaches for sentence embedding tasks, and in GPL we compare them for semantic search tasks (given a query, find relevant passages). While the unsupervised approach achieve acceptable performances for sentence embedding tasks, they perform poorly for semantic search tasks.		</description>		<dc:date>2022-08-20T01:16:16Z</dc:date>	</item>	<item rdf:about="http://www.semanlink.net/doc/2022/06/domain_transfer_with_ggpl_germ">		<title>Domain transfer with GGPL: German Generative Pseudo Labeling 🥨 | by Matthias Richter | Jun, 2022 | ML6team</title>		<link>http://www.semanlink.net/doc/2022/06/domain_transfer_with_ggpl_germ</link>		<dc:date>2022-06-02T13:55:12Z</dc:date>	</item>	<item rdf:about="http://www.semanlink.net/doc/2022/06/nils_reimers_sur_twitter_gpl">		<title>Nils Reimers sur Twitter : &quot;GPL goes multi-lingual...&quot;</title>		<link>http://www.semanlink.net/doc/2022/06/nils_reimers_sur_twitter_gpl</link>		<description>[Domain transfer with GGPL: German Generative Pseudo Labeling&#93;(doc:2022/06/domain_transfer_with_ggpl_germ)		</description>		<dc:date>2022-06-01T17:45:24Z</dc:date>	</item>	<item rdf:about="http://www.semanlink.net/doc/2022/04/ramsri_goutham_golla_sur_twitte">		<title>Ramsri Goutham Golla sur Twitter : &quot;Hi @Nils_Reimers For GPL you used &quot;msmarco-distilbert-base-tas-b&quot; model and ...&quot;</title>		<link>http://www.semanlink.net/doc/2022/04/ramsri_goutham_golla_sur_twitte</link>		<dc:date>2022-04-27T22:17:10Z</dc:date>	</item>	<item rdf:about="http://www.semanlink.net/doc/2022/03/domain_adaptation_sentence_tr">		<title>Domain Adaptation — Sentence-Transformers documentation</title>		<link>http://www.semanlink.net/doc/2022/03/domain_adaptation_sentence_tr</link>		<dc:date>2022-03-31T08:59:25Z</dc:date>	</item>	<item rdf:about="http://www.semanlink.net/doc/2022/03/2006_00632_neural_unsupervise">		<title>[2006.00632&#93; Neural Unsupervised Domain Adaptation in NLP---A Survey</title>		<link>http://www.semanlink.net/doc/2022/03/2006_00632_neural_unsupervise</link>		<dc:date>2022-03-30T01:13:03Z</dc:date>	</item>	<item rdf:about="http://www.semanlink.net/doc/2022/03/unsupervised_training_of_retrie">		<title>Unsupervised Training of Retrievers Using GenQ (The Art of Asking Questions with GenQ) | Pinecone</title>		<link>http://www.semanlink.net/doc/2022/03/unsupervised_training_of_retrie</link>		<dc:date>2022-03-09T10:56:30Z</dc:date>	</item>	<item rdf:about="http://www.semanlink.net/doc/2022/03/naver_labs_europe_nils_reim">		<title>NAVER LABS Europe : &quot;@Nils_Reimers of @huggingface on &apos;Unsupervised domain adaptation for neural search&apos;&quot;</title>		<link>http://www.semanlink.net/doc/2022/03/naver_labs_europe_nils_reim</link>		<dc:date>2022-03-09T10:53:24Z</dc:date>	</item>	<item rdf:about="http://www.semanlink.net/doc/2021/12/2112_09118_towards_unsupervis">		<title>[2112.09118&#93; Towards Unsupervised Dense Information Retrieval with Contrastive Learning</title>		<link>http://www.semanlink.net/doc/2021/12/2112_09118_towards_unsupervis</link>		<description>&gt; we explore the limits of contrastive learning as a way to train unsupervised dense retrievers, and show that it leads to strong retrieval performance.

[openreview&#93;(https://openreview.net/forum?id=jKN1pXi7b0)		</description>		<dc:date>2021-12-21T11:26:40Z</dc:date>	</item>	<item rdf:about="http://www.semanlink.net/doc/2021/12/2112_07577_gpl_generative_ps">		<title>[2112.07577&#93; GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval</title>		<link>http://www.semanlink.net/doc/2021/12/2112_07577_gpl_generative_ps</link>		<description>An unsupervised domain adaptation technique for dense retrieval models

1. synthetic queries
are generated for each passage from the target corpus (using an existing pre-trained [T5&#93;(tag:text_to_text_transfer_transformer)
encoder-decoder)
2. the generated queries are used for mining negative
passages (retrieving the most similar
paragraphs using an existing dense retrieval
model == hard negatives!)
3. the query-passage pairs are labeled by a cross-encoder and used to train the domain-adapted
dense retriever (using method described in [Hofstätter et al.,
2020&#93;(doc:2021/12/2010_02666_improving_efficien))

[Nils Reimers sur Twitter&#93;(doc:2021/12/nils_reimers_sur_twitter_do_), [GitHub&#93;(https://github.com/UKPLab/gpl),  by the author of [TSDAE&#93;(doc:2021/09/2104_06979_tsdae_using_trans)

Claims to improve &quot;Doc2Query&quot; [Document Expansion by Query Prediction&#93;(doc:2022/01/1904_08375_document_expansion): ([src&#93;(https://twitter.com/KexinWang2049/status/1471435779415150598))

&gt; - GPL: Uses doc2query to construct synthetic data and does knowledge distillation (i.e. training) on that data.
&gt; - Doc2query: Generates queries to extend the documents and use BM25 on top of them w/o training.		</description>		<dc:date>2021-12-15T18:23:28Z</dc:date>	</item>	<item rdf:about="http://www.semanlink.net/doc/2021/12/nils_reimers_sur_twitter_do_">		<title>Nils Reimers sur Twitter : &quot;Do dense retrieval models work out-of-the-box for your specific domain? Often the answer was No😢...&quot;</title>		<link>http://www.semanlink.net/doc/2021/12/nils_reimers_sur_twitter_do_</link>		<dc:date>2021-12-15T18:06:51Z</dc:date>	</item>	<item rdf:about="http://www.semanlink.net/doc/2021/11/unsupervised_training_for_sente">		<title>Unsupervised Training for Sentence Transformers | Pinecone</title>		<link>http://www.semanlink.net/doc/2021/11/unsupervised_training_for_sente</link>		<description>Blog post about [[2104.06979&#93; TSDAE: Using Transformer-based Sequential Denoising Auto-Encoder for Unsupervised Sentence Embedding Learning&#93;(doc:2021/09/2104_06979_tsdae_using_trans)

&gt; Fine-tuning with TSDAE simply cannot compete in terms of performance against supervised methods.
However, **the point and value of TSDAE is that it allows us to fine-tune models for use-cases where we have no data**. Specific domains with unique terminology or low resource languages.		</description>		<dc:date>2021-11-24T21:03:44Z</dc:date>	</item>	<item rdf:about="http://www.semanlink.net/doc/2021/09/2104_06979_tsdae_using_trans">		<title>[2104.06979&#93; TSDAE: Using Transformer-based Sequential Denoising Auto-Encoder for Unsupervised Sentence Embedding Learning</title>		<link>http://www.semanlink.net/doc/2021/09/2104_06979_tsdae_using_trans</link>		<description>&gt; The most
successful previous approaches like InferSent (Conneau
et al., 2017), Universial Sentence Encoder
(USE) (Cer et al., 2018) and SBERT (Reimers and
Gurevych, 2019) heavily relied on labeled data to
train sentence embedding models.
&gt;
&gt; TSDAE can
achieve up to 93.1% of the performance of indomain
supervised approaches. Further, we
show that TSDAE is **a strong domain adaptation
and pre-training method for sentence
embeddings**, significantly outperforming other
approaches like Masked Language Model.

&gt; During training, TSDAE
encodes corrupted sentences into fixed-sized
vectors and requires the decoder to reconstruct the
original sentences from this sentence embedding.

- &lt;https://www.sbert.net/examples/unsupervised_learning/TSDAE/README.html&gt;
- [github&#93;(https://github.com/UKPLab/sentence-transformers/tree/master/examples/unsupervised_learning/TSDAE)
- [UKPLab/sentence-transformers: Sentence Embeddings with BERT &amp; XLNet&#93;(doc:2020/07/ukplab_sentence_transformers_s)
- [twitter&#93;(https://twitter.com/KexinWang2049/status/1433361957579538432):

&gt; **TSDAE can learn domain-specific sentence embeddings with unlabeled sentences**
&gt;
&gt; Most importantly, instead of STS (Semantic Textual Similarity), **we suggest evaluating unsupervised sentence embeddings on the domain-specific tasks&amp;datasets, which is the real use case for them**. Actually, STS scores do not correlate with performance on specific tasks. 



		</description>		<dc:date>2021-09-01T16:43:01Z</dc:date>	</item></rdf:RDF>