<?xml version='1.0' encoding='UTF-8'  ?><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/" xmlns:dc="http://purl.org/dc/elements/1.1/">	<channel rdf:about="http://www.semanlink.net/tag/benjamin_clavie">		<title>Benjamin Clavié</title>		<link>http://www.semanlink.net/tag/benjamin_clavie</link>		<description>Documents tagged with Benjamin Clavié</description>		<items>			<rdf:Seq>							<rdf:li resource="http://www.semanlink.net/doc/2025/07/stop_saying_rag_is_dead_hamel"/>				<rdf:li resource="http://www.semanlink.net/doc/2025/02/benjamin_clavie_sur_x_what_i"/>				<rdf:li resource="http://www.semanlink.net/doc/2025/01/a_little_pooling_goes_a_long_wa"/>				<rdf:li resource="http://www.semanlink.net/doc/2025/01/benjamin_clavie_sur_x_%F0%9F%A7%B5_ste"/>				<rdf:li resource="http://www.semanlink.net/doc/2024/12/2412_13663_smarter_better_f"/>				<rdf:li resource="http://www.semanlink.net/doc/2024/12/jeremy_howard_sur_x_i_ll_get"/>				<rdf:li resource="http://www.semanlink.net/doc/2024/03/benjamin_clavie_sur_x_docume"/>				<rdf:li resource="http://www.semanlink.net/doc/2024/01/bclavie_ragatouille"/>			</rdf:Seq>		</items>	</channel>		<item rdf:about="http://www.semanlink.net/doc/2025/07/stop_saying_rag_is_dead_hamel">		<title>Stop Saying RAG Is Dead – Hamel’s Blog</title>		<link>http://www.semanlink.net/doc/2025/07/stop_saying_rag_is_dead_hamel</link>		<description>&gt; Why the future of RAG lies in better retrieval, not bigger context windows.		</description>		<dc:date>2025-07-14T08:36:40Z</dc:date>	</item>	<item rdf:about="http://www.semanlink.net/doc/2025/02/benjamin_clavie_sur_x_what_i">		<title>Benjamin Clavié sur X : &quot;What if a [MASK&#93; was all you needed?...&quot;</title>		<link>http://www.semanlink.net/doc/2025/02/benjamin_clavie_sur_x_what_i</link>		<dc:date>2025-02-11T00:25:23Z</dc:date>	</item>	<item rdf:about="http://www.semanlink.net/doc/2025/01/a_little_pooling_goes_a_long_wa">		<title>A little pooling goes a long way for multi-vector representations – Answer.AI</title>		<link>http://www.semanlink.net/doc/2025/01/a_little_pooling_goes_a_long_wa</link>		<description>&gt; Intuition: for documents focusing on a low number of topics, a lot of the tokens are likely to carry somewhat redundant semantic information, meaning keeping all of them is likely not useful.		</description>		<dc:date>2025-01-24T17:01:53Z</dc:date>	</item>	<item rdf:about="http://www.semanlink.net/doc/2025/01/benjamin_clavie_sur_x_%F0%9F%A7%B5_ste">		<title>Benjamin Clavié sur X : &quot;Stella Embeddings: What&apos;s the big deal?...&quot;</title>		<link>http://www.semanlink.net/doc/2025/01/benjamin_clavie_sur_x_%F0%9F%A7%B5_ste</link>		<description>&gt; Training based on unsupervised distillation

&gt; The current dominant way of training retrieval models is via the use of a contrastive loss, with little-to-no knowledge distillation
&gt; (Stella&apos;s) training work within the embedding space, seeking to minimize the geometric distances... between the teachers&apos; vectors and the student model (Stella)&apos;s outputs.
&gt; 
&gt; Stella models (and Jasper models) generalize amazingly well because of this.
		</description>		<dc:date>2025-01-13T18:42:18Z</dc:date>	</item>	<item rdf:about="http://www.semanlink.net/doc/2024/12/2412_13663_smarter_better_f">		<title>[2412.13663&#93; Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference</title>		<link>http://www.semanlink.net/doc/2024/12/2412_13663_smarter_better_f</link>		<dc:date>2024-12-21T22:45:32Z</dc:date>	</item>	<item rdf:about="http://www.semanlink.net/doc/2024/12/jeremy_howard_sur_x_i_ll_get">		<title>Jeremy Howard sur X : &quot;We trained 2 new models. Like BERT, but modern. ModernBERT. Not some hypey GenAI thing, but a proper workhorse model, for retrieval, classification, etc...&quot;</title>		<link>http://www.semanlink.net/doc/2024/12/jeremy_howard_sur_x_i_ll_get</link>		<description>&lt;https://x.com/LightOnIO/status/1869785737832366306&gt;		</description>		<dc:date>2024-12-21T17:13:36Z</dc:date>	</item>	<item rdf:about="http://www.semanlink.net/doc/2024/03/benjamin_clavie_sur_x_docume">		<title>Benjamin Clavié sur X : &quot;Introducing rerankers: a lightweight library to provide a unified way to use various reranking methods&quot;</title>		<link>http://www.semanlink.net/doc/2024/03/benjamin_clavie_sur_x_docume</link>		<dc:date>2024-03-16T10:28:38Z</dc:date>	</item>	<item rdf:about="http://www.semanlink.net/doc/2024/01/bclavie_ragatouille">		<title>bclavie/RAGatouille</title>		<link>http://www.semanlink.net/doc/2024/01/bclavie_ragatouille</link>		<description>&gt; RAGatouille&apos;s purpose is make it easy to use state-of-the-art methods in your RAG pipeline, without having to worry about the details or the years of literature! At the moment, RAGatouille focuses on making ColBERT simple to use.

[Using ColBERT in-memory: Index-Free Encodings &amp; Search&#93;(https://github.com/bclavie/RAGatouille/blob/0.0.5b1/examples/06-index_free_use.ipynb)
```
from ragatouille import RAGPretrainedModel
RAG = RAGPretrainedModel.from_pretrained &quot;colbert-ir/colbertv2. 0&quot; )
# Your documents, a plain old list of chunked strings.
documents = [...&#93;
# In-memory indexing supports metadata too!
meta = [&apos;attribute&apos;: &apos; really cool value&apos;}...&#93;
# All the magic happens here
RAG.encode documents, document_metadatas=meta)
# Query your in-memory index
RAG. search_encoded_docs(query = &quot;A great question&quot;, k=3)
# All further encode() calls add to the existing documents...
RAG.encode(extra_documents, document_metadatas=extra_meta)
# ... until you clear them
RAG.clear_encoded
```		</description>		<dc:date>2024-01-26T23:44:59Z</dc:date>	</item></rdf:RDF>