Normale Sup
http://www.semanlink.net/tag/normale_sup
Documents tagged with Normale Sup[2012.15156] A Memory Efficient Baseline for Open Domain Question Answering
http://www.semanlink.net/doc/2022/08/2012_15156_a_memory_efficient
2022-08-08T13:48:04Z[2208.03299] Few-shot Learning with Retrieval Augmented Language Model
http://www.semanlink.net/doc/2022/08/2208_03299_few_shot_learning_
> Atlas,
a retrieval-augmented language model capable of strong few-shot learning, despite having lower parameter
counts than other powerful recent few-shot learners.
[tweet](https://twitter.com/davisblalock/status/1564148889996836864?s=20&t=BnLM_O1HkTp7qJILF0DW8g)
2022-08-08T11:32:33Z[2112.09118] Towards Unsupervised Dense Information Retrieval with Contrastive Learning
http://www.semanlink.net/doc/2021/12/2112_09118_towards_unsupervis
> we explore the limits of contrastive learning as a way to train unsupervised dense retrievers, and show that it leads to strong retrieval performance.
[openreview](https://openreview.net/forum?id=jKN1pXi7b0)
2021-12-21T11:26:40Z[2012.04584] Distilling Knowledge from Reader to Retriever for Question Answering
http://www.semanlink.net/doc/2020/12/2012_04584_distilling_knowled
> a method to train an information retrieval module for downstream tasks, **without using pairs of queries and documents as annotations**.
Uses two models (standard pipeline for open-domain QA):
- the first one retrieves documents from a large source of knowledge (the retriever)
- the second one processes the support documents to solve the task (the reader).
> First the retriever selects support passages in a large knowledge
source. Then these passages are processed by the reader, along with the question, to generate an
answer
Inspired by knowledge distillation: the reader model is the teacher and the retriever is the student.
> More precisely, we use a sequence-to-sequence model as the reader, and use
the attention activations over the input documents as synthetic labels to train the retriever.
> (**train the retriever by learning to approximate the attention score of the reader**)
Refers to:
- [REALM: Retrieval-Augmented Language Model Pre-Training](doc:2020/12/2002_08909_realm_retrieval_a)
- [Dehghani: Neural Ranking Models with Weak Supervision](doc:?uri=https%3A%2F%2Farxiv.org%2Fabs%2F1704.08803)
2020-12-11T16:48:13ZFrom Random Grammars to Learning Language - Département de Physique de l'Ecole Normale supérieure
http://www.semanlink.net/doc/2020/09/from_random_grammars_to_learnin
2020-09-17T23:46:39Z[1802.07044] The Description Length of Deep Learning Models
http://www.semanlink.net/doc/2019/10/_1802_07044_the_description_le
> Solomonoff’s general theory of inference (Solomonoff, 1964) and the [Minimum Description Length Principle](tag:minimum_description_length_principle) (Grünwald, 2007; Rissanen, 2007) formalize [Occam's razor](tag:occam_s_razor), and hold that **a good model of data is a model that is good at losslessly
compressing the data, including the cost of describing the model itself**. Deep neural
networks might seem to go against this principle given the large number of
parameters to be encoded.
We demonstrate experimentally the ability of deep neural networks to compress
the training data even when accounting for parameter encoding.
2019-10-11T01:59:35ZUne vision optimiste de l'Afrique | Les Ernest
http://www.les-ernest.fr/lionel_zinsou
2011-08-28T01:41:17ZAndré Orléan: L'instabilité des marchés financiers | Les Ernest
http://www.les-ernest.fr/orlean
2010-08-16T23:06:30Z