]> 2023-03-26T23:45:31Z 2023-03-26 ChatGPT + Code Interpreter = Magic – @AndrewMayne 2023-03-21T18:06:46Z 2023-03-21 Jimmy Lin sur Twitter : "GPT-4 and its ilk are awesome for rapid prototyping and one-offs, but at the end of the day, enterprises will deploy far smaller distilled models in production. Here's my contrarian take -" / Twitter > Taking both vector and traditional search and merging them via Pinecone’s new hybrid search > Vector search or dense retrieval has been shown to significantly outperform traditional methods **when the embedding models have been fine-tuned on the target domain**. > In the past, engineering teams needed to run different solutions for dense and sparse search engines and another system to combine results in a meaningful way. Typically a dense vector index, sparse inverted index, and reranking step. > > The Pinecone approach to hybrid search uses **a single sparsedense index**. 2023-03-25 Getting Started with Hybrid Search | Pinecone 2023-03-25T11:38:56Z 2023-03-15T01:33:51Z raphaelsty.github.io/knowledge demo 2023-03-15 2023-03-08T01:49:35Z 2023-03-08 Riddle solved: Why was Roman concrete so durable? | MIT News 2023-03-30T00:16:51Z 2023-03-30 Fine-tuning - OpenAI API 2023-03-21T17:44:32Z 2023-03-21 Tutorial: VS Code with Google Cloud AI Platform as a backend | by Kyle Ziegler | Medium Sergey Karayev sur Twitter : "I want to chat with AI about long-form content I'm reading. (a paper on Arxiv, but the solution would ideally support any website or PDF.)... 2023-03-29 2023-03-29T01:23:19Z > @bing in @MicrosoftEdge does work, just had to give it access to page context in Settings 2023-03-25T20:14:14Z 2023-03-25 Markprompt | Open Source GPT-4 platform for Markdown > Build a delightful GPT-4 prompt for your Markdown docs 2023-03-08 2023-03-08T13:59:47Z Enabling Python VirtualEnv in JupyterLab | My Shitty Code Harrison Chase sur Twitter : "LangChain AIPlugins: A first open source attempt at using AIPlugins (the same ones ChatGPT is using) 2023-03-24T23:43:17Z 2023-03-24 2023-03-15 2023-03-15T02:14:03Z GPT-4 (OpenAI blog post) 2023-03-24T23:49:13Z 2023-03-24 DataChazGPT sur Twitter : "Just. Wow. @OpenAI's just showcased a #ChatGPT plugin for summarizing anything from the web!" Fly.io 2023-03-27 2023-03-27T11:22:22Z David Chalmers sur Twitter : "what are some new and interesting results about the relative capacities of multimodal models and pure language models... (thinking about "do language models need sensory grounding for meaning and understanding?".)" 2023-03-15 > the new GPT-4 data seem quite relevant here: the version with vision only slightly outperforms the language-only version on some standard tests. 2023-03-15T22:51:05Z Inria Paris NLP (ALMAnaCH team) sur Twitter : “Writing in two languages: Neural machine translation as an assistive bilingual writing tool” 2023-03-13T13:46:51Z 2023-03-13 > Americans Tommy Albright (Gene Kelly) and Jeff Douglas (Van Johnson) are on a hunting trip in Scotland and become lost in the woodlands. They happen upon Brigadoon, a miraculously blessed village that rises out of the mists every hundred years for only a day. 2023-03-15T18:25:42Z Brigadoon (film) - Wikipedia 2023-03-15 Jim Fan sur Twitter : "...NVIDIA AI Foundations " 2023-03-24T09:23:31Z 2023-03-24 Support of very large dataset? - 🤗Datasets - Hugging Face Forums 2023-03-12 2023-03-12T12:14:56Z [Big data? 🤗 Datasets to the rescue! - Hugging Face Course](doc:2023/03/big_data_🤗_datasets_to_the_re) 2023-03-27T07:51:03Z 2023-03-27 Mégabassines : « La débauche de moyens dépêchés par l’Etat contre les opposants contraste avec la tranquillité dont jouissent les tenants de l’agro-industrie » Using ChatGPT Plugins with LLaMA 2023-03-27 2023-03-27T23:35:03Z 2023-03-22T20:21:16Z anton sur Twitter : "Since ChatGPT has recently lost the ability to maintain conversations I moved over to self-hosted chatbot-ui... Everything is saved locally." 2023-03-22 Bibliothèque de MariaGambina 2023-03-05T14:05:11Z 2023-03-05 Jim Fan sur Twitter : "GPT-4 is HERE. Most important bits you need to know..." 2023-03-15 2023-03-15T02:07:30Z <https://twitter.com/DrJimFan/status/1635694095460102145?s=20> 2023-03-05T13:50:19Z 2023-03-05 La Supplication : Tchernobyl, chronique du monde après l'apocalypse Improve OCR quality for receipt processing with Tesseract and Label Studio 2023-03-13 2023-03-13T16:24:50Z 2023-03-02 2023-03-02T23:59:07Z Le Seigneur des porcheries — Wikipédia 2023-03-24T17:38:58Z Luke Zettlemoyer > a simple but effective method to asynchronously train large, sparse language models on arbitrary text corpora. Our method > > - clusters a corpus into sets of related documents, > - trains a separate expert language model on each cluster, > - and combines them in a sparse ensemble for inference. > > Our technique outperforms dense baselines on multiple corpora and few-shot tasks, and our analysis shows that specializing experts to meaningful clusters is key to these gains. Weijia Shi 2023-03-27 2023-03-24T17:38:58Z Scaling Expert Language Models with Unsupervised Domain Discovery Large language models are typically trained densely: all parameters are updated with respect to all inputs. This requires synchronization of billions of parameters across thousands of GPUs. We introduce a simple but effective method to asynchronously train large, sparse language models on arbitrary text corpora. Our method clusters a corpus into sets of related documents, trains a separate expert language model on each cluster, and combines them in a sparse ensemble for inference. This approach generalizes embarrassingly parallel training by automatically discovering the domains for each expert, and eliminates nearly all the communication overhead of existing sparse language models. Our technique outperforms dense baselines on multiple corpora and few-shot tasks, and our analysis shows that specializing experts to meaningful clusters is key to these gains. Performance also improves with the number of experts and size of training data, suggesting this is a highly efficient and accessible approach to training large language models. Noah A. Smith 2023-03-27T23:25:12Z Tim Althoff 2303.14177 Margaret Li Suchin Gururangan Suchin Gururangan [2303.14177] Scaling Expert Language Models with Unsupervised Domain Discovery Mike Lewis > experiences similar to ChatGPT plugins in an open source way Release v1.15.0 · deepset-ai/haystack 2023-03-30T23:54:26Z 2023-03-30 > You look at me, I’m loaded with confidence, I can’t be beat! I had 180 amateur fights, 22 professional fights, and I’m pretty as a girl! 2023-03-09 2023-03-09T00:15:40Z Muhammad Ali - "I'm as pretty as a girl!" 2023-03-31 harley turan sur Twitter : "a force-directed knowledge graph interface for @OpenAI’s gpt-4..." "GPT-4 for curiosity-led exploration of a concept:" ([Greg Brockman](tag:greg_brockman)) 2023-03-31T17:26:34Z > speech models with 2B parameters trained on 12 million hours of speech and 28 billion sentences of text, spanning 300+ languages. can perform automatic speech recognition (ASR) on widely-spoken languages like English and Mandarin, but also languages like Punjabi, > > We demonstrate that utilizing a large unlabeled multilingual dataset to pre-train the encoder of our model and fine-tuning on a smaller set of labeled data enables us to recognize these under-represented languages. Moreover, our model training process is effective for adapting to new languages and data. 2023-03-30T23:48:31Z Universal Speech Model 2023-03-30 Thérapies géniques : le ciseau moléculaire Crispr livre ses premiers traitements 2023-03-28T08:24:24Z 2023-03-28 2023-03-30T23:13:34Z > Très utilisé sur les cultures de maïs, le S-métolachlore se décompose en sous-produits responsables d’une vaste pollution des nappes phréatiques françaises. 2023-03-30 Le gouvernement veut revenir sur la procédure d’interdiction d’un herbicide majeur > Sans l’aiguillon américain de l’Inflation Reduction Act (IRA), un gigantesque plan de subventions en lien avec la transition énergétique, les Vingt-Sept seraient encore à disserter sur les bienfaits de la concurrence libre et non faussée L’Europe enfin mûre pour une politique industrielle 2023-03-19T18:49:27Z 2023-03-19 2023-03-21 John H. Meyer 🚀 sur Twitter : "@emerywells That's actually what I built it for👀 Context: I unfortunately lost my dad unexpectedly at the young age of 50, back in 2017. There was a lot left un-said, and a lot I wish I could've spoken to him about in my adult life.…" 2023-03-21T23:21:31Z Six Nations 2023 : Le résumé d'Angleterre vs France - YouTube 2023-03-12T23:43:03Z 2023-03-12 2023-03-16 2023-03-16T23:02:01Z Jordan Jacobs sur Twitter : "AI is eating software. Why? Traditional software never improves. AI enables ‘smart’ software to learn & improve constantly. The world runs on software and AI is changing everything. Yet few people see or understand the massive AI wave on the horizon." / Twitter « Pour la première fois peut-être, des avocats et des adversaires du nucléaire tentent de préserver ce qui les rassemble » 2023-03-12 2023-03-12T11:07:28Z > "just upload a document or add a link to your website and get a ChatGPT-like chatbot that can answer any question on it. Then add a chat widget to your website." but that's not training!!! [anton sur Twitter : "Kind of interesting seeing all of these products pop up saying “train ChatGPT on your docs or website” Technically no one can train ChatGPT on your data."](doc:2023/04/anton_sur_twitter_kind_of_in) 2023-03-28 2023-03-28T00:46:11Z Chatbase | Train ChatGPT on your data and add it to your website 2023-03-27 2023-03-27T23:15:25Z > A Transformer (vision encoder, language decoder). No OCR involved!. Pre-trained in a self-supervised fashion by predicting HTML based on masked portions of web page images. > Pix2Struct has been fine tuned on a variety of tasks and datasets, ranging from image captioning, visual question answering (VQA) over different inputs (books, charts, science diagrams), captioning UI components etc. ... We therefore advise you to use these models for the tasks they have been fine tuned on. > very similar to GPT-4's visual abilities, but open-source ;) Niels Rogge sur Twitter : "@GoogleAI's Pix2Struct now available in 🤗 Transformers!" whitead/paper-qa: LLM Chain for answering questions from documents with citations 2023-03-29T08:12:50Z 2023-03-29 LLM Zoo at Home: LLaMA & Alpaca | bergis universe of software, hardware and ideas 2023-03-20 2023-03-20T11:27:16Z Andrej Karpathy sur Twitter : "Base LLMs (non-finetuned) make very strong few-shot classifiers. Describe task in English, give few examples, read off the label probabilities on test example. No gradient-based optimization necessary. It brings a cannon to a knife fight but is fast, convenient, strong baseline." / Twitter 2023-03-19 2023-03-19T14:50:11Z 2023-03-30T23:10:22Z 2023-03-30 Arnaques, Crimes et Botanique The Gentlemen (2019 film) 2023-03-30 2023-03-30T23:08:02Z a 2019 action comedy film written, directed and produced by Guy Ritchie Physics of AI - YouTube - Intelligence has emerged: why? how? - Let's study this with *controlled experiments* and *toy models* - Clean and clear insights that peer slightly behind the magic curtain emergence; rise of scale. Phase transition: fct(compute time for training: nb of params, dataset size) (12:25) Transformers: the jump when compared to traditional NN: instead on oparating on single input X, operates on a set of inputs. (cf. training to say whether an image contains 2 elements of the same kind). **The aptitude to compare elements in the input sequence enables analogies** - the essence of reasoning.(Classical NN treats one high dimensional vector, tranformers hanlde sets of vectors. Attention layer replaces learned filters W of classical NN by the other elements in the sequence. Attention module compares part of input against other part of input. "Relative machines rather than absolute". Sets are a very powerful level of abstraction) emergence > The truth is that nobody has a clue what is going on" > > "**Something unknown is doing we don't know what**" (A. Eddington) being inspired by physics' methodology: controled experiments and toy mathematical models 2023-03-28 2023-03-28T23:46:52Z Contre le tout-plastique, le combat de la chercheuse Nathalie Gontard 2023-03-12 2023-03-12T02:55:48Z > Tous les plastiques, même recyclés, finiront en déchet 2021-04-15T00:53:54Z COIL: Revisit Exact Lexical Match in Information Retrieval with Contextualized Inverted List 2023-03-08T17:46:59Z Luyu Gao 2104.07186 2021-04-15T00:53:54Z Zhuyun Dai Jamie Callan [2104.07186] COIL: Revisit Exact Lexical Match in Information Retrieval with Contextualized Inverted List Luyu Gao 2023-03-08 Classical information retrieval systems such as BM25 rely on exact lexical match and carry out search efficiently with inverted list index. Recent neural IR models shifts towards soft semantic matching all query document terms, but they lose the computation efficiency of exact match systems. This paper presents COIL, a contextualized exact match retrieval architecture that brings semantic lexical matching. COIL scoring is based on overlapping query document tokens' contextualized representations. The new architecture stores contextualized token representations in inverted lists, bringing together the efficiency of exact match and the representation power of deep language models. Our experimental results show COIL outperforms classical lexical retrievers and state-of-the-art deep LM retrievers with similar or smaller latency. 2023-03-27 Alpaca Finetuning of Llama on a 24G Consumer GPU 2023-03-27T22:50:55Z [GitHub](https://github.com/aspctu/alpaca-lora) fork of [tloen/alpaca-lora: Instruct-tune LLaMA on consumer hardware](doc:2023/03/tloen_alpaca_lora_instruct_tun) 2023-03-22 Uses [LoRA: Low-Rank Adaptation of Large Language Models](doc:2023/03/2106_09685_lora_low_rank_ada) see [Alpaca Finetuning of Llama on a 24G Consumer GPU](doc:2023/03/alpaca_finetuning_of_llama_on_a) tloen/alpaca-lora: Instruct-tune LLaMA on consumer hardware 2023-03-22T00:23:50Z 2021-06-17T17:37:18Z Edward J. Hu Yelong Shen > freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks. > Compared to GPT-3 175B fine-tuned with Adam, LoRA can reduce the number of trainable parameters by 10,000 times and the GPU memory requirement by 3 times. > unlike [adapters](tag:adapter_modules_finetuning), no additional inference latency. > package that facilitates the integration of LoRA with PyTorch models. Implementations and model checkpoints for RoBERTa, DeBERTa, and GPT-2 on [github](https://github.com/microsoft/LoRA>). 2106.09685 Yuanzhi Li Shean Wang 2023-03-21 2021-10-16T18:40:34Z Edward J. Hu Phillip Wallis Weizhu Chen 2023-03-21T23:51:38Z LoRA: Low-Rank Adaptation of Large Language Models An important paradigm of natural language processing consists of large-scale pre-training on general domain data and adaptation to particular tasks or domains. As we pre-train larger models, full fine-tuning, which retrains all model parameters, becomes less feasible. Using GPT-3 175B as an example -- deploying independent instances of fine-tuned models, each with 175B parameters, is prohibitively expensive. We propose Low-Rank Adaptation, or LoRA, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks. Compared to GPT-3 175B fine-tuned with Adam, LoRA can reduce the number of trainable parameters by 10,000 times and the GPU memory requirement by 3 times. LoRA performs on-par or better than fine-tuning in model quality on RoBERTa, DeBERTa, GPT-2, and GPT-3, despite having fewer trainable parameters, a higher training throughput, and, unlike adapters, no additional inference latency. We also provide an empirical investigation into rank-deficiency in language model adaptation, which sheds light on the efficacy of LoRA. We release a package that facilitates the integration of LoRA with PyTorch models and provide our implementations and model checkpoints for RoBERTa, DeBERTa, and GPT-2 at https://github.com/microsoft/LoRA. Zeyuan Allen-Zhu Lu Wang [2106.09685] LoRA: Low-Rank Adaptation of Large Language Models