Semanlink - Zeyuan Allen-Zhu sur X : " surprisingly, when pre-training good data (e.g., Wiki) together with "junks" (e.g., Common Crawl), LLM's capacity on good data may decrease by 20x times!"

Impression

Recherche de Mot-clé

Recherche de Doc

Préférences...

Zeyuan Allen-Zhu sur X : " surprisingly, when pre-training good data (e.g., Wiki) together with "junks" (e.g., Common Crawl), LLM's capacity on good data may decrease by 20x times!"

Tags:

Au sujet de ce document

sl:bookmarkOf : https://twitter.com/ZeyuanAllenZhu/status/1777513028466188404
sl:creationDate : 2024-04-10
sl:creationTime : 2024-04-10T18:33:05Z

Infos sur le fichier

Bookmark of: https://twitter.com/ZeyuanAllenZhu/status/1777513028466188404