- Rush AM,
Biderman S,
Webson A,
Sasanka Ammanamanchi P, Wang T,
Sagot B,
Muennighoff N,
Villanova del
Moral A,
Ruwase O,
Bawden R,
Bekman S, McMillan-Major...
-
Stable and
Transferable Sp****
Expert Models". arXiv:2202.08906 [cs.CL].
Muennighoff, Niklas; Soldaini, Luca; Groeneveld, Dirk; Lo, Kyle; Morrison, Jacob;...
- "Transcending
Scaling Laws with 0.1%
Extra Compute". arXiv:2210.11399 [cs.CL].
Muennighoff, Niklas; Rush, Alexander; Barak, Boaz; Le Scao, Teven; Tazi, Nouamane;...
-
Retrieved 11
December 2023. Li, Raymond; Allal,
Loubna Ben; Zi, Yangtian;
Muennighoff, Niklas; Kocetkov, Denis; Mou, Chenghao; Marone, Marc; Akiki, Christopher;...
-
Training Enables Zero-Shot Task Generalization". arXiv:2110.08207 [cs.LG].
Muennighoff, Niklas; Wang, Thomas; Sutawika, Lintang; Roberts, Adam; Biderman, Stella;...