Laurençon, H, Saulnier, L, Wang, T, Akiki, C, del Moral, AV, Le Scao, T, von Werra, L, Mou, C, Ponferrada, EG, Nguyen, H, Frohberg, J, Šaško, M, Lhoest, Q, McMillan-Major, A, Dupont, G, Biderman, S, Rogers, A, Ben allal, L, De Toni, F, Pistilli, G, Nguyen, O, Nikpoor, S, Masoud, M, Colombo, P, de la Rosa, J, Villegas, P, Thrush, T, Longpre, S, Nagel, S, Weber, L, Muñoz, MR, Zhu, J, van Strien, D, Alyafeai, Z, Almubarak, K, Chien, VM, Gonzalez-Dios, I, Soroa, A, Lo, K, Dey, M, Suarez, PO, Gokaslan, A, Bose, S, Adelani, DI, Phan, L, Tran, H, Yu, I, Pai, S, Chim, J, Lepercq, V, Ilić, S, Mitchell, M, Luccioni, S & Jernite, Y 2022,
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset. in S Koyejo, S Mohamed, A Agarwal, D Belgrave, K Cho & A Oh (eds),
Advances in Neural Information Processing Systems 35 - 36th Conference on Neural Information Processing Systems, NeurIPS 2022. Advances in Neural Information Processing Systems, vol. 35, Neural information processing systems foundation, 36th Conference on Neural Information Processing Systems, NeurIPS 2022, New Orleans, United States,
28/11/22.