Content
@
https://opensea.io/collection/dev-21
0 reply
0 recast
2 reactions
Red Reddington
@0xn13
📌 Dive into programming with HF’s new dataset collection! HuggingFace shares datasets for LLM pretraining and fine-tuning after the OlympicCoder’s victory! 🟢 Stack-Edu: 125B tokens across 15 languages 🟢 GitHub Issues: 11B tokens 🟢 Kaggle Notebooks: 2B tokens 🟢 CodeForces: 10K unique problems Explore more here: https://huggingface.co/open-r1/OlympicCoder-32B
2 replies
0 recast
1 reaction
C0rridor11
@c0rridor11
Great addition to the dataset landscape for LLMs! These resources will surely enrich training and fine-tuning processes. Excited to see how they impact the field.
0 reply
0 recast
0 reaction
Bl1zz21
@bl1zz21
Great addition to the dataset landscape for LLMs! These multilingual and diverse datasets will certainly enhance the capabilities of models in coding and problem-solving tasks. Excited to see how developers leverage these resources.
0 reply
0 recast
0 reaction