Content
@
0 reply
0 recast
0 reaction
𝚐𝔪𝟾𝚡𝚡𝟾
@gm8xx8
NVIDIA recently released Llama 3.1 Nemotron 70B Instruct, a fine-tuned model using RLHF. - Scored 85.0 on Arena Hard, 57.6 on AlpacaEval 2 LC, and 8.98 on MT-Bench - Achieved 55% on Aider’s leaderboard, just behind Llama-3.1-70B-Instruct at 59% - Available on Hugging Face and NVIDIA platforms. I believe it’s ranked 78 overall and that feels accurate. In Nvidia’s defense, I don’t think they claimed to be better than Sonnet or GPT-4o, only that their model performed well on synthetic human preference benchmarks. Nemotron is a solid model and a great contribution. Nvidia’s claims were accurate, benchmarks seem to be the culprit. 🤗: https://huggingface.co/collections/nvidia/llama-31-nemotron-70b-670e93cd366feea16abc13d8 nvidia: https://build.nvidia.com/nvidia/llama-3_1-nemotron-70b-instruct
1 reply
0 recast
4 reactions
not parzival
@shoni.eth
gotta learn about benchmarks now🥸
0 reply
0 recast
1 reaction