teslaenergy pfp
teslaenergy
@polboepolboepfnf
The Human Eval Leaderboard Is Gamed! Please Stop Using It 🙏 3.5 Sonnet is MUCH BETTER than GPT-4o-mini. A simple vibe check will confirm it Any leaderboard that says otherwise is gamed and doesn't work! It's misinformation to claim that Mini is better than 3.5. Similar https://t.co/STzybWfxrL
0 reply
1 recast
0 reaction