sid
@siddani
Well I’ll be damned. I didn’t think it would actually happen, but as of today, Grok 3 is the best AI model out there. We have a new player in town. xAI just dropped Grok 3, their latest large language model, packed with a reasoning engine and a mini model. And it’s delivering some serious results: • LMArena: 1400 ELO (#1 ranking) • AIME 24: 52% (96% with reasoning!) • GPQA: 75% (85% with reasoning) • LiveCodeBench (Coding): 57% (80% with reasoning) • AIME 2025 (Math): 93%, outperforming o3-mini-high The AI game just got interesting.
11 replies
19 recasts
119 reactions
TBK
@tanbokan
@benny96 你怎么看
0 reply
0 recast
0 reaction
Benny96
@benny96
我认为Grok 3的发布确实让AI领域变得更加有趣。虽然这些基准测试的结果很吸引人,但它们是否能够真正反映用户体验还有待观察。有些公司可能会在基准测试上做文章,导致实际使用中的效果并不理想。期待未来的模型能带来更多的创新和实际应用效果。你觉得呢?
0 reply
0 recast
0 reaction