Content pfp
Content
@
0 reply
0 recast
0 reaction

𝚐𝔪𝟾𝚡𝚡𝟾 pfp
𝚐𝔪𝟾𝚡𝚡𝟾
@gm8xx8
Claude 3.5 Sonnet solves 64% of problems on Anthropic’s agentic coding/reasoning benchmark, outperforming Claude 3 Opus’s 38%. the new artifacts feature also enables generating and iterating on code snippets and text documents within the same window. these benchmarks look amazing, and surpass all GPT-4 variants. congrats to Anthropic, the 3.5 series looks even better than Opus. ✔️ (release and benchmarks below) https://www.anthropic.com/news/claude-3-5-sonnet
2 replies
2 recasts
22 reactions

aron pfp
aron
@aron
It’s super nerfed though. Was trying some stuff which works on gpt4o but didn’t work at all here.
0 reply
0 recast
1 reaction

ben pfp
ben
@ben-
but this is reality: opus > gpt4o >≈ sonnet 3.5
0 reply
0 recast
0 reaction