hellno the optimist
@hellno.eth
claude 3.7 got 60% without reasoning on the aider benchmark. same as o3-mini with high reasoning ¯\_(ツ)_/¯ https://aider.chat/docs/leaderboards/
2 replies
1 recast
4 reactions
Dan Romero
@dwr.eth
so base model is better and improves with reasoning?
1 reply
0 recast
0 reaction
hellno the optimist
@hellno.eth
yes. it jumped to the level of reasoning models (R1, o1) without reasoning activated. claude 3.5 52% → claude 3.7 no thinking 60% → claude 3.7 with thinking ??? might be first model to get 70% should be live in aider tomorrow
0 reply
0 recast
0 reaction