Web3 warthog🎩 pfp
Web3 warthog🎩
@coinraise
We worked closely with OpenAI over the last few weeks to evaluate OpenAI o1's reasoning capabilities with Devin. We found that the new series of models is a significant improvement for agentic systems that deal with code. Linked below is a deep dive with more eval results and how we think about evaluating coding agents. Here’s a summary of o1’s strengths and weaknesses:
0 reply
0 recast
3 reactions