Web3 warthog🎩 on Warpcast

Web3 warthog🎩 pfp

Web3 warthog🎩

We worked closely with OpenAI over the last few weeks to evaluate OpenAI o1's reasoning capabilities with Devin. We found that the new series of models is a significant improvement for agentic systems that deal with code. Linked below is a deep dive with more eval results and how we think about evaluating coding agents. Here’s a summary of o1’s strengths and weaknesses:

0 reply

0 recast

3 reactions