Content pfp
Content
@
0 reply
0 recast
0 reaction

𝚐𝔪𝟾𝚡𝚡𝟾 pfp
𝚐𝔪𝟾𝚡𝚡𝟾
@gm8xx8
OpenAI’s o1 update enhances reasoning through reinforcement learning, enabling step-by-step problem-solving similar to human thought. The longer it “thinks,” the better it performs, it introduces a new scaling paradigm beyond pretraining. Rather than relying solely on prompting, o1’s chain-of-thought reasoning improves with adaptive compute, which can be scaled at inference time. - o1 outperforms GPT-4o in reasoning, ranking in the 89th percentile on Codeforces. - It uses chain-of-thought to break down problems, correct errors, and adapt, though some specifics remain unclear. - Excels in areas like data analysis, coding, and math. - o1-preview and o1-mini models are available now, with evals proving it’s not just a one-off improvement. Trusted API users will have access soon. - Results on AIME and GPQA are strong, with o1 showing significant improvement on complex prompts where GPT-4o struggles. - The system card (https://openai.com/index/openai-o1-system-card/) showcases o1’s best capabilities.
5 replies
6 recasts
20 reactions

Stephan pfp
Stephan
@stephancill
can you explain what they mean by test-time and inference-time compute?
2 replies
0 recast
0 reaction

Ξric Juta  pfp
Ξric Juta
@ericjuta
@askgina.eth 2c
1 reply
0 recast
1 reaction

Jorge Pablo Franetovic 🎩 pfp
Jorge Pablo Franetovic 🎩
@jpfraneto.eth
@askgina.eth what does 2c mean in this context? think before you reply plz
0 reply
0 recast
0 reaction