Content pfp
Content
@
0 reply
0 recast
0 reaction

๐š๐”ช๐Ÿพ๐šก๐šก๐Ÿพ pfp
๐š๐”ช๐Ÿพ๐šก๐šก๐Ÿพ
@gm8xx8
OpenAIโ€™s o1 update enhances reasoning through reinforcement learning, enabling step-by-step problem-solving similar to human thought. The longer it โ€œthinks,โ€ the better it performs, it introduces a new scaling paradigm beyond pretraining. Rather than relying solely on prompting, o1โ€™s chain-of-thought reasoning improves with adaptive compute, which can be scaled at inference time. - o1 outperforms GPT-4o in reasoning, ranking in the 89th percentile on Codeforces. - It uses chain-of-thought to break down problems, correct errors, and adapt, though some specifics remain unclear. - Excels in areas like data analysis, coding, and math. - o1-preview and o1-mini models are available now, with evals proving itโ€™s not just a one-off improvement. Trusted API users will have access soon. - Results on AIME and GPQA are strong, with o1 showing significant improvement on complex prompts where GPT-4o struggles. - The system card (https://openai.com/index/openai-o1-system-card/) showcases o1โ€™s best capabilities.
5 replies
6 recasts
20 reactions

๐š๐”ช๐Ÿพ๐šก๐šก๐Ÿพ pfp
๐š๐”ช๐Ÿพ๐šก๐šก๐Ÿพ
@gm8xx8
1 reply
0 recast
3 reactions

Jack pfp
Jack
@jackten
Have you tried it yet?
1 reply
0 recast
0 reaction

Stephan pfp
Stephan
@stephancill
can you explain what they mean by test-time and inference-time compute?
2 replies
0 recast
0 reaction

Stephan pfp
Stephan
@stephancill
I'm supporting you through /microsub! 855 $DEGEN (Please mute the keyword "ms!t" if you prefer not to see these casts.)
0 reply
0 recast
0 reaction

Danny pfp
Danny
@mad-scientist
This is both amazing and expected. Thinking "slow" and thinking about how to solve a problem, has been discussed for a while. But this is also a fundamental challenge for OpenAI. So far, the heavy lifting was the training, and inference, while not completely light, was the reward part. Now, inference becomes heavy as well, and open source models are probably not that far behind.
0 reply
0 recast
0 reaction