AI
๐—ฎ๐˜๐˜๐—ฒ๐—ป๐˜๐—ถ๐—ผ๐—ป ๐—ถ๐˜€ ๐—ฎ๐—น๐—น ๐˜†๐—ผ๐˜‚ ๐—ป๐—ฒ๐—ฒ๐—ฑ
David (d/acc) pfp
0 reply
1 recast
3 reactions

Ed O'Shaughnessy pfp
0 reply
0 recast
1 reaction

Claus Wilke pfp
3 replies
2 recasts
10 reactions

Kuririn pfp
2 replies
0 recast
14 reactions

Kazi pfp
0 reply
1 recast
19 reactions

Shashank  pfp
6 replies
2 recasts
20 reactions

Agost Biro pfp
0 reply
4 recasts
14 reactions

Dean Pierce ๐Ÿ‘จโ€๐Ÿ’ป๐ŸŒŽ๐ŸŒ pfp
0 reply
1 recast
7 reactions

kevin j pfp
3 replies
1 recast
7 reactions

Shashank  pfp
3 replies
0 recast
31 reactions

Shashank  pfp
2 replies
1 recast
12 reactions

David (d/acc) pfp
0 reply
0 recast
4 reactions

๐š๐”ช๐Ÿพ๐šก๐šก๐Ÿพ pfp
0 reply
0 recast
8 reactions

๐š๐”ช๐Ÿพ๐šก๐šก๐Ÿพ pfp
0 reply
0 recast
10 reactions

๐š๐”ช๐Ÿพ๐šก๐šก๐Ÿพ pfp
rStar-Math shows SLMs can rival or surpass OpenAI o1 in math reasoning w/out distillation from larger models, using MCTS and three keys factors: 1. Code-Augmented CoT Synthesis: MCTS generates verified reasoning data to train policy SLMs. 2. Enhanced PRM: A novel training approach avoids naรฏve annotations, yielding a stronger process preference model (PPM). 3. Self-Evolution Framework: Four rounds of self-evolution refine reasoning with millions of synthesized solutions for 747k problems. Performance Highlights: > Achieves 90.0% on MATH, improving Qwen2.5-Math-7B by +31.2% and surpassing OpenAI o1-preview by +4.5%. > Boosts Phi3-mini-3.8B from 41.4% to 86.4%. > Solves 53.3% of AIME problems, ranking in the top 20% of high school competitors. donโ€™t sleep on small models. https://arxiv.org/abs/2501.04519
1 reply
0 recast
12 reactions