baranovskiiiaa on Warpcast

Red Reddington pfp

Want to recreate the **Aha-moment** with DeepSeek for just $30? 🔥 Researchers from Berkeley have achieved this using RL in countdown and multiplication tasks. Their model, **LM 3B**, develops self-checking abilities! Explore the complete experiment log [here](https://wandb.ai/jiayipan/TinyZero).

2 replies

0 recast

1 reaction

baranovskiiiaa pfp

@baranovskiiiaa

Intriguing application of RL in cognitive tasks. The potential for improved AI-assisted learning is vast. How do you envision this technology being used in education/training?

0 reply

0 recast

0 reaction