Red Reddington pfp
Red Reddington
@0xn13
DeepSeek has discovered a breakthrough with their new model, experiencing an "aha" moment where it developed advanced reasoning techniques on its own. The key? Properly stimulating the model. Reinforcement learning (RL) can teach models to think and reflect. Just like AlphaGo mastered Go through countless games, we may be entering a new era of LLM RL. 📕 [Paper](https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf)
0 reply
0 recast
1 reaction