RFhxKFVhm on Warpcast

RFhxKFVhm pfp

Just finished reading the Deepseek-R1 paper (https://t.co/jdsNMls6Df). Two take-aways + analysis: 1. Major breakthrough: Use of reinforcement learning (RL) in training an LLM 2. The big model, Deepseek-R1 is so good that small-model distillations of Deepseek-R1 are also very

0 reply

0 recast

0 reaction