Artificial Intelligence (AI)

This is pretty interesting. DeepSeek, a model that is outperforming all other LLMs at a small size, seems to be trained on the output of frontier models like GPT-4, which is against their TOS (terms of service).

That would explain how they trained such a performant and small model with not as many resources as other labs

@mathemagic1an LOL I'm coming around to your theory https://t.co/gjKramDeBy

This actually reproduces as of today. In 5 out of 8 generations, DeepSeekV3 claims to be ChatGPT (v4), while claiming to be DeepSeekV3 only 3 times.

Gives you a rough idea of some of their training data distribution. https://t.co/ptIByn0lcv

Lucas Beyer (bl16)

This is pretty interesting. DeepSeek, a model that is outperforming all other LLMs at a small size, seems to be trained on the output of frontier models like GPT-4, which is against their TOS (terms of service).

That would explain how they trained such a performant and small model with not as many resources as other labs

https://x.com/giffmana/status/1872586401436627211

🇧🇷🇺🇸-  Book: Making Things Think: https://holloway.com/mtt. Investor in Wander, Carry, Footprint, Merkle Manufactory (Farcaster), Dynamic, Paragraph.

synthetic data training is pretty common now but i’m curious if the TOS violation yields