Antidote on Warpcast

Content pfp

https://warpcast.com/~/channel/six

0 reply

0 recast

0 reaction

Antidote pfp

@0xantidote.eth

I wrote an article about decentralized training on my blog. Read the full article here (treat the summary thread as a teaser) 👇 https://www.antidoteblog.com/post/decentralized-training-the-1000x-opportunity-and-networks-competing-with-centralized-players

1 reply

2 recasts

9 reactions

Antidote pfp

@0xantidote.eth

(1/6) ⭐ What is decentralized training: Use permissionless, global clusters for (pre)-training models instead of single compute hubs owned by central entities like OpenAI or Anthropic. A worldwide network of people creating frontier models similar to Bitcoin mining. However, this remains a huge engineering problem. I believe, whoever cracks it can create an OpenAI like outcome. (And also create substantial financial wealth to the network participants)

1 reply

0 recast

1 reaction

Antidote pfp

@0xantidote.eth

(2/6) 💸 Why does this opportunity exist? Centralized training (Grok 3, 100k-200k H100s) costs 9 figures & faces limits: space, energy. At the same time, there is a potential unlimited amount of decentralized compute a network could tap into. Decentralized networks can compete with centralized players when the share of distributed resource minus discounted by the cost of coordination is equal to the share of centralized resources. Is this the case for decentralized training? 🤔

1 reply

0 recast

1 reaction

Antidote pfp

@0xantidote.eth

(3/6) ⛓︎ Blockchain networks can contribute to the coordination of decentralized resources, as seen with DePIN networks or Bitcoin. The other hurdle is the engineering problems introduced by training on decentralized infrastructure : current centralized training relies on high-speed intra-cluster communication. A decentralized network cannot replicate this.

1 reply

0 recast

1 reaction

Antidote pfp

@0xantidote.eth

(4/6) ⚡ Some say it is not possible to train frontier models on decentralized infrastructure, but there are some approaches, which mainlyfall in two categories (both of which are still in their infancy). Data parallelism: splitting data across devices, have each cluster train on the data split and then merge the models. Model parallelism: slicing the model into chunks, have each cluster compute a part of the model.

1 reply

0 recast

0 reaction

Antidote pfp

@0xantidote.eth

(5/6) 🤼 Who are the key players to watch? (Check the article for a detailed table) -Nous DisTrO, -Gensyn, -Prime Intellect, -Pluralis Research, Bittensor Subnets

1 reply

0 recast

1 reaction