Venkatesh Rao ☀️ on Warpcast

Venkatesh Rao ☀️ pfp

Venkatesh Rao ☀️

Yet another generation learns about Jevon’s Paradox this week. Ask Deepseek to explain it to you if you don’t get it. Cuda moat is down for the count, but AI hardware prospects have never been more bullish. Nvidia will peak later than people expect, but they’re no longer synonymous with the sector. https://en.wikipedia.org/wiki/Jevons_paradox

1 reply

0 recast

6 reactions

will pfp

wait what's the connection between the deepseek stuff and cuda specifically?

1 reply

0 recast

0 reaction

Venkatesh Rao ☀️ pfp

Venkatesh Rao ☀️

They apparently did a lot of low-level kernel optimizations and hacks bypassing cuda to get their high efficiency results. Basically bare metal. And in the process showed that the overhead of using cuda may be a huge performance drag on what the hardware can actually do. Cuda is a code swamp everyone I know hates and tries to work around. OpenAIs Triton is basically an imperfect wrapper/partial workaround. It is a moat because Nvidia throws a lot of labor at it not because it is sound. 70% confidence opinion that I may revise based on what more knowledgeable people conclude.

1 reply

0 recast

1 reaction

will pfp

oh interesting. Know they employed a bunch of tricks, wasn't aware / haven't seen info re bypassing cuda. Will keep an eye out to learn more (asked chatgpt but it didn't have anything to say about deepseek & cuda fwiw)

1 reply

0 recast

0 reaction

Venkatesh Rao ☀️ pfp

Venkatesh Rao ☀️

Ben Thompson has the highlight factoid but there’s more detailed stuff. PTX is basically bare metal. This is kinda well-known. Within the cuda paradigm big matmults are efficient but small ones are not. It’s like starting and stopping a car factory to make 1 part. Lots of attempts to make general and flexible high-utilization use of the “matmults factory” (look up dataflow architecture). The general problem remains unsolved but for specific problems you can get very far with kernel level optimizations to suit the problem. https://stratechery.com/2025/deepseek-faq/

1 reply

0 recast

1 reaction

will pfp

0 reply

0 recast

0 reaction