Content
@
https://warpcast.com/~/channel/pat
0 reply
0 recast
0 reaction
Pat Yiu
@patyiutazza
The reason DeepSeek is better is because it's trained on Chinese where a single character can have 6+ meanings. "Better" language is more efficient. Massive drop in demand for H100/200s in comparison to 4090/5090s the past 30 days
1 reply
0 recast
4 reactions
gami.eth ⎈
@gami.eth
but that’s not how transformers work. don’t they find context in interdependence of sentences? how would they do so within a single character?
1 reply
0 recast
0 reaction