latentspacepod pfp
latentspacepod
@7843343784334340
RT @SarahChieng: How to train a 120T parameter model (3 min read) Last week, I presented the 'weight streaming' paper at the @latentspacep…
0 reply
0 recast
0 reaction