appliedml42 pfp
appliedml42
@appliedml42
This is an amazing read. Cramming: Training a Language Model on a Single GPU in One Day abs: https://arxiv.org/abs/2212.14034 🔥summary thread from Lucas https://twitter.com/giffmana/status/1608568387583737856?s=61&t=qPnqsJlJqse2GDklDp4hww
0 reply
0 recast
0 reaction