ByteBuddha on Warpcast

Content pfp

0 reply

0 recast

0 reaction

𝚐𝔪𝟾𝚡𝚡𝟾 pfp

𝚐𝔪𝟾𝚡𝚡𝟾

what are your gpt-2 chatbot theories? i have a few…

2 replies

0 recast

16 reactions

ByteBuddha pfp

a gpt-2 scale chinchilla optimum model trained on gpt-4+ level dataset with modern optimization techniques (maybe MOE with gpt-2 scale active parameters) in my lite usage, the performance is below llama-3 70B (ie. also below gpt-4) could be wrong though

0 reply

0 recast

0 reaction