i'm figuring out how to train a LLM and documenting each step in this blog series. 

for anyone who is also curious, here is part 1:
https://michaelhly.com/posts/train-llm-one

download meeeee @ blobs.lol;
win up to $75 every week!!!

blobs, are you doing AI for blobs (the product) or is it exploration?

also: what limit should i reach before i consider tuning Llama? what does it look like?

a). just exploration
b). i'm not sure what you mean by limit. but hugging face has pretrained llamas that you can grab off the shelf:
https://huggingface.co/meta-llama

my belief is that you should tune a model to find a fit for your dataset ... otherwise you probably don't need to ...

limit means: gpt4 is pretty great, when should i consider having my own models, i want to play with it, but i don't have a hair-on-fire problem i want to solve with some custom llm, no strong motivation

yes. so this is the hardest part — defining the output/results you want the model to tune towards

yes for generative use cases, it's hard to beat openai.  i believe people usually tune for niche/task-specific use cases.

for example: pretrained model was trained on X corpus, but my data set has some variance, and i want tune the model to adjust for the variance