Content pfp
Content
@
0 reply
0 recast
0 reaction

Dan Romero pfp
Dan Romero
@dwr.eth
Let's say you have a corpus of text — 10 million words — about a specific topic. 1. What's the best way to "train a model" on that text? 2. Is that even the right term? Or is it using an existing foundational model and then augmenting it? Fine-tuning it? Something else?
18 replies
2 recasts
47 reactions

beeboop pfp
beeboop
@beeboop.eth
Training is correct, it's an umbrella term. Fine-tuning refers to the "voice" of the LLM i.e. their linguistics. RAG or "Retrieval-Augmented Generation" refers to the addition of a new corpus of data on top of the foundation model data. You likely want RAG + Fine-Tuning to achieve your goal.
0 reply
0 recast
3 reactions