Content pfp
Content
@
0 reply
0 recast
0 reaction

Dan Romero pfp
Dan Romero
@dwr.eth
Let's say you have a corpus of text — 10 million words — about a specific topic. 1. What's the best way to "train a model" on that text? 2. Is that even the right term? Or is it using an existing foundational model and then augmenting it? Fine-tuning it? Something else?
22 replies
7 recasts
77 reactions

K 🎩🔆 pfp
K 🎩🔆
@kijijij
Those 10M words should be in natural language, then *use https://www.trychroma.com/ to push data and connect with LLM. * create set of QA to verify satisfaction * create more data from existing corpus to reduce hallucination Interested in PoC for this, LMK
0 reply
0 recast
0 reaction