Content pfp
Content
@
0 reply
0 recast
0 reaction

Dan Romero pfp
Dan Romero
@dwr.eth
Let's say you have a corpus of text — 10 million words — about a specific topic. 1. What's the best way to "train a model" on that text? 2. Is that even the right term? Or is it using an existing foundational model and then augmenting it? Fine-tuning it? Something else?
18 replies
2 recasts
117 reactions

Daniel - Bountycaster pfp
Daniel - Bountycaster
@pirosb3
What should the model do? would this be an instruction-based model (answer questions - similar to ChatGPT)?
1 reply
0 recast
0 reaction

Dan Romero pfp
Dan Romero
@dwr.eth
Yeah ability to give you answers based on what is in the corpus but nothing else
4 replies
0 recast
0 reaction

ash pfp
ash
@aes
I would > create an endpoint to a S3 bucket with the text / resources you want to interact with > create a GPT action that uses the endpoint to access the text > use chatGPT4o interface to "talk" with documents OR use Brev.dev to fine-tune an open source model like Mistral 7B on your text
0 reply
0 recast
0 reaction