Content pfp
Content
@
0 reply
0 recast
0 reaction

Dan Romero pfp
Dan Romero
@dwr.eth
Let's say you have a corpus of text — 10 million words — about a specific topic. 1. What's the best way to "train a model" on that text? 2. Is that even the right term? Or is it using an existing foundational model and then augmenting it? Fine-tuning it? Something else?
18 replies
2 recasts
118 reactions

Daniel - Bountycaster pfp
Daniel - Bountycaster
@pirosb3
What should the model do? would this be an instruction-based model (answer questions - similar to ChatGPT)?
1 reply
0 recast
0 reaction

Dan Romero pfp
Dan Romero
@dwr.eth
Yeah ability to give you answers based on what is in the corpus but nothing else
4 replies
0 recast
0 reaction

Marwan ♋️ pfp
Marwan ♋️
@marwan1337
There's a tool called PrivateGPT that does exactly this. It won't answer anything not based on the text provided. https://docs.privategpt.dev/overview/welcome/introduction
0 reply
0 recast
1 reaction