Let's say you have a corpus of text — 10 million words — about a specific topic.

1. What's the best way to "train a model" on that text?

2. Is that even the right term? Or is it using an existing foundational model and then augmenting it? Fine-tuning it? Something else?

depends on usecase. you don’t need to train/finetune for mvp. just need rag and prompt engineering. if hallucinations is a problem eg health care, try deterministic quoting. happy to answer q’s as i have deployed these for clients across real-estate and one underway w/ smart contracts (on base most prolly)

have done both qa and channel summarization on all fc data recently. its just stale b/c didn’t want to run pipeline and didn’t want to develop a net new fc client. micro-demo on some twitter data:

have done both qa and channel summarization on all fc data recently. its just stale b/c didn’t want to run pipeline and didn’t want to develop a net new fc client. micro-demo on some twitter data: https://www.youtube.com/watch?v=X8lNlf2cJjU