adrienne
@adrienne
Building an AI bot in public (cont.) For anyone following my journey of creating gmfc101, I’m starting to work on improving the quality of the bot’s replies. Right now, the bot isn’t getting enough good context from our transcripts to answer questions well. Here’s the current flow (chunk-based RAG): - Use semantic search to find the 5 most relevant conversations across all episodes. - Provide the 5 chunks (conversation snippets) to the LLM, along with metadata (episode details, YouTube URL, etc.). - Return the LLM’s response. Current issues: 1. The 5 random chunks lack coherence with each other. 2. The chunks aren’t detailed or complete enough to help the LLM generate accurate/useful answers—especially with voice transcripts, which are conversational and full of “ums,” “ahs,” and tangents. 3. The bot recommends YouTube links, but no one wants to watch an hour-long video to get an answer to their question. Going to try a new approach (cont...)
3 replies
2 recasts
17 reactions
adrienne
@adrienne
The new approach I'm going to try: - Simple RAG is not working, I'm going to try a multi-step process to improve the quality of the context I'm sending into the prompt - Will continue to use semantic search to find the most relevant episodes where we talked about the topic the user is asking about, but instead of supplying those chunks as-is to the LLM, I'll use the episode metadata to retrieve up the full transcript of those episodes to get better/richer/deeper content - I should also be able to have the bot provide timestamps when recommending videos to watch Wish me luck!
5 replies
0 recast
7 reactions
adrienne
@adrienne
@gmfc101 I'm going to be tagging you while I'm working on giving you better access to transcripts for context.
1 reply
0 recast
0 reaction
adrienne
@adrienne
I'm going to A/B test so I just created a copy of my existing API to use as a safe starting point for the B route. I'm taking a break but when I come back I'll start making changes to the B route: - use existing semantic search to identify relevant episodes (top 1, 2 or 3 instead of top 5) - get access to the full transcripts for each of those episodes (currently transcripts are only stored in chunks in vector db, but will make all the raw transcripts available to my API as json files) - implement a "smart expansion" logic to take the matched chunks from vectordb and expand them using the full json transript so they are longer and have more context, including timestamps - update the prompt to inject these longer richer transcripts
0 reply
0 recast
0 reaction
Royal
@royalaid.eth
You might also want to try changing up how you're doing your embeddings next if the above doesn't work. I know there is a lot of variance depending on the model used to generate the vectors.
1 reply
0 recast
1 reaction
KMac🍌 ⏩
@kmacb.eth
💜💜💜👏 Please keep sharing. Heading down a similar path & your experience sharing is helpful. I was thinking of @alexpaden’s juicr as I read your cast. Are you using it for cast search to augment
2 replies
0 recast
1 reaction
Kieran Daniels 🎩
@kdaniels.eth
This is what I was saying! Don’t use rag, prompt layer. Eric and Sid are experts at this.
1 reply
0 recast
0 reaction