adrienne on Warpcast

adrienne pfp

Building an AI bot in public (cont.) For anyone following my journey of creating gmfc101, I’m starting to work on improving the quality of the bot’s replies. Right now, the bot isn’t getting enough good context from our transcripts to answer questions well. Here’s the current flow (chunk-based RAG): - Use semantic search to find the 5 most relevant conversations across all episodes. - Provide the 5 chunks (conversation snippets) to the LLM, along with metadata (episode details, YouTube URL, etc.). - Return the LLM’s response. Current issues: 1. The 5 random chunks lack coherence with each other. 2. The chunks aren’t detailed or complete enough to help the LLM generate accurate/useful answers—especially with voice transcripts, which are conversational and full of “ums,” “ahs,” and tangents. 3. The bot recommends YouTube links, but no one wants to watch an hour-long video to get an answer to their question. Going to try a new approach (cont...)

3 replies

2 recasts

17 reactions

adrienne pfp

The new approach I'm going to try: - Simple RAG is not working, I'm going to try a multi-step process to improve the quality of the context I'm sending into the prompt - Will continue to use semantic search to find the most relevant episodes where we talked about the topic the user is asking about, but instead of supplying those chunks as-is to the LLM, I'll use the episode metadata to retrieve up the full transcript of those episodes to get better/richer/deeper content - I should also be able to have the bot provide timestamps when recommending videos to watch Wish me luck!

5 replies

0 recast

7 reactions

adrienne pfp

I'm going to A/B test so I just created a copy of my existing API to use as a safe starting point for the B route. I'm taking a break but when I come back I'll start making changes to the B route: - use existing semantic search to identify relevant episodes (top 1, 2 or 3 instead of top 5) - get access to the full transcripts for each of those episodes (currently transcripts are only stored in chunks in vector db, but will make all the raw transcripts available to my API as json files) - implement a "smart expansion" logic to take the matched chunks from vectordb and expand them using the full json transript so they are longer and have more context, including timestamps - update the prompt to inject these longer richer transcripts

0 reply

0 recast

0 reaction