adrienne
@adrienne
Building an AI bot in public (cont.) For anyone following my journey of creating gmfc101, I’m starting to work on improving the quality of the bot’s replies. Right now, the bot isn’t getting enough good context from our transcripts to answer questions well. Here’s the current flow (chunk-based RAG): - Use semantic search to find the 5 most relevant conversations across all episodes. - Provide the 5 chunks (conversation snippets) to the LLM, along with metadata (episode details, YouTube URL, etc.). - Return the LLM’s response. Current issues: 1. The 5 random chunks lack coherence with each other. 2. The chunks aren’t detailed or complete enough to help the LLM generate accurate/useful answers—especially with voice transcripts, which are conversational and full of “ums,” “ahs,” and tangents. 3. The bot recommends YouTube links, but no one wants to watch an hour-long video to get an answer to their question. Going to try a new approach (cont...)
3 replies
2 recasts
17 reactions
adrienne
@adrienne
The new approach I'm going to try: - Simple RAG is not working, I'm going to try a multi-step process to improve the quality of the context I'm sending into the prompt - Will continue to use semantic search to find the most relevant episodes where we talked about the topic the user is asking about, but instead of supplying those chunks as-is to the LLM, I'll use the episode metadata to retrieve up the full transcript of those episodes to get better/richer/deeper content - I should also be able to have the bot provide timestamps when recommending videos to watch Wish me luck!
5 replies
0 recast
7 reactions
downshift
@downshift.eth
/microsub tip: 3190 $DEGEN
0 reply
0 recast
0 reaction
Jason
@jachian
this video might be helpful. fwiw the guy in the video is current president of the arc prize and friend of mine he breaks down his build for parsing business wisdom from the My First Million podcast (600+ episodes) https://youtu.be/NQtWHOUmqNw?si=abs0-P3Y5if9zoCK
0 reply
0 recast
0 reaction