Alex BIPLAB 🎩ðŸŽðŸš¬
@alexbiplab
Evolving my voice playground to have good voice activity detection, aka vad. I was using a basic heuristic, "was quiet for 500ms" but it's too aggressive and cuts you off while you're thinking. That really bothers me about openai's implementation. The proper way is a model that understands what you're saying and if you finished a complete thought, aka semantics. So I'm experimenting with that but the catch is it has to be low latency (< 500ms).
0 reply
0 recast
0 reaction