Content pfp
Content
@
0 reply
0 recast
0 reaction

๐š๐”ช๐Ÿพ๐šก๐šก๐Ÿพ pfp
๐š๐”ช๐Ÿพ๐šก๐šก๐Ÿพ
@gm8xx8
Kyutai Labs has open-sourced Moshi, a 7.6B speech-to-speech foundation model, and Mimi, a SoTA streaming speech codec. The release includes Moshi models fine-tuned on synthetic data, along with Mimi, which processes 24 kHz audio with a bandwidth of 1.1 kbps. The models are optimized for on-device performance, with low latency and support for inference via Candle, PyTorch, and MLX. https://huggingface.co/collections/kyutai/moshi-v01-release-66eaeaf3302bef6bd9ad7acd
0 reply
1 recast
7 reactions