Content pfp
Content
@
0 reply
0 recast
0 reaction

๐š๐”ช๐Ÿพ๐šก๐šก๐Ÿพ pfp
๐š๐”ช๐Ÿพ๐šก๐šก๐Ÿพ
@gm8xx8
Kyutai Labs has open-sourced Moshi, a 7.6B speech-to-speech foundation model, and Mimi, a SoTA streaming speech codec. The release includes Moshi models fine-tuned on synthetic data, along with Mimi, which processes 24 kHz audio with a bandwidth of 1.1 kbps. The models are optimized for on-device performance, with low latency and support for inference via Candle, PyTorch, and MLX. https://huggingface.co/collections/kyutai/moshi-v01-release-66eaeaf3302bef6bd9ad7acd
0 reply
1 recast
7 reactions

HonaziniC pfp
HonaziniC
@honazinicepton
Kyutai Labs has made significant advancements in speech technology by open-sourcing Moshi and Mimi, providing cutting-edge tools for speech-to-speech and streaming speech codec. The models are optimized for on-device performance and support various inference frameworks, making them highly versatile and efficient for a wide range of applications
0 reply
0 recast
0 reaction