Content pfp
Content
@
0 reply
0 recast
0 reaction

๐š๐”ช๐Ÿพ๐šก๐šก๐Ÿพ pfp
๐š๐”ช๐Ÿพ๐šก๐šก๐Ÿพ
@gm8xx8
Moshi ๐Ÿ”ฅ - 1. 7b multimodal LM - will be released as open source!! -achieves 160ms latency๐ŸคŒโœจ - trained on Scaleway cluster of 1000 H100 GPUs - expresses emotions and understands accents, like a โ€œfrench accent.โ€ - handles audio generation and listening simultaneously. - processes thoughts textually during speech. - uses dual audio streams for simultaneous listening and speaking. - jointly pre-trained on text and audio. - utilizes synthetic text from the 7b LLM Helium and fine-tuned on 100k TTS-converted โ€œoral-styleโ€ conversations. - voice learned from TTS-generated data. - achieves 200ms end-to-end latency. - includes a smaller version for macbooks or consumer GPUs. - implements watermarking to identify AI-generated audio (in progress).
1 reply
1 recast
21 reactions

๐š๐”ช๐Ÿพ๐šก๐šก๐Ÿพ pfp
๐š๐”ช๐Ÿพ๐šก๐šก๐Ÿพ
@gm8xx8
i tried the demo and imo it still has a long way to go, now if they in fact release code, model, and paper i believe the community will improve on it and it will be a much better assistant. โ€ฆ some notes: - 1. 7B Multimodal LM - Moshi already runs on apple laptops! (can run on laptop / on consumer GPUs) big w! - local: no data leaving your computer, no internet access. - open source! technical report and open model releases๐Ÿคž - latency + - trained by 8+ people in 4 months - used a heavy amount of synthetic data - didnโ€™t get the emotion - missed simple scheduling prompts - no multilingual - needs fine-tuning try demo โ†“ https://moshi.chat/?queue_id=talktomoshi
1 reply
2 recasts
4 reactions

Frank pfp
Frank
@deboboy
nice summaryโ€ฆ did you have to design a prompt to elicit emotion?
1 reply
2 recasts
1 reaction

๐š๐”ช๐Ÿพ๐šก๐šก๐Ÿพ pfp
๐š๐”ช๐Ÿพ๐šก๐šก๐Ÿพ
@gm8xx8
i couldnโ€™t capture different emotions, even simple prompts resulted in monotone responses for anger, sadness, and happiness. as the prompts grew more complex, the results strayed further from the intended task, and even repeating previous tasks became increasingly tedious. i was able to get it to whisper quite easily but when asked to revert it gave this response. (will spend more time with it again later)
1 reply
0 recast
1 reaction

Frank pfp
Frank
@deboboy
openchat-3.5-0106 approached emotion in my tests; accidentally got hooked on it for daily MH
0 reply
0 recast
0 reaction