Content pfp
Content
@
0 reply
0 recast
0 reaction

Katsuya pfp
Katsuya
@kn
Anyone know open source models or papers for splitting audio (that contains human speech and other sounds) into two separate audio files, one containing human speech only and one containing other sounds?
5 replies
1 recast
3 reactions

Giuliano Giacaglia pfp
Giuliano Giacaglia
@giu
I’ve tried to find a good tool for this to help with speech synthesis to get the human speech track but didn’t find anything that worked well
1 reply
0 recast
1 reaction

Katsuya pfp
Katsuya
@kn
Interesting, it seems doable and dataset is relatively easy to synthesize.
1 reply
0 recast
0 reaction

Giuliano Giacaglia pfp
Giuliano Giacaglia
@giu
The problem is the dataset, but I agree once you have the dataset it shouldn’t be hard to do it
2 replies
0 recast
0 reaction

Giuliano Giacaglia pfp
Giuliano Giacaglia
@giu
Also that’s why speech synthesis is not really great with every single figure. For now, you need enough clean data with someone’s voice. Joe Rogan, for example, is a great subject for this
0 reply
0 recast
0 reaction