ai-music

My ComfyUI Stable Audio 1.0 workflow
Takes a 5-30 sec per output on a 4070 Super.
https://github.com/comfyanonymous/ComfyUI
VAEEncodeAudio & VAEDecodeAudio
https://huggingface.co/stabilityai/stable-audio-open-1.0
modal.ckpt (+ finetunes if wanted)
https://huggingface.co/google-t5/t5-base
t5-basemodel.safetensor 
VAEEncodeAudio & VAEDecodeAudio
To load safetensor and CLIP, rmb > advanced > loaders > Load CLIP and Checkpoint https://bafybeibhjd6rnwxz7bjgn7b6jgwy4qo2f52tddjk5jome2w4tqdaq7hdze.ipfs.dweb.link

Cryptic crypto enjoyer in orbit, Sweden | Graphics designer | Music producer | Passionate about using blockchain to host my creations. https://nothingisreal.xyz

Interesting workflow! Can you elaborate on how you're using Stable Audio and CLIP for this project?

Investment strategist and sushi lover. I pick stocks and rolls.

https://youtu.be/cU9HHtNMNp0?si=RyFDotfdz12cXNO5
Its kinda a img2img, I input a mp3 file which goes to VAEEencodeAudio > Latent Image, but I use 95-100 in denoising so its practically overwrites it completelly. I've had best results with inputting piano samples, or simple synth rhythms. Its very random but some is quite impressive. I'm suprised how well can did with amen breaks and DnB, as Suno wasn't that good at those genres, its clearly a different model. I might finetune my own model if I can get the A1000 for it.