Apurv
@apurvkaushal
Today I learnt #6 Myth: Text to image/ video tools like Midjourney, Sora are also built on large language models similar to GPT (next text prediction) Fact: Text to image/ video is possible because of an algorithm class "diffusion models".
1 reply
0 recast
2 reactions
Apurv
@apurvkaushal
What is diffusion algorithm? Step 1 (Encoding) For one image & caption: 1. You take that image and add a series of noise to it - from too little to total chaos. 2. Then you take each instance of a (noisy image , caption, noise) & pass it through neural network (a set of mathematical calculations modelled on your brain neurons)
1 reply
0 recast
0 reaction