Content pfp
Content
@
0 reply
0 recast
0 reaction

clun.eth pfp
clun.eth
@clun.eth
Any teams working on having multimodal models “drive” a UI? For instance I show Bing screenshots of calendar UI and ask it what buttons to press and what to put in each text field. It clearly knows what to do so it should be possible to have it output some structured data that can generate button clicks etc.
1 reply
0 recast
2 reactions

manansh ❄️ @ Farcon pfp
manansh ❄️ @ Farcon
@manansh
Also curious to know if anyone is building here.
0 reply
0 recast
1 reaction