Katsuya pfp
Katsuya
@kn
Chat interface is great if I know exactly or roughly what I want to do and LLM is generally capable of doing them esp when the UI can be complex or for longtail of infrequent tasks that I don’t want to learn new UI. It’s not great for discovery.
2 replies
0 recast
0 reaction

dylan pfp
dylan
@dylsteck.eth
Agreed. I’ve been thinking about how new interfaces can combine text and UI, so that text can be a part of the UI’s state(eg you ran a command and this happened) without being the whole UI.
1 reply
0 recast
0 reaction

dylan pfp
dylan
@dylsteck.eth
One thing in particular I’ve been thinking about is: If you imagine a browser with a LLM that can automate actions, should commands be accessible through a universal text box, or should commands be attached to each website(maybe a vertical log underneath)? How do you manage the history of what you’re doing?
1 reply
0 recast
0 reaction

dylan pfp
dylan
@dylsteck.eth
And similarly, how do you mix commands with UI? Can a user take a small action with text and then construct an action or a little view using UI that’s running other commands but in the background? And if so, does it mean the browser saves schemas from sites to gather UI options, or does it grab schema on the fly?
1 reply
0 recast
0 reaction

dylan pfp
dylan
@dylsteck.eth
I don’t think any of what I offered is close to a solution and I’m still working through these things, but I wanted to share my thoughts & hopefully it sparks more dialogue!
1 reply
0 recast
0 reaction

Katsuya pfp
Katsuya
@kn
Yeah it can get wild once there is standard ways for websites to expose LLM API for browsing. ChatGPT Plug-ins is a great step forward.
0 reply
0 recast
0 reaction