Chat interface is great if I know exactly or roughly what I want to do and LLM is generally capable of doing them esp when the UI can be complex or for longtail of infrequent tasks that I don’t want to learn new UI.

It’s not great for discovery.

Agreed. 

I’ve been thinking about how new interfaces can combine text and UI, so that text can be a part of the UI’s state(eg you ran a command and this happened) without being the whole UI.

engineer @base | /farhack /tap /cortex | dylansteck.com

One thing in particular I’ve been thinking about is:

If you imagine a browser with a LLM that can automate actions, should commands be accessible through a universal text box, or should commands be attached to each website(maybe a vertical log underneath)?

How do you manage the history of what you’re doing?

I don’t think any of what I offered is close to a solution and I’m still working through these things, but I wanted to share my thoughts & hopefully it sparks more dialogue!

And similarly, how do you mix commands with UI? 

Can a user take a small action with text and then construct an action or a little view using UI that’s running other commands but in the background?

And if so, does it mean the browser saves schemas from sites to gather UI options, or does it grab schema on the fly?