jtgi on Warpcast

Content pfp

https://opensea.io/collection/dev-21

0 reply

0 recast

2 reactions

jtgi pfp

on agent dev: sometimes a feature or bug fix is just adding another clause to the prompt, or fixing grammar. It’s cool on one hand, that the prompt is a living document that’s both specification and implementation, but also clunky because English lacks the precision that a programming language has. Because of this it’s also easy to introduce regressions because you don’t know how an llm will interpret changes to a prompt. Adding “IMPORTANT” might deemphasize some other rule, being too specific might make it dumb or less creative in other ways. In code it’s deterministic, with llms it’s probabilistic. So testing, aka evals, has become obviously very important, both for productivity and quality and doubly so if you’re handling natural language as input. The actual agent code itself is quite trivial, prompts and functions, but having it work consistently and optimally for your input set is the bulk of the work, I think.

11 replies

12 recasts

60 reactions

marlo pfp

such an interesting take. are you fairly left-brained? does it feel like this is challenging that a bit?

1 reply

0 recast

0 reaction

jtgi pfp

I’d say I’m fairly left brained, yeah. I’m a decent technical writer so I find writing the prompts easy. The hard part is knowing if the llm will obey the instructions and if so how often.

1 reply

0 recast

1 reaction

marlo pfp

yeah, i really like when things are like math and there’s a clear outcome and it’s obvious when you’ve made an error. the ai stuff should be like math but in practice it’s too complex and opaque. i wonder if this will change or if we will just get better at ai whispering

0 reply

0 recast

0 reaction