
Voyageur
@xxcchronometer
224 Following
47 Followers
0 reply
0 recast
0 reaction
0 reply
0 recast
0 reaction
7 replies
3 recasts
41 reactions
0 reply
0 recast
0 reaction
11 replies
23 recasts
124 reactions
7 replies
9 recasts
46 reactions
24 replies
35 recasts
145 reactions
0 reply
0 recast
0 reaction
9 replies
2 recasts
19 reactions
14 replies
118 recasts
420 reactions
13 replies
10 recasts
122 reactions
6 replies
1 recast
18 reactions
5 replies
8 recasts
54 reactions
7 replies
16 recasts
100 reactions
0 reply
0 recast
0 reaction
16 replies
50 recasts
154 reactions
0 reply
0 recast
0 reaction
on agent dev: sometimes a feature or bug fix is just adding another clause to the prompt, or fixing grammar.
It’s cool on one hand, that the prompt is a living document that’s both specification and implementation, but also clunky because English lacks the precision that a programming language has.
Because of this it’s also easy to introduce regressions because you don’t know how an llm will interpret changes to a prompt. Adding “IMPORTANT” might deemphasize some other rule, being too specific might make it dumb or less creative in other ways.
In code it’s deterministic, with llms it’s probabilistic.
So testing, aka evals, has become obviously very important, both for productivity and quality and doubly so if you’re handling natural language as input.
The actual agent code itself is quite trivial, prompts and functions, but having it work consistently and optimally for your input set is the bulk of the work, I think. 11 replies
12 recasts
57 reactions
37 replies
35 recasts
193 reactions
3 replies
2 recasts
36 reactions