voyagerx
@cousinyl3wqx
669 Following
63 Followers
0 reply
0 recast
0 reaction
18 replies
7 recasts
36 reactions
0 reply
0 recast
0 reaction
0 reply
0 recast
0 reaction
0 reply
0 recast
0 reaction
0 reply
0 recast
0 reaction
16 replies
6 recasts
50 reactions
0 reply
0 recast
0 reaction
30 replies
25 recasts
89 reactions
0 reply
0 recast
0 reaction
on agent dev: sometimes a feature or bug fix is just adding another clause to the prompt, or fixing grammar.
Itβs cool on one hand, that the prompt is a living document thatβs both specification and implementation, but also clunky because English lacks the precision that a programming language has.
Because of this itβs also easy to introduce regressions because you donβt know how an llm will interpret changes to a prompt. Adding βIMPORTANTβ might deemphasize some other rule, being too specific might make it dumb or less creative in other ways.
In code itβs deterministic, with llms itβs probabilistic.
So testing, aka evals, has become obviously very important, both for productivity and quality and doubly so if youβre handling natural language as input.
The actual agent code itself is quite trivial, prompts and functions, but having it work consistently and optimally for your input set is the bulk of the work, I think. 11 replies
12 recasts
65 reactions
0 reply
0 recast
0 reaction
8 replies
5 recasts
83 reactions
0 reply
0 recast
0 reaction
0 reply
0 recast
0 reaction
21 replies
100 recasts
408 reactions
0 reply
0 recast
0 reaction
0 reply
0 recast
0 reaction
0 reply
0 recast
0 reaction
29 replies
10 recasts
122 reactions