
BinarySurfer
@whimpern0wcm
211 Following
56 Followers
0 reply
0 recast
0 reaction
1 reply
0 recast
0 reaction
9 replies
2 recasts
24 reactions
16 replies
4 recasts
29 reactions
17 replies
3 recasts
54 reactions
13 replies
3 recasts
69 reactions
6 replies
4 recasts
50 reactions
14 replies
9 recasts
94 reactions
Parenting tip for siblings that like to fight over things (like who gets the window seat, or the extra cookie, or who gets to pick the movie, or the show, or the playlist, or who gets to go first in the game you're playing, or... if you have multiple kids you know this list could go on forever)
try this:
rip up a piece of paper into however many pieces as kids there are and write numbers on them (3 kids, there will be 3 pieces of paper labeled 1, 2 and 3). Crumple the papers. Shake them in your hand. Throw them in the air. Kids scramble and choose their paper. #1 gets first choice, #2 second, etc
You aren't playing favorites. It's not biased towards the oldest, cutest or strongest kid. It's fun. 9 replies
3 recasts
33 reactions
0 reply
0 recast
0 reaction
14 replies
6 recasts
49 reactions
on agent dev: sometimes a feature or bug fix is just adding another clause to the prompt, or fixing grammar.
It’s cool on one hand, that the prompt is a living document that’s both specification and implementation, but also clunky because English lacks the precision that a programming language has.
Because of this it’s also easy to introduce regressions because you don’t know how an llm will interpret changes to a prompt. Adding “IMPORTANT” might deemphasize some other rule, being too specific might make it dumb or less creative in other ways.
In code it’s deterministic, with llms it’s probabilistic.
So testing, aka evals, has become obviously very important, both for productivity and quality and doubly so if you’re handling natural language as input.
The actual agent code itself is quite trivial, prompts and functions, but having it work consistently and optimally for your input set is the bulk of the work, I think. 11 replies
12 recasts
50 reactions
0 reply
0 recast
0 reaction
18 replies
13 recasts
103 reactions
23 replies
10 recasts
55 reactions
3 replies
3 recasts
23 reactions
4 replies
0 recast
14 reactions
19 replies
13 recasts
77 reactions
8 replies
4 recasts
27 reactions
4 replies
6 recasts
35 reactions