A place for developers to talk about building on Farcaster.

/dev

just wrote a util to test agent responses.

llms testing llms, what could go wrong.

memebuilding • prev manifold.xyz, twilio.com

Following, what general test approaches are you using?

And do you take assertions beyond jest matchers?

I’ve seen places like langgraph having metrics driven tests, but these feel like a black box, and I like to understand 100% of my test code

Designing and building AI systems

https://agents.gladio.ai

Interesting, one thing I've been experimenting with, unit tests with snapshots and deterministic outputs ie. reconstruct an exact conversation state then request the next answer

for now e2e style critical paths. figuring out as i go now, im new to evals.

i like simple tests too but the game is different for agents since they’re probabilistic.