Interesting misconception that Dan flagged today - people think that we are (or could be) using LLMs for spam detection. 

With the way LLMs work today, that's like using a hammer to cut your fingernails. LLMs are slow, expensive and don't really have a deep understanding of what spam is in the context of Farcaster. 

We use a random-forest decision tree that @akshaan designed, and feed it a bunch of signals using embeddings, user actions and graph data.

Technowatermelon. Elder Millenial. Building Farcaster. 

nf.td/varun

@v is training data purely from user reports? what type of F1, recall and precision the random forest get you? And do you incorporate some of the difference of costs between false positives and false negatives in training?