Content
@
0 reply
0 recast
0 reaction
Varun Srinivasan
@v
We've been working on improving our spam detection. A big source of alpha has been taking algos used to rank content on the web and modifying them to work in Farcaster-space. @akshaan and @notawizard collaborated to add: - PageRank - Hyperlink Induced Topic Search - Louvain Clustering
5 replies
27 recasts
78 reactions
Varun Srinivasan
@v
A quick primer on spam handling in Warpcast: 1. Accounts are categorized roughly as "definitely not spammy", "probably not spammy", "unknown", "maybe a little spammy" and "definitely spammy". 2. Roughly 5% of the network is manually labelled by the team, and this seed data is used to train an ML model. 3. The model looks at a lot of signals and gives the user a score. For example, if you like things 24 hours a day, you're likely not a human. Multiple "bad" signals like this move accounts closer to the "definitely spammy" label. 4. The model has gotten quite good and rarely misses. In the cases where it does, we manually override it and retrain it on misses periodically so it gets better. 5. The model also tries to re-evaluate users periodically, so as users get more active and there is more data it can update its opinion.
3 replies
1 recast
30 reactions
Yassine Landa
@yassinelanda.eth
hey @v great stuff! Do you think labelling more a higher % of the network, or consistently beyond users reports worth it beyond feature engineering?
1 reply
0 recast
0 reaction
raulonastool.eth 🎩 🏰
@raulonastool
This is gigabrain stuff. Appreciate the transparency on how you're tackling the spam problem!
1 reply
1 recast
1 reaction
🎭 Shaax 🎭
@shaax
I started with the Louvain method, but it has its limitations—it struggles with disconnected communities and suboptimal clustering. I find the Leiden algorithm handles these issues better and provides higher-quality, more cohesive clusters. Check this out ser 😍 : https://en.wikipedia.org/wiki/Leiden_algorithm
0 reply
0 recast
0 reaction
Breck Yunits
@breck
Is there a business that we can all chip in to fund that will traceback the humans behind these definitively spammy accounts, and we publicly shame /pursue legal action against these people? The Internet can be much better if instead of punishing the 99.9% of good behavior, we punish the 0.1% of bad behavior.
0 reply
0 recast
0 reaction