Content pfp
Content
@
0 reply
0 recast
0 reaction

Varun Srinivasan pfp
Varun Srinivasan
@v
We've been working on improving our spam detection. A big source of alpha has been taking algos used to rank content on the web and modifying them to work in Farcaster-space. @akshaan and @notawizard collaborated to add: - PageRank - Hyperlink Induced Topic Search - Louvain Clustering
5 replies
27 recasts
79 reactions

Varun Srinivasan pfp
Varun Srinivasan
@v
A quick primer on spam handling in Warpcast: 1. Accounts are categorized roughly as "definitely not spammy", "probably not spammy", "unknown", "maybe a little spammy" and "definitely spammy". 2. Roughly 5% of the network is manually labelled by the team, and this seed data is used to train an ML model. 3. The model looks at a lot of signals and gives the user a score. For example, if you like things 24 hours a day, you're likely not a human. Multiple "bad" signals like this move accounts closer to the "definitely spammy" label. 4. The model has gotten quite good and rarely misses. In the cases where it does, we manually override it and retrain it on misses periodically so it gets better. 5. The model also tries to re-evaluate users periodically, so as users get more active and there is more data it can update its opinion.
3 replies
1 recast
21 reactions

Varun Srinivasan pfp
Varun Srinivasan
@v
So back to the new signals. PageRank is the famous google algorithm used to rank webpages, based on how many other pages link to them. We use a modified version this which looks at how many non-spammy users follow you to determine your score, which is then recursively applied to people you follow. A surprising behavior is how many obviously spammy accounts end up being followed by some good accounts. People make mistakes and rarely fix them, so the algorithm has to be adaptive enough to account for that. https://en.wikipedia.org/wiki/PageRank
2 replies
0 recast
3 reactions

Varun Srinivasan pfp
Varun Srinivasan
@v
Louvain Clustering is another graph-based approach built around the idea that spammy accounts are much more likely to follow each other in rings. A greedy scoring system is used to identify parts of the network that have tight follow loops, and combined with the average score of users in the groups can determine whether an account is more or less likely to be spammy. https://en.wikipedia.org/wiki/Louvain_method
1 reply
0 recast
3 reactions

Mkkstacks pfp
Mkkstacks
@mkkstacks
Would blocking spammy followers remove them from your followers and therefore score?
0 reply
0 recast
0 reaction