Varun Srinivasan pfp
Varun Srinivasan
@v
Some thoughts on spam on Farcaster and how we tackle it. First question - What is spam? The naive answer is "automated activity" but this isn't right. Over 75% of spam we find comes from real humans who have phones, wallets and x accounts. The best definition is "inauthentic activity". It's that feeling you get when you realize that someone who is following, liking or replying to is doing it to benefit themselves and not because they're interested in you.
31 replies
93 recasts
322 reactions

Varun Srinivasan pfp
Varun Srinivasan
@v
Spam is driven by people who want to get airdrops. How much can you earn if you set up a fake account on Twitter? Probably not a whole lot and not in directly measurable dollars. If you do the same on Farcaster, you might earn 10 or even a 100 dollars in airdrops. Spammers on Farcaster are very, very motivated. We see patterns like LLM spamming before they become commonplace on larger networks like X.
1 reply
2 recasts
103 reactions

Varun Srinivasan pfp
Varun Srinivasan
@v
Spam also needs to be classified very, very quickly. If we don't, a spammer will interact with a lot of users after signing up making them unhappy. We often have little more than a profile and a few casts when we need to make a decision. If we get this decision wrong people get really unhappy - a spammer who isn't labelled will make existing users unhappy, and a new user who is incorrectly labelled will get frustrated and never come back.
2 replies
0 recast
43 reactions

Varun Srinivasan pfp
Varun Srinivasan
@v
Our spam models puts accounts into one of four categories: Level 0 - not enough information to make a decision Level 1 - an authentic users that other users will like Level 2 - a slightly inauthentic user that some users won't like Level 3 - a very inauthentic user that almost all people will dislike If we're certain that someone is spammy, their account goes into level 3 and their activity is usually hidden under the "Show more" in conversations. In most cases, it's less clear. An account may be good for a while and suddenly turn spammy when a new airdrop launches. In this case Level 2 might be applied, which does something lighter like disqualifying you from boosts, but still letting your replies appear. Accounts are also re-evaluated by our model very often so that new information can be used to make a more accurate decision. We rank and re-rank roughly 4-5 accounts every minute.
3 replies
1 recast
41 reactions

Varun Srinivasan pfp
Varun Srinivasan
@v
There are three parts to building a spam detection model: 1. Define signals, which can be calculated for each account. Ideally they have some correlation to spammy behavior. (e.g. frequency of posting) 2. Label data, either through manual review, user reports or heuristics. The dataset must be large enough that there is significance to the patterns. 3. Train the model, by letting it process labelled data and figure out which combinations of signals are the best predictors. @akshaan chose a type of model called a random forest which is a collection of decision trees. Here's a good lecture on the basics of how a decision tree works: https://www.youtube.com/watch?v=a3ioGSwfVpE&list=PLl8OlHZGYOQ7bkVbuRthEsaLr7bONzbXS&index=29
1 reply
0 recast
36 reactions

Varun Srinivasan pfp
Varun Srinivasan
@v
Random forests can identify very subtle patterns in data. For example, we once had a spam ring in country X that would fire up all their bots at the same time. Because we fed in country and time of posts as signals, it quickly learned that accounts that posted frequently at 10pm in that country were spammy. But what's very interesting is that it otherwise ignored country as a predictor. If you posted from that same country but had a more human-like pattern of posting around the clock it didn't rank you as likely to be a spammer. Forests can get very sophisticated and layer dozens of signals to find such patterns. They can be retrained periodically to adapt as spammers change their behavior.
3 replies
0 recast
28 reactions

Varun Srinivasan pfp
Varun Srinivasan
@v
It's not always intuitive what the best signals are. When I worked on fraud at Coinbase - which is a similar problem - one of our best signals was screen resolution. It turned out that fraudsters used a virtual machine that had a very odd screen resolution that most normal computers would never have. We've found this to be true in Farcaster data as well. I'm going to be more cagey about what the actual signals are, because revealing them will cause spammers to change their behavior making them harder to detect.
5 replies
0 recast
47 reactions

Varun Srinivasan pfp
Varun Srinivasan
@v
Commonly suggested signals like onchain data don't work very well. It turns out that there are a lot of users with little or no blockchain activity that are quite interesting on social networks. And the opposite also tends to be true, which is that there are people with ENS's and other onchain activity that are aggressive spammers and airdrop farmers. We recently tested some onchain signals and found a near-zero improvement in predictive power. This may change over time as more activity moves onchain, but as of today it's not very useful.
3 replies
2 recasts
36 reactions

Varun Srinivasan pfp
Varun Srinivasan
@v
The signals that tend to do very well fall into one of three categories: 1. Graph based -- spammers often share similar patterns of activity which can be used to catch them 2. Behaviors - they also tend to do things a certain way, because they're being repetitive in their actions (e.g. posting at fixed internals) 3. Textual - the content of their casts is often very predictive of their quality
7 replies
1 recast
39 reactions

Varun Srinivasan pfp
Varun Srinivasan
@v
If you have any more questions about how spam works, please ask and I'll try to reply tomorrow (because its getting late here)
9 replies
0 recast
23 reactions

jesse pfp
jesse
@jbird
Appreciate your thought process. Itโ€™s a very interesting challenge to tackle. The solution will never be perfect, but there has been a noticeable UX improvement from your efforts to suppress and remove spam incentives.
0 reply
0 recast
1 reaction

Disky.eth ๐ŸŽฉ pfp
Disky.eth ๐ŸŽฉ
@disky.eth
Thank you for the insights. But wouldโ€™t a spammer learn for these and adapt?
1 reply
0 recast
1 reaction

Jacek.degen.eth ๐ŸŽฉ pfp
Jacek.degen.eth ๐ŸŽฉ
@jacek
100 $DEGEN
0 reply
0 recast
1 reaction

max โ†‘๐ŸŽฉ pfp
max โ†‘๐ŸŽฉ
@baseddesigner.eth
broke it down nicely a lot of people who suggest checkin onchain activity don't realize that it's not only about spam bots but about spammy humans as well, which is where it gets real complex warpcast may become the best client in all of social media in terms of spam amounts soon
0 reply
0 recast
0 reaction

Supertaster.degen.eth ๐ŸŽฉ pfp
Supertaster.degen.eth ๐ŸŽฉ
@supertaster.eth
Very interesting to see the thinking and reasoning behind it. I admit I thought it was much less sophisticated. Great work!
0 reply
0 recast
0 reaction

Jhon pfp
Jhon
@jhonc.eth
I think too much time is being dedicated to spammers, and real users are being neglected. Speculating whether a user is spam or not leaves much to be desired. The focus should be on identifying genuine users, implementing verification methods, and making Warpcast more intuitive for newcomers who want to stay. When I first arrived, there were users with badges who behaved arrogantly. I stayed because I love cryptocurrencies, but as a social network, it has room for improvement. I love Warpcast and its community, but it needs to become more accessible, avoiding turning into a closed group that favors older members and disadvantages new ones.
0 reply
0 recast
0 reaction

OMGiDRAWEDit ๐ŸฆŽ pfp
OMGiDRAWEDit ๐ŸฆŽ
@omgidrawedit
Super interesting read. Had no clue it was being looked at like this. Cheers V
0 reply
0 recast
0 reaction

Sofi ๐ŸŽฉ pfp
Sofi ๐ŸŽฉ
@sofi888
big thanks for this explanation and the great work you, guys, do!! how often do you reevalute accounts which were marked as a spam? Does "marked with spam" label mean I'm a spammer in your above mentioned terminology or it just means that i have hidden replies to a person whom doesn't follow me? I was marked as a spammer๐Ÿ™ˆ I do 2-3 casts daily (I thought the more casts the spammy it looks) I cast almost at the same time everyday at 10-11pm (when I don't need to to take care of everybody around๐Ÿ˜†๐Ÿ˜†) never had casts like "f4f", gn, gm... I thoroughly think about every post I need to cast except of 4-5 among 100 may be๐Ÿ˜‚๐Ÿ˜‚
0 reply
0 recast
0 reaction