We're making the Warpcast spam dataset public. 

Over 400,000 accounts have been processed by our model, which determines the accounts that are most likely to generate inauthentic content or unwanted notifications.

We're making the Warpcast spam dataset public. 

Over 400,000 accounts have been processed by our model, which determines the accounts that are most likely to generate inauthentic content or unwanted notifications. 

https://github.com/warpcast/labels

Developers can use this to protect their apps from spammy users. 

Spam labels are provided as a JSONL file which follows the FIP: Labels specification (still in review). 

Data will be updated weekly with the latest labels.

Technowatermelon. Elder Millenial. Building Farcaster. 

nf.td/varun

While we've taken a lot of care to correct mistakes, its possible that a small number of legitimate accounts are misclassified. 

If you notice this, please reply to this thread or DM me. We will use these reports to improve the model.

I think parsing a file like this wouldn't be interesting in terms of resource consumption! I'll be converting Essen to an API soon.