Colin

Here's an idea for a Farcaster startup: 

- Sell labels for Farcaster accounts 
- Accounts can have multiple labels (e.g. airdrop farmer, onchain transctor)  
- Labels can be either binary or a confidence score (0 to 1) 

Warpcast would pay to use this service if you did a good job.

I think this is a very interesting problem space. It’s similar to some work I was doing at Google: labeling & predicting abusive behavior for millions of SMS messages across 1b phone numbers a month. 

We would:
- manually label millions of spam reports 
- create heuristic rules & train ML models using labeled data
- deploy heuristics & models to the device and on the server (bc different access to signals on each)
- rinse & repeat 

Heuristics got us surprisingly far. 

The vast amount of accessible onchain and FC data (metadata, behavior, content) should make it much easier to build decent classifiers.

cma.xyz. Founder /paragraph, ceo /mirror, previously at Google & Coinbase 

Using heuristics and top down thinking on the data makes the classification process easier because the results just make more sense. The ML model would just be a bonus where there might be some patterns related to those heuristics being picked up. 

We can't ask the MLs to understand what humans think is valuable in this data (yet) without any guidance. It seems sometimes devs in crypto forget the classic data science cycle: ad hoc research/reports > automate that > then decide if you have do dig deeper with extensive tech like MLs.