Content pfp
Content
@
0 reply
0 recast
0 reaction

Colin pfp
Colin
@colin
I think this is a very interesting problem space. It’s similar to some work I was doing at Google: labeling & predicting abusive behavior for millions of SMS messages across 1b phone numbers a month. We would: - manually label millions of spam reports - create heuristic rules & train ML models using labeled data - deploy heuristics & models to the device and on the server (bc different access to signals on each) - rinse & repeat Heuristics got us surprisingly far. The vast amount of accessible onchain and FC data (metadata, behavior, content) should make it much easier to build decent classifiers.
5 replies
2 recasts
30 reactions

Ape/rture pfp
Ape/rture
@aperture
Using heuristics and top down thinking on the data makes the classification process easier because the results just make more sense. The ML model would just be a bonus where there might be some patterns related to those heuristics being picked up. We can't ask the MLs to understand what humans think is valuable in this data (yet) without any guidance. It seems sometimes devs in crypto forget the classic data science cycle: ad hoc research/reports > automate that > then decide if you have do dig deeper with extensive tech like MLs.
0 reply
0 recast
0 reaction