Michael Huang on Warpcast

Content pfp

0 reply

0 recast

0 reaction

Michael Huang pfp

I just shipped a transformer-based SocialNLP toolkit for @farcaster. This library can possibly help construct new ML-based feed algorithms based on text classification instead of relying on manual curation, recasts/reactions, and chronological ordering. Check it out: https://github.com/michaelhly/FarGlot

5 replies

4 recasts

22 reactions

David (d/acc) pfp

@promptrotator.eth

Nice. Do you suspect that there's a big difference between the text of a spam cast and a legitimate one?

1 reply

0 recast

0 reaction

Michael Huang pfp

- we'll first have to come up with heuristics on what classifies spam (i.e. create a test set based on what users report) - and then train a model to minimize loss - the nature of spam can change, and we'll have to re-tune in short — it depends on how good we're at classifying ("labeling") spam

1 reply

0 recast

0 reaction

Vespertilio pfp

This is really cool - given that spam changes and ML models can “drift” over time, do you have any thoughts on how to implement/architect this system to ingest user feedback to continually improve the spam classifier?

2 replies

0 recast

1 reaction

Michael Huang pfp

this is quite above my pay grade 😅 but perhaps the LLM can improve its performance on reasoning datasets by training on its own generated labels: https://arxiv.org/pdf/2210.11610.pdf

1 reply

0 recast

0 reaction

Vespertilio pfp

That’s an interesting study, thanks for sharing. That’s quite fascinating that LLMs can improve by training on the labels they generated without a ground truth.

0 reply

0 recast

0 reaction