Royal
@royalaid.eth
> I'll just generate some embeddings, I have all the casts it can't be that bad > 🙃
2 replies
0 recast
6 reactions
Harris
@harris-
lets put down the computers and go for a walk until this is all over bro
2 replies
0 recast
1 reaction
Harris
@harris-
damn just realised I was missing a * 100 in this % output lmfao @royalaid.eth okay we're so back
1 reply
0 recast
0 reaction
Royal
@royalaid.eth
Bruh mine slowed down. I might see how much it costs to generate them and open source them
1 reply
0 recast
1 reaction
Harris
@harris-
What happened? And what's the data source?
1 reply
0 recast
0 reaction
Royal
@royalaid.eth
Its a sql table generated from neynars parquet data. The goal is use some from of embedding model to generate embeds for all the casts but with 151 million non-empty casts as of that dump that I am using its just a length process with my model, arctic-embed2. It doesn't help that I am sure ollama is adding overhead so I am going to bite the bullet, add pytorch and experiment w/ a few models.
1 reply
0 recast
1 reaction
Royal
@royalaid.eth
side note, I am gonna try to rip through just the powerusers casts but that is still ~15 million casts
0 reply
0 recast
1 reaction