Kasra Rahjerdi pfp
Kasra Rahjerdi
@jc4p
fam in ~18 hours we'll have 157 million casts with embeddings, pray for me it doesn't crash overnight
11 replies
11 recasts
111 reactions

shoni.eth pfp
shoni.eth
@alexpaden
not sure if it will help or not but here's mine + 8bit quantization after on f16. i rerun this every 5 min https://gist.github.com/alexpaden/b99668307e6e16c18e5ce581c8d719b8
1 reply
0 recast
1 reaction

shoni.eth pfp
shoni.eth
@alexpaden
^ it's cpu optimized for mac studio not gpu
1 reply
0 recast
1 reaction

Kasra Rahjerdi pfp
Kasra Rahjerdi
@jc4p
love it, thank you! iā€™m planning on doing monthly ingestion of casts + monthly embeddings of those so hopefullyyyy this will be easier after i finish this giant backfill
1 reply
0 recast
0 reaction

shoni.eth pfp
shoni.eth
@alexpaden
yeah only 1 failure isn't bad at all it took me like a week because i was trying to figure out optimizations which i am still missing a bunch on i think
1 reply
0 recast
1 reaction

Kasra Rahjerdi pfp
Kasra Rahjerdi
@jc4p
silly me i thought the hard part would be writing a super optimized gRPC client šŸ˜‚šŸ™ˆ
1 reply
0 recast
0 reaction

shoni.eth pfp
shoni.eth
@alexpaden
lol i casted about my learnings before (idk where) but the code prob says better. basically preallocating memory buffers and in my case using mps i think on avg i was at ~10ktextspersecond but i think if correctly optimized could prob break 50-100k. gpu i think is faster inference idk this code records the time taken per major section which was helpful for me
1 reply
0 recast
1 reaction

Kasra Rahjerdi pfp
Kasra Rahjerdi
@jc4p
how are you actually pulling the casts? i do a cloud hub + .net multiprocessing + multiplexed connections
1 reply
0 recast
0 reaction

shoni.eth pfp
shoni.eth
@alexpaden
i run the neynar parquet service which is a bit pricey but has some bonus features like spam labels and stuff https://github.com/alexpaden/neynar_parquet_importer i take that database, which has all the fundamentals (updated every 5min), then built another pipeline to run stuff like this to generate new columns and advanced analytics tables
2 replies
0 recast
1 reaction