Kasra Rahjerdi pfp
Kasra Rahjerdi
@jc4p
hey if you're wondering where the public data dump for April is, I need help figuring out the implications of my decisions with the Snapchain setup -- would love any ideas. Short of it: Snapchain prefers a 24/7 running instance, can't afford to run a heavy box 24/7, stuck on filtering 90GB of data on 8GB of RAM
10 replies
4 recasts
29 reactions

artlu 🎩 pfp
artlu 🎩
@artlu
stoopid question (forgive me for asking)--what's keeping you from using Shuttle? https://github.com/farcasterxyz/hub-monorepo/tree/main/packages/shuttle
1 reply
0 recast
2 reactions

Kasra Rahjerdi pfp
Kasra Rahjerdi
@jc4p
not a dumb question at all, basically i am repeating code i wrote for bulk retrieval (all time) instead of writing new code to only pull the latest data, so i need something a lot more performant than shuttle -- last time i benchmarked my grpc client was 30-40x faster
1 reply
0 recast
1 reaction

artlu 🎩 pfp
artlu 🎩
@artlu
not knowing anything more, this is what I would consider at a high level: - perform steps 1-3 on Right Hand Server (1.5 days first tim, $56/mo required cost) - rsync the 90GB parquet file over to Left Hand Server (do some calcs first? buffer for error time) - do steps 4-5 on Left Hand Server. spin it down. Pennies per month bonus: investigate Right Hand Server writing to Left Hand Server, rather than to a local file. Write speed can't be the bottleneck anymore? Requires Left Hand Server to stay running, but still only $5, or 10% more in cost still seems like a PITA, all to avoid writing shuttle-style incremental update code, vs a new node that must stay running all the time?
1 reply
0 recast
2 reactions