Shane da Silva on Warpcast

Content pfp

0 reply

0 recast

0 reaction

Varun Srinivasan pfp

Varun Srinivasan

Anyone here worked on an eng team that processed more than 100M background jobs /day? Starting to run into issues with our setup. Looking for ideas on what people typically turn to at this scale.

24 replies

21 recasts

157 reactions

SchmRdty.eth 🎩⛓️ 🫂 pfp

SchmRdty.eth 🎩⛓️ 🫂

Curious to know about the setup you’re running now. Is that information classified? I wondered why I was getting a light flicker earlier…

1 reply

0 recast

0 reaction

Shane da Silva pfp

We’re using Faktory + Redis. Runtime is Node.js

3 replies

0 recast

2 reactions

Samuel pfp

Shane chat with @leewardbound he's been running redis at scale for a while

1 reply

0 recast

2 reactions

Leeward Bound pfp

not quite at this scale but close 😁 it's been a while since I've had to do 10m+ daily jobs, thanks for the shout out tho

1 reply

0 recast

2 reactions

Samuel pfp

add what you now here. likely helps lots of folks monitoring this thread. really value your input mate

1 reply

0 recast

2 reactions

Leeward Bound pfp

well my first instinct - likely meaningless as I know nothing about their architecture - is "why 100m jobs daily?" and "what part of this is actually the bottleneck"? seems like there must be some batching that could be done. in lieu of firing 1k jobs per second, doing a batch every 250ms would be a 400x reduction...

1 reply

0 recast

1 reaction

Shane da Silva pfp

Fair suggestion! We’re doing a lot of things, but in simple terms we’re enqueuing a job per Farcaster message. The network currently sees north of 5M messages per day, so at 10x growth 100M messages a day isn’t too far out if we’re planning ahead.

1 reply

0 recast

2 reactions

Shane da Silva pfp

Batching would introduce complexity in the event a subset of messages in a batch fail to process. Do you roll back all messages in the batch in that case? Do you hold open a transaction for the entire batch? Not trying to put words in your mouth. Just trying to provide context for keeping it a single job per message.

1 reply

0 recast

0 reaction

Leeward Bound pfp

damn what kind of a failure rate are you seeing? I would add them to a retry queue for processing in the next batch I guess, but yeah again I don't know anything about your app or how it works, I don't mean to pitch "easy" solutions from an armchair 😂

1 reply

0 recast

0 reaction

Shane da Silva pfp

For messages specifically, under 0.1%, and the jobs that do fail get retried and (usually) succeed. Across all our jobs it’s variable since we’re interacting with a variety of third parties we don’t control. All that said, batching may still be an option worth exploring for some specific use cases.

0 reply

0 recast

1 reaction