Ξric Juta on Warpcast

Content pfp

0 reply

0 recast

0 reaction

Samuel pfp

What issue do you currently have indexing Farcaster data? e.g. using Replicator, Shuttle or APIs?

1 reply

0 recast

3 reactions

Jason Goldberg Ⓜ️ 💜 pfp

Jason Goldberg Ⓜ️ 💜

1 reply

0 recast

3 reactions

Samuel pfp

What do you wish shuttle had ? How’s your experience running it?

1 reply

0 recast

0 reaction

Ξric Juta pfp

forked shuttle, made the message type processing concurrent hard to estimate how many concurrent workers to allow/configure + hardware requirements for a backfill let alone how long it is too also replicator schema hadn't been fully ported to shuttle schema but ya I guess materialised tables are personal decisions

1 reply

0 recast

2 reactions

Samuel pfp

Thought about just writing it in Rust? If you’re already making it concurrent and add parallelism why stay with nodejs?

1 reply

0 recast

1 reaction

Ξric Juta pfp

meh just ran it using bun, intermediary trying to gauge ROI on backfilling the 1TB of data will return to it later shame there's no public process atm on just preloading shuttle Message schema with a downloadable snapshot especially since the replicator > shuttle schema changed cc @sidshekhar

1 reply

0 recast

1 reaction

Samuel pfp

So a public PSQL being synced using shuttle to then initialize from?

1 reply

0 recast

0 reaction

Ξric Juta pfp

ya basically skip the backfill process via shuttle

1 reply

0 recast

1 reaction

Samuel pfp

Do you have specific indexing requirements? Currently exploring if it is worth the effort to build shovel but for hubs or if shuttle is enough for now

1 reply

0 recast

0 reaction

Ξric Juta pfp

nothing in particular atm just something flexible to be ready for decentralised channels and make graph queries on + possibly embeddings in the future

1 reply

0 recast

2 reactions

Samuel pfp

@whyshock get in here. Do you think indexer will enable this? Then we build it @ericjuta do you think having to index everything (shuttle has no filters) is too much?

1 reply

0 recast

0 reaction

Ξric Juta pfp

I mean shuttle hasn't even implemented all message types afaicr but yeah for social graph queries to be useful then it would need all of the dataset I imagine

1 reply

0 recast

0 reaction

Ξric Juta pfp

"get me all the user data of fids that have followed channels related to a certain category" "rank viewable audience (those who followed) by their 7D follower growth and their own cast likes related to this cast hash" can get pretty in the weeds

1 reply

0 recast

0 reaction

Samuel pfp

yeah for that you may need all data so you're well served on shuttle once there and replicator now shovel style indexer would help in that you can say you only want /fitness /running for example and won't get your DB cluttered with anything but these channels and the social graph (all users)

1 reply

0 recast

0 reaction

Ξric Juta pfp

tbf you can do that just in shuttle with a few adjustments could perf diff between rust and bun runtime for it I'm not saying it's insubstantial but might not be worth the rust rewrite

2 replies

0 recast

1 reaction

Ξric Juta pfp

like my point is, might be worth waiting for the hub package to be fully re written in rust (although unsure if they fully want to port it from npm to crates)

1 reply

0 recast

1 reaction

Samuel pfp

oh yeah maybe, but that indexer would build on top of Hubs :D have played with the idea of rewriting hubs too but don't think hubs are the bottleneck for indexing as of now

1 reply

0 recast

1 reaction

Ξric Juta pfp

ya there is the teleport initiative too (hub spec in rust) potentially to contribute to but that uses a different data storage iirc they wanted to merge replicator and hub in one for easier snapshots etc

1 reply

0 recast

1 reaction

Samuel pfp

ooh that I didn't know but you don't need to merge. just adds complexity to anyone trying to only run a hub to verify messages and e.g. validate frame actions. would not like them being merged

1 reply

0 recast

1 reaction

Ξric Juta pfp

yeah... they're not having fun trying to implement a compatible hub client off the spec and protocol peering last I saw, might be progress on that keeping an alt hub is hard to maintain since it's early days

2 replies

0 recast

0 reaction

Ape/rture pfp

@ericjuta and @samuellhuber tbh there is no need to run your own hub. We actually have one running. Farcaster indexing is actually fairly light for us. We are providing the historical data + realtime data for free, so builders can focus on higher level queries and their actual product. Happy to help out

1 reply

0 recast

1 reaction

Samuel pfp

yep that's why during discussions at the /fbi fellowship we ruled that out

0 reply

0 recast

1 reaction