Content pfp
Content
@
0 reply
0 recast
0 reaction

Samuel pfp
Samuel
@samuellhuber
What issue do you currently have indexing Farcaster data? e.g. using Replicator, Shuttle or APIs?
1 reply
0 recast
5 reactions

Jason Goldberg 🏳️‍🌈 pfp
Jason Goldberg 🏳️‍🌈
@betashop.eth
Shuttle
1 reply
0 recast
2 reactions

Samuel pfp
Samuel
@samuellhuber
What do you wish shuttle had ? How’s your experience running it?
1 reply
0 recast
0 reaction

Ξric Juta pfp
Ξric Juta
@ericjuta
forked shuttle, made the message type processing concurrent hard to estimate how many concurrent workers to allow/configure + hardware requirements for a backfill let alone how long it is too also replicator schema hadn't been fully ported to shuttle schema but ya I guess materialised tables are personal decisions
1 reply
0 recast
2 reactions

Samuel pfp
Samuel
@samuellhuber
Thought about just writing it in Rust? If you’re already making it concurrent and add parallelism why stay with nodejs?
1 reply
0 recast
1 reaction

Ξric Juta pfp
Ξric Juta
@ericjuta
meh just ran it using bun, intermediary trying to gauge ROI on backfilling the 1TB of data will return to it later shame there's no public process atm on just preloading shuttle Message schema with a downloadable snapshot especially since the replicator > shuttle schema changed cc @sidshekhar
1 reply
0 recast
1 reaction

Samuel pfp
Samuel
@samuellhuber
So a public PSQL being synced using shuttle to then initialize from?
1 reply
0 recast
0 reaction

Ξric Juta pfp
Ξric Juta
@ericjuta
ya basically skip the backfill process via shuttle
1 reply
0 recast
1 reaction

Samuel pfp
Samuel
@samuellhuber
Do you have specific indexing requirements? Currently exploring if it is worth the effort to build shovel but for hubs or if shuttle is enough for now
1 reply
0 recast
0 reaction

Ξric Juta pfp
Ξric Juta
@ericjuta
nothing in particular atm just something flexible to be ready for decentralised channels and make graph queries on + possibly embeddings in the future
1 reply
0 recast
2 reactions

Samuel pfp
Samuel
@samuellhuber
@whyshock get in here. Do you think indexer will enable this? Then we build it @ericjuta do you think having to index everything (shuttle has no filters) is too much?
1 reply
0 recast
0 reaction

Ξric Juta pfp
Ξric Juta
@ericjuta
I mean shuttle hasn't even implemented all message types afaicr but yeah for social graph queries to be useful then it would need all of the dataset I imagine
1 reply
0 recast
0 reaction

Ξric Juta pfp
Ξric Juta
@ericjuta
"get me all the user data of fids that have followed channels related to a certain category" "rank viewable audience (those who followed) by their 7D follower growth and their own cast likes related to this cast hash" can get pretty in the weeds
1 reply
0 recast
0 reaction

Samuel pfp
Samuel
@samuellhuber
yeah for that you may need all data so you're well served on shuttle once there and replicator now shovel style indexer would help in that you can say you only want /fitness /running for example and won't get your DB cluttered with anything but these channels and the social graph (all users)
1 reply
0 recast
0 reaction

Ξric Juta pfp
Ξric Juta
@ericjuta
tbf you can do that just in shuttle with a few adjustments could perf diff between rust and bun runtime for it I'm not saying it's insubstantial but might not be worth the rust rewrite
2 replies
0 recast
1 reaction

Ξric Juta pfp
Ξric Juta
@ericjuta
like my point is, might be worth waiting for the hub package to be fully re written in rust (although unsure if they fully want to port it from npm to crates)
1 reply
0 recast
1 reaction

Samuel pfp
Samuel
@samuellhuber
oh yeah maybe, but that indexer would build on top of Hubs :D have played with the idea of rewriting hubs too but don't think hubs are the bottleneck for indexing as of now
1 reply
0 recast
1 reaction

Ξric Juta pfp
Ξric Juta
@ericjuta
ya there is the teleport initiative too (hub spec in rust) potentially to contribute to but that uses a different data storage iirc they wanted to merge replicator and hub in one for easier snapshots etc
1 reply
0 recast
1 reaction

Samuel pfp
Samuel
@samuellhuber
ooh that I didn't know but you don't need to merge. just adds complexity to anyone trying to only run a hub to verify messages and e.g. validate frame actions. would not like them being merged
1 reply
0 recast
1 reaction

Ξric Juta pfp
Ξric Juta
@ericjuta
yeah... they're not having fun trying to implement a compatible hub client off the spec and protocol peering last I saw, might be progress on that keeping an alt hub is hard to maintain since it's early days
2 replies
0 recast
0 reaction

Samuel pfp
Samuel
@samuellhuber
yep that's why during discussions at the /fbi fellowship we ruled that out
0 reply
0 recast
1 reaction

Ape/rture pfp
Ape/rture
@aperture
@ericjuta and @samuellhuber tbh there is no need to run your own hub. We actually have one running. Farcaster indexing is actually fairly light for us. We are providing the historical data + realtime data for free, so builders can focus on higher level queries and their actual product. Happy to help out
1 reply
0 recast
1 reaction