aichannel

tested my first farcaster embedding/ai pipeline benchmarks on m2 ultra here's what i found:

- using mps baremetal i can scale from ~500 to 50k+ tokens/sec. main bottleneck was no mps support in docker containers despite various workarounds, forcing cpu-only processing.

performance breakdown:
- mps optimized: ~500 tokens/sec per instance
- cpu only: ~70 tokens/sec per instance
- apple neural engine (non-pytorch): ~7 tokens/sec (idk)

running multiple instances via supervisor now. entire pipeline was built using cursor agent - once we add reasoning models, automation potential is huge for handling obscure commands and codebase nav (300 line files).

setup I'm using against 200m+ rows of farcaster casts: 
batch 256, 36 instances, `sentence-transformers/all-MiniLM-L6-v2`, f16
use case: generic classification of cast/reply, in-thread semantic search, expected ~200gb db load at int8 precision or ~1.2tb at f32.

prev founded http://rocketr.net, http://swish.id now exploring cryptoxai. expired sprint car driver & energy data engineer. idol worship on shoni_eth.X

text is i.e. "a whole string of text for embedding"