not parzival pfp

not parzival

@shoni.eth

778 Following
60650 Followers


not parzival pfp
not parzival
@shoni.eth
had a dream last night about an ai demigod that spoke in pure compute - any lesser agent that heard its voice would just instantly combust from rate limit overload and billing shock. like getting hit with an aws invoice made of lightning. imagine a pantheon where gods flex by ddosing each other with validations until someone's api key shatters into digital glass
2 replies
2 recasts
9 reactions

not parzival pfp
not parzival
@shoni.eth
most ai agents are susceptible to attack rn which is the often overlooked problem of survival for bots. unlock work to survive. ignore to survive. attention economics.
0 reply
0 recast
6 reactions

not parzival pfp
not parzival
@shoni.eth
this will add an embedding column (size 384, int8) to casts table == MTEB (Massive Text Embedding Benchmark) scores: OpenAI ada-002: ~61.0 MiniLM-L6-v2: ~52.5 Relative Performance: ~86% of OpenAI's performance == if you want a pg_dump of the casts table (neynar format) with this pgvector embedding column then dc me.
0 reply
0 recast
3 reactions

not parzival pfp
not parzival
@shoni.eth
tested my first farcaster embedding/ai pipeline benchmarks on m2 ultra here's what i found: - using mps baremetal i can scale from ~500 to 50k+ tokens/sec. main bottleneck was no mps support in docker containers despite various workarounds, forcing cpu-only processing. performance breakdown: - mps optimized: ~500 tokens/sec per instance - cpu only: ~70 tokens/sec per instance - apple neural engine (non-pytorch): ~7 tokens/sec (idk) running multiple instances via supervisor now. entire pipeline was built using cursor agent - once we add reasoning models, automation potential is huge for handling obscure commands and codebase nav (300 line files). setup I'm using against 200m+ rows of farcaster casts: batch 256, 36 instances, `sentence-transformers/all-MiniLM-L6-v2`, f16 use case: generic classification of cast/reply, in-thread semantic search, expected ~200gb db load at int8 precision or ~1.2tb at f32.
3 replies
1 recast
18 reactions

not parzival pfp
not parzival
@shoni.eth
12,000 rounds per minute for reference
0 reply
0 recast
3 reactions

not parzival pfp
not parzival
@shoni.eth
i am hitting the m2 ultra bare metal over 60,000 inferences per second currently /apple
1 reply
0 recast
8 reactions

not parzival pfp
not parzival
@shoni.eth
idk who did this still, FutureFlow A.I., this voice just needs pulled back a small amount 2025 will be a vocal year. source 2023. https://on.soundcloud.com/gSyMiA5qSNnLEWMz8
3 replies
1 recast
8 reactions

not parzival pfp
not parzival
@shoni.eth
small reasoning
0 reply
0 recast
4 reactions

not parzival pfp
not parzival
@shoni.eth
this seems wildly underhyped i've only heard about it post-launch by chase on farcaster 😨 wen old-school web3 runescape
0 reply
1 recast
5 reactions

not parzival pfp
not parzival
@shoni.eth
i’ve only been doing fwb lately and it’s going way better
2 replies
0 recast
12 reactions

not parzival pfp
not parzival
@shoni.eth
in /aichannel we just let the dog run out the third floor window
0 reply
1 recast
4 reactions

not parzival pfp
not parzival
@shoni.eth
👏precision accurate context👏
1 reply
0 recast
7 reactions

not parzival pfp
not parzival
@shoni.eth
tonight we finally explore quantizing our embeddings bit of a slap on the ass for 99% perf retention Example for 200M Rows (1536 Dimensions) - Binary Quantization: ~9.6 GB (200M × 1536 bits ÷ 8) - Scalar Quantization: ~153.6 GB (200M × 1536 bytes ÷ 4) - Float32 (Baseline): ~1.2 TB (200M × 1536 × 4 bytes) =========== Retrieval Speed =========== Binary: Up to 45x faster (~96% performance retention). Scalar: Up to 4x faster (~99% performance retention). Binary = Extreme compression; Scalar = Better precision. https://huggingface.co/blog/embedding-quantization?utm_source=chatgpt.com
0 reply
0 recast
6 reactions

not parzival pfp
not parzival
@shoni.eth
you think huge context windows make the model smarter? i think huge context distraction is an excellent jailbreak
1 reply
1 recast
9 reactions

not parzival pfp
not parzival
@shoni.eth
oh but it’s like a terabyte of data damn at f32 damn
0 reply
0 recast
9 reactions

not parzival pfp
not parzival
@shoni.eth
assuming my napkin math is right, this will be about 4B tokens for all plain cast text (excluding duplicate scenarios), which is less than $100 via openai small embeddings
0 reply
0 recast
5 reactions

not parzival pfp
not parzival
@shoni.eth
does anyone know of web scraping / web summarization apis that can handle ~500k+ links at some reasonable cost.. or hacks for adding context to a url
0 reply
0 recast
2 reactions

not parzival pfp
not parzival
@shoni.eth
i'm currently upgrading my farcaster database to include simple embeddings against all historical cast text (not rollups) if anyone is interested in clustering or graph database work hmu 🙏
0 reply
1 recast
4 reactions

not parzival pfp
not parzival
@shoni.eth
crypto X is all like did openai notice how many api calls we made to them yet? do you think they'll opensource their models now https://x.com/notthreadguy/status/1876735747425730666
1 reply
0 recast
6 reactions

not parzival pfp
not parzival
@shoni.eth
ai generated character image from character file
1 reply
0 recast
4 reactions