Some predictions on 2030 AI capabilities. But I think it's too pessimistic in its implications: if AI bug-finding is easy, then *the devs themselves* could use it to strip out bugs first.

Average code has 15-50 bugs per 1000 lines; if consumer bug-finders could catch 99%, then quite a few apps could become bug-free.

MATH itself is not the hallmark of mathematical ability, a bunch of papers show even prompt level sensitivity to performance. A number on a leaderboard is good, but not entirely indicative of progress.

https://arxiv.org/abs/2402.06664 some recent work on LLM based hacking

Casting here what i cant tweet elsewhere 


Writing @ crypto.mirror.xyz


Research @ ritual.net