عمر.eth on Warpcast

Content pfp

0 reply

0 recast

0 reaction

عمر.eth pfp

@omarperacha.eth

Just got round to reading “The AI Scientist” paper by Sakana AI, along with teams in Oxford and Canada What I found most exciting about this work is that, unlike most recent AI breakthroughs in the current “scaling war” era, this one really could come from a team with modest resources. https://arxiv.org/abs/2408.06292

2 replies

1 recast

0 reaction

عمر.eth pfp

@omarperacha.eth

Rather than train a new massive model, the team creates an “agent” that uses and instructs existing LLMs and coding tools to do the work. Think of it as a command line AI “app”, in the same way that Perplexity is a web app which uses existing models like Claude and GPT4 as a core part of the workflow

1 reply

0 recast

0 reaction

عمر.eth pfp

@omarperacha.eth

As background, I spent six years working in generative AI before becoming disillusioned with the resource requirements for the next generation of breakthroughs. I moved to crypto in part because I thought it’s still possible for a small team to do really meaningful work here. But research into agent-based AI application could be a really fruitful new frontier for science. I didn’t really appreciate this until reading this paper

1 reply

0 recast

0 reaction

عمر.eth pfp

@omarperacha.eth

The AI Scientist works by providing a set of specified topics, and providing some starter code for each. For example, you might give “language modelling” as a general research topic, and provide Karpathy’s NanoGPT as starter code. It then uses an existing LLM to generate research ideas on the topic, pick one it deems novel, and edits the starter code to implement its idea and run experiments. Afterwards, the same existing LLM writes up the results in a paper using LaTeX and Python for plotting charts. If you wanted to provide a custom topic, you’d need to provide that starter template code too. Naturally, it really only lends itself to software-based research atm.

1 reply

0 recast

0 reaction

عمر.eth pfp

@omarperacha.eth

Overall, it’s just a really well-written paper. Easy to read and very open about shortcomings. These are discussed in insightful detail. Well worth a read if feeling inspired, otherwise there is a blog post version along with fully open-sourced code too 👌🏽 https://sakana.ai/ai-scientist/

0 reply

0 recast

0 reaction