Ankur Goyal pfp

Ankur Goyal

@ankrgyl

51 Following
66 Followers


Ankur Goyal pfp
Ankur Goyal
@ankrgyl
What do folks here think of mastodon?
3 replies
0 recast
0 reaction

Ankur Goyal pfp
Ankur Goyal
@ankrgyl
Prompts and Programs: how do compilers designed with LLMs in mind change the future of programming? https://basecase.vc/blog/prompts-programs Covers 5 use cases: - No Code 3.0 - LLM libraries - Mimicking compilers - Optimizing code - LLM databases w/ links to research and repos. Appreciate feedback!
1 reply
0 recast
0 reaction

Ankur Goyal pfp
Ankur Goyal
@ankrgyl
Working hard for databases and ML!
0 reply
0 recast
0 reaction

Ankur Goyal pfp
Ankur Goyal
@ankrgyl
Is anyone thinking about the intersection of blockchain + OSS monetization? I'm curious of the latest & greatest thinking.
0 reply
0 recast
0 reaction

Ankur Goyal pfp
Ankur Goyal
@ankrgyl
Attempting to ask in a non pretentious way: which aspect feels (ontologically) difficult?
0 reply
0 recast
0 reaction

Ankur Goyal pfp
Ankur Goyal
@ankrgyl
Early thoughts: it will have a SQL parser, binder, optimizer, and interpreter (which either pushes to SQL database, NoSQL database, or runs locally) so you can run against local files, multiple databases, NoSQL, etc. without having to rewrite code. https://i.imgur.com/JO8LgbM.png
0 reply
0 recast
0 reaction

Ankur Goyal pfp
Ankur Goyal
@ankrgyl
Essentially like "DBT" but backed by a real SQL compiler/optimizer/interpreter. Appreciate feedback! And if anyone wants to collab/brainstorm let me know.
0 reply
0 recast
0 reaction

Ankur Goyal pfp
Ankur Goyal
@ankrgyl
I'm thinking about writing a new programming language: an ANSI SQL compatible language that natively interoperates with real code (Python, Typescript, Rust, etc.) and lets you manage your data model, views, etc. as 1st class programming constructs.
3 replies
0 recast
0 reaction

Ankur Goyal pfp
Ankur Goyal
@ankrgyl
👋🏽 I actually cross casted this thread: farcaster://casts/0x273f56a256a94d1937ff8e73a959b2ccf30a25c671eddbdec8dd3684b9c7f0e3/0xf2d25d780f9e236ff3469e949c8e52a7a32d627b6c7579ae5d72be7cca01f0dc
0 reply
0 recast
0 reaction

Ankur Goyal pfp
Ankur Goyal
@ankrgyl
That also reflects a well-designed product (one where the documentation can be that intuitive in the first place).
0 reply
0 recast
0 reaction

Ankur Goyal pfp
Ankur Goyal
@ankrgyl
I'm personally very optimistic about this intersection. Imagine a database that can answer SQL queries, natural language questions, or a combination of both! Lots of challenges to solve around accuracy & perf, but the basic pieces are all there.
0 reply
0 recast
0 reaction

Ankur Goyal pfp
Ankur Goyal
@ankrgyl
Memory transformers (https://arxiv.org/abs/2006.11527) are one attempt at solving for this. Generally speaking, I think splitting out the data from the reasoning capabilities will be a requirement for this use case.
1 reply
0 recast
0 reaction

Ankur Goyal pfp
Ankur Goyal
@ankrgyl
However, data loading is very, very difficult. Database indexes incrementally update w/ new data. With LLMs, you need to fine tune a model with both a representative set of input questions and the underlying data jointly.
1 reply
0 recast
0 reaction

Ankur Goyal pfp
Ankur Goyal
@ankrgyl
Storage size/compression also exciting. Emad has a great line about Stable Diffusion is Pied Piper because of how efficiently it compresses ~5B images into ~4GB of weights. Columnar compression is usually 4-10x, not 1000x.
1 reply
0 recast
0 reaction

Ankur Goyal pfp
Ankur Goyal
@ankrgyl
LLM queries are remarkably fast (constant time) for almost any query (w/ tunable cost via max length). The tradeoff of course is accuracy, since you cannot guarantee correctness. Note in my prompt I asked for citations. Verifying truthfulness is an active area of research.
1 reply
0 recast
0 reaction

Ankur Goyal pfp
Ankur Goyal
@ankrgyl
One way of thinking about this pattern is "LLMs are an index". This is limiting b/c LLMs support unstructured data, but let's start here. Indexes are measured by query perf, storage size, and update cost.
1 reply
0 recast
0 reaction

Ankur Goyal pfp
Ankur Goyal
@ankrgyl
First, a quick example. Let's query some knowledge about the S&P 500. Assuming you had growth rates saved in a database, this would be a SQL query like: > SELECT "year" FROM sp_growth ORDER BY "growth_rate" DESC LIMIT 1 https://i.imgur.com/zOpUdoM.jpg
1 reply
0 recast
0 reaction

Ankur Goyal pfp
Ankur Goyal
@ankrgyl
Thoughts on GPT-3/LLM is a "better database" from someone who has worked on relational databases for over a decade and AI for half. tl;dr I think they have the potential to be (1/n)
1 reply
0 recast
0 reaction

Ankur Goyal pfp
Ankur Goyal
@ankrgyl
Yeah this looks cool. LINQ is similar.
0 reply
0 recast
0 reaction

Ankur Goyal pfp
Ankur Goyal
@ankrgyl
I can't find any libraries that let you run SQL queries on native datastructures (e.g. vectors). Am I missing something?
1 reply
0 recast
0 reaction