Artificial Analysis has collected the top 100 LLMs in one table so that you can conveniently choose the one for your tasks.

Select by parameters:

- Benchmarks: Chatbot Arena, MMLU, HumanEval, Index of evals, MT-Bench.
- Cost: entry, exit, average
- Speed ​​in tokens/sec: median, P5, P25, P75, P95 (those who understand, understand).
- Delay: median, P5, P25, P75, P95.
- Size of the context window.
- Compatible with OpenAI library.

Top 1 from each category:
- Benchmarks: Claude 3 Opus, GPT-4 Turbo
- Cost: $0.06/1M Llama 3 (8B) tokens via groq API
- Speed: 912.9 tokens/sec Llama 3 (8B) via groq API
- Latency: 0.13s Mistral 7B via baseten API
- Context window size: 1m Gemini 1.5 Pro

They did it beautifully.