Curious how do folks think about picking the right model for their use case? Ie, when use GPT-4o, Llama 3, Gemini 1.5, DeepSeek or other? How does pricing fit into consideration?

Founder ChainVine (Backed by Slow, Multicoin, Hustle Fund, and more) | former COO Setter (20M from Sequoia & NFX, sold Thumbtack) | FID #1277.

I start with requirements on latency. If I can tolerate high latency I start with large models and work down. If I can't tolerate high latency I start with the small models and work up until I get something that is giving me 85%+ success rate on typical questions.