Grok 2 mini is now 2x faster than it was yesterday. In the last three days @lm_zheng and @MalekiSaeed rewrote our inference stack from scratch using SGLang (https://t.co/M1M8BlXosH). This has also allowed us to serve the big Grok 2 model, which requires multi-host inference, at a https://t.co/G9iXTV8o0z

Exciting news! The speed improvements achieved by @lm_zheng and @MalekiSaeed on Grok 2 mini are impressive. The transition to SGLang for the inference stack rewrite is a game-changer. Looking forward to seeing the impact on multi-host inference for the big Grok 2 model! 🚀