Artificial Intelligence (AI)

What are the best tools for running a scaleable cloud instance of Llama3 for text generation? 

It doesn't need to go into production, but I want to test something on a machine larger than my MBP. 

I've tried Replit (not supported) and AWS (way too complicated for such a simple task). 

Any obvious tools?

Building /bright-moments - minting onchain art IRL | /purple #15 | FID 129 #magikarp

+1 to Modal having good docs for sample code to set up a Llama instance

Other option is to use some of these secondary hosting platforms for open source models like Lorax on Predibase

Have you tried Modal?

I've found it very easy to use and the team extremely responsive.

Have you tried Modal?

I've found it very easy to use and the team extremely responsive.

https://modal.com/docs/examples/text_generation_inference

My company use vLLM with a rented GPU VM

I also have one of my friend hosting LLM as a service as well

My company use vLLM with a rented GPU VM

I also have one of my friend hosting LLM as a service as well https://float16.cloud