Content
@
0 reply
0 recast
0 reaction
phil
@phil
What are the best tools for running a scaleable cloud instance of Llama3 for text generation? It doesn't need to go into production, but I want to test something on a machine larger than my MBP. I've tried Replit (not supported) and AWS (way too complicated for such a simple task). Any obvious tools?
4 replies
2 recasts
11 reactions
Zenigame
@zeni.eth
Have you tried Modal? I've found it very easy to use and the team extremely responsive. https://modal.com/docs/examples/text_generation_inference
0 reply
0 recast
1 reaction
Jason
@jachian
+1 to Modal having good docs for sample code to set up a Llama instance Other option is to use some of these secondary hosting platforms for open source models like Lorax on Predibase
0 reply
0 recast
0 reaction
Ren π© βοΈ
@renatov.eth
also interesting this question
0 reply
0 recast
0 reaction
chompk β
@chompk
My company use vLLM with a rented GPU VM I also have one of my friend hosting LLM as a service as well https://float16.cloud
0 reply
0 recast
0 reaction