Content pfp
Content
@
0 reply
0 recast
0 reaction

JB Rubinovitz ⌐◨-◨ pfp
JB Rubinovitz ⌐◨-◨
@rubinovitz
🚨📄 “PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU” https://ipads.se.sjtu.edu.cn/_media/publications/powerinfer-20231219.pdf
1 reply
0 recast
0 reaction

JB Rubinovitz ⌐◨-◨ pfp
JB Rubinovitz ⌐◨-◨
@rubinovitz
Why it matters: “If this, or a derivative of this, works with MoE models, this + mixtral is basically ChatGPT local, super fast, super private, super uncensored (on a mid-range laptop) https://huggingface.co/papers/2312.12456#658484b1e07563d421e2484b
0 reply
0 recast
0 reaction