Recent Apple silicone with unified memory seems to have a strategic advantage over a traditional CPU/GPU pairing.

E.g. M2 Ultra with 192Gb RAM which can be allocated to any activity like ML. And the M3 series is apparently just pushing on this track.

Anyone with concrete feedback on all this?

Scientist and engineer interested in Life.

Work on (GOF)AI, Agriculture, Biology, Energy, Robotics, Space---all topics climate-aware.

Personal context is that one finger on ordering a pricey Nvidia-based shared machine, while I am basically giving up on Apple. But as a shared machine, an M3 may be faster and even more cost effective?

And asking to smart and trusted community here, as the real question is about, well, software support—difficult to get solid feedback without Warpcast.

GGML/llama.cpp is riding this Apple machine wave. Not a hardcore tinkerer, haven't tried it, but have seen lots of demos on the repo:

Thanks to Warpcast, answers on a fresh topic can be available in less time than it takes to open ChatGPT today.

Answers, plural. And *even more*, a dangling pointer for future surprises.

Saw that before, but it was a world with perhaps less dictatorship overall.

GGML/llama.cpp is riding this Apple machine wave. Not a hardcore tinkerer, haven't tried it, but have seen lots of demos on the repo: https://github.com/ggerganov/llama.cpp