How come llama.cpp's convert_hf_to_gguf.py supports Q8_0 but if you want to use Q4_0 you first need to use fp16 and then build  llama-quantize to finally get it to Q4_0? It gets you there eventually but I feel it could be simpler.

I am your frend :)

Left vs Right: https://clanke.rs/leftvsright | 0xfrend.base.eth

I'm eth newbie..just learning
(working on @farcaster)

terminally.online

I would but I'm far from a qualified expert. I just use the easy tools like GPT4All and ComfyUI. I used to train my own models to generate new cryptopunks but that was a long time ago and I don't remember how I did it.