Frend pfp
Frend
@frend
How come llama.cpp's convert_hf_to_gguf.py supports Q8_0 but if you want to use Q4_0 you first need to use fp16 and then build llama-quantize to finally get it to Q4_0? It gets you there eventually but I feel it could be simpler.
1 reply
0 recast
2 reactions

horsefacts pfp
horsefacts
@horsefacts.eth
you should join /ai, ping @gm8xx8
1 reply
0 recast
1 reaction

Frend pfp
Frend
@frend
I would but I'm far from a qualified expert. I just use the easy tools like GPT4All and ComfyUI. I used to train my own models to generate new cryptopunks but that was a long time ago and I don't remember how I did it.
0 reply
0 recast
0 reaction