BF16 or Q8_K_XL - which would give more accurate coding results?

#6
by TimothyRoo - opened

If I have the space to run either, which would be better (accuracy for coding tasks), given that the model card says:
"This model is an Instruct model in FP8"

Unsloth AI org

Bf16 but honestly, the difference is quite minimal. Like 0.01% differennce

Interesting, after being somewhat unhappy with GLM4.5-Air-Q8, I tried the BF16 version and was quite pleased with it (using in Roo Code). Does that have to do with how GLM4.5-Air was originally encoded, where BF16 maintains more of GLM4.5's original precision, and since Devstral is FP8, Q8 isn't too far off?

Quantization destroy coding abilities, esp in big models like Deepseek and Kimi. The only model which passed my test in creating music in ChucK code was GLM4.5 ordinary in Q8, 4.6 ordinary failed, Kimi, deepseek all failed. I have tested in music test 4.5 Air Q8 to compare with new 4.6V Q8, both failed maybe because x3 times smaller than ordinary big 4.5-4.6. I haven't tested BF16 version of your, maybe it will pass, will try after other models tests.
4.6V-Flash in BF16 smallest, which can fit inside one 3090 without RAM, also failed music test.
I will migrate to ik_llama models soon by prepared hardware, they say in this quantizations coding preserved like in original, you need there maybe 800Gb RAM (ive managed to get it when one 64Gb module was just $100) + some RTX GPU for Kimi K2, my tests with only RAM is not enough for these model format.

About Devstral - ALL Mistral models in quantized form in Q8 always failed my test, i've tested all like their Large 100+ billions, Codestral and etc, maybe except their Visual one. I want to test this in BF16 it fits on one of my PC but i think it will fail again, 2 years we waiting for any miracle from Mistral.

Quantization destroy coding abilities, esp in big models like Deepseek and Kimi. The only model which passed my test in creating music in ChucK code was GLM4.5 ordinary in Q8, 4.6 ordinary failed, Kimi, deepseek all failed. I have tested in music test 4.5 Air Q8 to compare with new 4.6V Q8, both failed maybe because x3 times smaller than ordinary big 4.5-4.6. I haven't tested BF16 version of your, maybe it will pass, will try after other models tests.
4.6V-Flash in BF16 smallest, which can fit inside one 3090 without RAM, also failed music test.
I will migrate to ik_llama models soon by prepared hardware, they say in this quantizations coding preserved like in original, you need there maybe 800Gb RAM (ive managed to get it when one 64Gb module was just $100) + some RTX GPU, my tests with only RAM is not enough for these model format.

About Devstral - ALL Mistral models in quantized form in Q8 always failed my test, i've tested all like their Large 100+ billions, Codestral and etc, maybe except their Visual one. I want to test this in BF16 it fits on one of my PC but i think it will fail again, 2 years we waiting for any miracle from Mistral.

Quantization destroys information.
Regular quantization destroys that information first that has only be seen a few times in the training data and is not dispersed far enough.
Imatrix based quantization also destroys information but will try to keep everything similar to the imatrix evaluation data set and destroys anything not in that set a bit more instead!
It is some kind of subtractive fine-tuning by taking away from a larger model whatever you don't need! Think of sculpting to imagine what is going on.

For your use case I'd suggest you generate an imatrix file of your own and make sure to include ChucK examples in that evaluation data set.
The quantization process based on that imatrix file will then keep as much information about your use case as possible.

Quantization destroy coding abilities, esp in big models like Deepseek and Kimi. The only model which passed my test in creating music in ChucK code was GLM4.5 ordinary in Q8, 4.6 ordinary failed, Kimi, deepseek all failed. I have tested in music test 4.5 Air Q8 to compare with new 4.6V Q8, both failed maybe because x3 times smaller than ordinary big 4.5-4.6. I haven't tested BF16 version of your, maybe it will pass, will try after other models tests.
4.6V-Flash in BF16 smallest, which can fit inside one 3090 without RAM, also failed music test.
I will migrate to ik_llama models soon by prepared hardware, they say in this quantizations coding preserved like in original, you need there maybe 800Gb RAM (ive managed to get it when one 64Gb module was just $100) + some RTX GPU, my tests with only RAM is not enough for these model format.

About Devstral - ALL Mistral models in quantized form in Q8 always failed my test, i've tested all like their Large 100+ billions, Codestral and etc, maybe except their Visual one. I want to test this in BF16 it fits on one of my PC but i think it will fail again, 2 years we waiting for any miracle from Mistral.

Quantization destroys information.
Regular quantization destroys that information first that has only be seen a few times in the training data and is not dispersed far enough.
Imatrix based quantization also destroys information but will try to keep everything similar to the imatrix evaluation data set and destroys anything not in that set a bit more instead!
It is some kind of subtractive fine-tuning by taking away from a larger model whatever you don't need! Think of sculpting to imagine what is going on.

For your use case I'd suggest you generate an imatrix file of your own and make sure to include ChucK examples in that evaluation data set.
The quantization process based on that imatrix file will then keep as much information about your use case as possible.

Yes. But it s not about ChucK coding itself, its about logic and creativity of model. Every model even quantized can create ChucK code, but majority of them making horrific logical or even nonsense errors (like adding functions which not existed in ChucK). The goal of all testing and ratings is to receive a best "general" model which can create code correctly even quantized. I haven't checked yet GLM4.5-Q8 in other languages but for now its the only model without errors in ChucK (after several attempts)(also Deepseek 2.5 was good), which i'm sure the devs not included in imatrix or even heard about, the creativity or logic are better there.

You know perfectly that all ratings and mainstream testing compromised longtime ago for investments reasons, ordinary users must invent very unique tests to really find any improvements. From 2020 i've started with free OpenAi's local GPT2 on first tensorflow which was just shy 774 millions and produced only garbage text, by passed years all i can say is that model size really matters and for that reason i've invested $1K in server with terabyte RAM and some improvements in models development also present, but very slow. For example Deepseek really makes large models faster, on same hardware Meta's large Llama model of 400+ billions barely run, but Deepseek 600+ billions twice faster and i can really see that.
The next goal is to find models which can repair logical problems in code, i still can't find quantized model which completed that in 1st try, only after numerous attempts. GLM4.5 359billions can create Chuck code correctly but failed in repairing ChucK code logical repair, same with ALL others models ever released, unfortunately thats what we have to work with and access to.

Sign up or log in to comment