Apr 11: Updated with Google chat template fixes + more
#12
pinned
by danielhanchen - opened
Hey everyone, we’ve updated the quants again to include all of Google’s official chat template fixes (which fixed/improved tool-calling), along with the latest llama.cpp fixes.
We know there have been a lot of re-downloading lately, so we appreciate your patience. We’re pushing updates whenever fixes become available to make sure you always have the latest and best-performing quants.
NVIDIA is working on the CUDA 13.2 issue. Until it is fixed, do not use CUDA 13.2.
danielhanchen pinned discussion