Step-3.5-Flash-Base-Midtrain Abliterated i1
#1961
by
wa999 - opened
wa999 changed discussion title from
Step-3.5-Flash-Base-Midtrain i1
to Step-3.5-Flash-Base-Midtrain Abliterated i1
sorry but, we dont abliterate on request, especially models that big (I maybe could fit a 7b or something on my gpu), only quant, you can ask someone else. But I queued this model for quantization =)
You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#Step-3.5-Flash-Base-Midtrain-GGUF for quants to appear.
Looks like this got removed?
What happened?
@wa999 The GGUFs failed our internal tests due to failing to load in llama.cpp because of error loading model: error loading model hyperparameters: key not found in model: step35.context_length:
Step-3.5-Flash-Base-Midtrain + echo '{[[PROGRESS:dryrun...]]}'
Step-3.5-Flash-Base-Midtrain {[[PROGRESS:dryrun...]]}
Step-3.5-Flash-Base-Midtrain + DRYRUN=
Step-3.5-Flash-Base-Midtrain + llama llama-completion -m Step-3.5-Flash-Base-Midtrain.gguf~ --no-warmup -n 0 -t 1 -no-cnv -st
Step-3.5-Flash-Base-Midtrain ggml_cuda_init: failed to initialize CUDA: no CUDA-capable device is detected
Step-3.5-Flash-Base-Midtrain build: 8461 (bc190353) with GNU 14.2.0 for Linux x86_64
Step-3.5-Flash-Base-Midtrain main: llama backend init
Step-3.5-Flash-Base-Midtrain main: load the model and apply lora adapter, if any
Step-3.5-Flash-Base-Midtrain common_init_result: fitting params to device memory, for bugs during this step try to reproduce them with -fit off, or provide --verbose logs if the bug only occurs with -fit on
Step-3.5-Flash-Base-Midtrain llama_model_load: error loading model: error loading model hyperparameters: key not found in model: step35.context_length
Step-3.5-Flash-Base-Midtrain llama_model_load_from_file_impl: failed to load model
Step-3.5-Flash-Base-Midtrain llama_params_fit: encountered an error while trying to fit params to free device memory: failed to load model
Step-3.5-Flash-Base-Midtrain llama_params_fit: fitting params to free memory took 0.05 seconds
Step-3.5-Flash-Base-Midtrain llama_model_loader: loaded meta data with 42 key-value pairs and 753 tensors from Step-3.5-Flash-Base-Midtrain.gguf~ (version GGUF V3 (latest))
Step-3.5-Flash-Base-Midtrain llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 0: general.architecture str = step35
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 1: general.type str = model
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 2: general.name str = Step 3.5 Flash Base Midtrain
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 3: general.size_label str = 288x7.4B
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 4: general.license str = apache-2.0
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 5: step35.block_count u32 = 45
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 6: step35.embedding_length u32 = 4096
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 7: step35.feed_forward_length u32 = 11264
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 8: step35.attention.head_count arr[i32,45] = [64, 96, 96, 96, 64, 96, 96, 96, 64, ...
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 9: step35.rope.freq_base f32 = 5000000.000000
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 10: step35.rope.freq_base_swa f32 = 10000.000000
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 11: step35.expert_gating_func u32 = 2
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 12: step35.attention.key_length u32 = 128
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 13: step35.attention.value_length u32 = 128
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 14: general.file_type u32 = 1025
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 15: step35.attention.head_count_kv arr[i32,45] = [8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, ...
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 16: step35.attention.sliding_window u32 = 512
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 17: step35.attention.sliding_window_pattern arr[bool,45] = [false, true, true, true, false, true...
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 18: step35.expert_count u32 = 288
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 19: step35.expert_used_count u32 = 8
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 20: step35.expert_feed_forward_length u32 = 1280
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 21: step35.expert_shared_feed_forward_length u32 = 1280
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 22: step35.expert_weights_scale f32 = 3.000000
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 23: step35.expert_weights_norm bool = true
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 24: step35.leading_dense_block_count u32 = 3
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 25: step35.moe_every_n_layers u32 = 1
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 26: step35.attention.layer_norm_rms_epsilon f32 = 0.000010
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 27: step35.swiglu_clamp_exp arr[f32,45] = [0.000000, 0.000000, 0.000000, 0.0000...
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 28: step35.swiglu_clamp_shexp arr[f32,45] = [0.000000, 0.000000, 0.000000, 0.0000...
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 29: general.quantization_version u32 = 2
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 30: tokenizer.ggml.model str = gpt2
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 31: tokenizer.ggml.pre str = deepseek-v3
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 32: tokenizer.ggml.tokens arr[str,128896] = ["<ï½beginâofâsentenceï½>", "<ï...
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 33: tokenizer.ggml.token_type arr[i32,128896] = [3, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 34: tokenizer.ggml.merges arr[str,127741] = ["Ä t", "Ä a", "i n", "Ä Ä ", "h e...
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 35: tokenizer.ggml.bos_token_id u32 = 0
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 36: tokenizer.ggml.eos_token_id u32 = 128007
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 37: tokenizer.ggml.padding_token_id u32 = 1
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 38: tokenizer.ggml.add_bos_token bool = true
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 39: tokenizer.ggml.add_sep_token bool = false
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 40: tokenizer.ggml.add_eos_token bool = false
Step-3.5-Flash-Base-Midtrain llama_model_loader: - kv 41: tokenizer.chat_template str = {% macro render_content(content) %}{%...
Step-3.5-Flash-Base-Midtrain llama_model_loader: - type f32: 265 tensors
Step-3.5-Flash-Base-Midtrain llama_model_loader: - type bf16: 488 tensors
Step-3.5-Flash-Base-Midtrain llama_model_loader: mmap is not supported for dry-run so it is now disabled
Step-3.5-Flash-Base-Midtrain print_info: file format = GGUF V3 (latest)
Step-3.5-Flash-Base-Midtrain print_info: file type = F16 (guessed)
Step-3.5-Flash-Base-Midtrain print_info: file size = 366.95 GiB (16.00 BPW)
Step-3.5-Flash-Base-Midtrain llama_model_load: error loading model: error loading model hyperparameters: key not found in model: step35.context_length
Step-3.5-Flash-Base-Midtrain llama_model_load_from_file_impl: failed to load model
Step-3.5-Flash-Base-Midtrain common_init_from_params: failed to load model 'Step-3.5-Flash-Base-Midtrain.gguf~'
Step-3.5-Flash-Base-Midtrain /llmjob/share/bin/quantize: line 235: 1340178 Segmentation fault DRYRUN= llama llama-completion -m "$SRC.gguf~" --no-warmup -n 0 -t 1 -no-cnv -st < /dev/null
Step-3.5-Flash-Base-Midtrain + echo 'dryrun failed'
Step-3.5-Flash-Base-Midtrain dryrun failed
Step-3.5-Flash-Base-Midtrain + exit 57
Step-3.5-Flash-Base-Midtrain job finished, status 57
Step-3.5-Flash-Base-Midtrain job-done<0 Step-3.5-Flash-Base-Midtrain noquant 57>