Step-3.5-Flash-Base-Midtrain Abliterated i1

#1961
by wa999 - opened
wa999 changed discussion title from Step-3.5-Flash-Base-Midtrain i1 to Step-3.5-Flash-Base-Midtrain Abliterated i1

sorry but, we dont abliterate on request, especially models that big (I maybe could fit a 7b or something on my gpu), only quant, you can ask someone else. But I queued this model for quantization =)

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#Step-3.5-Flash-Base-Midtrain-GGUF for quants to appear.

Looks like this got removed?
What happened?

@wa999 The GGUFs failed our internal tests due to failing to load in llama.cpp because of error loading model: error loading model hyperparameters: key not found in model: step35.context_length:

Step-3.5-Flash-Base-Midtrain    + echo '{[[PROGRESS:dryrun...]]}'
Step-3.5-Flash-Base-Midtrain    {[[PROGRESS:dryrun...]]}
Step-3.5-Flash-Base-Midtrain    + DRYRUN=
Step-3.5-Flash-Base-Midtrain    + llama llama-completion -m Step-3.5-Flash-Base-Midtrain.gguf~ --no-warmup -n 0 -t 1 -no-cnv -st
Step-3.5-Flash-Base-Midtrain    ggml_cuda_init: failed to initialize CUDA: no CUDA-capable device is detected
Step-3.5-Flash-Base-Midtrain    build: 8461 (bc190353) with GNU 14.2.0 for Linux x86_64
Step-3.5-Flash-Base-Midtrain    main: llama backend init
Step-3.5-Flash-Base-Midtrain    main: load the model and apply lora adapter, if any
Step-3.5-Flash-Base-Midtrain    common_init_result: fitting params to device memory, for bugs during this step try to reproduce them with -fit off, or provide --verbose logs if the bug only occurs with -fit on
Step-3.5-Flash-Base-Midtrain    llama_model_load: error loading model: error loading model hyperparameters: key not found in model: step35.context_length
Step-3.5-Flash-Base-Midtrain    llama_model_load_from_file_impl: failed to load model
Step-3.5-Flash-Base-Midtrain    llama_params_fit: encountered an error while trying to fit params to free device memory: failed to load model
Step-3.5-Flash-Base-Midtrain    llama_params_fit: fitting params to free memory took 0.05 seconds
Step-3.5-Flash-Base-Midtrain    llama_model_loader: loaded meta data with 42 key-value pairs and 753 tensors from Step-3.5-Flash-Base-Midtrain.gguf~ (version GGUF V3 (latest))
Step-3.5-Flash-Base-Midtrain    llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv   0:                       general.architecture str              = step35
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv   1:                               general.type str              = model
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv   2:                               general.name str              = Step 3.5 Flash Base Midtrain
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv   3:                         general.size_label str              = 288x7.4B
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv   4:                            general.license str              = apache-2.0
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv   5:                         step35.block_count u32              = 45
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv   6:                    step35.embedding_length u32              = 4096
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv   7:                 step35.feed_forward_length u32              = 11264
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv   8:                step35.attention.head_count arr[i32,45]      = [64, 96, 96, 96, 64, 96, 96, 96, 64, ...
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv   9:                      step35.rope.freq_base f32              = 5000000.000000
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  10:                  step35.rope.freq_base_swa f32              = 10000.000000
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  11:                  step35.expert_gating_func u32              = 2
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  12:                step35.attention.key_length u32              = 128
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  13:              step35.attention.value_length u32              = 128
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  14:                          general.file_type u32              = 1025
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  15:             step35.attention.head_count_kv arr[i32,45]      = [8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, ...
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  16:            step35.attention.sliding_window u32              = 512
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  17:    step35.attention.sliding_window_pattern arr[bool,45]     = [false, true, true, true, false, true...
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  18:                        step35.expert_count u32              = 288
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  19:                   step35.expert_used_count u32              = 8
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  20:          step35.expert_feed_forward_length u32              = 1280
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  21:   step35.expert_shared_feed_forward_length u32              = 1280
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  22:                step35.expert_weights_scale f32              = 3.000000
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  23:                 step35.expert_weights_norm bool             = true
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  24:           step35.leading_dense_block_count u32              = 3
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  25:                  step35.moe_every_n_layers u32              = 1
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  26:    step35.attention.layer_norm_rms_epsilon f32              = 0.000010
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  27:                    step35.swiglu_clamp_exp arr[f32,45]      = [0.000000, 0.000000, 0.000000, 0.0000...
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  28:                  step35.swiglu_clamp_shexp arr[f32,45]      = [0.000000, 0.000000, 0.000000, 0.0000...
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  29:               general.quantization_version u32              = 2
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  30:                       tokenizer.ggml.model str              = gpt2
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  31:                         tokenizer.ggml.pre str              = deepseek-v3
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  32:                      tokenizer.ggml.tokens arr[str,128896]  = ["<ï½beginâofâsentenceï½>", "<ï...
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  33:                  tokenizer.ggml.token_type arr[i32,128896]  = [3, 3, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  34:                      tokenizer.ggml.merges arr[str,127741]  = ["Ä  t", "Ä  a", "i n", "Ä  Ä ", "h e...
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  35:                tokenizer.ggml.bos_token_id u32              = 0
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  36:                tokenizer.ggml.eos_token_id u32              = 128007
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  37:            tokenizer.ggml.padding_token_id u32              = 1
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  38:               tokenizer.ggml.add_bos_token bool             = true
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  39:               tokenizer.ggml.add_sep_token bool             = false
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  40:               tokenizer.ggml.add_eos_token bool             = false
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - kv  41:                    tokenizer.chat_template str              = {% macro render_content(content) %}{%...
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - type  f32:  265 tensors
Step-3.5-Flash-Base-Midtrain    llama_model_loader: - type bf16:  488 tensors
Step-3.5-Flash-Base-Midtrain    llama_model_loader: mmap is not supported for dry-run so it is now disabled
Step-3.5-Flash-Base-Midtrain    print_info: file format = GGUF V3 (latest)
Step-3.5-Flash-Base-Midtrain    print_info: file type   = F16 (guessed)
Step-3.5-Flash-Base-Midtrain    print_info: file size   = 366.95 GiB (16.00 BPW) 
Step-3.5-Flash-Base-Midtrain    llama_model_load: error loading model: error loading model hyperparameters: key not found in model: step35.context_length
Step-3.5-Flash-Base-Midtrain    llama_model_load_from_file_impl: failed to load model
Step-3.5-Flash-Base-Midtrain    common_init_from_params: failed to load model 'Step-3.5-Flash-Base-Midtrain.gguf~'
Step-3.5-Flash-Base-Midtrain    /llmjob/share/bin/quantize: line 235: 1340178 Segmentation fault      DRYRUN= llama llama-completion -m "$SRC.gguf~" --no-warmup -n 0 -t 1 -no-cnv -st < /dev/null
Step-3.5-Flash-Base-Midtrain    + echo 'dryrun failed'
Step-3.5-Flash-Base-Midtrain    dryrun failed
Step-3.5-Flash-Base-Midtrain    + exit 57
Step-3.5-Flash-Base-Midtrain    job finished, status 57
Step-3.5-Flash-Base-Midtrain    job-done<0 Step-3.5-Flash-Base-Midtrain noquant 57>

Sign up or log in to comment