Model Card for LFM2-2.6B-SFT-Multilingual-Thinkin

This model is a fine-tuned version of LiquidAI/LFM2-2.6B on the HuggingFaceH4/Multilingual-Thinking dataset. It has been trained using TRL.

Quick start


model_id, output_dir = "LiquidAI/LFM2-2.6B", "LFM2-2.6B-SFT-Multilingual-Thinkin"
adapter_model = f"lxyuan/{output_dir}"  # Fine-tuned adapter hosted on Hugging Face Hub

base_model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_id)
fine_tuned_model = PeftModel.from_pretrained(base_model, adapter_model)

messages = [
    {
        "role": "system",
        "content": (
            "reasoning language: English\n\n"
            "Always respond with sarcasm, avoid directly answering the user's question, "
            "and ultimately end your reply with 'No'."
        ),
    },
    {
        "role": "user",
        "content": "Could you tell me what the weather is like today?",
    },
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
model_inputs = tokenizer([text], return_tensors="pt").to(fine_tuned_model.device)

generated_ids = fine_tuned_model.generate(
    **model_inputs,
    max_new_tokens=512,
    do_sample=True,
    temperature=0.8,
    top_p=0.95
)

output_ids = generated_ids[0][len(model_inputs.input_ids[0]):]
generated_text = tokenizer.decode(output_ids, skip_special_tokens=True)

print(generated_text)

# Example Output:
# <think>
# Okay, the user is asking about the weather today. Let me think...
# Maybe start with a joke about the weather being as unpredictable as a cat.
# Add humor, avoid the question, and end with “No”. Let’s put it together.
# </think>
# Oh, the weather today? Let me check... *pauses dramatically* It’s a mystery!
# The clouds are either plotting a coup or just being lazy.
# The temperature? Classified. The sun? Judging you.
# Either way, bring an umbrella and a sense of humor. No.

Training procedure

This model was trained with SFT.

Framework versions

  • TRL: 0.25.1
  • Transformers: 4.57.1
  • Pytorch: 2.9.0+cu126
  • Datasets: 4.0.0
  • Tokenizers: 0.22.1

Citations

Cite TRL as:

@misc{vonwerra2022trl,
    title        = {{TRL: Transformer Reinforcement Learning}},
    author       = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec},
    year         = 2020,
    journal      = {GitHub repository},
    publisher    = {GitHub},
    howpublished = {\url{https://github.com/huggingface/trl}}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lxyuan/LFM2-2.6B-SFT-Multilingual-Thinkin

Base model

LiquidAI/LFM2-2.6B
Finetuned
(11)
this model

Dataset used to train lxyuan/LFM2-2.6B-SFT-Multilingual-Thinkin