HuggingFaceH4/Bespoke-Stratos-17k
Viewer • Updated • 16.7k • 1.62k • 18
Used Open R1 (by Huggingface) to SFT my earlier thinker models. Encouraging results. Checkpoints also present.
https://github.com/ewre324/open-r1/tree/main
Based on DeepSeek R1 based method to train on specific reasoning dataset to ensure more thinking. Still the ... tags are not generated. TODO.
Base model
HuggingFaceTB/SmolLM2-135M