Qwen3.5 Dense-to-MoE Weight Transfer
Collection
Qwen3.5 MoE models from dual-source weight transfer (dense backbone + 35B-A3B experts). Hybrid DeltaNet + GQA attention. • 6 items • Updated
A tiny Qwen3.5 hybrid MoE model for testing and validation purposes.
This model has random weights and is not trained. It exists to validate the architecture implementation and hub upload pipeline.