Unofficial Mistral Community [deprecated]

community

Activity Feed Request to join this org

AI & ML interests

Unofficial org for community upload of Mistral's Open Source models.

Recent Activity

nielsr submitted a paper 3 days ago

Geometric Context Transformer for Streaming 3D Reconstruction

nielsr submitted a paper 10 days ago

A Frame is Worth One Token: Efficient Generative World Modeling with Delta Tokens

nielsr submitted a paper 16 days ago

MDPBench: A Benchmark for Multilingual Document Parsing in Real-World Scenarios

View all activity

danielhanchen

posted an update 3 days ago

Post

2287

Qwen3.6-35B-A3B can now be run locally! 💜

The model is the strongest mid-sized LLM on nearly all benchmarks.

Run on 23GB RAM via Unsloth Dynamic GGUFs.

GGUFs to run: unsloth/Qwen3.6-35B-A3B-GGUF
Guide: https://unsloth.ai/docs/models/qwen3.6

10 replies

nielsr

submitted a paper to Daily Papers 3 days ago

Geometric Context Transformer for Streaming 3D Reconstruction

Paper • 2604.14141 • Published 4 days ago • 4

nielsr

submitted a paper to Daily Papers 10 days ago

A Frame is Worth One Token: Efficient Generative World Modeling with Delta Tokens

Paper • 2604.04913 • Published 13 days ago • 10

danielhanchen

posted an update 12 days ago

Post

5335

You can now fine-tune Gemma 4 for free with our notebooks! 🔥

You just need 8GB VRAM to train Gemma 4 locally!

Unsloth trains Gemma4 1.5x faster with 50% less VRAM.
GitHub: https://github.com/unslothai/unsloth
Guide + Notebooks: https://unsloth.ai/docs/models/gemma-4/train

5 replies

nielsr

submitted a paper to Daily Papers 16 days ago

MDPBench: A Benchmark for Multilingual Document Parsing in Real-World Scenarios

Paper • 2603.28130 • Published 20 days ago • 11

danielhanchen

posted an update 16 days ago

Post

3722

Google releases Gemma 4. ✨
Gemma 4 introduces 4 models: E2B, E4B, 26B-A4B, 31B.
The multimodal reasoning models are under Apache 2.0.

Run E2B and E4B on ~6GB RAM, and on phones. Run 26B-A4B and 31B on ~18GB.

GGUFs: https://huggingface.co/collections/unsloth/gemma-4
Guide: https://unsloth.ai/docs/models/gemma-4

danielhanchen

posted an update 19 days ago

Post

2702

A new way to use Unsloth.

Coming soon...

MaziyarPanahi

posted an update 19 days ago

Post

1379

Training mRNA Language Models Across 25 Species for $165

We built an end-to-end protein AI pipeline covering structure prediction, sequence design, and codon optimization. After comparing multiple transformer architectures for codon-level language modeling, CodonRoBERTa-large-v2 emerged as the clear winner with a perplexity of 4.10 and a Spearman CAI correlation of 0.40, significantly outperforming ModernBERT. We then scaled to 25 species, trained 4 production models in 55 GPU-hours, and built a species-conditioned system that no other open-source project offers. Complete results, architectural decisions, and runnable code below.

https://huggingface.co/blog/OpenMed/training-mrna-models-25-species

danielhanchen

posted an update 25 days ago

Post

902

You don’t need to set LLM parameters anymore! 🚀

llama.cpp uses only the context length + compute your local setup needs. Unsloth also auto-applies the correct model settings

Try in Unsloth Studio - now with precompiled llama.cpp binaries.

GitHub: https://github.com/unslothai/unsloth

2 replies

MaziyarPanahi

posted an update 25 days ago

Post

2226

We annotated 119K medical images with two frontier VLMs (Qwen 3.5, Kimi K2.5), cross-validated at 93% agreement, and produced 110K training records, all for under $500. Fine-tuning 3 small models (2-3B params) improved all benchmarks: best model reaches +15.0% average exact match.

Everything is open-sourced: datasets, adapters, and code.

https://huggingface.co/blog/OpenMed/synthvision

2 replies

nielsr

submitted a paper to Daily Papers 27 days ago

Do VLMs Need Vision Transformers? Evaluating State Space Models as Vision Encoders

Paper • 2603.19209 • Published about 1 month ago • 5

nielsr

submitted 2 papers to Daily Papers about 1 month ago

V-JEPA 2.1: Unlocking Dense Features in Video Self-Supervised Learning

Paper • 2603.14482 • Published Mar 15 • 30

Omnilingual MT: Machine Translation for 1,600 Languages

Paper • 2603.16309 • Published Mar 17 • 21

danielhanchen

posted an update about 1 month ago

Post

3380

Introducing Unsloth Studio ✨
A new open-source web UI to train and run LLMs.

• Run models locally on Mac, Windows, Linux
• Train 500+ models 2x faster with 70% less VRAM
• Supports GGUF, vision, audio, embedding models
• Auto-create datasets from PDF, CSV, DOCX
• Self-healing tool calling and code execution
• Compare models side by side + export to GGUF

GitHub: https://github.com/unslothai/unsloth
Blog and Guide: https://unsloth.ai/docs/new/studio

Available now on Hugging Face, NVIDIA, Docker and Colab.

danielhanchen

posted an update about 1 month ago

Post

3914

We collaborated with NVIDIA to teach you about Reinforcement Learning and RL environments. 💚 Learn:

• Why RL environments matter + how to build them
• When RL is better than SFT
• GRPO and RL best practices
• How verifiable rewards and RLVR work

Blog: https://unsloth.ai/blog/rl-environments

4 replies

nielsr

authored a paper about 1 month ago

Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections

Paper • 2603.12180 • Published Mar 12 • 65

MaziyarPanahi

posted an update about 2 months ago

Post

4833

DNA, mRNA, proteins, AI. I spent the last year going deep into computational biology as an ML engineer. This is Part I of what I found. 🧬

In 2024, AlphaFold won the Nobel Prize in Chemistry.

By 2026, the open-source community had built alternatives that outperform it.

That's the story I find most interesting about protein AI right now. Not just the science (which is incredible), but the speed at which open-source caught up. Multiple teams, independently, reproduced and then exceeded AlphaFold 3's accuracy with permissive licenses. The field went from prediction to generation: we're not just modeling known proteins anymore, we're designing new ones.

I spent months mapping this landscape for ML engineers. What the architectures actually are (spoiler: transformers and diffusion models), which tools to use for what, and which ones you can actually ship commercially.

New post on the Hugging Face blog: https://huggingface.co/blog/MaziyarPanahi/protein-ai-landscape

Hope you all enjoy! 🤗

2 replies

merve

updated a Space about 2 months ago

README

👀

danielhanchen

posted an update about 2 months ago

Post

3450

100,000+ models trained with Unsloth have now been open-sourced on 🤗Hugging Face! 🦥

Here are the most popular ones you can run local:
1. TeichAI - GLM-4.7-Flash distilled from Claude 4.5 Opus (high)
2. Zed - Qwen Coder 7B fine-tuned for stronger coding
3. DavidAU - Llama-3.3-8B distilled from Claude 4.5 Opus (high)
4. huihui - gpt-oss made “abliberated”

Links to models:
1. TeichAI: TeichAI/GLM-4.7-Flash-Claude-Opus-4.5-High-Reasoning-Distill-GGUF
2. Zed: zed-industries/zeta
3. DavidAU: DavidAU/Llama3.3-8B-Instruct-Thinking-Claude-4.5-Opus-High-Reasoning
4. huihui: huihui-ai/Huihui-gpt-oss-20b-BF16-abliterated

See all the 100K latest models fine-tuned with Unsloth here: https://huggingface.co/models?other=u

2 replies

nielsr

submitted a paper to Daily Papers about 2 months ago

VidEoMT: Your ViT is Secretly Also a Video Segmentation Model

Paper • 2602.17807 • Published Feb 19 • 7

AI & ML interests

Recent Activity

Team members 24

mistral-community's activity

README