Nathan Habib's picture

Building on HF

Nathan Habib PRO

SaylorTwift

huggingface

·

AI & ML interests

Evals

Recent Activity

liked a dataset 2 days ago

crosbylegal/RedlineBench

liked a model 2 days ago

MiniMaxAI/MiniMax-M3

upvoted an article 2 days ago

GLM-5.2: Built for Long-Horizon Tasks

View all activity

Organizations

liked a dataset 2 days ago

crosbylegal/RedlineBench

Viewer • Updated 3 days ago • 140 • 3.88k • 9

liked a model 2 days ago

MiniMaxAI/MiniMax-M3

Image-Text-to-Text • 427B • Updated 5 days ago • 104k • • 1.17k

upvoted an article 2 days ago

Article

GLM-5.2: Built for Long-Horizon Tasks

zai-org

•

4 days ago

• 81

liked a model 2 days ago

poolside/Laguna-M.1

Text Generation • 226B • Updated about 20 hours ago • 2.58k • 77

upvoted an article 3 days ago

Article

Is it agentic enough? Benchmarking open models on your own tooling

+1

lysandre, SaylorTwift, pcuenq

•

3 days ago

• 17

published an article 3 days ago

Article

Is it agentic enough? Benchmarking open models on your own tooling

+1

lysandre, SaylorTwift, pcuenq

•

3 days ago

• 17

liked 2 models 6 days ago

nex-agi/Nex-N2-Pro

Text Generation • 397B • Updated 10 days ago • 7.87k • 340

prefeitura-rio/Rio-3.5-Open-397B

Image-Text-to-Text • 403B • Updated 7 days ago • 191k • 327

New activity in CohereLabs/North-Mini-Code-1.0 10 days ago

Add eval results for SWE-bench Verified, SWE-bench Pro, and Terminal-Bench v2

#7 opened 10 days ago by

Add evaluation results (SWE-bench Verified, SWE-bench Pro, Terminal-Bench v2)

#6 opened 10 days ago by

liked a model 11 days ago

CohereLabs/North-Mini-Code-1.0

Text Generation • 30B • Updated 6 days ago • 19.6k • 470

upvoted a changelog 11 days ago

Hugging Face Changelog

Publish models from CI without HF_TOKEN

13 days ago

• 102

upvoted an article 11 days ago

Article

The Open Source Community is backing OpenEnv for Agentic RL

+16

burtenshaw, spisakjo, lysandre, darktex, willcb, qjoy, pawalt, cwing-nv, danielhanchen, andrewzhou, thegovind, shimmyshimmer, Hamid-Nazeri, Sanyam, zkwentz, emre0, lewtun, sergiopaniego

•

13 days ago

• 89

upvoted a paper 11 days ago

ResearchClawBench: A Benchmark for End-to-End Autonomous Scientific Research

Paper • 2606.07591 • Published 24 days ago • 93

updated a dataset 13 days ago

SaylorTwift/harbor-assets

Updated 13 days ago • 24

published a dataset 13 days ago

SaylorTwift/harbor-assets

Updated 13 days ago • 24

New activity in MMMU/MMMU_Pro 13 days ago

Update eval.yaml

#7 opened 16 days ago by

upvoted an article 13 days ago

Article

Designing the hf CLI as an agent-optimized way to work with the Hub

celinah, Wauplin

•

17 days ago

• 57

New activity in nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-NVFP4 16 days ago

Add evaluation results (GPQA, MMLU-Pro, SWE-bench Verified, HLE)

#6 opened 16 days ago by

New activity in nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16 16 days ago

Add evaluation results (GPQA, MMLU-Pro, SWE-bench Verified, HLE)

#3 opened 16 days ago by