-
MMLU-Pro Leaderboard
๐ฅ240More advanced and challenging multi-task evaluation
-
Stick To Your Role! Leaderboard
๐ญ59Benchmarking LLMs on the stability of simulated populations
-
ZeroEval Leaderboard
๐53Embed ZeroEval for evaluation
-
Decentralized Arena Leaderboard
๐ฅ26View and compare LLM evaluations across various domains
Hristo Panev
hppdqdq
AI & ML interests
None yet
Recent Activity
liked
a model
5 days ago
Intel/MiroThinker-v1.5-30B-gguf-q2ks-mixed-AutoRound
liked
a model
7 days ago
Phr00t/LTX2-Rapid-Merges
liked
a model
14 days ago
Phr00t/Qwen3-VL-32B-Instruct-heretic-v2-iQ5KS-GGUF
Organizations
None yet