The tasks and counterfactuals from the Mechanistic Interpretability Benchmark.
AI & ML interests
Principled evaluation of mechanistic interpretability methods.
Recent Activity
View all activity
datasets 7
mib-bench/ravel
Viewer
• Updated
• 117k • 14
mib-bench/arithmetic_subtraction
Viewer
• Updated
• 20.9k • 165
mib-bench/arithmetic_addition
Viewer
• Updated
• 40.4k • 211
mib-bench/ioi
Viewer
• Updated
• 21k • 611
mib-bench/arc_easy
Viewer
• Updated
• 4.01k • 387
mib-bench/arc_challenge
Viewer
• Updated
• 2k • 276
mib-bench/copycolors_mcqa
Viewer
• Updated
• 1.89k • 242