Large Multi-modal Models Can Interpret Features in Large Multi-modal Models

๐Ÿ” A Database for Interpreted 5K Features | ๐Ÿ” ArXiv Paper | ๐Ÿ  LMMs-Lab Homepage | ๐Ÿค— Huggingface Collections | GitHub Repo

Instructions to use the demo

You can use this demo to : 1. Visualize the activations of the model for a given image. 2. Generate text with a specific feature clamped to a certain value.

Visualization of Activations

  1. Upload an image. (or use an example)
  2. Click on the "Submit" button to visualize the activations. The top-100 features will be displayed. (It might contains lots of low level features that activates on many patterns so explainable features might not rank very high)
  3. Use the slider to select a feature number.
  4. Click on the "Visualize" button to see the activation of that feature.

Steering Model

  1. Use the slider to select a feature number.
  2. Use the number input to select the feature strength.
  3. Type the text input.
  4. Upload an image. (optional)
  5. Click on the "Submit" button to generate text with the selected feature clamped to the selected strength.

Auto Interp Explanations(first 5k neurons) for top 500 features

Auto Interp Explanations(first 5k neurons) for top 500 features
Feature
Auto Interp Explanation(first 5k neurons)
1 131072
Examples
Sample Image Feature Number Explanation

@misc{zhang2024largemultimodalmodelsinterpret,
      title={Large Multi-modal Models Can Interpret Features in Large Multi-modal Models},
      author={Kaichen Zhang and Yifei Shen and Bo Li and Ziwei Liu},
      year={2024},
      eprint={2411.14982},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2411.14982},
}