OTA / DEPLOYMENT_GUIDE.md
pmahdavi's picture
Deploy FFG Mask Explorer initial version
48a55a5

A newer version of the Gradio SDK is available: 6.9.0

Upgrade

FFG Mask Explorer - Deployment Guide

Overview

This guide explains how to deploy the FFG Mask Explorer to HuggingFace Spaces.

Directory Structure

huggingface_space/
β”œβ”€β”€ app.py                      # Main Gradio application
β”œβ”€β”€ requirements.txt            # Python dependencies
β”œβ”€β”€ README.md                   # HuggingFace Space metadata & docs
β”œβ”€β”€ prepare_for_deployment.sh   # Script to copy necessary files
β”œβ”€β”€ test_local.py              # Local testing script
└── ffg_experiment_suite/      # (Created by prepare_for_deployment.sh)
    └── src/
        β”œβ”€β”€ __init__.py
        β”œβ”€β”€ models.py
        β”œβ”€β”€ grafting.py
        └── analysis.py

Deployment Steps

1. Prepare Files Locally

cd /Users/pxm5426/research/surgeon/huggingface_space
./prepare_for_deployment.sh

This will copy the necessary source files from the surgeon toolkit.

2. Test Locally (Optional)

# Install dependencies
pip install -r requirements.txt

# Run tests
python test_local.py

# Run the app locally
python app.py

3. Create HuggingFace Space

  1. Go to https://huggingface.co/new-space
  2. Choose:
    • Space name: ffg-mask-explorer
    • SDK: Gradio
    • Hardware: A10G small (GPU required!)
    • Public/Private as desired

4. Deploy to HuggingFace

# Clone your new space
git clone https://huggingface.co/spaces/YOUR_USERNAME/ffg-mask-explorer
cd ffg-mask-explorer

# Copy all files from the prepared directory
cp -r /Users/pxm5426/research/surgeon/huggingface_space/* .

# Add files to git
git add .
git commit -m "Initial deployment of FFG Mask Explorer"
git push

5. Monitor Deployment

  • Check the "App" tab on HuggingFace to see build progress
  • View logs if there are any errors
  • The space will automatically build and deploy

Features Implemented

Core Functionality

  • βœ… Single model mask generation
  • βœ… Three grafting methods (FFG, Magnitude, Fish-Mask)
  • βœ… Real-time processing with GPU
  • βœ… Basic visualizations:
    • Overall statistics bar chart
    • Layer-wise sparsity distribution
    • Sample mask heatmaps

User Interface

  • βœ… Pre-configured model dropdown
  • βœ… Custom model support (advanced accordion)
  • βœ… Sparsity ratio slider
  • βœ… Method selection
  • βœ… Device selection (CUDA/CPU)
  • βœ… Progress indicators
  • βœ… Status messages

Memory Management

  • βœ… Automatic GPU cleanup after generation
  • βœ… Model offloading after use
  • βœ… Garbage collection

Potential Enhancements

Phase 2 Features

  • Multi-model comparison
  • Full visualization suite from run_comparison_visualization.py
  • Jaccard similarity metrics
  • Export masks as .pt files
  • Export visualizations as high-res images

Phase 3 Features

  • Batch processing for multiple sparsity ratios
  • Session management (save/load results)
  • Advanced visualization options
  • Integration with model merging

Troubleshooting

Common Issues

  1. Out of Memory (OOM)

    • Solution: Reduce batch size or use CPU for testing
    • Consider upgrading to larger GPU tier
  2. Model Download Fails

    • Check HuggingFace token if models are private
    • Verify model IDs are correct
  3. Import Errors

    • Ensure prepare_for_deployment.sh was run
    • Check all dependencies in requirements.txt

GPU Memory Usage

  • Llama-3.1-8B models require ~16-20GB GPU memory
  • A10G (24GB) should be sufficient for single model processing
  • For multi-model comparison, consider A100 tier

Support

For issues or questions:

  • Check HuggingFace Space logs
  • Review the original surgeon toolkit documentation
  • Refer to the paper for technical details

Citation

Remember to cite the original work when using this tool:

@misc{mahdavinia2025harnessingoptimizationdynamicscurvatureinformed,
    title={Harnessing Optimization Dynamics for Curvature-Informed Model Merging}, 
    author={Pouria Mahdavinia and Hamed Mahdavi and Niloofar Mireshghallah and Mehrdad Mahdavi},
    year={2025},
    eprint={2509.11167},
    archivePrefix={arXiv},
    primaryClass={cs.LG},
    url={https://arxiv.org/abs/2509.11167}, 
}