A newer version of the Gradio SDK is available:
6.9.0
FFG Mask Explorer - Deployment Guide
Overview
This guide explains how to deploy the FFG Mask Explorer to HuggingFace Spaces.
Directory Structure
huggingface_space/
βββ app.py # Main Gradio application
βββ requirements.txt # Python dependencies
βββ README.md # HuggingFace Space metadata & docs
βββ prepare_for_deployment.sh # Script to copy necessary files
βββ test_local.py # Local testing script
βββ ffg_experiment_suite/ # (Created by prepare_for_deployment.sh)
βββ src/
βββ __init__.py
βββ models.py
βββ grafting.py
βββ analysis.py
Deployment Steps
1. Prepare Files Locally
cd /Users/pxm5426/research/surgeon/huggingface_space
./prepare_for_deployment.sh
This will copy the necessary source files from the surgeon toolkit.
2. Test Locally (Optional)
# Install dependencies
pip install -r requirements.txt
# Run tests
python test_local.py
# Run the app locally
python app.py
3. Create HuggingFace Space
- Go to https://huggingface.co/new-space
- Choose:
- Space name:
ffg-mask-explorer - SDK:
Gradio - Hardware:
A10G small(GPU required!) - Public/Private as desired
- Space name:
4. Deploy to HuggingFace
# Clone your new space
git clone https://huggingface.co/spaces/YOUR_USERNAME/ffg-mask-explorer
cd ffg-mask-explorer
# Copy all files from the prepared directory
cp -r /Users/pxm5426/research/surgeon/huggingface_space/* .
# Add files to git
git add .
git commit -m "Initial deployment of FFG Mask Explorer"
git push
5. Monitor Deployment
- Check the "App" tab on HuggingFace to see build progress
- View logs if there are any errors
- The space will automatically build and deploy
Features Implemented
Core Functionality
- β Single model mask generation
- β Three grafting methods (FFG, Magnitude, Fish-Mask)
- β Real-time processing with GPU
- β
Basic visualizations:
- Overall statistics bar chart
- Layer-wise sparsity distribution
- Sample mask heatmaps
User Interface
- β Pre-configured model dropdown
- β Custom model support (advanced accordion)
- β Sparsity ratio slider
- β Method selection
- β Device selection (CUDA/CPU)
- β Progress indicators
- β Status messages
Memory Management
- β Automatic GPU cleanup after generation
- β Model offloading after use
- β Garbage collection
Potential Enhancements
Phase 2 Features
- Multi-model comparison
- Full visualization suite from
run_comparison_visualization.py - Jaccard similarity metrics
- Export masks as .pt files
- Export visualizations as high-res images
Phase 3 Features
- Batch processing for multiple sparsity ratios
- Session management (save/load results)
- Advanced visualization options
- Integration with model merging
Troubleshooting
Common Issues
Out of Memory (OOM)
- Solution: Reduce batch size or use CPU for testing
- Consider upgrading to larger GPU tier
Model Download Fails
- Check HuggingFace token if models are private
- Verify model IDs are correct
Import Errors
- Ensure
prepare_for_deployment.shwas run - Check all dependencies in requirements.txt
- Ensure
GPU Memory Usage
- Llama-3.1-8B models require ~16-20GB GPU memory
- A10G (24GB) should be sufficient for single model processing
- For multi-model comparison, consider A100 tier
Support
For issues or questions:
- Check HuggingFace Space logs
- Review the original surgeon toolkit documentation
- Refer to the paper for technical details
Citation
Remember to cite the original work when using this tool:
@misc{mahdavinia2025harnessingoptimizationdynamicscurvatureinformed,
title={Harnessing Optimization Dynamics for Curvature-Informed Model Merging},
author={Pouria Mahdavinia and Hamed Mahdavi and Niloofar Mireshghallah and Mehrdad Mahdavi},
year={2025},
eprint={2509.11167},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2509.11167},
}