Instructions to use stabilityai/stable-diffusion-3.5-controlnets-tensorrt with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- TensorRT
How to use stabilityai/stable-diffusion-3.5-controlnets-tensorrt with TensorRT:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
| pipeline_tag: text-to-image | |
| library_name: tensorrt | |
| inference: false | |
| license: other | |
| license_name: stabilityai-ai-community | |
| license_link: LICENSE.md | |
| tags: | |
| - tensorrt | |
| - sd3.5-large | |
| - text-to-image | |
| - depth | |
| - canny | |
| - blur | |
| - controlnet | |
| - onnx | |
| - fp8 | |
| extra_gated_prompt: >- | |
| By clicking "Agree", you agree to the [License | |
| Agreement](https://huggingface.co/stabilityai/stable-diffusion-3.5-large/blob/main/LICENSE.md) | |
| and acknowledge Stability AI's [Privacy | |
| Policy](https://stability.ai/privacy-policy). | |
| extra_gated_fields: | |
| Name: text | |
| Email: text | |
| Country: country | |
| Organization or Affiliation: text | |
| Receive email updates and promotions on Stability AI products, services, and research?: | |
| type: select | |
| options: | |
| - 'Yes' | |
| - 'No' | |
| What do you intend to use the model for?: | |
| type: select | |
| options: | |
| - Research | |
| - Personal use | |
| - Creative Professional | |
| - Startup | |
| - Enterprise | |
| I agree to the License Agreement and acknowledge Stability AI's Privacy Policy: checkbox | |
| language: | |
| - en | |
| # Stable Diffusion 3.5 Large ControlNet TensorRT | |
| ## Introduction | |
| This repository hosts the **TensorRT-optimized version** of **Stable Diffusion 3.5 Large ControlNets**, developed in collaboration between [Stability AI](https://stability.ai) and [NVIDIA](https://huggingface.co/nvidia). This implementation leverages NVIDIA's TensorRT deep learning inference library to deliver significant performance improvements while maintaining the exceptional image quality of the original model. | |
| Stable Diffusion 3.5 Large is a Multimodal Diffusion Transformer (MMDiT) text-to-image model that features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency. The TensorRT optimization makes these capabilities accessible for production deployment and real-time applications. | |
| The following control types are available: | |
| - Canny - Use a Canny edge map to guide the structure of the generated image. This is especially useful for illustrations, but works with all styles. | |
| - Depth - use a depth map, generated by DepthFM, to guide generation. Some example use cases include generating architectural renderings, or texturing 3D assets. | |
| - Blur - can be used to perform extremely high fidelity upscaling. A common use case is to tile an input image, apply the ControlNet to each tile, and merge the tiles to produce a higher resolution image. | |
| ## Model Details | |
| ### Model Description | |
| This repository holds the ONNX export of the Depth, Canny and Blue ControlNet models in BF16 precision. The FP8 quantized models are also available for the Depth and Canny Controlnets. | |
| ## Performance using TensorRT 10.13 | |
| #### Depth ControlNet: Timings for 40 steps at 1024x1024 | |
| | Accelerator | Precision | VAE Encoder | CLIP-G | CLIP-L | T5 | MMDiT x 40 | VAE Decoder | Total | | |
| |-------------|-----------|-------------|------------|--------------|--------------|-----------------------|---------------------|------------------------| | |
| | H100 | BF16 | 74.97 ms | 11.87 ms | 4.90 ms | 8.82 ms | 18839.01 ms | 117.38 ms | 19097.19 ms | | |
| | H100 | FP8 | 31.24 ms | 11.99 ms | 4.96 ms | 8.39 ms | 9175.53 ms | 36.36 ms | 9308.86 ms | | |
| #### Canny ControlNet: Timings for 60 steps at 1024x1024 | |
| | Accelerator | Precision | VAE Encoder | CLIP-G | CLIP-L | T5 | MMDiT x 60 | VAE Decoder | Total | | |
| |-------------|-----------|-------------|------------|--------------|--------------|-----------------------|---------------------|------------------------| | |
| | H100 | BF16 | 78.50 ms | 12.29 ms | 5.08 ms | 8.65 ms | 28057.08 ms | 106.49 ms | 28306.20 ms | | |
| | H100 | FP8 | 31.21 ms | 12.17 ms | 4.96 ms | 8.35 ms | 13936.82 ms | 36.63 ms | 14068.32 ms | | |
| #### Blur ControlNet: Timings for 60 steps at 1024x1024 | |
| | Accelerator | Precision | VAE Encoder | CLIP-G | CLIP-L | T5 | MMDiT x 60 | VAE Decoder | Total | | |
| |-------------|-----------|-------------|------------|--------------|--------------|-----------------------|---------------------|------------------------| | |
| | H100 | BF16 | 74.48 ms | 11.71 ms | 4.86 ms | 8.80 ms | 28604.26 ms | 113.24 ms | 28859.06 ms | | |
| ## Usage Example | |
| 1. Follow the [setup instructions](https://github.com/NVIDIA/TensorRT/blob/release/sd35/demo/Diffusion/README.md) on launching a TensorRT NGC container. | |
| ```shell | |
| git clone https://github.com/NVIDIA/TensorRT.git | |
| cd TensorRT | |
| git checkout release/sd35 | |
| docker run --rm -it --gpus all -v $PWD:/workspace nvcr.io/nvidia/pytorch:25.08-py3 /bin/bash | |
| ``` | |
| 2. Install libraries and requirements | |
| ```shell | |
| cd demo/Diffusion | |
| source setup.sh | |
| ``` | |
| 3. Generate HuggingFace user access token | |
| To download model checkpoints for the Stable Diffusion 3.5 checkpoints, please request access on the[Stable Diffusion 3.5 Large](https://huggingface.co/stabilityai/stable-diffusion-3.5-large), [Stable Diffusion 3.5 Large Depth ControlNet](https://huggingface.co/stabilityai/stable-diffusion-3.5-large-controlnet-depth), [Stable Diffusion 3.5 Large Canny ControlNet](https://huggingface.co/stabilityai/stable-diffusion-3.5-large-controlnet-canny), and [Stable Diffusion 3.5 Large Blur ControlNet](https://huggingface.co/stabilityai/stable-diffusion-3.5-large-controlnet-blur) pages. | |
| You will then need to obtain a `read` access token to HuggingFace Hub and export as shown below. See [instructions](https://huggingface.co/docs/hub/security-tokens). | |
| ```bash | |
| export HF_TOKEN=<your access token> | |
| ``` | |
| 4. Perform TensorRT optimized inference: | |
| - **Stable Diffusion 3.5 Large Depth ControlNet in BF16 precision** | |
| ``` | |
| python3 demo_controlnet_sd35.py \ | |
| "a photo of a man" \ | |
| --version=3.5-large \ | |
| --bf16 \ | |
| --controlnet-type depth \ | |
| --download-onnx-models \ | |
| --denoising-steps=40 \ | |
| --guidance-scale 4.5 \ | |
| --build-static-batch \ | |
| --use-cuda-graph \ | |
| --hf-token=$HF_TOKEN | |
| ``` | |
| - **Stable Diffusion 3.5 Large Depth ControlNet in FP8 precision** | |
| ``` | |
| python3 demo_controlnet_sd35.py \ | |
| "a photo of a man" \ | |
| --version=3.5-large \ | |
| --fp8 \ | |
| --controlnet-type depth \ | |
| --download-onnx-models \ | |
| --denoising-steps=40 \ | |
| --guidance-scale 4.5 \ | |
| --build-static-batch \ | |
| --use-cuda-graph \ | |
| --hf-token=$HF_TOKEN | |
| ``` | |
| - **Stable Diffusion 3.5 Large Canny ControlNet in BF16 precision** | |
| ``` | |
| python3 demo_controlnet_sd35.py \ | |
| "A Night time photo taken by Leica M11, portrait of a Japanese woman in a kimono, looking at the camera, Cherry blossoms" \ | |
| --version=3.5-large \ | |
| --bf16 \ | |
| --controlnet-type canny \ | |
| --download-onnx-models \ | |
| --denoising-steps=60 \ | |
| --guidance-scale 3.5 \ | |
| --build-static-batch \ | |
| --use-cuda-graph \ | |
| --hf-token=$HF_TOKEN | |
| ``` | |
| - **Stable Diffusion 3.5 Large Canny ControlNet in FP8 precision** | |
| ``` | |
| python3 demo_controlnet_sd35.py \ | |
| "A Night time photo taken by Leica M11, portrait of a Japanese woman in a kimono, looking at the camera, Cherry blossoms" \ | |
| --version=3.5-large \ | |
| --fp8 \ | |
| --controlnet-type canny \ | |
| --download-onnx-models \ | |
| --denoising-steps=60 \ | |
| --guidance-scale 3.5 \ | |
| --build-static-batch \ | |
| --use-cuda-graph \ | |
| --hf-token=$HF_TOKEN | |
| ``` | |
| - **Stable Diffusion 3.5 Large Blur ControlNet in BF16 precision** | |
| ``` | |
| python3 demo_controlnet_sd35.py \ | |
| "generated ai art, a tiny, lost rubber ducky in an action shot close-up, surfing the humongous waves, inside the tube, in the style of Kelly Slater" \ | |
| --version=3.5-large \ | |
| --bf16 \ | |
| --controlnet-type blur \ | |
| --download-onnx-models \ | |
| --denoising-steps=60 \ | |
| --guidance-scale 3.5 \ | |
| --build-static-batch \ | |
| --use-cuda-graph \ | |
| --hf-token=$HF_TOKEN | |
| ``` | |