Guy Ohayon* , Hila Manor* , Tomer Michaeli , Michael Elad
* Equal contribution
Technion - Israel Institute of Technology
[Paper] |
[Project Page] |
[Code]
Denoising Diffusion Codebook Models (DDCM) is a novel (and simple) generative approach based on any Denoising Diffusion Model (DDM), that is able to produce high-quality image samples along with their losslessly compressed bit-stream representations.
DDCM can easily be utilized for perceptual image compression, as well as for solving a variety of compressed conditional generation tasks such as text-conditional image generation and image restoration, where each generated sample is accompanied by a compressed bit-stream.
The tabs below correspond to demos of different practical applications. Open each tab to see the application's specific instructions.
Note: The demos below rely on relatively old pre-trained diffusion models such as Stable Diffusion 2.1
(Mirrored by sd2-community), simply for the purpose of demonstrating the capabilities of DDCM. Feel free to implement our DDCM-based methods using newer diffusion models to further improve performance.
- To change the bit rate, modify the number of diffusion timesteps (T) and/or the codebook sizes (K).
- The input image will be center-cropped and resized to the specified size (512x512 or 768x768).
| Input image | Diffusion timesteps (T) | Size of each codebook (K) | Image size |
|---|
Decompress a previously generated bit-stream
Please mark if your input face image is already aligned. If not, we will try to automatically detect, crop and align the faces, and raise an error if no faces are found. Expect better results if your input image is already aligned.
| Input image | Diffusion timesteps (T) | Size of each codebook (K) | Perceptual quality measure to optimize | Perception-distortion tradeoff coefficient (λ) | Input face image is aligned |
|---|
Decompress a previously generated bit-stream
This application demonstrates the capabilities of our new compressed classifier-free guidance method, which does not require the input condition for decompression.
Each image is generated along with its compressed bit-stream representation, and the input condition is implicitly encoded in the bit-stream.
| Input text prompt | Diffusion timesteps (T) | Size of each codebook (K) | Sub-sampled codebooks' sizes (K̃) | Image size |
|---|
Decompress a previously generated bit-stream
If you find our work useful, please ⭐ our GitHub repository. Thanks!
📝 Citation
@article{ohayon2025compressedimagegenerationdenoising,
title={Compressed Image Generation with Denoising Diffusion Codebook Models},
author={Guy Ohayon and Hila Manor and Tomer Michaeli and Michael Elad},
year={2025},
eprint={2502.01189},
journal={arXiv},
primaryClass={eess.IV},
url={https://arxiv.org/abs/2502.01189},
}
📋 License This project is released under the MIT license.
📧 Contact If you have any questions, please feel free to contact us at guyoep@gmail.com (Guy Ohayon) and hila.manor@campus.technion.ac.il (Hila Manor).