Waving Goodbye to Low-Res: A Diffusion-Wavelet Approach for Image Super-Resolution

Abstract

Image Super-Resolution (SR) remains challenging, particularly in achieving high-quality details without extensive computational cost. Existing methods often struggle to balance the trade-off between image quality, especially in high-frequency details, and computational efficiency. In this paper, we present a novel Diffusion-Wavelet (DiWa) approach for bridging this gap. It leverages the strengths of diffusion models and discrete wavelet transformation. By enabling the diffusion model to operate in the frequency domain, our models effectively hallucinate highfrequency information for SR images on the wavelet spectrum, resulting in high-quality and detailed reconstructions in image space. Quantitatively, our method outperforms other state-ofthe-art diffusion-based SR methods, namely SR3 and SRDiff, regarding PSNR, SSIM, and LPIPS on both face (8x scaling) and general (4x scaling) SR benchmarks. Meanwhile, using the frequency domain allows us to use fewer parameters than the compared models: 92M parameters instead of 550M compared to SR3 and 9.3M instead of 12M compared to SRDiff. Additionally, DiWa outperforms other state-of-the-art generative methods on general SR datasets while saving inference time (ca. 250 %).

Publication
2024 International Joint Conference on Neural Networks (IJCNN)