[2108.05301] Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling

Significance

One-shot image re-scaling and super-resolution using invertible neural network

Keypoints

  • Propose a unified framework for image super-resolution and rescaling using normalizing flow
  • Demonstrate performance of the proposed method by experiments

Review

Background

Normalizing flows are a class of generative models that aims to learn a bijective mapping between two different distributions with invertible neural networks. Recent studies have shown promising results in low-level computer vision tasks, such as image rescaling or image super-resolution, for applying normalizing flow (see my previous post for an example). This work proposes an unified approach for both image rescaling and super-resolution, which introduces better flexibility and usability for both tasks.

Keypoints

Propose a unified framework for image super-resolution and rescaling using normalizing flow

The process of image up/downscaling is based on the bijective transformation $\mathbf{x} \leftrightarrow [\mathbf{y}, \mathbf{a}]$ where $\mathbf{y}$ is the low-resolution (LR) image from the original image $\mathbf{x}$ and $\mathbf{a}$ is the corresponding high-frequency component. Main idea of the proposed method is to again learn a bijective transformation between the high-frequency component $\mathbf{a}$ and a latent variable $\mathbf{z}$ with an invertible neural network. Image downscaling can be explicitly performed with forward compuatation of wavelet transform while the upscaling without high-frequency prior, i.e. super-resolution, can be done with a randomly sampled latent variable $\mathbf{z}$ as a prior. 210812-1 Schematic illustration of the proposed method The feature extractor $\phi_{l}$ at level $l$ can be a convolution neural network (CNN).

For training the model, negative log-likelihood can be basically minimized. Pixel-wise L1 loss and perceptual loss + GAN loss can further be employed, where the model trained with these adjunctive loss functions are referred to the HCFlow+ and HCFlow++, respectively.

Demonstrate performance of the proposed method by experiments

The performance of super-resolution is compared quantitatively and qualitatively on DIV2K dataset (4x) and the CelebA dataset (8x)

210812-2 Quantitative performance on the DIV2K dataset (4x) super-resolution

210812-3 Qualitative performance on the DIV2K dataset (4x) super-resolution

210812-4 Quantitative performance on the CelebA dataset (8x) super-resolution

210812-5 Qualitative performance on the CelebA dataset (8x) super-resolution

The results demonstrate that the proposed method outperforms other baseline methods in terms of both PSNR/SSIM (HCFlow) and perceptual quality (LPIPS with HCFlow++) in super-resolution tasks.

210812-6 Quantitative performance on the DIV2K dataset (4x) image rescaling

210812-7 Qualitative performance on the DIV2K dataset (4x) image rescaling

Proposed HCFlow also outperforms previous flow-based image rescaling method IRN in the image rescaling task.

Related

Share

Comment

#image-generation #multi-modal #language-model #retrieval-augmentation #robotics #forecasting #psychiatry #instruction-tuning #diffusion-model #notice #graph-neural-network #responsible-ai #privacy-preserving #scaling #mixture-of-experts #generative-adversarial-network #speech-model #contrastive-learning #self-supervised #image-representation #image-processing #object-detection #pseudo-labeling #scene-text-detection #neural-architecture-search #data-sampling #long-tail #graph-representation #zero-shot #metric-learning #federated-learning #weight-matrix #low-rank #vision-transformer #computer-vision #normalizing-flow #invertible-neural-network #super-resolution #image-manipulation #thread-summarization #natural-language-processing #domain-adaptation #knowledge-distillation #scene-text #model-compression #semantic-segmentation #instance-segmentation #video-understanding #code-generation #graph-generation #image-translation #data-augmentation #model-pruning #signal-processing #text-generation #text-classification #music-representation #transfer-learning #link-prediction #counterfactual-learning #medical-imaging #acceleration #transformer #style-transfer #novel-view-synthesis #point-cloud #spiking-neural-network #optimization #multi-layer-perceptron #adversarial-training #visual-search #image-retrieval #negative-sampling #action-localization #weakly-supervised #data-compression #hypergraph #adversarial-attack #submodularity #active-learning #deblurring #object-tracking #pyramid-structure #loss-function #gradient-descent #generalization #bug-fix #orthogonality #explainability #saliency-mapping #information-theory #question-answering #knowledge-graph #robustness #limited-data #recommender-system #anomaly-detection #gaussian-discriminant-analysis #molecular-graph #video-processing