# [2103.10081] Self-Supervised Adaptation for Video Super-Resolution

### Significance

Video super-resolution algorithms are further enhanced by self-supervised learning

### Keypoints

• Propose a self-supervised method for enhancing pre-trained video super-resolution algorithms
• Extensive experiments confirm the performance of the proposed method

### Review

#### Background

Self-supervised learning has a powerful advantage in that it does not require ground truth data during training. Furthermore, accumulating evidences suggest its potential practical powers. This work is in line with this trend, applying self-supervised learning to the video super-resolution problem.

#### Keypoints

##### Propose a self-supervised method for enhancing pre-trained video super-resolution algorithms

An important point to be considered when applying self-supervised learning is how to define the training target from the dataset without ground-truth. Key observation The key observation of this work is that there usually exists similar patches within the full video between the downscaled version of a large patch $\mathbf{a}_{gt}$ and a patch from its nearby frame $\mathbf{b}_{gt}$. However, this property does not hold for the frames resolved with the conventional neural networks, where quality of the downscaled version of a large patch $\mathbf{a}$ is much better than that of its corresponding small patch $\mathbf{b}$. The authors mention that these two patches $\mathbf{a}$ and $\mathbf{b}$ become similar when downscaled to $\mathbf{a}_{LR}$ and $\mathbf{b}_{LR}$. Based on the observation, $\mathbf{a}_{LR}$ and $\mathbf{a}$ serve as the input and target of the self-supervised training. Pseudocode of proposed self-supervised training Knowledge-distillation can also be applied to this self-supervised method to reduce training time. Knowledge distillation from large (teacher) network to small (student) network

##### Extensive experiments confirm the performance of the proposed method

Experiments include quantitative and qualitative improvement of spatial resolution (PSNR/SSIM) and temporal consistency (tOF), using models TOFlow, RBPN, EDVR. Spatial resolution of models with and without proposed method (quantitative) Spatial resolution of models with and without proposed method (qualitative) Temporal consistency of models with and without proposed method (quantitative) Temporal consistency of models with and without proposed method (qualitative) Further experiments on single-image super-resolution also show improved performance when applied to RCAN.