Image super-resolution models are commonly evaluated by average scores (over some benchmark test sets), which fail to reflect the performance of these models on images of varying difficulty and that some models generate artifacts on certain difficult images, which is not reflected by the average scores. We propose difficulty-aware performance evaluation procedures to better differentiate between SISR models that produce visually different results on some images but yield close average performance scores over the entire test set. In particular, we propose two image-difficulty measures, the high-frequency index and rotation-invariant edge index, to predict those test images, where a model would yield significantly better visual results over another model, and an evaluation method where these visual differences are reflected on objective measures. Experimental results demonstrate the effectiveness of the proposed image-difficulty measures and evaluation methodology.
翻译:暂无翻译