
Unsupervised Information-Theoretic Perceptual Quality Metric NeurIPS 2020
Explore a novel unsupervised approach using information-theoretic principles to assess image quality, inspired by efficient coding and slowness in biological systems. Learn about constructing probabilistic representations of images and optimizing objective functions for efficient coding and slowness principles.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
An Unsupervised Information-Theoretic Perceptual Quality Metric NeurIPS 2020 Sangnie Bhardwaj, Ian Fischer, Johannes Ball , Troy Chinen Google Research
Inspiration Inspiration Conventional Deep-learning based IQA models: large-scale classification subjective study classifier network IQA subjective study IQA model It is cumbersome costly and slow Use an information-theoretic perspective and unsupervised way Two principles that have been hypothesized to shape sensory processing in biological systems. Efficient coding: the internal representation of images in the human visual system is optimized to efficiently represent the visual information processed by it. Slowness: image features that are not persistent across small time scales are likely to be uninformative for human decision making
An information An information- -theoretic perspective theoretic perspective we can construct a probabilistic representation by representing an image as a probability distribution over a latent space. ?(?|?) a prior encoder of input X X is input image and Z is representation of X We then train a variational approximation of ? ? ? That is q ? ? Then we can use Kullback Leibler divergence to measure the distance between two images: Kingma, D. P. and M. Welling (2013). "Auto-encoding variational bayes." arXiv preprint arXiv:1312.6114.
Objective Function Objective Function Now we consider two input images X and Y, and the full encoder ? ? ?,? : The objective function is set to be a variation bound of Multivariate Mutual Information (MMI): Where IXYZ can be decomposed into three parts: Maxmize : ? ?;?,? Efficient coding principal Minimize : ? ?;?|? ?(?;?|?) Slowness principal
Parameterization Parameterization Any probabilistic distribution can be approximated by a mixture of multivariate Gaussian distribution: 2) ? ? ? = ??????????(??,?? We use a mixture of 5 Gaussian to approximate marginal encoder q, and 1 Gaussian to approximate full encoder p. Fix the variance of all Gaussian distributions to 1. Thus yielding 10 parameters for marginal encoder q ( 5 means and 5 weights) and 1 parameter for full encoder p (mean) ? ? ? and ? ? ? share the same parameters.
Training model Training model We use a pair of consecutive frames of a video to simulate a pair of similar but not identical images.(Input X and Y). Frontend model (Feature Extraction) : No trainable parameters in steerable pyramid LF Linear Filter
Training model Training model R Rectified Linear Layer 50 units each (Left) , 10 units each (Right)) L: Linear Layer
Results Results Experiment 1 : On BAPPS database: Metric: the metric as a percentage JND :Area under the ROC curve (A2AFC :The fraction of human raters agreeing with UC)
Results Results Experiment 2 : on CLIC 2020 database
Results Results Experiment 3 : Invariance under pixel shifts Shifting the pixels of image X for a small number to generate image Y.
Results Results Experiment 4 : Qualitative comparisons via ImageNet-C For a given metric, we computed the metric value between a reference and a corrupted image and then found an equivalent amount of Gaussian noise to add to the reference that yields the same metric value.
Ablation Study Ablation Study
Summary Summary Use a information-theoretic way to measure the distance between two images. It is an unsupervised approach and can avoid the bias of human raters. Does not include pairwise comparisons between corresponding spatial locations in images. It is robust with respect to architectural details. The amount of parameters is relatively small Can t handle global distortions(blur, compression, frost) well.