Visible to the public Biblio

Filters: Author is Guha, Tanaya  [Clear All Filters]
2022-03-01
Roy, Debaleena, Guha, Tanaya, Sanchez, Victor.  2021.  Graph Based Transforms based on Graph Neural Networks for Predictive Transform Coding. 2021 Data Compression Conference (DCC). :367–367.
This paper introduces the GBT-NN, a novel class of Graph-based Transform within the context of block-based predictive transform coding using intra-prediction. The GBT-NNis constructed by learning a mapping function to map a graph Laplacian representing the covariance matrix of the current block. Our objective of learning such a mapping functionis to design a GBT that performs as well as the KLT without requiring to explicitly com-pute the covariance matrix for each residual block to be transformed. To avoid signallingany additional information required to compute the inverse GBT-NN, we also introduce acoding framework that uses a template-based prediction to predict residuals at the decoder. Evaluation results on several video frames and medical images, in terms of the percentageof preserved energy and mean square error, show that the GBT-NN can outperform the DST and DCT.
2017-03-07
Kiran, Indra, Guha, Tanaya, Pandey, Gaurav.  2016.  Blind Image Quality Assessment Using Subspace Alignment. Proceedings of the Tenth Indian Conference on Computer Vision, Graphics and Image Processing. :91:1–91:6.

This paper addresses the problem of estimating the quality of an image as it would be perceived by a human. A well accepted approach to assess perceptual quality of an image is to quantify its loss of structural information. We propose a blind image quality assessment method that aims at quantifying structural information loss in a given (possibly distorted) image by comparing its structures with those extracted from a database of clean images. We first construct a subspace from the clean natural images using (i) principal component analysis (PCA), and (ii) overcomplete dictionary learning with sparsity constraint. While PCA provides mathematical convenience, an overcomplete dictionary is known to capture the perceptually important structures resembling the simple cells in the primary visual cortex. The subspace learned from the clean images is called the source subspace. Similarly, a subspace, called the target subspace, is learned from the distorted image. In order to quantify the structural information loss, we use a subspace alignment technique which transforms the target subspace into the source by optimizing over a transformation matrix. This transformation matrix is subsequently used to measure the global and local (patch-based) quality score of the distorted image. The quality scores obtained by the proposed method are shown to correlate well with the subjective scores obtained from human annotators. Our method achieves competitive results when evaluated on three benchmark databases.