Biblio
A THz image edge detection approach based on wavelet and neural network is proposed in this paper. First, the source image is decomposed by wavelet, the edges in the low-frequency sub-image are detected using neural network method and the edges in the high-frequency sub-images are detected using wavelet transform method on the coarsest level of the wavelet decomposition, the two edge images are fused according to some fusion rules to obtain the edge image of this level, it then is projected to the next level. Afterwards the final edge image of L-1 level is got according to some fusion rule. This process is repeated until reaching the 0 level thus to get the final integrated and clear edge image. The experimental results show that our approach based on fusion technique is superior to Canny operator method and wavelet transform method alone.
Sparse and low rank matrix decomposition is a method that has recently been developed for estimating different components of hyperspectral data. The rank component is capable of preserving global data structures of data, while a sparse component can select the discriminative information by preserving details. In order to take advantage of both, we present a novel decision fusion based on joint low rank and sparse component (DFJLRS) method for hyperspectral imagery in this paper. First, we analyzed the effects of different components on classification results. Then a novel method adopts a decision fusion strategy which combines a SVM classifier with the information provided by joint sparse and low rank components. With combination of the advantages, the proposed method is both representative and discriminative. The proposed algorithm is evaluated using several hyperspectral images when compared with traditional counterparts.
Bi-dimensional empirical mode decomposition can decompose the source image into several Bi-dimensional Intrinsic Mode Functions. In the process of image decomposition, interpolation is needed and the upper and lower envelopes will be drawn. However, these interpolations and the drawings of upper and lower envelopes require a lot of computation time and manual screening. This paper proposes a simple but effective method that can maintain the characteristics of the original BEMD method, and the Hermite interpolation reconstruction method is used to replace the surface interpolation, and the variable neighborhood window method is used to replace the fixed neighborhood window method. We call it fast bi-dimensional empirical mode decomposition of the variable neighborhood window method based on research characteristics, and we finally complete the image fusion. The empirical analysis shows that this method can overcome the shortcomings that the source image features and details information of BIMF component decomposed from the original BEMD method are not rich enough, and reduce the calculation time, and the fusion quality is better.
Automatic emotion recognition using computer vision is significant for many real-world applications like photojournalism, virtual reality, sign language recognition, and Human Robot Interaction (HRI) etc., Psychological research findings advocate that humans depend on the collective visual conduits of face and body to comprehend human emotional behaviour. Plethora of studies have been done to analyse human emotions using facial expressions, EEG signals and speech etc., Most of the work done was based on single modality. Our objective is to efficiently integrate emotions recognized from facial expressions and upper body pose of humans using images. Our work on bimodal emotion recognition provides the benefits of the accuracy of both the modalities.
Classifying Hyperspectral images with few training samples is a challenging problem. The generative adversarial networks (GAN) are promising techniques to address the problems. GAN constructs an adversarial game between a discriminator and a generator. The generator generates samples that are not distinguishable by the discriminator, and the discriminator determines whether or not a sample is composed of real data. In this paper, by introducing multilayer features fusion in GAN and a dynamic neighborhood voting mechanism, a novel algorithm for HSIs classification based on 1-D GAN was proposed. Extracting and fusing multiple layers features in discriminator, and using a little labeled samples, we fine-tuned a new sample 1-D CNN spectral classifier for HSIs. In order to improve the accuracy of the classification, we proposed a dynamic neighborhood voting mechanism to classify the HSIs with spatial features. The obtained results show that the proposed models provide competitive results compared to the state-of-the-art methods.
There is an inevitable trade-off between spatial and spectral resolutions in optical remote sensing images. A number of data fusion techniques of multimodal images with different spatial and spectral characteristics have been developed to generate optical images with both spatial and spectral high resolution. Although some of the techniques take the spectral and spatial blurring process into account, there is no method that attempts to retrieve an optical image with both spatial and spectral high resolution, a spectral blurring filter and a spectral response simultaneously. In this paper, we propose a new framework of spatial resolution enhancement by a fusion of multiple optical images with different characteristics based on tensor decomposition. An optical image with both spatial and spectral high resolution, together with a spatial blurring filter and a spectral response, is generated via canonical polyadic (CP) decomposition of a set of tensors. Experimental results featured that relatively reasonable results were obtained by regularization based on nonnegativity and coupling.
A high definition(HD) wide dynamic video surveillance system is designed and implemented based on Field Programmable Gate Array(FPGA). This system is composed of three subsystems, which are video capture, video wide dynamic processing and video display subsystem. The images in the video are captured directly through the camera that is configured in a pattern have long exposure in odd frames and short exposure in even frames. The video data stream is buffered in DDR2 SDRAM to obtain two adjacent frames. Later, the image data fusion is completed by fusing the long exposure image with the short exposure image (pixel by pixel). The video image display subsystem can display the image through a HDMI interface. The system is designed on the platform of Lattice ECP3-70EA FPGA, and camera is the Panasonic MN34229 sensor. The experimental result shows that this system can expand dynamic range of the HD video with 30 frames per second and a resolution equal to 1920*1080 pixels by real-time wide dynamic range (WDR) video processing, and has a high practical value.
With the recent developments in the field of visual sensor technology, multiple imaging sensors are used in several applications such as surveillance, medical imaging and machine vision, in order to improve their capabilities. The goal of any efficient image fusion algorithm is to combine the visual information, obtained from a number of disparate imaging sensors, into a single fused image without the introduction of distortion or loss of information. The existing fusion algorithms employ either the mean or choose-max fusion rule for selecting the best features for fusion. The choose-max rule distorts constants background information whereas the mean rule blurs the edges. In this paper, Non-Subsampled Contourlet Transform (NSCT) based two feature-level fusion schemes are proposed and compared. In the first method Fuzzy logic is applied to determine the weights to be assigned to each segmented region using the salient region feature values computed. The second method employs Golden Section Algorithm (GSA) to achieve the optimal fusion weights of each region based on its Petrovic metric. The regions are merged adaptively using the weights determined. Experiments show that the proposed feature-level fusion methods provide better visual quality with clear edge information and objective quality metrics than individual multi-resolution-based methods such as Dual Tree Complex Wavelet Transform and NSCT.