Visible to the public Biblio

Filters: Keyword is Image reconstruction  [Clear All Filters]
2023-04-28
Huang, Wenwei, Cao, Chunhong, Hong, Sixia, Gao, Xieping.  2022.  ISTA-based Adaptive Sparse Sampling Network for Compressive Sensing MRI Reconstruction. 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). :999–1004.
The compressed sensing (CS) method can reconstruct images with a small amount of under-sampling data, which is an effective method for fast magnetic resonance imaging (MRI). As the traditional optimization-based models for MRI suffered from non-adaptive sampling and shallow” representation ability, they were unable to characterize the rich patterns in MRI data. In this paper, we propose a CS MRI method based on iterative shrinkage threshold algorithm (ISTA) and adaptive sparse sampling, called DSLS-ISTA-Net. Corresponding to the sampling and reconstruction of the CS method, the network framework includes two folders: the sampling sub-network and the improved ISTA reconstruction sub-network which are coordinated with each other through end-to-end training in an unsupervised way. The sampling sub-network and ISTA reconstruction sub-network are responsible for the implementation of adaptive sparse sampling and deep sparse representation respectively. In the testing phase, we investigate different modules and parameters in the network structure, and perform extensive experiments on MR images at different sampling rates to obtain the optimal network. Due to the combination of the advantages of the model-based method and the deep learning-based method in this method, and taking both adaptive sampling and deep sparse representation into account, the proposed networks significantly improve the reconstruction performance compared to the art-of-state CS-MRI approaches.
Lotfollahi, Mahsa, Tran, Nguyen, Gajjela, Chalapathi, Berisha, Sebastian, Han, Zhu, Mayerich, David, Reddy, Rohith.  2022.  Adaptive Compressive Sampling for Mid-Infrared Spectroscopic Imaging. 2022 IEEE International Conference on Image Processing (ICIP). :2336–2340.
Mid-infrared spectroscopic imaging (MIRSI) is an emerging class of label-free, biochemically quantitative technologies targeting digital histopathology. Conventional histopathology relies on chemical stains that alter tissue color. This approach is qualitative, often making histopathologic examination subjective and difficult to quantify. MIRSI addresses these challenges through quantitative and repeatable imaging that leverages native molecular contrast. Fourier transform infrared (FTIR) imaging, the best-known MIRSI technology, has two challenges that have hindered its widespread adoption: data collection speed and spatial resolution. Recent technological breakthroughs, such as photothermal MIRSI, provide an order of magnitude improvement in spatial resolution. However, this comes at the cost of acquisition speed, which is impractical for clinical tissue samples. This paper introduces an adaptive compressive sampling technique to reduce hyperspectral data acquisition time by an order of magnitude by leveraging spectral and spatial sparsity. This method identifies the most informative spatial and spectral features, integrates a fast tensor completion algorithm to reconstruct megapixel-scale images, and demonstrates speed advantages over FTIR imaging while providing spatial resolutions comparable to new photothermal approaches.
ISSN: 2381-8549
2023-01-13
Taneja, Vardaan, Chen, Pin-Yu, Yao, Yuguang, Liu, Sijia.  2022.  When Does Backdoor Attack Succeed in Image Reconstruction? A Study of Heuristics vs. Bi-Level Solution ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). :4398—4402.
Recent studies have demonstrated the lack of robustness of image reconstruction networks to test-time evasion attacks, posing security risks and potential for misdiagnoses. In this paper, we evaluate how vulnerable such networks are to training-time poisoning attacks for the first time. In contrast to image classification, we find that trigger-embedded basic backdoor attacks on these models executed using heuristics lead to poor attack performance. Thus, it is non-trivial to generate backdoor attacks for image reconstruction. To tackle the problem, we propose a bi-level optimization (BLO)-based attack generation method and investigate its effectiveness on image reconstruction. We show that BLO-generated back-door attacks can yield a significant improvement over the heuristics-based attack strategy.
2022-11-08
Javaheripi, Mojan, Samragh, Mohammad, Fields, Gregory, Javidi, Tara, Koushanfar, Farinaz.  2020.  CleaNN: Accelerated Trojan Shield for Embedded Neural Networks. 2020 IEEE/ACM International Conference On Computer Aided Design (ICCAD). :1–9.
We propose Cleann, the first end-to-end framework that enables online mitigation of Trojans for embedded Deep Neural Network (DNN) applications. A Trojan attack works by injecting a backdoor in the DNN while training; during inference, the Trojan can be activated by the specific backdoor trigger. What differentiates Cleann from the prior work is its lightweight methodology which recovers the ground-truth class of Trojan samples without the need for labeled data, model retraining, or prior assumptions on the trigger or the attack. We leverage dictionary learning and sparse approximation to characterize the statistical behavior of benign data and identify Trojan triggers. Cleann is devised based on algorithm/hardware co-design and is equipped with specialized hardware to enable efficient real-time execution on resource-constrained embedded platforms. Proof of concept evaluations on Cleann for the state-of-the-art Neural Trojan attacks on visual benchmarks demonstrate its competitive advantage in terms of attack resiliency and execution overhead.
2022-06-06
Shin, Ho-Chul.  2019.  Abnormal Detection based on User Feedback for Abstracted Pedestrian Video. 2019 International Conference on Information and Communication Technology Convergence (ICTC). :1036–1038.
In this study, we present the abstracted pedestrian behavior representation and abnormal detection method based on user feedback for pedestrian video surveillance system. Video surveillance data is large in size and difficult to process in real time. To solve this problem, we suggested a method of expressing the pedestrian behavior with abbreviated map. In the video surveillance system, false detection of an abnormal situation becomes a big problem. If surveillance user can guide the false detection case as human in the loop, the surveillance system can learn the case and reduce the false detection error in the future. We suggested user feedback based abnormal pedestrian detection method. By the suggested user feedback algorithm, the false detection can be reduced to less than 0.5%.
2021-04-27
Manchanda, R., Sharma, K..  2020.  A Review of Reconstruction Algorithms in Compressive Sensing. 2020 International Conference on Advances in Computing, Communication Materials (ICACCM). :322–325.
Compressive Sensing (CS) is a promising technology for the acquisition of signals. The number of measurements is reduced by using CS which is needed to obtain the signals in some basis that are compressible or sparse. The compressible or sparse nature of the signals can be obtained by transforming the signals in some domain. Depending on the signals sparsity signals are sampled below the Nyquist sampling criteria by using CS. An optimization problem needs to be solved for the recovery of the original signal. Very few studies have been reported about the reconstruction of the signals. Therefore, in this paper, the reconstruction algorithms are elaborated systematically for sparse signal recovery in CS. The discussion of various reconstruction algorithms in made in this paper will help the readers in order to understand these algorithms efficiently.
Sekar, K., Devi, K. Suganya, Srinivasan, P., SenthilKumar, V. M..  2020.  Deep Wavelet Architecture for Compressive sensing Recovery. 2020 Seventh International Conference on Information Technology Trends (ITT). :185–189.
The deep learning-based compressive Sensing (CS) has shown substantial improved performance and in run-time reduction with signal sampling and reconstruction. In most cases, moreover, these techniques suffer from disrupting artefacts or high-frequency contents at low sampling ratios. Similarly, this occurs in the multi-resolution sampling method, which further collects more components with lower frequencies. A promising innovation combining CS with convolutionary neural network has eliminated the sparsity constraint yet recovery persists slow. We propose a Deep wavelet based compressive sensing with multi-resolution framework provides better improvement in reconstruction as well as run time. The proposed model demonstrates outstanding quality on test functions over previous approaches.
2021-02-15
Zhu, L., Zhou, X., Zhang, X..  2020.  A Reversible Meaningful Image Encryption Scheme Based on Block Compressive Sensing. 2020 IEEE 3rd International Conference on Information Communication and Signal Processing (ICICSP). :326–330.
An efficient and reversible meaningful image encryption scheme is proposed in this paper. The plain image is first compressed and encrypted simultaneously by Adaptive Block Compressive Sensing (ABCS) framework to create a noise-like secret image. Next, Least Significant Bit (LSB) embedding is employed to embed the secret image into a carrier image to generate the final meaningful cipher image. In this scheme, ABCS improves the compression and efficiency performance, and the embedding and extraction operations are absolutely reversible. The simulation results and security analyses are presented to demonstrate the effectiveness, compression, secrecy of the proposed scheme.
Omori, T., Isono, Y., Kondo, K., Akamine, Y., Kidera, S..  2020.  k-Space Decomposition Based Super-resolution Three-dimensional Imaging Method for Millimeter Wave Radar. 2020 IEEE Radar Conference (RadarConf20). :1–6.
Millimeter wave imaging radar is indispensible for collision avoidance of self-driving system, especially in optically blurred visions. The range points migration (RPM) is one of the most promising imaging algorithms, which provides a number of advantages from synthetic aperture radar (SAR), in terms of accuracy, computational complexity, and potential for multifunctional imaging. The inherent problem in the RPM is that it suffers from lower angular resolution in narrower frequency band even if higher frequency e.g. millimeter wave, signal is exploited. To address this problem, the k-space decomposition based RPM has been developed. This paper focuses on the experimental validation of this method using the X-band or millimeter wave radar system, and demonstrated that our method significantly enhances the reconstruction accuracy in three-dimensional images for the two simple spheres and realistic vehicle targets.
2021-02-01
Wang, H., Li, Y., Wang, Y., Hu, H., Yang, M.-H..  2020.  Collaborative Distillation for Ultra-Resolution Universal Style Transfer. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). :1857–1866.
Universal style transfer methods typically leverage rich representations from deep Convolutional Neural Network (CNN) models (e.g., VGG-19) pre-trained on large collections of images. Despite the effectiveness, its application is heavily constrained by the large model size to handle ultra-resolution images given limited memory. In this work, we present a new knowledge distillation method (named Collaborative Distillation) for encoder-decoder based neural style transfer to reduce the convolutional filters. The main idea is underpinned by a finding that the encoder-decoder pairs construct an exclusive collaborative relationship, which is regarded as a new kind of knowledge for style transfer models. Moreover, to overcome the feature size mismatch when applying collaborative distillation, a linear embedding loss is introduced to drive the student network to learn a linear embedding of the teacher's features. Extensive experiments show the effectiveness of our method when applied to different universal style transfer approaches (WCT and AdaIN), even if the model size is reduced by 15.5 times. Especially, on WCT with the compressed models, we achieve ultra-resolution (over 40 megapixels) universal style transfer on a 12GB GPU for the first time. Further experiments on optimization-based stylization scheme show the generality of our algorithm on different stylization paradigms. Our code and trained models are available at https://github.com/mingsun-tse/collaborative-distillation.
Jin, H., Wang, T., Zhang, M., Li, M., Wang, Y., Snoussi, H..  2020.  Neural Style Transfer for Picture with Gradient Gram Matrix Description. 2020 39th Chinese Control Conference (CCC). :7026–7030.
Despite the high performance of neural style transfer on stylized pictures, we found that Gatys et al [1] algorithm cannot perfectly reconstruct texture style. Output stylized picture could emerge unsatisfied unexpected textures such like muddiness in local area and insufficient grain expression. Our method bases on original algorithm, adding the Gradient Gram description on style loss, aiming to strengthen texture expression and eliminate muddiness. To some extent our method lengthens the runtime, however, its output stylized pictures get higher performance on texture details, especially in the elimination of muddiness.
2021-01-15
Khalid, H., Woo, S. S..  2020.  OC-FakeDect: Classifying Deepfakes Using One-class Variational Autoencoder. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). :2794—2803.
An image forgery method called Deepfakes can cause security and privacy issues by changing the identity of a person in a photo through the replacement of his/her face with a computer-generated image or another person's face. Therefore, a new challenge of detecting Deepfakes arises to protect individuals from potential misuses. Many researchers have proposed various binary-classification based detection approaches to detect deepfakes. However, binary-classification based methods generally require a large amount of both real and fake face images for training, and it is challenging to collect sufficient fake images data in advance. Besides, when new deepfakes generation methods are introduced, little deepfakes data will be available, and the detection performance may be mediocre. To overcome these data scarcity limitations, we formulate deepfakes detection as a one-class anomaly detection problem. We propose OC-FakeDect, which uses a one-class Variational Autoencoder (VAE) to train only on real face images and detects non-real images such as deepfakes by treating them as anomalies. Our preliminary result shows that our one class-based approach can be promising when detecting Deepfakes, achieving a 97.5% accuracy on the NeuralTextures data of the well-known FaceForensics++ benchmark dataset without using any fake images for the training process.
2020-12-17
Amrouche, F., Lagraa, S., Frank, R., State, R..  2020.  Intrusion detection on robot cameras using spatio-temporal autoencoders: A self-driving car application. 2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring). :1—5.

Robot Operating System (ROS) is becoming more and more important and is used widely by developers and researchers in various domains. One of the most important fields where it is being used is the self-driving cars industry. However, this framework is far from being totally secure, and the existing security breaches do not have robust solutions. In this paper we focus on the camera vulnerabilities, as it is often the most important source for the environment discovery and the decision-making process. We propose an unsupervised anomaly detection tool for detecting suspicious frames incoming from camera flows. Our solution is based on spatio-temporal autoencoders used to truthfully reconstruct the camera frames and detect abnormal ones by measuring the difference with the input. We test our approach on a real-word dataset, i.e. flows coming from embedded cameras of self-driving cars. Our solution outperforms the existing works on different scenarios.

2020-12-14
Efendioglu, H. S., Asik, U., Karadeniz, C..  2020.  Identification of Computer Displays Through Their Electromagnetic Emissions Using Support Vector Machines. 2020 International Conference on INnovations in Intelligent SysTems and Applications (INISTA). :1–5.
As a TEMPEST information security problem, electromagnetic emissions from the computer displays can be captured, and reconstructed using signal processing techniques. It is necessary to identify the display type to intercept the image of the display. To determine the display type not only significant for attackers but also for protectors to prevent display compromising emanations. This study relates to the identification of the display type using Support Vector Machines (SVM) from electromagnetic emissions emitted from computer displays. After measuring the emissions using receiver measurement system, the signals were processed and training/test data sets were formed and the classification performance of the displays was examined with the SVM. Moreover, solutions for a better classification under real conditions have been proposed. Thus, one of the important step of the display image capture can accomplished by automatically identification the display types. The performance of the proposed method was evaluated in terms of confusion matrix and accuracy, precision, F1-score, recall performance measures.
2020-12-11
Peng, M., Wu, Q..  2019.  Enhanced Style Transfer in Real-Time with Histogram-Matched Instance Normalization. 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS). :2001—2006.

Since the neural networks are utilized to extract information from an image, Gatys et al. found that they could separate the content and style of images and reconstruct them to another image which called Style Transfer. Moreover, there are many feed-forward neural networks have been suggested to speeding up the original method to make Style Transfer become practical application. However, this takes a price: these feed-forward networks are unchangeable because of their fixed parameters which mean we cannot transfer arbitrary styles but only single one in real-time. Some coordinated approaches have been offered to relieve this dilemma. Such as a style-swap layer and an adaptive normalization layer (AdaIN) and soon. Its worth mentioning that we observed that the AdaIN layer only aligns the means and variance of the content feature maps with those of the style feature maps. Our method is aimed at presenting an operational approach that enables arbitrary style transfer in real-time, reserving more statistical information by histogram matching, providing more reliable texture clarity and more humane user control. We achieve performance more cheerful than existing approaches without adding calculation, complexity. And the speed comparable to the fastest Style Transfer method. Our method provides more flexible user control and trustworthy quality and stability.

Cao, Y., Tang, Y..  2019.  Development of Real-Time Style Transfer for Video System. 2019 3rd International Conference on Circuits, System and Simulation (ICCSS). :183—187.

Re-drawing the image as a certain artistic style is considered to be a complicated task for computer machine. On the contrary, human can easily master the method to compose and describe the style between different images. In the past, many researchers studying on the deep neural networks had found an appropriate representation of the artistic style using perceptual loss and style reconstruction loss. In the previous works, Gatys et al. proposed an artificial system based on convolutional neural networks that creates artistic images of high perceptual quality. Whereas in terms of running speed, it was relatively time-consuming, thus it cannot apply to video style transfer. Recently, a feed-forward CNN approach has shown the potential of fast style transformation, which is an end-to-end system without hundreds of iteration while transferring. We combined the benefits of both approaches, optimized the feed-forward network and defined time loss function to make it possible to implement the style transfer on video in real time. In contrast to the past method, our method runs in real time with higher resolution while creating competitive visually pleasing and temporally consistent experimental results.

2020-12-07
Wang, C., He, M..  2018.  Image Style Transfer with Multi-target Loss for loT Applications. 2018 15th International Symposium on Pervasive Systems, Algorithms and Networks (I-SPAN). :296–299.

Transferring the style of an image is a fundamental problem in computer vision. Which extracts the features of a context image and a style image, then fixes them to produce a new image with features of the both two input images. In this paper, we introduce an artificial system to separate and recombine the content and style of arbitrary images, providing a neural algorithm for the creation of artistic images. We use a pre-trained deep convolutional neural network VGG19 to extract the feature map of the input style image and context image. Then we define a loss function that captures the difference between the output image and the two input images. We use the gradient descent algorithm to update the output image to minimize the loss function. Experiment results show the feasibility of the method.

2020-09-18
Yudin, Oleksandr, Ziubina, Ruslana, Buchyk, Serhii, Frolov, Oleg, Suprun, Olha, Barannik, Natalia.  2019.  Efficiency Assessment of the Steganographic Coding Method with Indirect Integration of Critical Information. 2019 IEEE International Conference on Advanced Trends in Information Theory (ATIT). :36—40.
The presented method of encoding and steganographic embedding of a series of bits for the hidden message was first developed by modifying the digital platform (bases) of the elements of the image container. Unlike other methods, steganographic coding and embedding is accomplished by changing the elements of the image fragment, followed by the formation of code structures for the established structure of the digital representation of the structural elements of the image media image. The method of estimating quantitative indicators of embedded critical data is presented. The number of bits of the container for the developed method of steganographic coding and embedding of critical information is estimated. The efficiency of the presented method is evaluated and the comparative analysis of the value of the embedded digital data in relation to the method of weight coefficients of the discrete cosine transformation matrix, as well as the comparative analysis of the developed method of steganographic coding, compared with the Koch and Zhao methods to determine the embedded data resistance against attacks of various types. It is determined that for different values of the quantization coefficient, the most critical are the built-in containers of critical information, which are built by changing the part of the digital video data platform depending on the size of the digital platform and the number of bits of the built-in container.
2020-09-14
Anselmi, Nicola, Poli, Lorenzo, Oliveri, Giacomo, Rocca, Paolo, Massa, Andrea.  2019.  Dealing with Correlation and Sparsity for an Effective Exploitation of the Compressive Processing in Electromagnetic Inverse Problems. 2019 13th European Conference on Antennas and Propagation (EuCAP). :1–4.
In this paper, a novel method for tomographic microwave imaging based on the Compressive Processing (CP) paradigm is proposed. The retrieval of the dielectric profiles of the scatterers is carried out by efficiently solving both the sampling and the sensing problems suitably formulated under the first order Born approximation. Selected numerical results are presented in order to show the improvements provided by the CP with respect to conventional compressive sensing (CSE) approaches.
Wang, Lizhi, Xiong, Zhiwei, Huang, Hua, Shi, Guangming, Wu, Feng, Zeng, Wenjun.  2019.  High-Speed Hyperspectral Video Acquisition By Combining Nyquist and Compressive Sampling. IEEE Transactions on Pattern Analysis and Machine Intelligence. 41:857–870.
We propose a novel hybrid imaging system to acquire 4D high-speed hyperspectral (HSHS) videos with high spatial and spectral resolution. The proposed system consists of two branches: one branch performs Nyquist sampling in the temporal dimension while integrating the whole spectrum, resulting in a high-frame-rate panchromatic video; the other branch performs compressive sampling in the spectral dimension with longer exposures, resulting in a low-frame-rate hyperspectral video. Owing to the high light throughput and complementary sampling, these two branches jointly provide reliable measurements for recovering the underlying HSHS video. Moreover, the panchromatic video can be used to learn an over-complete 3D dictionary to represent each band-wise video sparsely, thanks to the inherent structural similarity in the spectral dimension. Based on the joint measurements and the self-adaptive dictionary, we further propose a simultaneous spectral sparse (3S) model to reinforce the structural similarity across different bands and develop an efficient computational reconstruction algorithm to recover the HSHS video. Both simulation and hardware experiments validate the effectiveness of the proposed approach. To the best of our knowledge, this is the first time that hyperspectral videos can be acquired at a frame rate up to 100fps with commodity optical elements and under ordinary indoor illumination.
Wang, Hui, Yan, Qiurong, Li, Bing, Yuan, Chenglong, Wang, Yuhao.  2019.  Sampling Time Adaptive Single-Photon Compressive Imaging. IEEE Photonics Journal. 11:1–10.
We propose a time-adaptive sampling method and demonstrate a sampling-time-adaptive single-photon compressive imaging system. In order to achieve self-adapting adjustment of sampling time, the theory of threshold of light intensity estimation accuracy is deduced. According to this threshold, a sampling control module, based on field-programmable gate array, is developed. Finally, the advantage of the time-adaptive sampling method is proved experimentally. Imaging performance experiments show that the time-adaptive sampling method can automatically adjust the sampling time for the change of light intensity of image object to obtain an image with better quality and avoid speculative selection of sampling time.
Quang-Huy, Tran, Nguyen, Van Dien, Nguyen, Van Dung, Duc-Tan, Tran.  2019.  Density Imaging Using a Compressive Sampling DBIM approach. 2019 International Conference on Advanced Technologies for Communications (ATC). :160–163.
Density information has been used as a property of sound to restore objects in a quantitative manner in ultrasound tomography based on backscatter theory. In the traditional method, the authors only study the distorted Born iterative method (DBIM) to create density images using Tikhonov regularization. The downside is that the image quality is still low, the resolution is low, the convergence rate is not high. In this paper, we study the DBIM method to create density images using compressive sampling technique. With compressive sampling technique, the probes will be randomly distributed on the measurement system (unlike the traditional method, the probes are evenly distributed on the measurement system). This approach uses the l1 regularization to restore images. The proposed method will give superior results in image recovery quality, spatial resolution. The limitation of this method is that the imaging time is longer than the one in the traditional method, but the less number of iterations is used in this method.
2020-07-03
Huijuan, Wang, Yong, Jiang, Xingmin, Ma.  2019.  Fast Bi-dimensional Empirical Mode based Multisource Image Fusion Decomposition. 2019 28th Wireless and Optical Communications Conference (WOCC). :1—4.

Bi-dimensional empirical mode decomposition can decompose the source image into several Bi-dimensional Intrinsic Mode Functions. In the process of image decomposition, interpolation is needed and the upper and lower envelopes will be drawn. However, these interpolations and the drawings of upper and lower envelopes require a lot of computation time and manual screening. This paper proposes a simple but effective method that can maintain the characteristics of the original BEMD method, and the Hermite interpolation reconstruction method is used to replace the surface interpolation, and the variable neighborhood window method is used to replace the fixed neighborhood window method. We call it fast bi-dimensional empirical mode decomposition of the variable neighborhood window method based on research characteristics, and we finally complete the image fusion. The empirical analysis shows that this method can overcome the shortcomings that the source image features and details information of BIMF component decomposed from the original BEMD method are not rich enough, and reduce the calculation time, and the fusion quality is better.

Singh, Neha, Joshi, Sandeep, Birla, Shilpi.  2019.  Suitability of Singular Value Decomposition for Image Watermarking. 2019 6th International Conference on Signal Processing and Integrated Networks (SPIN). :983—986.

Digital images are extensively used and exchanged through internet, which gave rise to the need of establishing authorship of images. Image watermarking has provided a solution to prevent false claims of ownership of the media. Information about the owner, generally in the form of a logo, text or image is imperceptibly hid into the subject. Many transforms have been explored by the researcher community for image watermarking. Many watermarking techniques have been developed based on Singular Value Decomposition (SVD) of images. This paper analyses Singular Value Decomposition to understand its use, ability and limitations to hide additional information into the cover image for Digital Image Watermarking application.

2020-06-19
Ly, Son Thai, Do, Nhu-Tai, Lee, Guee-Sang, Kim, Soo-Hyung, Yang, Hyung-Jeong.  2019.  A 3d Face Modeling Approach for in-The-Wild Facial Expression Recognition on Image Datasets. 2019 IEEE International Conference on Image Processing (ICIP). :3492—3496.

This paper explores the benefits of 3D face modeling for in-the-wild facial expression recognition (FER). Since there is limited in-the-wild 3D FER dataset, we first construct 3D facial data from available 2D dataset using recent advances in 3D face reconstruction. The 3D facial geometry representation is then extracted by deep learning technique. In addition, we also take advantage of manipulating the 3D face, such as using 2D projected images of 3D face as additional input for FER. These features are then fused with that of 2D FER typical network. By doing so, despite using common approaches, we achieve a competent recognition accuracy on Real-World Affective Faces (RAF) database and Static Facial Expressions in the Wild (SFEW 2.0) compared with the state-of-the-art reports. To the best of our knowledge, this is the first time such a deep learning combination of 3D and 2D facial modalities is presented in the context of in-the-wild FER.