Biblio

List
Filter

Found 87 results

Filters: Keyword is image classification [Clear All Filters]

2023-09-20

Samia, Bougareche, Soraya, Zehani, Malika, Mimi. 2022. Fashion Images Classification using Machine Learning, Deep Learning and Transfer Learning Models. 2022 7th International Conference on Image and Signal Processing and their Applications (ISPA). :1—5.

Fashion is the way we present ourselves which mainly focuses on vision, has attracted great interest from computer vision researchers. It is generally used to search fashion products in online shopping malls to know the descriptive information of the product. The main objectives of our paper is to use deep learning (DL) and machine learning (ML) methods to correctly identify and categorize clothing images. In this work, we used ML algorithms (support vector machines (SVM), K-Nearest Neirghbors (KNN), Decision tree (DT), Random Forest (RF)), DL algorithms (Convolutionnal Neurals Network (CNN), AlexNet, GoogleNet, LeNet, LeNet5) and the transfer learning using a pretrained models (VGG16, MobileNet and RestNet50). We trained and tested our models online using google colaboratory with Tensorflow/Keras and Scikit-Learn libraries that support deep learning and machine learning in Python. The main metric used in our study to evaluate the performance of ML and DL algorithms is the accuracy and matrix confusion. The best result for the ML models is obtained with the use of ANN (88.71%) and for the DL models is obtained for the GoogleNet architecture (93.75%). The results obtained showed that the number of epochs and the depth of the network have an effect in obtaining the best results.

2023-07-20

Shetty, Pallavi, Joshi, Kapil, Raman, Dr. Ramakrishnan, Rao, K. Naga Venkateshwara, Kumar, Dr. A. Vijaya, Tiwari, Mohit. 2022. A Framework of Artificial Intelligence for the Manufacturing and Image Classification system. 2022 5th International Conference on Contemporary Computing and Informatics (IC3I). :1504—1508.

Artificial intelligence (AI) has been successfully employed in industries for decades, beginning with the invention of expert systems in the 1960s and continuing through the present ubiquity of deep learning. Data-driven AI solutions have grown increasingly common as a means of supporting ever-more complicated industrial processes owing to the accessibility of affordable computer and storage infrastructure. Despite recent optimism, implementing AI to smart industrial applications still offers major difficulties. The present paper gives an executive summary of AI methodologies with an emphasis on deep learning before detailing unresolved issues in AI safety, data privacy, and data quality — all of which are necessary for completely automated commercial AI systems.

2022-11-18

De la Parra, Cecilia, El-Yamany, Ahmed, Soliman, Taha, Kumar, Akash, Wehn, Norbert, Guntoro, Andre. 2021. Exploiting Resiliency for Kernel-Wise CNN Approximation Enabled by Adaptive Hardware Design. 2021 IEEE International Symposium on Circuits and Systems (ISCAS). :1–5.

Efficient low-power accelerators for Convolutional Neural Networks (CNNs) largely benefit from quantization and approximation, which are typically applied layer-wise for efficient hardware implementation. In this work, we present a novel strategy for efficient combination of these concepts at a deeper level, which is at each channel or kernel. We first apply layer-wise, low bit-width, linear quantization and truncation-based approximate multipliers to the CNN computation. Then, based on a state-of-the-art resiliency analysis, we are able to apply a kernel-wise approximation and quantization scheme with negligible accuracy losses, without further retraining. Our proposed strategy is implemented in a specialized framework for fast design space exploration. This optimization leads to a boost in estimated power savings of up to 34% in residual CNN architectures for image classification, compared to the base quantized architecture.

Khoshavi, Navid, Sargolzaei, Saman, Bi, Yu, Roohi, Arman. 2021. Entropy-Based Modeling for Estimating Adversarial Bit-flip Attack Impact on Binarized Neural Network. 2021 26th Asia and South Pacific Design Automation Conference (ASP-DAC). :493–498.

Over past years, the high demand to efficiently process deep learning (DL) models has driven the market of the chip design companies. However, the new Deep Chip architectures, a common term to refer to DL hardware accelerator, have slightly paid attention to the security requirements in quantized neural networks (QNNs), while the black/white -box adversarial attacks can jeopardize the integrity of the inference accelerator. Therefore in this paper, a comprehensive study of the resiliency of QNN topologies to black-box attacks is examined. Herein, different attack scenarios are performed on an FPGA-processor co-design, and the collected results are extensively analyzed to give an estimation of the impact’s degree of different types of attacks on the QNN topology. To be specific, we evaluated the sensitivity of the QNN accelerator to a range number of bit-flip attacks (BFAs) that might occur in the operational lifetime of the device. The BFAs are injected at uniformly distributed times either across the entire QNN or per individual layer during the image classification. The acquired results are utilized to build the entropy-based model that can be leveraged to construct resilient QNN architectures to bit-flip attacks.

2022-10-16

MaungMaung, AprilPyone, Kiya, Hitoshi. 2021. Ensemble of Key-Based Models: Defense Against Black-Box Adversarial Attacks. 2021 IEEE 10th Global Conference on Consumer Electronics (GCCE). :95–98.

We propose a voting ensemble of models trained by using block-wise transformed images with secret keys against black-box attacks. Although key-based adversarial defenses were effective against gradient-based (white-box) attacks, they cannot defend against gradient-free (black-box) attacks without requiring any secret keys. In the proposed ensemble, a number of models are trained by using images transformed with different keys and block sizes, and then a voting ensemble is applied to the models. Experimental results show that the proposed defense achieves a clean accuracy of 95.56 % and an attack success rate of less than 9 % under attacks with a noise distance of 8/255 on the CIFAR-10 dataset.

2022-06-06

Uchida, Hikaru, Matsubara, Masaki, Wakabayashi, Kei, Morishima, Atsuyuki. 2020. Human-in-the-loop Approach towards Dual Process AI Decisions. 2020 IEEE International Conference on Big Data (Big Data). :3096–3098.

How to develop AI systems that can explain how they made decisions is one of the important and hot topics today. Inspired by the dual-process theory in psychology, this paper proposes a human-in-the-loop approach to develop System-2 AI that makes an inference logically and outputs interpretable explanation. Our proposed method first asks crowd workers to raise understandable features of objects of multiple classes and collect training data from the Internet to generate classifiers for the features. Logical decision rules with the set of generated classifiers can explain why each object is of a particular class. In our preliminary experiment, we applied our method to an image classification of Asian national flags and examined the effectiveness and issues of our method. In our future studies, we plan to combine the System-2 AI with System-1 AI (e.g., neural networks) to efficiently output decisions.

2021-08-02

Peng, Ye, Fu, Guobin, Luo, Yingguang, Yu, Qi, Li, Bin, Hu, Jia. 2020. A Two-Layer Moving Target Defense for Image Classification in Adversarial Environment. 2020 IEEE 6th International Conference on Computer and Communications (ICCC). :410—414.

Deep learning plays an increasingly important role in various fields due to its superior performance, and it also achieves advanced recognition performance in the field of image classification. However, the vulnerability of deep learning in the adversarial environment cannot be ignored, and the prediction result of the model is likely to be affected by the small perturbations added to the samples by the adversary. In this paper, we propose a two-layer dynamic defense method based on defensive techniques pool and retrained branch model pool. First, we randomly select defense methods from the defense pool to process the input. The perturbation ability of the adversarial samples preprocessed by different defense methods changed, which would produce different classification results. In addition, we conduct adversarial training based on the original model and dynamically generate multiple branch models. The classification results of these branch models for the same adversarial sample is inconsistent. We can detect the adversarial samples by using the inconsistencies in the output results of the two layers. The experimental results show that the two-layer dynamic defense method we designed achieves a good defense effect.

2021-04-08

Boato, G., Dang-Nguyen, D., Natale, F. G. B. De. 2020. Morphological Filter Detector for Image Forensics Applications. IEEE Access. 8:13549—13560.

Mathematical morphology provides a large set of powerful non-linear image operators, widely used for feature extraction, noise removal or image enhancement. Although morphological filters might be used to remove artifacts produced by image manipulations, both on binary and gray level documents, little effort has been spent towards their forensic identification. In this paper we propose a non-trivial extension of a deterministic approach originally detecting erosion and dilation of binary images. The proposed approach operates on grayscale images and is robust to image compression and other typical attacks. When the image is attacked the method looses its deterministic nature and uses a properly trained SVM classifier, using the original detector as a feature extractor. Extensive tests demonstrate that the proposed method guarantees very high accuracy in filtering detection, providing 100% accuracy in discriminating the presence and the type of morphological filter in raw images of three different datasets. The achieved accuracy is also good after JPEG compression, equal or above 76.8% on all datasets for quality factors above 80. The proposed approach is also able to determine the adopted structuring element for moderate compression factors. Finally, it is robust against noise addition and it can distinguish morphological filter from other filters.

Rhee, K. H.. 2020. Composition of Visual Feature Vector Pattern for Deep Learning in Image Forensics. IEEE Access. 8:188970—188980.

In image forensics, to determine whether the image is impurely transformed, it extracts and examines the features included in the suspicious image. In general, the features extracted for the detection of forgery images are based on numerical values, so it is somewhat unreasonable to use in the CNN structure for image classification. In this paper, the extraction method of a feature vector is using a least-squares solution. Treat a suspicious image like a matrix and its solution to be coefficients as the feature vector. Get two solutions from two images of the original and its median filter residual (MFR). Subsequently, the two features were formed into a visualized pattern and then fed into CNN deep learning to classify the various transformed images. A new structure of the CNN net layer was also designed by hybrid with the inception module and the residual block to classify visualized feature vector patterns. The performance of the proposed image forensics detection (IFD) scheme was measured with the seven transformed types of image: average filtered (window size: 3 × 3), gaussian filtered (window size: 3 × 3), JPEG compressed (quality factor: 90, 70), median filtered (window size: 3 × 3, 5 × 5), and unaltered. The visualized patterns are fed into the image input layer of the designed CNN hybrid model. Throughout the experiment, the accuracy of median filtering detection was 98% over. Also, the area under the curve (AUC) by sensitivity (TP: true positive rate) and 1-specificity (FP: false positive rate) results of the proposed IFD scheme approached to `1' on the designed CNN hybrid model. Experimental results show high efficiency and performance to classify the various transformed images. Therefore, the grade evaluation of the proposed scheme is “Excellent (A)”.

2021-03-29

Singh, S., Nasoz, F.. 2020. Facial Expression Recognition with Convolutional Neural Networks. 2020 10th Annual Computing and Communication Workshop and Conference (CCWC). :0324—0328.

Emotions are a powerful tool in communication and one way that humans show their emotions is through their facial expressions. One of the challenging and powerful tasks in social communications is facial expression recognition, as in non-verbal communication, facial expressions are key. In the field of Artificial Intelligence, Facial Expression Recognition (FER) is an active research area, with several recent studies using Convolutional Neural Networks (CNNs). In this paper, we demonstrate the classification of FER based on static images, using CNNs, without requiring any pre-processing or feature extraction tasks. The paper also illustrates techniques to improve future accuracy in this area by using pre-processing, which includes face detection and illumination correction. Feature extraction is used to extract the most prominent parts of the face, including the jaw, mouth, eyes, nose, and eyebrows. Furthermore, we also discuss the literature review and present our CNN architecture, and the challenges of using max-pooling and dropout, which eventually aided in better performance. We obtained a test accuracy of 61.7% on FER2013 in a seven-classes classification task compared to 75.2% in state-of-the-art classification.

Pranav, E., Kamal, S., Chandran, C. Satheesh, Supriya, M. H.. 2020. Facial Emotion Recognition Using Deep Convolutional Neural Network. 2020 6th International Conference on Advanced Computing and Communication Systems (ICACCS). :317—320.

The rapid growth of artificial intelligence has contributed a lot to the technology world. As the traditional algorithms failed to meet the human needs in real time, Machine learning and deep learning algorithms have gained great success in different applications such as classification systems, recommendation systems, pattern recognition etc. Emotion plays a vital role in determining the thoughts, behaviour and feeling of a human. An emotion recognition system can be built by utilizing the benefits of deep learning and different applications such as feedback analysis, face unlocking etc. can be implemented with good accuracy. The main focus of this work is to create a Deep Convolutional Neural Network (DCNN) model that classifies 5 different human facial emotions. The model is trained, tested and validated using the manually collected image dataset.

Ozdemir, M. A., Elagoz, B., Soy, A. Alaybeyoglu, Akan, A.. 2020. Deep Learning Based Facial Emotion Recognition System. 2020 Medical Technologies Congress (TIPTEKNO). :1—4.

In this study, it was aimed to recognize the emotional state from facial images using the deep learning method. In the study, which was approved by the ethics committee, a custom data set was created using videos taken from 20 male and 20 female participants while simulating 7 different facial expressions (happy, sad, surprised, angry, disgusted, scared, and neutral). Firstly, obtained videos were divided into image frames, and then face images were segmented using the Haar library from image frames. The size of the custom data set obtained after the image preprocessing is more than 25 thousand images. The proposed convolutional neural network (CNN) architecture which is mimics of LeNet architecture has been trained with this custom dataset. According to the proposed CNN architecture experiment results, the training loss was found as 0.0115, the training accuracy was found as 99.62%, the validation loss was 0.0109, and the validation accuracy was 99.71%.

Jia, C., Li, C. L., Ying, Z.. 2020. Facial expression recognition based on the ensemble learning of CNNs. 2020 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC). :1—5.

As a part of body language, facial expression is a psychological state that reflects the current emotional state of the person. Recognition of facial expressions can help to understand others and enhance communication with others. We propose a facial expression recognition method based on convolutional neural network ensemble learning in this paper. Our model is composed of three sub-networks, and uses the SVM classifier to Integrate the output of the three networks to get the final result. The recognition accuracy of the model's expression on the FER2013 dataset reached 71.27%. The results show that the method has high test accuracy and short prediction time, and can realize real-time, high-performance facial recognition.

Xu, X., Ruan, Z., Yang, L.. 2020. Facial Expression Recognition Based on Graph Neural Network. 2020 IEEE 5th International Conference on Image, Vision and Computing (ICIVC). :211—214.

Facial expressions are one of the most powerful, natural and immediate means for human being to present their emotions and intensions. In this paper, we present a novel method for fully automatic facial expression recognition. The facial landmarks are detected for characterizing facial expressions. A graph convolutional neural network is proposed for feature extraction and facial expression recognition classification. The experiments were performed on the three facial expression databases. The result shows that the proposed FER method can achieve good recognition accuracy up to 95.85% using the proposed method.

Al-Janabi, S. I. Ali, Al-Janabi, S. T. Faraj, Al-Khateeb, B.. 2020. Image Classification using Convolution Neural Network Based Hash Encoding and Particle Swarm Optimization. 2020 International Conference on Data Analytics for Business and Industry: Way Towards a Sustainable Economy (ICDABI). :1–5.

Image Retrieval (IR) has become one of the main problems facing computer society recently. To increase computing similarities between images, hashing approaches have become the focus of many programmers. Indeed, in the past few years, Deep Learning (DL) has been considered as a backbone for image analysis using Convolutional Neural Networks (CNNs). This paper aims to design and implement a high-performance image classifier that can be used in several applications such as intelligent vehicles, face recognition, marketing, and many others. This work considers experimentation to find the sequential model's best configuration for classifying images. The best performance has been obtained from two layers' architecture; the first layer consists of 128 nodes, and the second layer is composed of 32 nodes, where the accuracy reached up to 0.9012. The proposed classifier has been achieved using CNN and the data extracted from the CIFAR-10 dataset by the inception model, which are called the Transfer Values (TRVs). Indeed, the Particle Swarm Optimization (PSO) algorithm is used to reduce the TRVs. In this respect, the work focus is to reduce the TRVs to obtain high-performance image classifier models. Indeed, the PSO algorithm has been enhanced by using the crossover technique from genetic algorithms. This led to a reduction of the complexity of models in terms of the number of parameters used and the execution time.

2021-03-22

Kumar, S. A., Kumar, A., Bajaj, V., Singh, G. K.. 2020. An Improved Fuzzy Min–Max Neural Network for Data Classification. IEEE Transactions on Fuzzy Systems. 28:1910–1924.

Hyperbox classifier is an efficient tool for modern pattern classification problems due to its transparency and rigorous use of Euclidian geometry. Fuzzy min-max (FMM) network efficiently implements the hyperbox classifier, and has been modified several times to yield better classification accuracy. However, the obtained accuracy is not up to the mark. Therefore, in this paper, a new improved FMM (IFMM) network is proposed to increase the accuracy rate. In the proposed IFMM network, a modified constraint is employed to check the expandability of a hyperbox. It also uses semiperimeter of the hyperbox along with k-nearest mechanism to select the expandable hyperbox. In the proposed IFMM, the contraction rules of conventional FMM and enhanced FMM (EFMM) are also modified using semiperimeter of a hyperbox in order to balance the size of both overlapped hyperboxes. Experimental results show that the proposed IFMM network outperforms the FMM, k-nearest FMM, and EFMM by yielding more accuracy rate with less number of hyperboxes. The proposed methods are also applied to histopathological images to know the best magnification factor for classification.

2021-03-09

Rahmati, A., Moosavi-Dezfooli, S.-M., Frossard, P., Dai, H.. 2020. GeoDA: A Geometric Framework for Black-Box Adversarial Attacks. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). :8443–8452.

Adversarial examples are known as carefully perturbed images fooling image classifiers. We propose a geometric framework to generate adversarial examples in one of the most challenging black-box settings where the adversary can only generate a small number of queries, each of them returning the top-1 label of the classifier. Our framework is based on the observation that the decision boundary of deep networks usually has a small mean curvature in the vicinity of data samples. We propose an effective iterative algorithm to generate query-efficient black-box perturbations with small p norms which is confirmed via experimental evaluations on state-of-the-art natural image classifiers. Moreover, for p=2, we theoretically show that our algorithm actually converges to the minimal perturbation when the curvature of the decision boundary is bounded. We also obtain the optimal distribution of the queries over the iterations of the algorithm. Finally, experimental results confirm that our principled black-box attack algorithm performs better than state-of-the-art algorithms as it generates smaller perturbations with a reduced number of queries.

Cui, W., Li, X., Huang, J., Wang, W., Wang, S., Chen, J.. 2020. Substitute Model Generation for Black-Box Adversarial Attack Based on Knowledge Distillation. 2020 IEEE International Conference on Image Processing (ICIP). :648–652.

Although deep convolutional neural network (CNN) performs well in many computer vision tasks, its classification mechanism is very vulnerable when it is exposed to the perturbation of adversarial attacks. In this paper, we proposed a new algorithm to generate the substitute model of black-box CNN models by using knowledge distillation. The proposed algorithm distills multiple CNN teacher models to a compact student model as the substitution of other black-box CNN models to be attacked. The black-box adversarial samples can be consequently generated on this substitute model by using various white-box attacking methods. According to our experiments on ResNet18 and DenseNet121, our algorithm boosts the attacking success rate (ASR) by 20% by training the substitute model based on knowledge distillation.

2021-03-04

Carlini, N., Farid, H.. 2020. Evading Deepfake-Image Detectors with White- and Black-Box Attacks. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). :2804—2813.

It is now possible to synthesize highly realistic images of people who do not exist. Such content has, for example, been implicated in the creation of fraudulent socialmedia profiles responsible for dis-information campaigns. Significant efforts are, therefore, being deployed to detect synthetically-generated content. One popular forensic approach trains a neural network to distinguish real from synthetic content.We show that such forensic classifiers are vulnerable to a range of attacks that reduce the classifier to near- 0% accuracy. We develop five attack case studies on a state- of-the-art classifier that achieves an area under the ROC curve (AUC) of 0.95 on almost all existing image generators, when only trained on one generator. With full access to the classifier, we can flip the lowest bit of each pixel in an image to reduce the classifier's AUC to 0.0005; perturb 1% of the image area to reduce the classifier's AUC to 0.08; or add a single noise pattern in the synthesizer's latent space to reduce the classifier's AUC to 0.17. We also develop a black-box attack that, with no access to the target classifier, reduces the AUC to 0.22. These attacks reveal significant vulnerabilities of certain image-forensic classifiers.

Kalin, J., Ciolino, M., Noever, D., Dozier, G.. 2020. Black Box to White Box: Discover Model Characteristics Based on Strategic Probing. 2020 Third International Conference on Artificial Intelligence for Industries (AI4I). :60—63.

In Machine Learning, White Box Adversarial Attacks rely on knowing underlying knowledge about the model attributes. This works focuses on discovering to distrinct pieces of model information: the underlying architecture and primary training dataset. With the process in this paper, a structured set of input probes and the output of the model become the training data for a deep classifier. Two subdomains in Machine Learning are explored - image based classifiers and text transformers with GPT-2. With image classification, the focus is on exploring commonly deployed architectures and datasets available in popular public libraries. Using a single transformer architecture with multiple levels of parameters, text generation is explored by fine tuning off different datasets. Each dataset explored in image and text are distinguishable from one another. Diversity in text transformer outputs implies further research is needed to successfully classify architecture attribution in text domain.

2021-03-01

Perisetty, A., Bodempudi, S. T., Shaik, P. Rahaman, Kumar, B. L. N. Phaneendra. 2020. Classification of Hyperspectral Images using Edge Preserving Filter and Nonlinear Support Vector Machine (SVM). 2020 4th International Conference on Intelligent Computing and Control Systems (ICICCS). :1050–1054.

Hyperspectral image is acquired with a special sensor in which the information is collected continuously. This sensor will provide abundant data from the scene captured. The high voluminous data in this image give rise to the extraction of materials and other valuable items in it. This paper proposes a methodology to extract rich information from the hyperspectral images. As the information collected in a contiguous manner, there is a need to extract spectral bands that are uncorrelated. A factor analysis based dimensionality reduction technique is employed to extract the spectral bands and a weight least square filter is used to get the spatial information from the data. Due to the preservation of edge property in the spatial filter, much information is extracted during the feature extraction phase. Finally, a nonlinear SVM is applied to assign a class label to the pixels in the image. The research work is tested on the standard dataset Indian Pines. The performance of the proposed method on this dataset is assessed through various accuracy measures. These accuracies are 96%, 92.6%, and 95.4%. over the other methods. This methodology can be applied to forestry applications to extract the various metrics in the real world.

2021-02-08

Prathusha, P., Jyothi, S., Mamatha, D. M.. 2018. Enhanced Image Edge Detection Methods for Crab Species Identification. 2018 International Conference on Soft-computing and Network Security (ICSNS). :1—7.

Automatic Image Analysis, Image Classification, Automatic Object Recognition are some of the aspiring research areas in various fields of Engineering. Many Industrial and biological applications demand Image Analysis and Image Classification. Sample images available for classification may be complex, image data may be inadequate or component regions in the image may have poor visibility. With the available information each Digital Image Processing application has to analyze, classify and recognize the objects appropriately. Pre-processing, Image segmentation, feature extraction and classification are the most common steps to follow for Classification of Images. In this study we applied various existing edge detection methods like Robert, Sobel, Prewitt, Canny, Otsu and Laplacian of Guassian to crab images. From the conducted analysis of all edge detection operators, it is observed that Sobel, Prewitt, Robert operators are ideal for enhancement. The paper proposes Enhanced Sobel operator, Enhanced Prewitt operator and Enhanced Robert operator using morphological operations and masking. The novelty of the proposed approach is that it gives thick edges to the crab images and removes spurious edges with help of m-connectivity. Parameters which measure the accuracy of the results are employed to compare the existing edge detection operators with proposed edge detection operators. This approach shows better results than existing edge detection operators.

2021-01-18

Molek, V., Hurtik, P.. 2020. Training Neural Network Over Encrypted Data. 2020 IEEE Third International Conference on Data Stream Mining Processing (DSMP). :23–27.

We are answering the question whenever systems with convolutional neural network classifier trained over plain and encrypted data keep the ordering according to accuracy. Our motivation is need for designing convolutional neural network classifiers when data in their plain form are not accessible because of private company policy or sensitive data gathered by police. We propose to use a combination of fully connected autoencoder together with a convolutional neural network classifier. The autoencoder transforms the data info form that allows the convolutional classifier to be trained. We present three experiments that show the ordering of systems over plain and encrypted data. The results show that the systems indeed keep the ordering, and thus a NN designer can select appropriate architecture over encrypted data and later let data owner train or fine-tune the system/CNN classifier on the plain data.

2021-01-15

Yang, X., Li, Y., Lyu, S.. 2019. Exposing Deep Fakes Using Inconsistent Head Poses. ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). :8261—8265.

In this paper, we propose a new method to expose AI-generated fake face images or videos (commonly known as the Deep Fakes). Our method is based on the observations that Deep Fakes are created by splicing synthesized face region into the original image, and in doing so, introducing errors that can be revealed when 3D head poses are estimated from the face images. We perform experiments to demonstrate this phenomenon and further develop a classification method based on this cue. Using features based on this cue, an SVM classifier is evaluated using a set of real face images and Deep Fakes.

Amerini, I., Galteri, L., Caldelli, R., Bimbo, A. Del. 2019. Deepfake Video Detection through Optical Flow Based CNN. 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). :1205—1207.

Recent advances in visual media technology have led to new tools for processing and, above all, generating multimedia contents. In particular, modern AI-based technologies have provided easy-to-use tools to create extremely realistic manipulated videos. Such synthetic videos, named Deep Fakes, may constitute a serious threat to attack the reputation of public subjects or to address the general opinion on a certain event. According to this, being able to individuate this kind of fake information becomes fundamental. In this work, a new forensic technique able to discern between fake and original video sequences is given; unlike other state-of-the-art methods which resorts at single video frames, we propose the adoption of optical flow fields to exploit possible inter-frame dissimilarities. Such a clue is then used as feature to be learned by CNN classifiers. Preliminary results obtained on FaceForensics++ dataset highlight very promising performances.