Biblio
Emotions are a powerful tool in communication and one way that humans show their emotions is through their facial expressions. One of the challenging and powerful tasks in social communications is facial expression recognition, as in non-verbal communication, facial expressions are key. In the field of Artificial Intelligence, Facial Expression Recognition (FER) is an active research area, with several recent studies using Convolutional Neural Networks (CNNs). In this paper, we demonstrate the classification of FER based on static images, using CNNs, without requiring any pre-processing or feature extraction tasks. The paper also illustrates techniques to improve future accuracy in this area by using pre-processing, which includes face detection and illumination correction. Feature extraction is used to extract the most prominent parts of the face, including the jaw, mouth, eyes, nose, and eyebrows. Furthermore, we also discuss the literature review and present our CNN architecture, and the challenges of using max-pooling and dropout, which eventually aided in better performance. We obtained a test accuracy of 61.7% on FER2013 in a seven-classes classification task compared to 75.2% in state-of-the-art classification.
The automatic face tracking and detection has been one of the fastest developing areas due to its wide range of application, security and surveillance application in particular. It has been one of the most interest subjects, which suppose but yet to be wholly explored in various research areas due to various distinctive factors: varying ethnic groups, sizes, orientations, poses, occlusions and lighting conditions. The focus of this paper is to propose an improve algorithm to speed up the face tracking and detection process with the simple and efficient proposed novel edge detector to reject the non-face-likes regions, hence reduce the false detection rate in an automatic face tracking and detection in still images with multiple faces for facial expression system. The correct rates of 95.9% on the Haar face detection and proposed novel edge detector, which is higher 6.1% than the primitive integration of Haar and canny edge detector.
By the multi-layer nonlinear mapping and the semantic feature extraction of the deep learning, a deep learning network is proposed for video face detection to overcome the challenge of detecting faces rapidly and accurately in video with changeable background. Particularly, a pre-training procedure is used to initialize the network parameters to avoid falling into the local optimum, and the greedy layer-wise learning is introduced in the pre-training to avoid the training error transfer in layers. Key to the network is that the probability of neurons models the status of human brain neurons which is a continuous distribution from the most active to the least active and the hidden layer’s neuron number decreases layer-by-layer to reduce the redundant information of the input data. Moreover, the skin color detection is used to accelerate the detection speed by generating candidate regions. Experimental results show that, besides the faster detection speed and robustness against face rotation, the proposed method possesses lower false detection rate and lower missing detection rate than traditional algorithms.
A machine translation system that can convert South African Sign Language video to English audio or text and vice versa in real-time would be immensely beneficial to the Deaf and hard of hearing. Sign language gestures are characterised and expressed by five distinct parameters: hand location; hand orientation; hand shape; hand movement and facial expressions. The aim of this research is to recognise facial expressions and to compare the following feature descriptors: local binary patterns; compound local binary patterns and histogram of oriented gradients in two testing environments, a subset of the BU3D-FE dataset and the CK+ dataset. The overall accuracy, accuracy across facial expression classes, robustness to test subjects, and the ability to generalise of each feature descriptor within the context of automatic facial expression recognition are analysed as part of the comparison procedure. Overall, HOG proved to be a more robust feature descriptor to the LBP and CLBP. Furthermore, the CLBP can generally be considered to be superior to the LBP, but the LBP has greater potential in terms of its ability to generalise.
Human face detection plays an essential role in the first stage of face processing applications. In this study, an enhanced face detection framework is proposed to improve detection rate based on skin color and provide a validation process. A preliminary segmentation of the input images based on skin color can significantly reduce search space and accelerate the process of human face detection. The primary detection is based on Haar-like features and the Adaboost algorithm. A validation process is introduced to reject non-face objects, which might occur during the face detection process. The validation process is based on two-stage Extended Local Binary Patterns. The experimental results on the CMU-MIT and Caltech 10000 datasets over a wide range of facial variations in different colors, positions, scales, and lighting conditions indicated a successful face detection rate.
Detecting faces and heads appearing in video feeds are challenging tasks in real-world video surveillance applications due to variations in appearance, occlusions and complex backgrounds. Recently, several CNN architectures have been proposed to increase the accuracy of detectors, although their computational complexity can be an issue, especially for realtime applications, where faces and heads must be detected live using high-resolution cameras. This paper compares the accuracy and complexity of state-of-the-art CNN architectures that are suitable for face and head detection. Single pass and region-based architectures are reviewed and compared empirically to baseline techniques according to accuracy and to time and memory complexity on images from several challenging datasets. The viability of these architectures is analyzed with real-time video surveillance applications in mind. Results suggest that, although CNN architectures can achieve a very high level of accuracy compared to traditional detectors, their computational cost can represent a limitation for many practical real-time applications.
Near-sensor data analytics is a promising direction for internet-of-things endpoints, as it minimizes energy spent on communication and reduces network load - but it also poses security concerns, as valuable data are stored or sent over the network at various stages of the analytics pipeline. Using encryption to protect sensitive data at the boundary of the on-chip analytics engine is a way to address data security issues. To cope with the combined workload of analytics and encryption in a tight power envelope, we propose Fulmine, a system-on-chip (SoC) based on a tightly-coupled multi-core cluster augmented with specialized blocks for compute-intensive data processing and encryption functions, supporting software programmability for regular computing tasks. The Fulmine SoC, fabricated in 65-nm technology, consumes less than 20mW on average at 0.8V achieving an efficiency of up to 70pJ/B in encryption, 50pJ/px in convolution, or up to 25MIPS/mW in software. As a strong argument for real-life flexible application of our platform, we show experimental results for three secure analytics use cases: secure autonomous aerial surveillance with a state-of-the-art deep convolutional neural network (CNN) consuming 3.16pJ per equivalent reduced instruction set computer operation, local CNN-based face detection with secured remote recognition in 5.74pJ/op, and seizure detection with encrypted data collection from electroencephalogram within 12.7pJ/op.
In order to provide reliable security solution to the people, the concept of smart ATM security system based on Embedded Linux platform is suggested in this paper. The study is focused on Design and Implementation of Face Detection based ATM Security System using Embedded Linux Platform. The system is implemented on the credit card size Raspberry Pi board with extended capability of open source Computer Vision (OpenCV) software which is used for Image processing operation. High level security mechanism is provided by the consecutive actions such as initially system captures the human face and check whether the human face is detected properly or not. If the face is not detected properly, it warns the user to adjust him/her properly to detect the face. Still the face is not detected properly the system will lock the door of the ATM cabin for security purpose. As soon as the door is lock, the system will automatic generates 3 digit OTP code. The OTP code will be sent to the watchman's registered mobile number through SMS using GSM module which is connected with the raspberry Pi. Watchman will enter the generated OTP through keypad which is interfaced with the Pi Board. The OTP will be verified and if it is correct then door will be unlock otherwise it will remain lock.
Face is crucial for human identity, while face identification has become crucial to information security. It is important to understand and work with the problems and challenges for all different aspects of facial feature extraction and face identification. In this tutorial, we identify and discuss four research challenges in current Face Detection/Recognition research and related research areas: (1) Unavoidable Facial Feature Alterations, (2) Voluntary Facial Feature Alterations, (3) Uncontrolled Environments, and (4) Accuracy Control on Large-scale Dataset. We also direct several different applications (spin-offs) of facial feature studies in the tutorial.
Machine learning is enabling a myriad innovations, including new algorithms for cancer diagnosis and self-driving cars. The broad use of machine learning makes it important to understand the extent to which machine-learning algorithms are subject to attack, particularly when used in applications where physical security or safety is at risk. In this paper, we focus on facial biometric systems, which are widely used in surveillance and access control. We define and investigate a novel class of attacks: attacks that are physically realizable and inconspicuous, and allow an attacker to evade recognition or impersonate another individual. We develop a systematic method to automatically generate such attacks, which are realized through printing a pair of eyeglass frames. When worn by the attacker whose image is supplied to a state-of-the-art face-recognition algorithm, the eyeglasses allow her to evade being recognized or to impersonate another individual. Our investigation focuses on white-box face-recognition systems, but we also demonstrate how similar techniques can be used in black-box scenarios, as well as to avoid face detection.
Keeping a driver focused on the road is one of the most critical steps in insuring the safe operation of a vehicle. The Strategic Highway Research Program 2 (SHRP2) has over 3,100 recorded videos of volunteer drivers during a period of 2 years. This extensive naturalistic driving study (NDS) contains over one million hours of video and associated data that could aid safety researchers in understanding where the driver's attention is focused. Manual analysis of this data is infeasible; therefore efforts are underway to develop automated feature extraction algorithms to process and characterize the data. The real-world nature, volume, and acquisition conditions are unmatched in the transportation community, but there are also challenges because the data has relatively low resolution, high compression rates, and differing illumination conditions. A smaller dataset, the head pose validation study, is available which used the same recording equipment as SHRP2 but is more easily accessible with less privacy constraints. In this work we report initial head pose accuracy using commercial and open source face pose estimation algorithms on the head pose validation data set.