Biblio
The current AI revolution provides us with many new, but often very complex algorithmic systems. This complexity does not only limit understanding, but also acceptance of e.g. deep learning methods. In recent years, explainable AI (XAI) has been proposed as a remedy. However, this research is rarely supported by publications on explanations from social sciences. We suggest a bottom-up approach to explanations for (game) AI, by starting from a baseline definition of understandability informed by the concept of limited human working memory. We detail our approach and demonstrate its application to two games from the GVGAI framework. Finally, we discuss our vision of how additional concepts from social sciences can be integrated into our proposed approach and how the results can be generalised.
Explosive naval mines pose a threat to ocean and sea faring vessels, both military and civilian. This work applies deep neural network (DNN) methods to the problem of detecting minelike objects (MLO) on the seafloor in side-scan sonar imagery. We explored how the DNN depth, memory requirements, calculation requirements, and training data distribution affect detection efficacy. A visualization technique (class activation map) was incorporated that aids a user in interpreting the model's behavior. We found that modest DNN model sizes yielded better accuracy (98%) than very simple DNN models (93%) and a support vector machine (78%). The largest DNN models achieved textless;1% efficacy increase at a cost of a 17x increase of trainable parameter count and computation requirements. In contrast to DNNs popularized for many-class image recognition tasks, the models for this task require far fewer computational resources (0.3% of parameters), and are suitable for embedded use within an autonomous unmanned underwater vehicle.
This paper describes the applications of deep learning-based image recognition in the DARPA Memex program and its repository of 1.4 million weapons-related images collected from the Deep web. We develop a fast, efficient, and easily deployable framework for integrating Google's Tensorflow framework with Apache Tika for automatically performing image forensics on the Memex data. Our framework and its integration are evaluated qualitatively and quantitatively and our work suggests that automated, large-scale, and reliable image classification and forensics can be widely used and deployed in bulk analysis for answering domain-specific questions.
Once we had tried to propose an unbreakable CAPTCHA and we reached a result that limitation of time is effect to prevent computers from recognizing characters accurately while computers can finally recognize all text-based CAPTCHA in unlimited time. One of the existing usual ways to prevent computers from recognizing characters is distortion, and adding noise is also effective for the prevention. However, these kinds of prevention also make recognition of characters by human beings difficult. As a solution of the problems, an effective text-based CAPTCHA algorithm with amodal completion was proposed by our team. Our CAPTCHA causes computers a large amount of calculation costs while amodal completion helps human beings to recognize characters momentarily. Our CAPTCHA has evolved with aftereffects and combinations of complementary colors. We evaluated our CAPTCHA with deep learning which is attracting the most attention since deep learning is faster and more accurate than existing methods for recognition with computers. In this paper, we add jagged lines to edges of characters since edges are one of the most important parts for recognition in deep learning. In this paper, we also evaluate that how much the jagged lines decrease recognition of human beings and how much they prevent computers from the recognition. We confirm the effects of our method to deep learning.
Arabic handwritten documents present specific challenges due to the cursive nature of the writing and the presence of diacritical marks. Moreover, one of the largest labeled database of Arabic handwritten documents, the OpenHart-NIST database includes specific noise, namely guidelines, that has to be addressed. We propose several approaches to process these documents. First a guideline detection approach has been developed, based on K-means, that detects the documents that include guidelines. We then propose a series of preprocessing at text-line level to reduce the noise effects. For text-lines including guidelines, a guideline removal preprocessing is described and existing keystroke restoration approaches are assessed. In addition, we propose a preprocessing that combines noise removal and deskewing by removing line fragments from neighboring text lines, while searching for the principal orientation of the text-line. We provide recognition results, showing the significant improvement brought by the proposed processings.
Keystroke dynamics is a form of behavioral biometrics that can be used for continuous authentication of computer users. Many classifiers have been proposed for the analysis of acquired user patterns and verification of users at computer terminals. The underlying machine learning methods that use Gaussian density estimator for outlier detection typically assume that the digraph patterns in keystroke data are generated from a single Gaussian distribution. In this paper, we relax this assumption by allowing digraphs to fit more than one distribution via the Gaussian Mixture Model (GMM). We have conducted an experiment with a public data set collected in a controlled environment. Out of 30 users with dynamic text, we obtain 0.08% Equal Error Rate (EER) with 2 components by using GMM, while pure Gaussian yields 1.3% EER for the same data set (an improvement of EER by 93.8%). Our results show that GMM can recognize keystroke dynamics more precisely and authenticate users with higher confidence level.
With the increasing popularity of wearable devices, information becomes much easily available. However, personal information sharing still poses great challenges because of privacy issues. We propose an idea of Visual Human Signature (VHS) which can represent each person uniquely even captured in different views/poses by wearable cameras. We evaluate the performance of multiple effective modalities for recognizing an identity, including facial appearance, visual patches, facial attributes and clothing attributes. We propose to emphasize significant dimensions and do weighted voting fusion for incorporating the modalities to improve the VHS recognition. By jointly considering multiple modalities, the VHS recognition rate can reach by 51% in frontal images and 48% in the more challenging environment and our approach can surpass the baseline with average fusion by 25% and 16%. We also introduce Multiview Celebrity Identity Dataset (MCID), a new dataset containing hundreds of identities with different view and clothing for comprehensive evaluation.
A process of de-identification used for privacy protection in multimedia content should be applied not only for primary biometric traits (face, voice) but for soft biometric traits as well. This paper deals with a proposal of the automatic hair color de-identification method working with video records. The method involves image hair area segmentation, basic hair color recognition, and modification of hair color for real-looking de-identified images.
Automated human facial image de-identification is a much needed technology for privacy-preserving social media and intelligent surveillance applications. Other than the usual face blurring techniques, in this work, we propose to achieve facial anonymity by slightly modifying existing facial images into "averaged faces" so that the corresponding identities are difficult to uncover. This approach preserves the aesthesis of the facial images while achieving the goal of privacy protection. In particular, we explore a deep learning-based facial identity-preserving (FIP) features. Unlike conventional face descriptors, the FIP features can significantly reduce intra-identity variances, while maintaining inter-identity distinctions. By suppressing and tinkering FIP features, we achieve the goal of k-anonymity facial image de-identification while preserving desired utilities. Using a face database, we successfully demonstrate that the resulting "averaged faces" will still preserve the aesthesis of the original images while defying facial image identity recognition.
An enormous number of images are currently shared through social networking services such as Facebook. These images usually contain appearance of people and may violate the people's privacy if they are published without permission from each person. To remedy this privacy concern, visual privacy protection, such as blurring, is applied to facial regions of people without permission. However, in addition to image quality degradation, this may spoil the context of the image: If some people are filtered while the others are not, missing facial expression makes comprehension of the image difficult. This paper proposes an image melding-based method that modifies facial regions in a visually unintrusive way with preserving facial expression. Our experimental results demonstrated that the proposed method can retain facial expression while protecting privacy.
Document image binarization is performed to segment foreground text from background text in badly degraded documents. In this paper, a comprehensive survey has been conducted on some state-of-the-art document image binarization techniques. After describing these document images binarization techniques, their performance have been compared with the help of various evaluation performance metrics which are widely used for document image analysis and recognition. On the basis of this comparison, it has been found out that the adaptive contrast method is the best performing method. Accordingly, the partial results that we have obtained for the adaptive contrast method have been stated and also the mathematical model and block diagram of the adaptive contrast method has been described in detail.
To improve comprehensive performance of denoising range images, an impulsive noise (IN) denoising method with variable windows is proposed in this paper. Founded on several discriminant criteria, the principles of dropout IN detection and outlier IN detection are provided. Subsequently, a nearest non-IN neighbors searching process and an Index Distance Weighted Mean filter is combined for IN denoising. As key factors of adapatablity of the proposed denoising method, the sizes of two windows for outlier INs detection and INs denoising are investigated. Originated from a theoretical model of invader occlusion, variable window is presented for adapting window size to dynamic environment of each point, accompanying with practical criteria of adaptive variable window size determination. Experiments on real range images of multi-line surface are proceeded with evaluations in terms of computational complexity and quality assessment with comparison analysis among a few other popular methods. It is indicated that the proposed method can detect the impulsive noises with high accuracy, meanwhile, denoise them with strong adaptability with the help of variable window.
This paper proposes an enhanced method for personal authentication based on finger Knuckle Print using Kekre's wavelet transform (KWT). Finger-knuckle-print (FKP) is the inherent skin patterns of the outer surface around the phalangeal joint of one's finger. It is highly discriminable and unique which makes it an emerging promising biometric identifier. Kekre's wavelet transform is constructed from Kekre's transform. The proposed system is evaluated on prepared FKP database that involves all categories of FKP. The total database of 500 samples of FKP. This paper focuses the different image enhancement techniques for the pre-processing of the captured images. The proposed algorithm is examined on 350 training and 150 testing samples of database and shows that the quality of database and pre-processing techniques plays important role to recognize the individual. The experimental result calculate the performance parameters like false acceptance rate (FAR), false rejection rate (FRR), True Acceptance rate (TAR), True rejection rate (TRR). The tested result demonstrated the improvement in EER (Error Equal Rate) which is very much important for authentication. The experimental result using Kekre's algorithm along with image enhancement shows that the finger knuckle recognition rate is better than the conventional method.
Large number of digital images and videos are acquired, stored, processed and shared nowadays. High quality imaging hardware and low cost, user friendly image editing software make digital mediums vulnerable to modifications. One of the most popular image modification techniques is copy move forgery. This tampering technique copies part of an image and pastes it into another part on the same image to conceal or to replicate some part of the image. Researchers proposed many techniques to detect copy move forged regions of images recently. These methods divide image into overlapping blocks and extract features to determine similarity among group of blocks. Selection of the feature extraction algorithm plays an important role on the accuracy of detection methods. Column averages of 1D-FT of rows is used to extract features from overlapping blocks on the image. Blocks are transformed into frequency domain using 1D-FT of the rows and average values of the transformed columns form feature vectors. Similarity of feature vectors indicates possible forged regions. Results show that the proposed method can detect copy pasted regions with higher accuracy compared to similar works reported in the literature. The method is also more resistant against the Gaussian blurring or JPEG compression attacks as shown in the results.
- « first
- ‹ previous
- 1
- 2
- 3