Visible to the public Biblio

Filters: Keyword is image recognition  [Clear All Filters]
2018-12-10
Volz, V., Majchrzak, K., Preuss, M..  2018.  A Social Science-based Approach to Explanations for (Game) AI. 2018 IEEE Conference on Computational Intelligence and Games (CIG). :1–2.

The current AI revolution provides us with many new, but often very complex algorithmic systems. This complexity does not only limit understanding, but also acceptance of e.g. deep learning methods. In recent years, explainable AI (XAI) has been proposed as a remedy. However, this research is rarely supported by publications on explanations from social sciences. We suggest a bottom-up approach to explanations for (game) AI, by starting from a baseline definition of understandability informed by the concept of limited human working memory. We detail our approach and demonstrate its application to two games from the GVGAI framework. Finally, we discuss our vision of how additional concepts from social sciences can be integrated into our proposed approach and how the results can be generalised.

2018-06-20
Wang, Qinglong, Guo, Wenbo, Zhang, Kaixuan, Ororbia, II, Alexander G., Xing, Xinyu, Liu, Xue, Giles, C. Lee.  2017.  Adversary Resistant Deep Neural Networks with an Application to Malware Detection. Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. :1145–1153.
Outside the highly publicized victories in the game of Go, there have been numerous successful applications of deep learning in the fields of information retrieval, computer vision, and speech recognition. In cybersecurity, an increasing number of companies have begun exploring the use of deep learning (DL) in a variety of security tasks with malware detection among the more popular. These companies claim that deep neural networks (DNNs) could help turn the tide in the war against malware infection. However, DNNs are vulnerable to adversarial samples, a shortcoming that plagues most, if not all, statistical and machine learning models. Recent research has demonstrated that those with malicious intent can easily circumvent deep learning-powered malware detection by exploiting this weakness. To address this problem, previous work developed defense mechanisms that are based on augmenting training data or enhancing model complexity. However, after analyzing DNN susceptibility to adversarial samples, we discover that the current defense mechanisms are limited and, more importantly, cannot provide theoretical guarantees of robustness against adversarial sampled-based attacks. As such, we propose a new adversary resistant technique that obstructs attackers from constructing impactful adversarial samples by randomly nullifying features within data vectors. Our proposed technique is evaluated on a real world dataset with 14,679 malware variants and 17,399 benign programs. We theoretically validate the robustness of our technique, and empirically show that our technique significantly boosts DNN robustness to adversarial samples while maintaining high accuracy in classification. To demonstrate the general applicability of our proposed method, we also conduct experiments using the MNIST and CIFAR-10 datasets, widely used in image recognition research.
2018-04-11
Gebhardt, D., Parikh, K., Dzieciuch, I., Walton, M., Hoang, N. A. V..  2017.  Hunting for Naval Mines with Deep Neural Networks. OCEANS 2017 - Anchorage. :1–5.

Explosive naval mines pose a threat to ocean and sea faring vessels, both military and civilian. This work applies deep neural network (DNN) methods to the problem of detecting minelike objects (MLO) on the seafloor in side-scan sonar imagery. We explored how the DNN depth, memory requirements, calculation requirements, and training data distribution affect detection efficacy. A visualization technique (class activation map) was incorporated that aids a user in interpreting the model's behavior. We found that modest DNN model sizes yielded better accuracy (98%) than very simple DNN models (93%) and a support vector machine (78%). The largest DNN models achieved textless;1% efficacy increase at a cost of a 17x increase of trainable parameter count and computation requirements. In contrast to DNNs popularized for many-class image recognition tasks, the models for this task require far fewer computational resources (0.3% of parameters), and are suitable for embedded use within an autonomous unmanned underwater vehicle.

2018-03-05
Gowda, Thamme, Hundman, Kyle, Mattmann, Chris A..  2017.  An Approach for Automatic and Large Scale Image Forensics. Proceedings of the 2Nd International Workshop on Multimedia Forensics and Security. :16–20.

This paper describes the applications of deep learning-based image recognition in the DARPA Memex program and its repository of 1.4 million weapons-related images collected from the Deep web. We develop a fast, efficient, and easily deployable framework for integrating Google's Tensorflow framework with Apache Tika for automatically performing image forensics on the Memex data. Our framework and its integration are evaluated qualitatively and quantitatively and our work suggests that automated, large-scale, and reliable image classification and forensics can be widely used and deployed in bulk analysis for answering domain-specific questions.

2017-12-20
Azakami, T., Shibata, C., Uda, R..  2017.  Challenge to Impede Deep Learning against CAPTCHA with Ergonomic Design. 2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC). 1:637–642.

Once we had tried to propose an unbreakable CAPTCHA and we reached a result that limitation of time is effect to prevent computers from recognizing characters accurately while computers can finally recognize all text-based CAPTCHA in unlimited time. One of the existing usual ways to prevent computers from recognizing characters is distortion, and adding noise is also effective for the prevention. However, these kinds of prevention also make recognition of characters by human beings difficult. As a solution of the problems, an effective text-based CAPTCHA algorithm with amodal completion was proposed by our team. Our CAPTCHA causes computers a large amount of calculation costs while amodal completion helps human beings to recognize characters momentarily. Our CAPTCHA has evolved with aftereffects and combinations of complementary colors. We evaluated our CAPTCHA with deep learning which is attracting the most attention since deep learning is faster and more accurate than existing methods for recognition with computers. In this paper, we add jagged lines to edges of characters since edges are one of the most important parts for recognition in deep learning. In this paper, we also evaluate that how much the jagged lines decrease recognition of human beings and how much they prevent computers from the recognition. We confirm the effects of our method to deep learning.

An, G., Yu, W..  2017.  CAPTCHA Recognition Algorithm Based on the Relative Shape Context and Point Pattern Matching. 2017 9th International Conference on Measuring Technology and Mechatronics Automation (ICMTMA). :168–172.
Using shape context descriptors in the distance uneven grouping and its more extensive description of the shape feature, so this descriptor has the target contour point set deformation invariance. However, the twisted adhesions verification code have more outliers and more serious noise, the above-mentioned invariance of the shape context will become very bad, in order to solve the above descriptors' limitations, this article raise a new algorithm based on the relative shape context and point pattern matching to identify codes. And also experimented on the CSDN site's verification code, the result is that the recognition rate is higher than the traditional shape context and the response time is shorter.
2017-03-08
Chammas, E., Mokbel, C., Likforman-Sulem, L..  2015.  Arabic handwritten document preprocessing and recognition. 2015 13th International Conference on Document Analysis and Recognition (ICDAR). :451–455.

Arabic handwritten documents present specific challenges due to the cursive nature of the writing and the presence of diacritical marks. Moreover, one of the largest labeled database of Arabic handwritten documents, the OpenHart-NIST database includes specific noise, namely guidelines, that has to be addressed. We propose several approaches to process these documents. First a guideline detection approach has been developed, based on K-means, that detects the documents that include guidelines. We then propose a series of preprocessing at text-line level to reduce the noise effects. For text-lines including guidelines, a guideline removal preprocessing is described and existing keystroke restoration approaches are assessed. In addition, we propose a preprocessing that combines noise removal and deskewing by removing line fragments from neighboring text lines, while searching for the principal orientation of the text-line. We provide recognition results, showing the significant improvement brought by the proposed processings.

Çeker, H., Upadhyaya, S..  2015.  Enhanced recognition of keystroke dynamics using Gaussian mixture models. MILCOM 2015 - 2015 IEEE Military Communications Conference. :1305–1310.

Keystroke dynamics is a form of behavioral biometrics that can be used for continuous authentication of computer users. Many classifiers have been proposed for the analysis of acquired user patterns and verification of users at computer terminals. The underlying machine learning methods that use Gaussian density estimator for outlier detection typically assume that the digraph patterns in keystroke data are generated from a single Gaussian distribution. In this paper, we relax this assumption by allowing digraphs to fit more than one distribution via the Gaussian Mixture Model (GMM). We have conducted an experiment with a public data set collected in a controlled environment. Out of 30 users with dynamic text, we obtain 0.08% Equal Error Rate (EER) with 2 components by using GMM, while pure Gaussian yields 1.3% EER for the same data set (an improvement of EER by 93.8%). Our results show that GMM can recognize keystroke dynamics more precisely and authenticate users with higher confidence level.

Tsao, Chia-Chin, Chen, Yan-Ying, Hou, Yu-Lin, Hsu, Winston H..  2015.  Identify Visual Human Signature in community via wearable camera. 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). :2229–2233.

With the increasing popularity of wearable devices, information becomes much easily available. However, personal information sharing still poses great challenges because of privacy issues. We propose an idea of Visual Human Signature (VHS) which can represent each person uniquely even captured in different views/poses by wearable cameras. We evaluate the performance of multiple effective modalities for recognizing an identity, including facial appearance, visual patches, facial attributes and clothing attributes. We propose to emphasize significant dimensions and do weighted voting fusion for incorporating the modalities to improve the VHS recognition. By jointly considering multiple modalities, the VHS recognition rate can reach by 51% in frontal images and 48% in the more challenging environment and our approach can surpass the baseline with average fusion by 25% and 16%. We also introduce Multiview Celebrity Identity Dataset (MCID), a new dataset containing hundreds of identities with different view and clothing for comprehensive evaluation.

Prinosil, J., Krupka, A., Riha, K., Dutta, M. K., Singh, A..  2015.  Automatic hair color de-identification. 2015 International Conference on Green Computing and Internet of Things (ICGCIoT). :732–736.

A process of de-identification used for privacy protection in multimedia content should be applied not only for primary biometric traits (face, voice) but for soft biometric traits as well. This paper deals with a proposal of the automatic hair color de-identification method working with video records. The method involves image hair area segmentation, basic hair color recognition, and modification of hair color for real-looking de-identified images.

Chi, H., Hu, Y. H..  2015.  Face de-identification using facial identity preserving features. 2015 IEEE Global Conference on Signal and Information Processing (GlobalSIP). :586–590.

Automated human facial image de-identification is a much needed technology for privacy-preserving social media and intelligent surveillance applications. Other than the usual face blurring techniques, in this work, we propose to achieve facial anonymity by slightly modifying existing facial images into "averaged faces" so that the corresponding identities are difficult to uncover. This approach preserves the aesthesis of the facial images while achieving the goal of privacy protection. In particular, we explore a deep learning-based facial identity-preserving (FIP) features. Unlike conventional face descriptors, the FIP features can significantly reduce intra-identity variances, while maintaining inter-identity distinctions. By suppressing and tinkering FIP features, we achieve the goal of k-anonymity facial image de-identification while preserving desired utilities. Using a face database, we successfully demonstrate that the resulting "averaged faces" will still preserve the aesthesis of the original images while defying facial image identity recognition.

Nakashima, Y., Koyama, T., Yokoya, N., Babaguchi, N..  2015.  Facial expression preserving privacy protection using image melding. 2015 IEEE International Conference on Multimedia and Expo (ICME). :1–6.

An enormous number of images are currently shared through social networking services such as Facebook. These images usually contain appearance of people and may violate the people's privacy if they are published without permission from each person. To remedy this privacy concern, visual privacy protection, such as blurring, is applied to facial regions of people without permission. However, in addition to image quality degradation, this may spoil the context of the image: If some people are filtered while the others are not, missing facial expression makes comprehension of the image difficult. This paper proposes an image melding-based method that modifies facial regions in a visually unintrusive way with preserving facial expression. Our experimental results demonstrated that the proposed method can retain facial expression while protecting privacy.

Lokhande, S. S., Dawande, N. A..  2015.  A Survey on Document Image Binarization Techniques. 2015 International Conference on Computing Communication Control and Automation. :742–746.

Document image binarization is performed to segment foreground text from background text in badly degraded documents. In this paper, a comprehensive survey has been conducted on some state-of-the-art document image binarization techniques. After describing these document images binarization techniques, their performance have been compared with the help of various evaluation performance metrics which are widely used for document image analysis and recognition. On the basis of this comparison, it has been found out that the adaptive contrast method is the best performing method. Accordingly, the partial results that we have obtained for the adaptive contrast method have been stated and also the mathematical model and block diagram of the adaptive contrast method has been described in detail.

2015-05-06
Jian Wang, Lin Mei, Yi Li, Jian-Ye Li, Kun Zhao, Yuan Yao.  2014.  Variable Window for Outlier Detection and Impulsive Noise Recognition in Range Images. Cluster, Cloud and Grid Computing (CCGrid), 2014 14th IEEE/ACM International Symposium on. :857-864.

To improve comprehensive performance of denoising range images, an impulsive noise (IN) denoising method with variable windows is proposed in this paper. Founded on several discriminant criteria, the principles of dropout IN detection and outlier IN detection are provided. Subsequently, a nearest non-IN neighbors searching process and an Index Distance Weighted Mean filter is combined for IN denoising. As key factors of adapatablity of the proposed denoising method, the sizes of two windows for outlier INs detection and INs denoising are investigated. Originated from a theoretical model of invader occlusion, variable window is presented for adapting window size to dynamic environment of each point, accompanying with practical criteria of adaptive variable window size determination. Experiments on real range images of multi-line surface are proceeded with evaluations in terms of computational complexity and quality assessment with comparison analysis among a few other popular methods. It is indicated that the proposed method can detect the impulsive noises with high accuracy, meanwhile, denoise them with strong adaptability with the help of variable window.
 

2015-05-05
Raut, R.D., Kulkarni, S., Gharat, N.N..  2014.  Biometric Authentication Using Kekre's Wavelet Transform. Electronic Systems, Signal Processing and Computing Technologies (ICESC), 2014 International Conference on. :99-104.

This paper proposes an enhanced method for personal authentication based on finger Knuckle Print using Kekre's wavelet transform (KWT). Finger-knuckle-print (FKP) is the inherent skin patterns of the outer surface around the phalangeal joint of one's finger. It is highly discriminable and unique which makes it an emerging promising biometric identifier. Kekre's wavelet transform is constructed from Kekre's transform. The proposed system is evaluated on prepared FKP database that involves all categories of FKP. The total database of 500 samples of FKP. This paper focuses the different image enhancement techniques for the pre-processing of the captured images. The proposed algorithm is examined on 350 training and 150 testing samples of database and shows that the quality of database and pre-processing techniques plays important role to recognize the individual. The experimental result calculate the performance parameters like false acceptance rate (FAR), false rejection rate (FRR), True Acceptance rate (TAR), True rejection rate (TRR). The tested result demonstrated the improvement in EER (Error Equal Rate) which is very much important for authentication. The experimental result using Kekre's algorithm along with image enhancement shows that the finger knuckle recognition rate is better than the conventional method.
 

2015-05-01
Ketenci, S., Ulutas, G., Ulutas, M..  2014.  Detection of duplicated regions in images using 1D-Fourier transform. Systems, Signals and Image Processing (IWSSIP), 2014 International Conference on. :171-174.

Large number of digital images and videos are acquired, stored, processed and shared nowadays. High quality imaging hardware and low cost, user friendly image editing software make digital mediums vulnerable to modifications. One of the most popular image modification techniques is copy move forgery. This tampering technique copies part of an image and pastes it into another part on the same image to conceal or to replicate some part of the image. Researchers proposed many techniques to detect copy move forged regions of images recently. These methods divide image into overlapping blocks and extract features to determine similarity among group of blocks. Selection of the feature extraction algorithm plays an important role on the accuracy of detection methods. Column averages of 1D-FT of rows is used to extract features from overlapping blocks on the image. Blocks are transformed into frequency domain using 1D-FT of the rows and average values of the transformed columns form feature vectors. Similarity of feature vectors indicates possible forged regions. Results show that the proposed method can detect copy pasted regions with higher accuracy compared to similar works reported in the literature. The method is also more resistant against the Gaussian blurring or JPEG compression attacks as shown in the results.