Visible to the public Biblio

Filters: Keyword is convolution  [Clear All Filters]
2018-11-19
Chen, Y., Lai, Y., Liu, Y..  2017.  Transforming Photos to Comics Using Convolutional Neural Networks. 2017 IEEE International Conference on Image Processing (ICIP). :2010–2014.

In this paper, inspired by Gatys's recent work, we propose a novel approach that transforms photos to comics using deep convolutional neural networks (CNNs). While Gatys's method that uses a pre-trained VGG network generally works well for transferring artistic styles such as painting from a style image to a content image, for more minimalist styles such as comics, the method often fails to produce satisfactory results. To address this, we further introduce a dedicated comic style CNN, which is trained for classifying comic images and photos. This new network is effective in capturing various comic styles and thus helps to produce better comic stylization results. Even with a grayscale style image, Gatys's method can still produce colored output, which is not desirable for comics. We develop a modified optimization framework such that a grayscale image is guaranteed to be synthesized. To avoid converging to poor local minima, we further initialize the output image using grayscale version of the content image. Various examples show that our method synthesizes better comic images than the state-of-the-art method.

Li, P., Zhao, L., Xu, D., Lu, D..  2018.  Incorporating Multiscale Contextual Loss for Image Style Transfer. 2018 IEEE 3rd International Conference on Image, Vision and Computing (ICIVC). :241–245.

In this paper, we propose to impose a multiscale contextual loss for image style transfer based on Convolutional Neural Networks (CNN). In the traditional optimization framework, a new stylized image is synthesized by constraining the high-level CNN features similar to a content image and the lower-level CNN features similar to a style image, which, however, appears to lost many details of the content image, presenting unpleasing and inconsistent distortions or artifacts. The proposed multiscale contextual loss, named Haar loss, is responsible for preserving the lost details by dint of matching the features derived from the content image and the synthesized image via wavelet transform. It endows the synthesized image with the characteristic to better retain the semantic information of the content image. More specifically, the unpleasant distortions can be effectively alleviated while the style can be well preserved. In the experiments, we show the visually more consistent and simultaneously well-stylized images generated by incorporating the multiscale contextual loss.

2018-06-11
Ocsa, A., Huillca, J. L., Coronado, R., Quispe, O., Arbieto, C., Lopez, C..  2017.  Approximate nearest neighbors by deep hashing on large-scale search: Comparison of representations and retrieval performance. 2017 IEEE Latin American Conference on Computational Intelligence (LA-CCI). :1–6.

The growing volume of data and its increasing complexity require even more efficient and faster information retrieval techniques. Approximate nearest neighbor search algorithms based on hashing were proposed to query high-dimensional datasets due to its high retrieval speed and low storage cost. Recent studies promote the use of Convolutional Neural Network (CNN) with hashing techniques to improve the search accuracy. However, there are challenges to solve in order to find a practical and efficient solution to index CNN features, such as the need for a heavy training process to achieve accurate query results and the critical dependency on data-parameters. In this work we execute exhaustive experiments in order to compare recent methods that are able to produces a better representation of the data space with a less computational cost for a better accuracy by computing the best data-parameter values for optimal sub-space projection exploring the correlations among CNN feature attributes using fractal theory. We give an overview of these different techniques and present our comparative experiments for data representation and retrieval performance.

Moskewicz, Matthew W., Jannesari, Ali, Keutzer, Kurt.  2017.  Boda: A Holistic Approach for Implementing Neural Network Computations. Proceedings of the Computing Frontiers Conference. :53–62.
Neural networks (NNs) are currently a very popular topic in machine learning for both research and practice. GPUs are the dominant computing platform for research efforts and are also gaining popularity as a deployment platform for applications such as autonomous vehicles. As a result, GPU vendors such as NVIDIA have spent enormous effort to write special-purpose NN libraries. On other hardware targets, especially mobile GPUs, such vendor libraries are not generally available. Thus, the development of portable, open, high-performance, energy-efficient GPU code for NN operations would enable broader deployment of NN-based algorithms. A root problem is that high efficiency GPU programming suffers from high complexity, low productivity, and low portability. To address this, this work presents a framework to enable productive, high-efficiency GPU programming for NN computations across hardware platforms and programming models. In particular, the framework provides specific support for metaprogramming and autotuning of operations over ND-Arrays. To show the correctness and value of our framework and approach, we implement a selection of NN operations, covering the core operations needed for deploying three common image-processing neural networks. We target three different hardware platforms: NVIDIA, AMD, and Qualcomm GPUs. On NVIDIA GPUs, we show both portability between OpenCL and CUDA as well competitive performance compared to the vendor library. On Qualcomm GPUs, we show that our framework enables productive development of target-specific optimizations, and achieves reasonable absolute performance. Finally, On AMD GPUs, we show initial results that indicate our framework can yield reasonable performance on a new platform with minimal effort.
2018-06-07
Sim, H., Nguyen, D., Lee, J., Choi, K..  2017.  Scalable stochastic-computing accelerator for convolutional neural networks. 2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC). :696–701.

Stochastic Computing (SC) is an alternative design paradigm particularly useful for applications where cost is critical. SC has been applied to neural networks, as neural networks are known for their high computational complexity. However previous work in this area has critical limitations such as the fully-parallel architecture assumption, which prevent them from being applicable to recent ones such as convolutional neural networks, or ConvNets. This paper presents the first SC architecture for ConvNets, shows its feasibility, with detailed analyses of implementation overheads. Our SC-ConvNet is a hybrid between SC and conventional binary design, which is a marked difference from earlier SC-based neural networks. Though this might seem like a compromise, it is a novel feature driven by the need to support modern ConvNets at scale, which commonly have many, large layers. Our proposed architecture also features hybrid layer composition, which helps achieve very high recognition accuracy. Our detailed evaluation results involving functional simulation and RTL synthesis suggest that SC-ConvNets are indeed competitive with conventional binary designs, even without considering inherent error resilience of SC.

Akcay, S., Breckon, T. P..  2017.  An evaluation of region based object detection strategies within X-ray baggage security imagery. 2017 IEEE International Conference on Image Processing (ICIP). :1337–1341.

Here we explore the applicability of traditional sliding window based convolutional neural network (CNN) detection pipeline and region based object detection techniques such as Faster Region-based CNN (R-CNN) and Region-based Fully Convolutional Networks (R-FCN) on the problem of object detection in X-ray security imagery. Within this context, with limited dataset availability, we employ a transfer learning paradigm for network training tackling both single and multiple object detection problems over a number of R-CNN/R-FCN variants. The use of first-stage region proposal within the Faster RCNN and R-FCN provide superior results than traditional sliding window driven CNN (SWCNN) approach. With the use of Faster RCNN with VGG16, pretrained on the ImageNet dataset, we achieve 88.3 mAP for a six object class X-ray detection problem. The use of R-FCN with ResNet-101, yields 96.3 mAP for the two class firearm detection problem requiring 0.1 second computation per image. Overall we illustrate the comparative performance of these techniques as object localization strategies within cluttered X-ray security imagery.

2018-04-04
Wu, F., Wang, J., Liu, J., Wang, W..  2017.  Vulnerability detection with deep learning. 2017 3rd IEEE International Conference on Computer and Communications (ICCC). :1298–1302.
Vulnerability detection is an import issue in information system security. In this work, we propose the deep learning method for vulnerability detection. We present three deep learning models, namely, convolution neural network (CNN), long short term memory (LSTM) and convolution neural network — long short term memory (CNN-LSTM). In order to test the performance of our approach, we collected 9872 sequences of function calls as features to represent the patterns of binary programs during their execution. We apply our deep learning models to predict the vulnerabilities of these binary programs based on the collected data. The experimental results show that the prediction accuracy of our proposed method reaches 83.6%, which is superior to that of traditional method like multi-layer perceptron (MLP).
Parchami, M., Bashbaghi, S., Granger, E..  2017.  CNNs with cross-correlation matching for face recognition in video surveillance using a single training sample per person. 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). :1–6.

In video surveillance, face recognition (FR) systems seek to detect individuals of interest appearing over a distributed network of cameras. Still-to-video FR systems match faces captured in videos under challenging conditions against facial models, often designed using one reference still per individual. Although CNNs can achieve among the highest levels of accuracy in many real-world FR applications, state-of-the-art CNNs that are suitable for still-to-video FR, like trunk-branch ensemble (TBE) CNNs, represent complex solutions for real-time applications. In this paper, an efficient CNN architecture is proposed for accurate still-to-video FR from a single reference still. The CCM-CNN is based on new cross-correlation matching (CCM) and triplet-loss optimization methods that provide discriminant face representations. The matching pipeline exploits a matrix Hadamard product followed by a fully connected layer inspired by adaptive weighted cross-correlation. A triplet-based training approach is proposed to optimize the CCM-CNN parameters such that the inter-class variations are increased, while enhancing robustness to intra-class variations. To further improve robustness, the network is fine-tuned using synthetically-generated faces based on still and videos of non-target individuals. Experiments on videos from the COX Face and Chokepoint datasets indicate that the CCM-CNN can achieve a high level of accuracy that is comparable to TBE-CNN and HaarNet, but with a significantly lower time and memory complexity. It may therefore represent the better trade-off between accuracy and complexity for real-time video surveillance applications.

2018-02-28
Brodeur, S., Rouat, J..  2017.  Optimality of inference in hierarchical coding for distributed object-based representations. 2017 15th Canadian Workshop on Information Theory (CWIT). :1–5.

Hierarchical approaches for representation learning have the ability to encode relevant features at multiple scales or levels of abstraction. However, most hierarchical approaches exploit only the last level in the hierarchy, or provide a multiscale representation that holds a significant amount of redundancy. We argue that removing redundancy across the multiple levels of abstraction is important for an efficient representation of compositionality in object-based representations. With the perspective of feature learning as a data compression operation, we propose a new greedy inference algorithm for hierarchical sparse coding. Convolutional matching pursuit with a L0-norm constraint was used to encode the input signal into compact and non-redundant codes distributed across levels of the hierarchy. Simple and complex synthetic datasets of temporal signals were created to evaluate the encoding efficiency and compare with the theoretical lower bounds on the information rate for those signals. Empirical evidence have shown that the algorithm is able to infer near-optimal codes for simple signals. However, it failed for complex signals with strong overlapping between objects. We explain the inefficiency of convolutional matching pursuit that occurred in such case. This brings new insights about the NP-hard optimization problem related to using L0-norm constraint in inferring optimally compact and distributed object-based representations.

2018-02-27
[Anonymous].  2017.  Sensitivity Analysis in Keystroke Dynamics Using Convolutional Neural Networks. 2017 IEEE Workshop on Information Forensics and Security (WIFS). :1–6.

Biometrics has become ubiquitous and spurred common use in many authentication mechanisms. Keystroke dynamics is a form of behavioral biometrics that can be used for user authentication while actively working at a terminal. The proposed mechanisms involve digraph, trigraph and n-graph analysis as separate solutions or suggest a fusion mechanism with certain limitations. However, deep learning can be used as a unifying machine learning technique that consolidates the power of all different features since it has shown tremendous results in image recognition and natural language processing. In this paper, we investigate the applicability of deep learning on three different datasets by using convolutional neural networks and Gaussian data augmentation technique. We achieve 10% higher accuracy and 7.3% lower equal error rate (EER) than existing methods. Also, our sensitivity analysis indicates that the convolution operation and the fully-connected layer are the most prominent factors that affect the accuracy and the convergence rate of a network trained with keystroke data.

2017-12-20
Wang, Y., Huang, Y., Zheng, W., Zhou, Z., Liu, D., Lu, M..  2017.  Combining convolutional neural network and self-adaptive algorithm to defeat synthetic multi-digit text-based CAPTCHA. 2017 IEEE International Conference on Industrial Technology (ICIT). :980–985.
We always use CAPTCHA(Completely Automated Public Turing test to Tell Computers and Humans Apart) to prevent automated bot for data entry. Although there are various kinds of CAPTCHAs, text-based scheme is still applied most widely, because it is one of the most convenient and user-friendly way for daily user [1]. The fact is that segmentations of different types of CAPTCHAs are not always the same, which means one of CAPTCHA's bottleneck is the segmentation. Once we could accurately split the character, the problem could be solved much easier. Unfortunately, the best way to divide them is still case by case, which is to say there is no universal way to achieve it. In this paper, we present a novel algorithm to achieve state-of-the-art performance, what was more, we also constructed a new convolutional neural network as an add-on recognition part to stabilize our state-of-the-art performance of the whole CAPTCHA system. The CAPTCHA datasets we are using is from the State Administration for Industry& Commerce of the People's Republic of China. In this datasets, there are totally 33 entrances of CAPTCHAs. In this experiments, we assume that each of the entrance is known. Results are provided showing how our algorithms work well towards these CAPTCHAs.
2017-11-20
Deng, C., Qiao, H..  2016.  Network security intrusion detection system based on incremental improved convolutional neural network model. 2016 International Conference on Communication and Electronics Systems (ICCES). :1–5.

With the popularization and development of network knowledge, network intruders are increasing, and the attack mode has been updated. Intrusion detection technology is a kind of active defense technology, which can extract the key information from the network system, and quickly judge and protect the internal or external network intrusion. Intrusion detection is a kind of active security technology, which provides real-time protection for internal attacks, external attacks and misuse, and it plays an important role in ensuring network security. However, with the diversification of intrusion technology, the traditional intrusion detection system cannot meet the requirements of the current network security. Therefore, the implementation of intrusion detection needs diversifying. In this context, we apply neural network technology to the network intrusion detection system to solve the problem. In this paper, on the basis of intrusion detection method, we analyze the development history and the present situation of intrusion detection technology, and summarize the intrusion detection system overview and architecture. The neural network intrusion detection is divided into data acquisition, data analysis, pretreatment, intrusion behavior detection and testing.

2017-09-15
Shim, Yong, Sengupta, Abhronil, Roy, Kaushik.  2016.  Low-power Approximate Convolution Computing Unit with Domain-wall Motion Based "Spin-memristor" for Image Processing Applications. Proceedings of the 53rd Annual Design Automation Conference. :21:1–21:6.

Convolution serves as the basic computational primitive for various associative computing tasks ranging from edge detection to image matching. CMOS implementation of such computations entails significant bottlenecks in area and energy consumption due to the large number of multiplication and addition operations involved. In this paper, we propose an ultra-low power and compact hybrid spintronic-CMOS design for the convolution computing unit. Low-voltage operation of domain-wall motion based magneto-metallic "Spin-Memristor"s interfaced with CMOS circuits is able to perform the convolution operation with reasonable accuracy. Simulation results of Gabor filtering for edge detection reveal \textasciitilde 2.5× lower energy consumption compared to a baseline 45nm-CMOS implementation.

2015-05-06
Shimauchi, S., Ohmuro, H..  2014.  Accurate adaptive filtering in square-root Hann windowed short-time fourier transform domain. Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on. :1305-1309.

A novel short-time Fourier transform (STFT) domain adaptive filtering scheme is proposed that can be easily combined with nonlinear post filters such as residual echo or noise reduction in acoustic echo cancellation. Unlike normal STFT subband adaptive filters, which suffers from aliasing artifacts due to its poor prototype filter, our scheme achieves good accuracy by exploiting the relationship between the linear convolution and the poor prototype filter, i.e., the STFT window function. The effectiveness of our scheme was confirmed through the results of simulations conducted to compare it with conventional methods.