Yeh, M., Tang, S., Bhattad, A., Zou, C., Forsyth, D..
2020.
Improving Style Transfer with Calibrated Metrics. 2020 IEEE Winter Conference on Applications of Computer Vision (WACV). :3149–3157.
Style transfer produces a transferred image which is a rendering of a content image in the manner of a style image. We seek to understand how to improve style transfer.To do so requires quantitative evaluation procedures, but current evaluation is qualitative, mostly involving user studies. We describe a novel quantitative evaluation procedure. Our procedure relies on two statistics: the Effectiveness (E) statistic measures the extent that a given style has been transferred to the target, and the Coherence (C) statistic measures the extent to which the original image's content is preserved. Our statistics are calibrated to human preference: targets with larger values of E and C will reliably be preferred by human subjects in comparisons of style and content, respectively.We use these statistics to investigate relative performance of a number of Neural Style Transfer (NST) methods, revealing a number of intriguing properties. Admissible methods lie on a Pareto frontier (i.e. improving E reduces C, or vice versa). Three methods are admissible: Universal style transfer produces very good C but weak E; modifying the optimization used for Gatys' loss produces a method with strong E and strong C; and a modified cross-layer method has slightly better E at strong cost in C. While the histogram loss improves the E statistics of Gatys' method, it does not make the method admissible. Surprisingly, style weights have relatively little effect in improving EC scores, and most variability in transfer is explained by the style itself (meaning experimenters can be misguided by selecting styles). Our GitHub Link is available1.
Rathi, P., Adarsh, P., Kumar, M..
2020.
Deep Learning Approach for Arbitrary Image Style Fusion and Transformation using SANET model. 2020 4th International Conference on Trends in Electronics and Informatics (ICOEI)(48184). :1049–1057.
For real-time applications of arbitrary style transformation, there is a trade-off between the quality of results and the running time of existing algorithms. Hence, it is required to maintain the equilibrium of the quality of generated artwork with the speed of execution. It's complicated for the present arbitrary style-transformation procedures to preserve the structure of content-image while blending with the design and pattern of style-image. This paper presents the implementation of a network using SANET models for generating impressive artworks. It is flexible in the fusion of new style characteristics while sustaining the semantic-structure of the content-image. The identity-loss function helps to minimize the overall loss and conserves the spatial-arrangement of content. The results demonstrate that this method is practically efficient, and therefore it can be employed for real-time fusion and transformation using arbitrary styles.
Jin, N..
2020.
CNN-Based Image Style Transfer and Its Applications. 2020 International Conference on Computing and Data Science (CDS). :387–390.
Convolutional neural network is a variant of deep neural network. It is widely used to extract image features and used in image classification, image generation and style transfer. In the style transfer task, we can extract the content from one picture through the convolutional neural network, extract the style from another picture, and generate a new picture by the combination. In this paper, we show the general steps of image style transfer based on convolutional neural networks through a specific example, and discuss the future possible applications.
Wu, L., Chen, X., Meng, L., Meng, X..
2020.
Multitask Adversarial Learning for Chinese Font Style Transfer. 2020 International Joint Conference on Neural Networks (IJCNN). :1–8.
Style transfer between Chinese fonts is challenging due to both the complexity of Chinese characters and the significant difference between fonts. Existing algorithms for this task typically learn a mapping between the reference and target fonts for each character. Subsequently, this mapping is used to generate the characters that do not exist in the target font. However, the characters available for training are unlikely to cover all fine-grained parts of the missing characters, leading to the overfitting problem. As a result, the generated characters of the target font may suffer problems of incomplete or even radicals and dirty dots. To address this problem, this paper presents a multi-task adversarial learning approach, termed MTfontGAN, to generate more vivid Chinese characters. MTfontGAN learns to transfer a reference font to multiple target ones simultaneously. An alignment is imposed on the encoders of different tasks to make them focus on the important parts of the characters in general style transfer. Such cross-task interactions at the feature level effectively improve the generalization capability of MTfontGAN. The performance of MTfontGAN is evaluated on three Chinese font datasets. Experimental results show that MTfontGAN outperforms the state-of-the-art algorithms in a single-task setting. More importantly, increasing the number of tasks leads to better performance in all of them.
Wang, H., Li, Y., Wang, Y., Hu, H., Yang, M.-H..
2020.
Collaborative Distillation for Ultra-Resolution Universal Style Transfer. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). :1857–1866.
Universal style transfer methods typically leverage rich representations from deep Convolutional Neural Network (CNN) models (e.g., VGG-19) pre-trained on large collections of images. Despite the effectiveness, its application is heavily constrained by the large model size to handle ultra-resolution images given limited memory. In this work, we present a new knowledge distillation method (named Collaborative Distillation) for encoder-decoder based neural style transfer to reduce the convolutional filters. The main idea is underpinned by a finding that the encoder-decoder pairs construct an exclusive collaborative relationship, which is regarded as a new kind of knowledge for style transfer models. Moreover, to overcome the feature size mismatch when applying collaborative distillation, a linear embedding loss is introduced to drive the student network to learn a linear embedding of the teacher's features. Extensive experiments show the effectiveness of our method when applied to different universal style transfer approaches (WCT and AdaIN), even if the model size is reduced by 15.5 times. Especially, on WCT with the compressed models, we achieve ultra-resolution (over 40 megapixels) universal style transfer on a 12GB GPU for the first time. Further experiments on optimization-based stylization scheme show the generality of our algorithm on different stylization paradigms. Our code and trained models are available at https://github.com/mingsun-tse/collaborative-distillation.
Jiang, H., Du, M., Whiteside, D., Moursy, O., Yang, Y..
2020.
An Approach to Embedding a Style Transfer Model into a Mobile APP. 2020 International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE). :307–316.
The prevalence of photo processing apps suggests the demands of picture editing. As an implementation of the convolutional neural network, style transfer has been deep investigated and there are supported materials to realize it on PC platform. However, few approaches are mentioned to deploy a style transfer model on the mobile and meet the requirements of mobile users. The traditional style transfer model takes hours to proceed, therefore, based on a Perceptual Losses algorithm [1], we created a feedforward neural network for each style and the proceeding time was reduced to a few seconds. The training data were generated from a pre-trained convolutional neural network model, VGG-19. The algorithm took thousandth time and generated similar output as the original. Furthermore, we optimized the model and deployed the model with TensorFlow Mobile library. We froze the model and adopted a bitmap to scale the inputs to 720×720 and reverted back to the original resolution. The reverting process may create some blur but it can be regarded as a feature of art. The generated images have reliable quality and the waiting time is independent of the content and pattern of input images. The main factor that influences the proceeding time is the input resolution. The average waiting time of our model on the mobile phone, HUAWEI P20 Pro, is less than 2 seconds for 720p images and around 2.8 seconds for 1080p images, which are ten times slower than that on the PC GPU, Tesla T40. The performance difference depends on the architecture of the model.
Ye, H., Liu, W., Huang, S..
2020.
Method of Image Style Transfer Based on Edge Detection. 2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC). 1:1635–1639.
In order to overcome the problem of edge information loss in the process of neural network processing, a method of neural network style transfer based on edge detection is presented. The edge information of the content image is extracted, and the edge information image is processed in the neural network together with the content image and the style image to constrain the edge information of the content image. Compared with Gatys algorithm and markov random field neural network algorithm, the content image edge structure after image style transfer is successfully retained.
Mangaokar, N., Pu, J., Bhattacharya, P., Reddy, C. K., Viswanath, B..
2020.
Jekyll: Attacking Medical Image Diagnostics using Deep Generative Models. 2020 IEEE European Symposium on Security and Privacy (EuroS P). :139–157.
Advances in deep neural networks (DNNs) have shown tremendous promise in the medical domain. However, the deep learning tools that are helping the domain, can also be used against it. Given the prevalence of fraud in the healthcare domain, it is important to consider the adversarial use of DNNs in manipulating sensitive data that is crucial to patient healthcare. In this work, we present the design and implementation of a DNN-based image translation attack on biomedical imagery. More specifically, we propose Jekyll, a neural style transfer framework that takes as input a biomedical image of a patient and translates it to a new image that indicates an attacker-chosen disease condition. The potential for fraudulent claims based on such generated `fake' medical images is significant, and we demonstrate successful attacks on both X-rays and retinal fundus image modalities. We show that these attacks manage to mislead both medical professionals and algorithmic detection schemes. Lastly, we also investigate defensive measures based on machine learning to detect images generated by Jekyll.
Bai, Y., Guo, Y., Wei, J., Lu, L., Wang, R., Wang, Y..
2020.
Fake Generated Painting Detection Via Frequency Analysis. 2020 IEEE International Conference on Image Processing (ICIP). :1256–1260.
With the development of deep neural networks, digital fake paintings can be generated by various style transfer algorithms. To detect the fake generated paintings, we analyze the fake generated and real paintings in Fourier frequency domain and observe statistical differences and artifacts. Based on our observations, we propose Fake Generated Painting Detection via Frequency Analysis (FGPD-FA) by extracting three types of features in frequency domain. Besides, we also propose a digital fake painting detection database for assessing the proposed method. Experimental results demonstrate the excellence of the proposed method in different testing conditions.
Jin, H., Wang, T., Zhang, M., Li, M., Wang, Y., Snoussi, H..
2020.
Neural Style Transfer for Picture with Gradient Gram Matrix Description. 2020 39th Chinese Control Conference (CCC). :7026–7030.
Despite the high performance of neural style transfer on stylized pictures, we found that Gatys et al [1] algorithm cannot perfectly reconstruct texture style. Output stylized picture could emerge unsatisfied unexpected textures such like muddiness in local area and insufficient grain expression. Our method bases on original algorithm, adding the Gradient Gram description on style loss, aiming to strengthen texture expression and eliminate muddiness. To some extent our method lengthens the runtime, however, its output stylized pictures get higher performance on texture details, especially in the elimination of muddiness.