Visible to the public Biblio

Found 221 results

Filters: Keyword is visualization  [Clear All Filters]
2020-06-12
Ay, Betül, Aydın, Galip, Koyun, Zeynep, Demir, Mehmet.  2019.  A Visual Similarity Recommendation System using Generative Adversarial Networks. 2019 International Conference on Deep Learning and Machine Learning in Emerging Applications (Deep-ML). :44—48.

The goal of content-based recommendation system is to retrieve and rank the list of items that are closest to the query item. Today, almost every e-commerce platform has a recommendation system strategy for products that customers can decide to buy. In this paper we describe our work on creating a Generative Adversarial Network based image retrieval system for e-commerce platforms to retrieve best similar images for a given product image specifically for shoes. We compare state-of-the-art solutions and provide results for the proposed deep learning network on a standard data set.

2020-06-04
Cao, Lizhou, Peng, Chao, Hansberger, Jeffery T..  2019.  A Large Curved Display System in Virtual Reality for Immersive Data Interaction. 2019 IEEE Games, Entertainment, Media Conference (GEM). :1—4.

This work presents the design and implementation of a large curved display system in a virtual reality (VR) environment that supports visualization of 2D datasets (e.g., images, buttons and text). By using this system, users are allowed to interact with data in front of a wide field of view and gain a high level of perceived immersion. We exhibit two use cases of this system, including (1) a virtual image wall as the display component of a 3D user interface, and (2) an inventory interface for a VR-based educational game. The use cases demonstrate capability and flexibility of curved displays in supporting varied purposes of data interaction within virtual environments.

Asiri, Somayah, Alzahrani, Ahmad A..  2019.  The Effectiveness of Mixed Reality Environment-Based Hand Gestures in Distributed Collaboration. 2019 2nd International Conference on Computer Applications Information Security (ICCAIS). :1—6.

Mixed reality (MR) technologies are widely used in distributed collaborative learning scenarios and have made learning and training more flexible and intuitive. However, there are many challenges in the use of MR due to the difficulty in creating a physical presence, particularly when a physical task is being performed collaboratively. We therefore developed a novel MR system to overcomes these limitations and enhance the distributed collaboration user experience. The primary objective of this paper is to explore the potential of a MR-based hand gestures system to enhance the conceptual architecture of MR in terms of both visualization and interaction in distributed collaboration. We propose a synchronous prototype named MRCollab as an immersive collaborative approach that allows two or more users to communicate with a peer based on the integration of several technologies such as video, audio, and hand gestures.

2020-06-01
Alshinina, Remah, Elleithy, Khaled.  2018.  A highly accurate machine learning approach for developing wireless sensor network middleware. 2018 Wireless Telecommunications Symposium (WTS). :1–7.
Despite the popularity of wireless sensor networks (WSNs) in a wide range of applications, security problems associated with them have not been completely resolved. Middleware is generally introduced as an intermediate layer between WSNs and the end user to resolve some limitations, but most of the existing middleware is unable to protect data from malicious and unknown attacks during transmission. This paper introduces an intelligent middleware based on an unsupervised learning technique called Generative Adversarial Networks (GANs) algorithm. GANs contain two networks: a generator (G) network and a detector (D) network. The G creates fake data similar to the real samples and combines it with real data from the sensors to confuse the attacker. The D contains multi-layers that have the ability to differentiate between real and fake data. The output intended for this algorithm shows an actual interpretation of the data that is securely communicated through the WSN. The framework is implemented in Python with experiments performed using Keras. Results illustrate that the suggested algorithm not only improves the accuracy of the data but also enhances its security by protecting data from adversaries. Data transmission from the WSN to the end user then becomes much more secure and accurate compared to conventional techniques.
2020-04-24
Zhang, Lei, Zhang, Jianqing, Chen, Yong, Liao, Shaowen.  2018.  Research on the Simulation Algorithm of Object-Oriented Language. 2018 3rd International Conference on Smart City and Systems Engineering (ICSCSE). :902—904.

Security model is an important subject in the field of low energy independence complexity theory. It takes security strategy as the core, changes the system from static protection to dynamic protection, and provides the basis for the rapid response of the system. A large number of empirical studies have been conducted to verify the cache consistency. The development of object oriented language is pure object oriented language, and the other is mixed object oriented language, that is, adding class, inheritance and other elements in process language and other languages. This paper studies a new object-oriented language application, namely GUT for a write-back cache, which is based on the study of simulation algorithm to solve all these challenges in the field of low energy independence complexity theory.

2020-03-30
Bharati, Aparna, Moreira, Daniel, Brogan, Joel, Hale, Patricia, Bowyer, Kevin, Flynn, Patrick, Rocha, Anderson, Scheirer, Walter.  2019.  Beyond Pixels: Image Provenance Analysis Leveraging Metadata. 2019 IEEE Winter Conference on Applications of Computer Vision (WACV). :1692–1702.
Creative works, whether paintings or memes, follow unique journeys that result in their final form. Understanding these journeys, a process known as "provenance analysis," provides rich insights into the use, motivation, and authenticity underlying any given work. The application of this type of study to the expanse of unregulated content on the Internet is what we consider in this paper. Provenance analysis provides a snapshot of the chronology and validity of content as it is uploaded, re-uploaded, and modified over time. Although still in its infancy, automated provenance analysis for online multimedia is already being applied to different types of content. Most current works seek to build provenance graphs based on the shared content between images or videos. This can be a computationally expensive task, especially when considering the vast influx of content that the Internet sees every day. Utilizing non-content-based information, such as timestamps, geotags, and camera IDs can help provide important insights into the path a particular image or video has traveled during its time on the Internet without large computational overhead. This paper tests the scope and applicability of metadata-based inferences for provenance graph construction in two different scenarios: digital image forensics and cultural analytics.
2020-03-02
Fu, Rao, Grinberg, Ilya, Gogolyuk, Petro.  2019.  Electric Power Distribution System Fault Recovery Based on Visual Computation. 2019 IEEE 20th International Conference on Computational Problems of Electrical Engineering (CPEE). :1–4.

A study case of electric power distribution system fault recovery has been introduced in this article. With proper connections, network reconfiguration should be considered an effective solution to the system fault condition. Considering the radial structure of the distribution system, appropriate observation on visualized outcome of the voltage profile can lead the system operator to obtain the best switching line effectively. Contour plots are applied for visualizing the voltage profiles of a modified IEEE 13-node test feeder model.

2020-02-18
Han, Chihye, Yoon, Wonjun, Kwon, Gihyun, Kim, Daeshik, Nam, Seungkyu.  2019.  Representation of White- and Black-Box Adversarial Examples in Deep Neural Networks and Humans: A Functional Magnetic Resonance Imaging Study. 2019 International Joint Conference on Neural Networks (IJCNN). :1–8.

The recent success of brain-inspired deep neural networks (DNNs) in solving complex, high-level visual tasks has led to rising expectations for their potential to match the human visual system. However, DNNs exhibit idiosyncrasies that suggest their visual representation and processing might be substantially different from human vision. One limitation of DNNs is that they are vulnerable to adversarial examples, input images on which subtle, carefully designed noises are added to fool a machine classifier. The robustness of the human visual system against adversarial examples is potentially of great importance as it could uncover a key mechanistic feature that machine vision is yet to incorporate. In this study, we compare the visual representations of white- and black-box adversarial examples in DNNs and humans by leveraging functional magnetic resonance imaging (fMRI). We find a small but significant difference in representation patterns for different (i.e. white- versus black-box) types of adversarial examples for both humans and DNNs. However, human performance on categorical judgment is not degraded by noise regardless of the type unlike DNN. These results suggest that adversarial examples may be differentially represented in the human visual system, but unable to affect the perceptual experience.

2020-02-17
Eckhart, Matthias, Ekelhart, Andreas, Weippl, Edgar.  2019.  Enhancing Cyber Situational Awareness for Cyber-Physical Systems through Digital Twins. 2019 24th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA). :1222–1225.
Operators of cyber-physical systems (CPSs) need to maintain awareness of the cyber situation in order to be able to adequately address potential issues in a timely manner. For instance, detecting early symptoms of cyber attacks may speed up the incident response process and mitigate consequences of attacks (e.g., business interruption, safety hazards). However, attaining a full understanding of the cyber situation may be challenging, given the complexity of CPSs and the ever-changing threat landscape. In particular, CPSs typically need to be continuously operational, may be sensitive to active scanning, and often provide only limited in-depth analysis capabilities. To address these challenges, we propose to utilize the concept of digital twins for enhancing cyber situational awareness. Digital twins, i.e., virtual replicas of systems, can run in parallel to their physical counterparts and allow deep inspection of their behavior without the risk of disrupting operational technology services. This paper reports our work in progress to develop a cyber situational awareness framework based on digital twins that provides a profound, holistic, and current view on the cyber situation that CPSs are in. More specifically, we present a prototype that provides real-time visualization features (i.e., system topology, program variables of devices) and enables a thorough, repeatable investigation process on a logic and network level. A brief explanation of technological use cases and outlook on future development efforts completes this work.
2020-02-10
Selvi J., Anitha Gnana, kalavathy G., Maria.  2019.  Probing Image and Video Steganography Based On Discrete Wavelet and Discrete Cosine Transform. 2019 Fifth International Conference on Science Technology Engineering and Mathematics (ICONSTEM). 1:21–24.

Now-a-days, video steganography has developed for a secured communication among various users. The two important factor of steganography method are embedding potency and embedding payload. Here, a Multiple Object Tracking (MOT) algorithmic programs used to detect motion object, also shows foreground mask. Discrete wavelet Transform (DWT) and Discrete Cosine Transform (DCT) are used for message embedding and extraction stage. In existing system Least significant bit method was proposed. This technique of hiding data may lose some data after some file transformation. The suggested Multiple object tracking algorithm increases embedding and extraction speed, also protects secret message against various attackers.

2019-10-28
Trunov, Artem S., Voronova, Lilia I., Voronov, Vyacheslav I., Ayrapetov, Dmitriy P..  2018.  Container Cluster Model Development for Legacy Applications Integration in Scientific Software System. 2018 IEEE International Conference "Quality Management, Transport and Information Security, Information Technologies" (IT QM IS). :815–819.
Feature of modern scientific information systems is their integration with computing applications, providing distributed computer simulation and intellectual processing of Big Data using high-efficiency computing. Often these software systems include legacy applications in different programming languages, with non-standardized interfaces. To solve the problem of applications integration, containerization systems are using that allow to configure environment in the shortest time to deploy software system. However, there are no such systems for computer simulation systems with large number of nodes. The article considers the actual task of combining containers into a cluster, integrating legacy applications to manage the distributed software system MD-SLAG-MELT v.14, which supports high-performance computing and visualization of the computer experiments results. Testing results of the container cluster including automatic load sharing module for MD-SLAG-MELT system v.14. are given.
2019-09-04
Liang, J., Jiang, L., Cao, L., Li, L., Hauptmann, A..  2018.  Focal Visual-Text Attention for Visual Question Answering. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. :6135–6143.
Recent insights on language and vision with neural networks have been successfully applied to simple single-image visual question answering. However, to tackle real-life question answering problems on multimedia collections such as personal photos, we have to look at whole collections with sequences of photos or videos. When answering questions from a large collection, a natural problem is to identify snippets to support the answer. In this paper, we describe a novel neural network called Focal Visual-Text Attention network (FVTA) for collective reasoning in visual question answering, where both visual and text sequence information such as images and text metadata are presented. FVTA introduces an end-to-end approach that makes use of a hierarchical process to dynamically determine what media and what time to focus on in the sequential data to answer the question. FVTA can not only answer the questions well but also provides the justifications which the system results are based upon to get the answers. FVTA achieves state-of-the-art performance on the MemexQA dataset and competitive results on the MovieQA dataset.
2019-06-24
Naeem, H., Guo, B., Naeem, M. R..  2018.  A light-weight malware static visual analysis for IoT infrastructure. 2018 International Conference on Artificial Intelligence and Big Data (ICAIBD). :240–244.

Recently a huge trend on the internet of things (IoT) and an exponential increase in automated tools are helping malware producers to target IoT devices. The traditional security solutions against malware are infeasible due to low computing power for large-scale data in IoT environment. The number of malware and their variants are increasing due to continuous malware attacks. Consequently, the performance improvement in malware analysis is critical requirement to stop rapid expansion of malicious attacks in IoT environment. To solve this problem, the paper proposed a novel framework for classifying malware in IoT environment. To achieve flne-grained malware classification in suggested framework, the malware image classification system (MICS) is designed for representing malware image globally and locally. MICS first converts the suspicious program into the gray-scale image and then captures hybrid local and global malware features to perform malware family classification. Preliminary experimental outcomes of MICS are quite promising with 97.4% classification accuracy on 9342 windows suspicious programs of 25 families. The experimental results indicate that proposed framework is quite capable to process large-scale IoT malware.

2019-06-17
Väisänen, Teemu, Noponen, Sami, Latvala, Outi-Marja, Kuusijärvi, Jarkko.  2018.  Combining Real-Time Risk Visualization and Anomaly Detection. Proceedings of the 12th European Conference on Software Architecture: Companion Proceedings. :55:1-55:7.

Traditional risk management produces a rather static listing of weaknesses, probabilities and mitigations. Large share of cyber security risks realize through computer networks. These attacks or attack attempts produce events that are detected by various monitoring techniques such as Intrusion Detection Systems (IDS). Often the link between detecting these potentially dangerous real-time events and risk management process is lacking, or completely missing. This paper presents means for transferring and visualizing the network events in the risk management instantly with a tool called Metrics Visualization System (MVS). The tool is used to dynamically visualize network security events of a Terrestrial Trunked Radio (TETRA) network running in Software Defined Networking (SDN) context as a case study. Visualizations are presented with a treelike graph, that gives a quick easily understandable overview of the cyber security situation. This paper also discusses what network security events are monitored and how they affect the more general risk levels. The major benefit of this approach is that the risk analyst is able to map the designed risk tree/security metrics into actual real-time events and view the system's security posture with the help of a runtime visualization view.

Garae, J., Ko, R. K. L., Apperley, M..  2018.  A Full-Scale Security Visualization Effectiveness Measurement and Presentation Approach. 2018 17th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ 12th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :639–650.
What makes a security visualization effective? How do we measure visualization effectiveness in the context of investigating, analyzing, understanding and reporting cyber security incidents? Identifying and understanding cyber-attacks are critical for decision making - not just at the technical level, but also the management and policy-making levels. Our research studied both questions and extends our Security Visualization Effectiveness Measurement (SvEm) framework by providing a full-scale effectiveness approach for both theoretical and user-centric visualization techniques. Our framework facilitates effectiveness through interactive three-dimensional visualization to enhance both single and multi-user collaboration. We investigated effectiveness metrics including (1) visual clarity, (2) visibility, (3) distortion rates and (4) user response (viewing) times. The SvEm framework key components are: (1) mobile display dimension and resolution factor, (2) security incident entities, (3) user cognition activators and alerts, (4) threat scoring system, (5) working memory load and (6) color usage management. To evaluate our full-scale security visualization effectiveness framework, we developed VisualProgger - a real-time security visualization application (web and mobile) visualizing data provenance changes in SvEm use cases. Finally, the SvEm visualizations aims to gain the users' attention span by ensuring a consistency in the viewer's cognitive load, while increasing the viewer's working memory load. In return, users have high potential to gain security insights in security visualization. Our evaluation shows that viewers perform better with prior knowledge (working memory load) of security events and that circular visualization designs attract and maintain the viewer's attention span. These discoveries revealed research directions for future work relating to measurement of security visualization effectiveness.
2019-06-10
Kornish, D., Geary, J., Sansing, V., Ezekiel, S., Pearlstein, L., Njilla, L..  2018.  Malware Classification Using Deep Convolutional Neural Networks. 2018 IEEE Applied Imagery Pattern Recognition Workshop (AIPR). :1-6.

In recent years, deep convolution neural networks (DCNNs) have won many contests in machine learning, object detection, and pattern recognition. Furthermore, deep learning techniques achieved exceptional performance in image classification, reaching accuracy levels beyond human capability. Malware variants from similar categories often contain similarities due to code reuse. Converting malware samples into images can cause these patterns to manifest as image features, which can be exploited for DCNN classification. Techniques for converting malware binaries into images for visualization and classification have been reported in the literature, and while these methods do reach a high level of classification accuracy on training datasets, they tend to be vulnerable to overfitting and perform poorly on previously unseen samples. In this paper, we explore and document a variety of techniques for representing malware binaries as images with the goal of discovering a format best suited for deep learning. We implement a database for malware binaries from several families, stored in hexadecimal format. These malware samples are converted into images using various approaches and are used to train a neural network to recognize visual patterns in the input and classify malware based on the feature vectors. Each image type is assessed using a variety of learning models, such as transfer learning with existing DCNN architectures and feature extraction for support vector machine classifier training. Each technique is evaluated in terms of classification accuracy, result consistency, and time per trial. Our preliminary results indicate that improved image representation has the potential to enable more effective classification of new malware.

2019-03-22
Quweider, M., Lei, H., Zhang, L., Khan, F..  2018.  Managing Big Data in Visual Retrieval Systems for DHS Applications: Combining Fourier Descriptors and Metric Space Indexing. 2018 1st International Conference on Data Intelligence and Security (ICDIS). :188-193.

Image retrieval systems have been an active area of research for more than thirty years progressively producing improved algorithms that improve performance metrics, operate in different domains, take advantage of different features extracted from the images to be retrieved, and have different desirable invariance properties. With the ever-growing visual databases of images and videos produced by a myriad of devices comes the challenge of selecting effective features and performing fast retrieval on such databases. In this paper, we incorporate Fourier descriptors (FD) along with a metric-based balanced indexing tree as a viable solution to DHS (Department of Homeland Security) needs to for quick identification and retrieval of weapon images. The FDs allow a simple but effective outline feature representation of an object, while the M-tree provide a dynamic, fast, and balanced search over such features. Motivated by looking for applications of interest to DHS, we have created a basic guns and rifles databases that can be used to identify weapons in images and videos extracted from media sources. Our simulations show excellent performance in both representation and fast retrieval speed.

2019-02-14
Sun, A., Gao, G., Ji, T., Tu, X..  2018.  One Quantifiable Security Evaluation Model for Cloud Computing Platform. 2018 Sixth International Conference on Advanced Cloud and Big Data (CBD). :197-201.

Whatever one public cloud, private cloud or a mixed cloud, the users lack of effective security quantifiable evaluation methods to grasp the security situation of its own information infrastructure on the whole. This paper provides a quantifiable security evaluation system for different clouds that can be accessed by consistent API. The evaluation system includes security scanning engine, security recovery engine, security quantifiable evaluation model, visual display module and etc. The security evaluation model composes of a set of evaluation elements corresponding different fields, such as computing, storage, network, maintenance, application security and etc. Each element is assigned a three tuple on vulnerabilities, score and repair method. The system adopts ``One vote vetoed'' mechanism for one field to count its score and adds up the summary as the total score, and to create one security view. We implement the quantifiable evaluation for different cloud users based on our G-Cloud platform. It shows the dynamic security scanning score for one or multiple clouds with visual graphs and guided users to modify configuration, improve operation and repair vulnerabilities, so as to improve the security of their cloud resources.

2019-01-21
Cho, S., Han, I., Jeong, H., Kim, J., Koo, S., Oh, H., Park, M..  2018.  Cyber Kill Chain based Threat Taxonomy and its Application on Cyber Common Operational Picture. 2018 International Conference On Cyber Situational Awareness, Data Analytics And Assessment (Cyber SA). :1–8.

Over a decade, intelligent and persistent forms of cyber threats have been damaging to the organizations' cyber assets and missions. In this paper, we analyze current cyber kill chain models that explain the adversarial behavior to perform advanced persistent threat (APT) attacks, and propose a cyber kill chain model that can be used in view of cyber situation awareness. Based on the proposed cyber kill chain model, we propose a threat taxonomy that classifies attack tactics and techniques for each attack phase using CAPEC, ATT&CK that classify the attack tactics, techniques, and procedures (TTPs) proposed by MITRE. We also implement a cyber common operational picture (CyCOP) to recognize the situation of cyberspace. The threat situation can be represented on the CyCOP by applying cyber kill chain based threat taxonomy.

2019-01-16
Baykara, M., Güçlü, S..  2018.  Applications for detecting XSS attacks on different web platforms. 2018 6th International Symposium on Digital Forensic and Security (ISDFS). :1–6.

Today, maintaining the security of the web application is of great importance. Sites Intermediate Script (XSS) is a security flaw that can affect web applications. This error allows an attacker to add their own malicious code to HTML pages that are displayed to the user. Upon execution of the malicious code, the behavior of the system or website can be completely changed. The XSS security vulnerability is used by attackers to steal the resources of a web browser such as cookies, identity information, etc. by adding malicious Java Script code to the victim's web applications. Attackers can use this feature to force a malicious code worker into a Web browser of a user, since Web browsers support the execution of embedded commands on web pages to enable dynamic web pages. This work has been proposed as a technique to detect and prevent manipulation that may occur in web sites, and thus to prevent the attack of Site Intermediate Script (XSS) attacks. Ayrica has developed four different languages that detect XSS explanations with Asp.NET, PHP, PHP and Ruby languages, and the differences in the detection of XSS attacks in environments provided by different programming languages.

2018-12-10
Zhu, J., Liapis, A., Risi, S., Bidarra, R., Youngblood, G. M..  2018.  Explainable AI for Designers: A Human-Centered Perspective on Mixed-Initiative Co-Creation. 2018 IEEE Conference on Computational Intelligence and Games (CIG). :1–8.

Growing interest in eXplainable Artificial Intelligence (XAI) aims to make AI and machine learning more understandable to human users. However, most existing work focuses on new algorithms, and not on usability, practical interpretability and efficacy on real users. In this vision paper, we propose a new research area of eXplainable AI for Designers (XAID), specifically for game designers. By focusing on a specific user group, their needs and tasks, we propose a human-centered approach for facilitating game designers to co-create with AI/ML techniques through XAID. We illustrate our initial XAID framework through three use cases, which require an understanding both of the innate properties of the AI techniques and users' needs, and we identify key open challenges.

Schonherr, L., Zeiler, S., Kolossa, D..  2017.  Spoofing detection via simultaneous verification of audio-visual synchronicity and transcription. 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). :591–598.

Acoustic speaker recognition systems are very vulnerable to spoofing attacks via replayed or synthesized utterances. One possible countermeasure is audio-visual speaker recognition. Nevertheless, the addition of the visual stream alone does not prevent spoofing attacks completely and only provides further information to assess the authenticity of the utterance. Many systems consider audio and video modalities independently and can easily be spoofed by imitating only a single modality or by a bimodal replay attack with a victim's photograph or video. Therefore, we propose the simultaneous verification of the data synchronicity and the transcription in a challenge-response setup. We use coupled hidden Markov models (CHMMs) for a text-dependent spoofing detection and introduce new features that provide information about the transcriptions of the utterance and the synchronicity of both streams. We evaluate the features for various spoofing scenarios and show that the combination of the features leads to a more robust recognition, also in comparison to the baseline method. Additionally, by evaluating the data on unseen speakers, we show the spoofing detection to be applicable in speaker-independent use-cases.

2018-11-19
Chelaramani, S., Jha, A., Namboodiri, A. M..  2018.  Cross-Modal Style Transfer. 2018 25th IEEE International Conference on Image Processing (ICIP). :2157–2161.

We, humans, have the ability to easily imagine scenes that depict sentences such as ``Today is a beautiful sunny day'' or ``There is a Christmas feel, in the air''. While it is hard to precisely describe what one person may imagine, the essential high-level themes associated with such sentences largely remains the same. The ability to synthesize novel images that depict the feel of a sentence is very useful in a variety of applications such as education, advertisement, and entertainment. While existing papers tackle this problem given a style image, we aim to provide a far more intuitive and easy to use solution that synthesizes novel renditions of an existing image, conditioned on a given sentence. We present a method for cross-modal style transfer between an English sentence and an image, to produce a new image that imbibes the essential theme of the sentence. We do this by modifying the style transfer mechanism used in image style transfer to incorporate a style component derived from the given sentence. We demonstrate promising results using the YFCC100m dataset.

Grinstein, E., Duong, N. Q. K., Ozerov, A., Pérez, P..  2018.  Audio Style Transfer. 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). :586–590.

``Style transfer'' among images has recently emerged as a very active research topic, fuelled by the power of convolution neural networks (CNNs), and has become fast a very popular technology in social media. This paper investigates the analogous problem in the audio domain: How to transfer the style of a reference audio signal to a target audio content? We propose a flexible framework for the task, which uses a sound texture model to extract statistics characterizing the reference audio style, followed by an optimization-based audio texture synthesis to modify the target content. In contrast to mainstream optimization-based visual transfer method, the proposed process is initialized by the target content instead of random noise and the optimized loss is only about texture, not structure. These differences proved key for audio style transfer in our experiments. In order to extract features of interest, we investigate different architectures, whether pre-trained on other tasks, as done in image style transfer, or engineered based on the human auditory system. Experimental results on different types of audio signal confirm the potential of the proposed approach.

Shinya, A., Tung, N. D., Harada, T., Thawonmas, R..  2017.  Object-Specific Style Transfer Based on Feature Map Selection Using CNNs. 2017 Nicograph International (NicoInt). :88–88.

We propose a method for transferring an arbitrary style to only a specific object in an image. Style transfer is the process of combining the content of an image and the style of another image into a new image. Our results show that the proposed method can realize style transfer to specific object.