Biblio

List
Filter

Found 221 results

Filters: Keyword is visualization [Clear All Filters]

2022-01-25

Taspinar, Samet, Mohanty, Manoranjan, Memon, Nasir. 2021. Effect of Video Pixel-Binning on Source Attribution of Mixed Media. ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). :2545–2549.

Photo Response Non-Uniformity (PRNU) noise obtained from images or videos is used as a camera fingerprint to attribute visual objects captured by a camera. The PRNU-based source attribution method, however, fails when there is misalignment between the fingerprint and the query object. One example of such a misalignment, which has been overlooked in the field, is caused by the in-camera resizing technique that a video may have been subjected to. This paper investigates the attribution of visual media in the context of matching a video query object to an image fingerprint or vice versa. Specifically this paper focuses on improving camera attribution performance by taking into account the effects of binning, a commonly used in-camera resizing technique applied to video. We experimentally show that the True Positive Rate (TPR) obtained when binning is considered is approximately 3% higher.

Kozlova, Liudmila P., Kozlova, Olga A.. 2021. Expanding Space with Augmented Reality. 2021 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (ElConRus). :965—967.

Replacing real life with the virtual space has long ceased to be a theory. Among the whole variety of visualization, systems that allow projecting non-existent objects into real-world space are especially distinguished. Thus, augmented reality technology has found its application in many different fields. The article discusses the general concepts and principles of building augmented reality systems.

Gonsher, Ian, Lei, Zhenhong. 2021. Prototype of Force Feedback Tool for Mixed Reality Applications. 2021 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct). :508—509.

This prototype demonstrates the viability of manipulating both physical and virtual objects with the same tool in order to maintain object permanence across both modes of interaction. Using oppositional force feedback, provided by a servo, and an augmented visual interface, provided by the user’s smartphone, this tool simulates the look and feel of a physical object within an augmented environment. Additionally, the tool is also able to manipulate physical objects that are not part of the augmented reality, such as a physical nut. By integrating both modes of interaction into the same tool, users can fluidly move between these different modes of interaction, manipulating both physical and virtual objects as the need arises. By overlaying this kind of visual and haptic augmentation onto a common tool such as a pair of pliers, we hope to further explore scenarios for collaborative telepresence in future work.

2022-01-11

Li, Xiaolong, Zhao, Tengteng, Zhang, Wei, Gan, Zhiqiang, Liu, Fugang. 2021. A Visual Analysis Framework of Attack Paths Based on Network Traffic. 2021 IEEE International Conference on Power Electronics, Computer Applications (ICPECA). :232–237.

With the rapid development of the Internet, cyberspace security has become a potentially huge problem. At the same time, the disclosure of cyberspace vulnerabilities is getting faster and faster. Traditional protection methods based on known features cannot effectively defend against new network attacks. Network attack is no more a single vulnerability exploit, but an APT attack based on multiple complicated methods. Cyberspace attacks have become ``rationalized'' on the surface. Currently, there are a lot of researches about visualization of attack paths, but there is no an overall plan to reproduce the attack path. Most researches focus on the detection and characterization individual based on single behavior cyberspace attacks, which loose it's abilities to help security personnel understand the complete attack behavior of attackers. The key factors of this paper is to collect the attackers' aggressive behavior by reverse retrospective method based on the actual shooting range environment. By finding attack nodes and dividing offensive behavior into time series, we can characterize the attacker's behavior path vividly and comprehensively.

2022-01-10

Ibrahim, Mariam, Nabulsi, Intisar. 2021. Security Analysis of Smart Home Systems Applying Attack Graph. 2021 Fifth World Conference on Smart Trends in Systems Security and Sustainability (WorldS4). :230–234.

In this work, security analysis of a Smart Home System (SHS) is inspected. The paper focuses on describing common and likely cyber security threats against SHS. This includes both their influence on human privacy and safety. The SHS is properly presented and formed applying Architecture Analysis and Design Language (AADL), exhibiting the system layout, weaknesses, attack practices, besides their requirements and post settings. The obtained model is later inspected along with a security requirement with JKind model tester software for security endangerment. The overall attack graph causing system compromise is graphically given using Graphviz.

2021-12-21

Mishra, Srinivas, Pradhan, Sateesh Kumar, Rath, Subhendu Kumar. 2021. Detection of Zero-Day Attacks in Network IDS through High Performance Soft Computing. 2021 International Conference on Artificial Intelligence and Smart Systems (ICAIS). :1199–1204.

The ever-evolving computers has its implications on the data and information and the threats that they are exposed to. With the exponential growth of internet, the chances of data breach are highly likely as unauthorized and ill minded users find new ways to get access to the data that they can use for their plans. Most of the systems today have well designed measures that examine the information for any abnormal behavior (Zero Day Attacks) compared to what has been seen and experienced over the years. These checks are done based on a predefined identity (signature) of information. This is being termed as Intrusion Detection Systems (IDS). The concept of IDS revolves around validation of data and/or information and detecting unauthorized access attempts with an intention of manipulating data. High Performance Soft Computing (HPSC) aims to internalize cumulative adoption of traditional and modern attempts to breach data security and expose it to high scale damage and altercations. Our effort in this paper is to emphasize on the multifaceted tactic and rationalize important functionalities of IDS available at the disposal of HPSC.

2021-12-20

Ma, Chiyuan, Zuo, Yi, CHEN, C.L.Philip, Li, Tieshan. 2021. A Weight-Adaptive Algorithm of Multi Feature Fusion Based on Kernel Correlation Filtering for Target Tracking. 2021 International Conference on Security, Pattern Analysis, and Cybernetics（SPAC). :274–279.

In most correlation filter target tracking algorithms, poor accuracy in the tracking process for complex field images of the target and scale change problems. To address these issues, this paper proposes an algorithm of adaptive multi-feature fusion with scale change correlation filtering tracking. Our algorithm is based on the rapid and simple Kernel-Correlated Filtering(K CF) tracker, and achieves the complementarity among image features by fusing multiple features of Color Nmae(CN), Histogram of Oriented Gradient(HOG) and Local Binary Pattern(LBP) with weights adjusted by visual evaluation functions. The proposed algorithm introduces scale pooling and bilinear interpolation to adjust the target template size. Experiments on the OTB-2015 dataset of 100 video frames are compared with several trackers, and the precision and success ratio of our algorithm on complex scene tracking problems are 17.7% and 32.1 % respectively compared to the based-KCF.

Vadlamani, Aparna, Kalicheti, Rishitha, Chimalakonda, Sridhar. 2021. APIScanner - Towards Automated Detection of Deprecated APIs in Python Libraries. 2021 IEEE/ACM 43rd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion). :5–8.

Python libraries are widely used for machine learning and scientific computing tasks today. APIs in Python libraries are deprecated due to feature enhancements and bug fixes in the same way as in other languages. These deprecated APIs are discouraged from being used in further software development. Manually detecting and replacing deprecated APIs is a tedious and time-consuming task due to the large number of API calls used in the projects. Moreover, the lack of proper documentation for these deprecated APIs makes the task challenging. To address this challenge, we propose an algorithm and a tool APIScanner that automatically detects deprecated APIs in Python libraries. This algorithm parses the source code of the libraries using abstract syntax tree (ASTs) and identifies the deprecated APIs via decorator, hard-coded warning or comments. APIScanner is a Visual Studio Code Extension that highlights and warns the developer on the use of deprecated API elements while writing the source code. The tool can help developers to avoid using deprecated API elements without the execution of code. We tested our algorithm and tool on six popular Python libraries, which detected 838 of 871 deprecated API elements. Demo of APIScanner: https://youtu.be/1hy\_ugf-iek. Documentation, tool, and source code can be found here: https://rishitha957.github.io/APIScanner.

2021-11-29

Carroll, Fiona, Legg, Phil, Bønkel, Bastian. 2020. The Visual Design of Network Data to Enhance Cyber Security Awareness of the Everyday Internet User. 2020 International Conference on Cyber Situational Awareness, Data Analytics and Assessment (CyberSA). :1–7.

Technology and the use of online services are very prevalent across much of our everyday lives. As our digital interactions continue to grow, there is a need to improve public awareness of the risks to our personal online privacy and security. Designing for cyber security awareness has never been so important. In this work, we consider people's current impressions towards their privacy and security online. We also explore how abnormal network activity data can be visually conveyed to afford a heightened cyber security awareness. In detail, the paper documents the different effects of visual variables in an edge and node DoS visualisation to depict abnormally high volumes of traffic. The results from two studies show that people are generally becoming more concerned about their privacy and security online. Moreover, we have found that the more focus based visual techniques (i.e. blur) and geometry-based techniques (i.e. jaggedness and sketchiness) afford stronger impressions of uncertainty from abnormally high volumes of network traffic. In terms of security, these impressions and feelings alert in the end-user that something is not quite as it should be and hence develop a heightened cyber security awareness.

2021-11-08

Damasevicius, Robertas, Toldinas, Jevgenijus, Venckauskas, Algimantas, Grigaliunas, Sarunas, Morkevicius, Nerijus. 2020. Technical Threat Intelligence Analytics: What and How to Visualize for Analytic Process. 2020 24th International Conference Electronics. :1–4.

Visual Analytics uses data visualization techniques for enabling compelling data analysis by engaging graphical and visual portrayal. In the domain of cybersecurity, convincing visual representation of data enables to ascertain valuable observations that allow the domain experts to construct efficient cyberattack mitigation strategies and provide useful decision support. We present a survey of visual analytics tools and methods in the domain of cybersecurity. We explore and discuss Technical Threat Intelligence visualization tools using the Five Question Method. We conclude the analysis of the works using Moody's Physics of Notations, and VIS4ML ontology as a methodological background of visual analytics process. We summarize our analysis as a high-level model of visual analytics for cybersecurity threat analysis.

2021-10-12

Deng, Perry, Linsky, Cooper, Wright, Matthew. 2020. Weaponizing Unicodes with Deep Learning -Identifying Homoglyphs with Weakly Labeled Data. 2020 IEEE International Conference on Intelligence and Security Informatics (ISI). :1–6.

Visually similar characters, or homoglyphs, can be used to perform social engineering attacks or to evade spam and plagiarism detectors. It is thus important to understand the capabilities of an attacker to identify homoglyphs - particularly ones that have not been previously spotted - and leverage them in attacks. We investigate a deep-learning model using embedding learning, transfer learning, and augmentation to determine the visual similarity of characters and thereby identify potential homoglyphs. Our approach uniquely takes advantage of weak labels that arise from the fact that most characters are not homoglyphs. Our model drastically outperforms the Normal-ized Compression Distance approach on pairwise homoglyph identification, for which we achieve an average precision of 0.97. We also present the first attempt at clustering homoglyphs into sets of equivalence classes, which is more efficient than pairwise information for security practitioners to quickly lookup homoglyphs or to normalize confusable string encodings. To measure clustering performance, we propose a metric (mBIOU) building on the classic Intersection-Over-Union (IOU) metric. Our clustering method achieves 0.592 mBIOU, compared to 0.430 for the naive baseline. We also use our model to predict over 8,000 previously unknown homoglyphs, and find good early indications that many of these may be true positives. Source code and list of predicted homoglyphs are uploaded to Github: https://github.com/PerryXDeng/weaponizing\_unicode.

2021-10-04

Dong, Xianzhe, He, Xinyi, Liang, Tianlin, Shi, Dai, Tao, Dan. 2020. Entropy based Security Rating Evaluation Scheme for Pattern Lock. 2020 IEEE International Conference on Consumer Electronics - Taiwan (ICCE-Taiwan). :1–2.

To better protect users' privacy, various authentication mechanisms have been applied on smartphones. Android pattern lock has been widely used because it is easy to memorize, however, simple ones are more vulnerable to attack such as shoulder surfing attack. In this paper, we propose a security rating evaluation scheme based on pattern lock. In particular, an entropy function of a pattern lock can be calculated, which is decided by five kinds of attributes: size, length, angle, overlap and intersection for quantitative evaluation of pattern lock. And thus, the security rating thresholds will be determined by the distribution of entropy values. Finally, we design and develop an APP based on Android Studio, which is used to verify the effectiveness of our proposed security rating evaluation scheme.

2021-09-21

Sartoli, Sara, Wei, Yong, Hampton, Shane. 2020. Malware Classification Using Recurrence Plots and Deep Neural Network. 2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA). :901–906.

In this paper, we introduce a method for visualizing and classifying malware binaries. A malware binary consists of a series of data points of compiled machine codes that represent programming components. The occurrence and recurrence behavior of these components is determined by the common tasks malware samples in a particular family carry out. Thus, we view a malware binary as a series of emissions generated by an underlying stochastic process and use recurrence plots to transform malware binaries into two-dimensional texture images. We observe that recurrence plot-based malware images have significant visual similarities within the same family and are different from samples in other families. We apply deep CNN classifiers to classify malware samples. The proposed approach does not require creating malware signature or manual feature engineering. Our preliminary experimental results show that the proposed malware representation leads to a higher and more stable accuracy in comparison to directly transforming malware binaries to gray-scale images.

Kartel, Anastasia, Novikova, Evgenia, Volosiuk, Aleksandr. 2020. Analysis of Visualization Techniques for Malware Detection. 2020 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus). :337–340.

Due to the steady growth of various sophisticated types of malware, different malware analysis systems are becoming more and more demanded. While there are various automatic approaches available to identify and detect malware, the malware analysis is still time-consuming process. The visualization-driven techniques may significantly increase the efficiency of the malware analysis process by involving human visual system which is a powerful pattern seeker. In this paper the authors reviewed different visualization methods, examined their features and tasks solved with their help. The paper presents the most commonly used approaches and discusses open challenges in malware visual analytics.

2021-09-16

Sarker, Partha S., Singh Saini, Amandeep, Sajan, K S, Srivastava, Anurag K.. 2020. CP-SAM: Cyber-Power Security Assessment and Resiliency Analysis Tool for Distribution System. 2020 Resilience Week (RWS). :188–193.

Cyber-power resiliency analysis of the distribution system is becoming critical with increase in adverse cyberevents. Distribution network operators need to assess and analyze the resiliency of the system utilizing the analytical tool with a carefully designed visualization and be driven by data and model-based analytics. This work introduces the Cyber-Physical Security Assessment Metric (CP-SAM) visualization tool to assist operators in ensuring the energy supply to critical loads during or after a cyber-attack. CP-SAM also provides decision support to operators utilizing measurement data and distribution power grid model and through well-designed visualization. The paper discusses the concepts of cyber-physical resiliency, software design considerations, open-source software components, and use cases for the tool to demonstrate the implementation and importance of the developed tool.

2021-08-31

Freitas, Lucas F., Nogueira, Adalberto R., Melgar, Max E. Vizcarra. 2020. Visual Authentication Scheme Based on Reversible Degradation and QR Code. 2020 Fourth World Conference on Smart Trends in Systems, Security and Sustainability (WorldS4). :58—63.

Two-Dimensional barcodes are used as data authentication storage tool on several cryptographic architectures. This article describes a novel meaningful image authentication method for data validation using the Meaningless Reversible Degradation concept and QR Codes. The system architecture use the Meaningless Reversible Degradation algorithm, systematic Reed-Solomon error correction codes, meaningful images, and QR Codes. The encoded images are the secret key for visual validation. The proposed work encodes any secret image file up to 3.892 Bytes and is decoded using data stored in a QR Code and a digital file retrieved through a wireless connection on a mobile device. The QR Code carries partially distorted and stream ciphered bits. The QR Code version is defined in conformity with the secret image file size. Once the QR Code data is decoded, the authenticating party retrieves a previous created Reed-Solomon redundancy file to correct the QR Code stored data. Finally, the secret image is decoded for user visual identification. A regular QR Code reader cannot decode any meaningful information when the QR Code is scanned. The presented cryptosystem improves the redundancy download file size up to 50% compared to a plaintext image transmission.

Di Noia, Tommaso, Malitesta, Daniele, Merra, Felice Antonio. 2020. TAaMR: Targeted Adversarial Attack against Multimedia Recommender Systems. 2020 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W). :1–8.

Deep learning classifiers are hugely vulnerable to adversarial examples, and their existence raised cybersecurity concerns in many tasks with an emphasis on malware detection, computer vision, and speech recognition. While there is a considerable effort to investigate attacks and defense strategies in these tasks, only limited work explores the influence of targeted attacks on input data (e.g., images, textual descriptions, audio) used in multimedia recommender systems (MR). In this work, we examine the consequences of applying targeted adversarial attacks against the product images of a visual-based MR. We propose a novel adversarial attack approach, called Target Adversarial Attack against Multimedia Recommender Systems (TAaMR), to investigate the modification of MR behavior when the images of a category of low recommended products (e.g., socks) are perturbed to misclassify the deep neural classifier towards the class of more recommended products (e.g., running shoes) with human-level slight images alterations. We explore the TAaMR approach studying the effect of two targeted adversarial attacks (i.e., FGSM and PGD) against input pictures of two state-of-the-art MR (i.e., VBPR and AMR). Extensive experiments on two real-world recommender fashion datasets confirmed the effectiveness of TAaMR in terms of recommendation lists changing while keeping the original human judgment on the perturbed images.

2021-08-17

Belman, Amith K., Paul, Tirthankar, Wang, Li, Iyengar, S. S., Śniatała, Paweł, Jin, Zhanpeng, Phoha, Vir V., Vainio, Seppo, Röning, Juha. 2020. Authentication by Mapping Keystrokes to Music: The Melody of Typing. 2020 International Conference on Artificial Intelligence and Signal Processing (AISP). :1—6.

Expressing Keystroke Dynamics (KD) in form of sound opens new avenues to apply sound analysis techniques on KD. However this mapping is not straight-forward as varied feature space, differences in magnitudes of features and human interpretability of the music bring in complexities. We present a musical interface to KD by mapping keystroke features to music features. Music elements like melody, harmony, rhythm, pitch and tempo are varied with respect to the magnitude of their corresponding keystroke features. A pitch embedding technique makes the music discernible among users. Using the data from 30 users, who typed fixed strings multiple times on a desktop, shows that these auditory signals are distinguishable between users by both standard classifiers (SVM, Random Forests and Naive Bayes) and humans alike.

2021-08-02

Liu, Weilun, Ge, Mengmeng, Kim, Dong Seong. 2020. Integrated Proactive Defense for Software Defined Internet of Things under Multi-Target Attacks. 2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID). :767—774.

Due to the constrained resource and computational limitation of many Internet of Things (IoT) devices, conventional security protections, which require high computational overhead are not suitable to be deployed. Thus, vulnerable IoT devices could be easily exploited by attackers to break into networks. In this paper, we employ cyber deception and moving target defense (MTD) techniques to proactively change the network topology with both real and decoy nodes with the support of software-defined networking (SDN) technology and investigate the impact of single-target and multi-target attacks on the effectiveness of the integrated mechanism via a hierarchical graphical security model with security metrics. We also implement a web-based visualization interface to show topology changes with highlighted attack paths. Finally, the qualitative security analysis is performed for a small-scale and SDN-supported IoT network with different combinations of decoy types and levels of attack intelligence. Simulation results show the integrated defense mechanism can introduce longer mean-time-to-security-failure and larger attack impact under the multi-target attack, compared with the single-target attack model. In addition, adaptive shuffling has better performance than fixed interval shuffling in terms of a higher proportion of decoy paths, longer mean-time-to-security-failure and largely reduced defense cost.

2021-07-27

Jiao, Rui, Zhang, Lan, Li, Anran. 2020. IEye: Personalized Image Privacy Detection. 2020 6th International Conference on Big Data Computing and Communications (BIGCOM). :91–95.

Massive images are being shared via a variety of ways, such as social networking. The rich content of images raise a serious concern for privacy. A great number of efforts have been devoted to designing mechanisms for privacy protection based on the assumption that the privacy is well defined. However, in practice, given a collection of images it is usually nontrivial to decide which parts of images should be protected, since the sensitivity of objects is context-dependent and user-dependent. To meet personalized privacy requirements of different users, we propose a system IEye to automatically detect private parts of images based on both common knowledge and personal knowledge. Specifically, for each user's images, multi-layered semantic graphs are constructed as feature representations of his/her images and a rule set is learned from those graphs, which describes his/her personalized privacy. In addition, an optimization algorithm is proposed to protect the user's privacy as well as minimize the loss of utility. We conduct experiments on two datasets, the results verify the effectiveness of our design to detect and protect personalized image privacy.

2021-07-07

Diamanti, Alessio, Vilchez, José Manuel Sanchez, Secci, Stefano. 2020. LSTM-based radiography for anomaly detection in softwarized infrastructures. 2020 32nd International Teletraffic Congress (ITC 32). :28–36.

Legacy and novel network services are expected to be migrated and designed to be deployed in fully virtualized environments. Starting with 5G, NFV becomes a formally required brick in the specifications, for services integrated within the infrastructure provider networks. This evolution leads to deployment of virtual resources Virtual-Machine (VM)-based, container-based and/or server-less platforms, all calling for a deep virtualization of infrastructure components. Such a network softwarization also unleashes further logical network virtualization, easing multi-layered, multi-actor and multi-access services, so as to be able to fulfill high availability, security, privacy and resilience requirements. However, the derived increased components heterogeneity makes the detection and the characterization of anomalies difficult, hence the relationship between anomaly detection and corresponding reconfiguration of the NFV stack to mitigate anomalies. In this article we propose an unsupervised machine-learning data-driven approach based on Long-Short- Term-Memory (LSTM) autoencoders to detect and characterize anomalies in virtualized networking services. With a radiography visualization, this approach can spot and describe deviations from nominal parameter values of any virtualized network service by means of a lightweight and iterative mean-squared reconstruction error analysis of LSTM-based autoencoders. We implement and validate the proposed methodology through experimental tests on a vIMS proof-of-concept deployed using Kubernetes.

Beghdadi, Azeddine, Bezzine, Ismail, Qureshi, Muhammad Ali. 2020. A Perceptual Quality-driven Video Surveillance System. 2020 IEEE 23rd International Multitopic Conference (INMIC). :1–6.

Video-based surveillance systems often suffer from poor-quality video in an uncontrolled environment. This may strongly affect the performance of high-level tasks such as visual tracking, abnormal event detection or more generally scene understanding and interpretation. This work aims to demonstrate the impact and the importance of video quality in video surveillance systems. Here, we focus on the most important challenges and difficulties related to the perceptual quality of the acquired or transmitted images/videos in uncontrolled environments. In this paper, we propose an architecture of a smart surveillance system that incorporates the perceptual quality of acquired scenes. We study the behaviour of some state-of-the-art video quality metrics on some original and distorted sequences from a dedicated surveillance dataset. Through this study, it has been shown that some of the state-of-the-art image/video quality metrics do not work in the context of video-surveillance. This study opens a new research direction to develop the video quality metrics in the context of video surveillance and also to propose a new quality-driven framework of video surveillance system.

2021-06-28

Sarabia-Lopez, Jaime, Nuñez-Ramirez, Diana, Mata-Mendoza, David, Fragoso-Navarro, Eduardo, Cedillo-Hernandez, Manuel, Nakano-Miyatake, Mariko. 2020. Visible-Imperceptible Image Watermarking based on Reversible Data Hiding with Contrast Enhancement. 2020 International Conference on Mechatronics, Electronics and Automotive Engineering (ICMEAE). :29–34.

Currently the use and production of multimedia data such as digital images have increased due to its wide use within smart devices and open networks. Although this has some advantages, it has generated several issues related to the infraction of intellectual property. Digital image watermarking is a promissory solution to solve these issues. Considering the need to develop mechanisms to improve the information security as well as protect the intellectual property of the digital images, in this paper we propose a novel visible-imperceptible watermarking based on reversible data hiding with contrast enhancement. In this way, a watermark logo is embedded in the spatial domain of the original image imperceptibly, so that the logo is revealed applying reversible data hiding increasing the contrast of the watermarked image and the same time concealing a great amount of data bits, which are extracted and the watermarked image restored to its original conditions using the reversible functionality. Experimental results show the effectiveness of the proposed algorithm. A performance comparison with the current state-of-the-art is provided.

2021-06-01

Cideron, Geoffrey, Seurin, Mathieu, Strub, Florian, Pietquin, Olivier. 2020. HIGhER: Improving instruction following with Hindsight Generation for Experience Replay. 2020 IEEE Symposium Series on Computational Intelligence (SSCI). :225–232.

Language creates a compact representation of the world and allows the description of unlimited situations and objectives through compositionality. While these characterizations may foster instructing, conditioning or structuring interactive agent behavior, it remains an open-problem to correctly relate language understanding and reinforcement learning in even simple instruction following scenarios. This joint learning problem is alleviated through expert demonstrations, auxiliary losses, or neural inductive biases. In this paper, we propose an orthogonal approach called Hindsight Generation for Experience Replay (HIGhER) that extends the Hindsight Experience Replay approach to the language-conditioned policy setting. Whenever the agent does not fulfill its instruction, HIGhER learns to output a new directive that matches the agent trajectory, and it relabels the episode with a positive reward. To do so, HIGhER learns to map a state into an instruction by using past successful trajectories, which removes the need to have external expert interventions to relabel episodes as in vanilla HER. We show the efficiency of our approach in the BabyAI environment, and demonstrate how it complements other instruction following methods.

Zheng, Wenbo, Yan, Lan, Gou, Chao, Wang, Fei-Yue. 2020. Webly Supervised Knowledge Embedding Model for Visual Reasoning. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). :12442–12451.

Visual reasoning between visual image and natural language description is a long-standing challenge in computer vision. While recent approaches offer a great promise by compositionality or relational computing, most of them are oppressed by the challenge of training with datasets containing only a limited number of images with ground-truth texts. Besides, it is extremely time-consuming and difficult to build a larger dataset by annotating millions of images with text descriptions that may very likely lead to a biased model. Inspired by the majority success of webly supervised learning, we utilize readily-available web images with its noisy annotations for learning a robust representation. Our key idea is to presume on web images and corresponding tags along with fully annotated datasets in learning with knowledge embedding. We present a two-stage approach for the task that can augment knowledge through an effective embedding model with weakly supervised web data. This approach learns not only knowledge-based embeddings derived from key-value memory networks to make joint and full use of textual and visual information but also exploits the knowledge to improve the performance with knowledge-based representation learning for applying other general reasoning tasks. Experimental results on two benchmarks show that the proposed approach significantly improves performance compared with the state-of-the-art methods and guarantees the robustness of our model against visual reasoning tasks and other reasoning tasks.