Visible to the public Biblio

Filters: Keyword is Cognition  [Clear All Filters]
2023-07-19
Zuo, Langyi.  2022.  Comparison between the Traditional and Computerized Cognitive Training Programs in Treating Mild Cognitive Impairment. 2022 2nd International Conference on Electronic Information Engineering and Computer Technology (EIECT). :119—124.
MCI patients can be benefited from cognitive training programs to improve their cognitive capabilities or delay the decline of cognition. This paper evaluated three types of commonly seen categories of cognitive training programs (non-computerized / traditional cognitive training (TCT), computerized cognitive training (CCT), and virtual/augmented reality cognitive training (VR/AR CT)) based on six aspects: stimulation strength, user-friendliness, expandability, customizability/personalization, convenience, and motivation/atmosphere. In addition, recent applications of each type of CT were offered. Finally, a conclusion in which no single CT outperformed the others was derived, and the most applicable scenario of each type of CT was also provided.
2023-06-09
Liu, Chengwei, Chen, Sen, Fan, Lingling, Chen, Bihuan, Liu, Yang, Peng, Xin.  2022.  Demystifying the Vulnerability Propagation and Its Evolution via Dependency Trees in the NPM Ecosystem. 2022 IEEE/ACM 44th International Conference on Software Engineering (ICSE). :672—684.
Third-party libraries with rich functionalities facilitate the fast development of JavaScript software, leading to the explosive growth of the NPM ecosystem. However, it also brings new security threats that vulnerabilities could be introduced through dependencies from third-party libraries. In particular, the threats could be excessively amplified by transitive dependencies. Existing research only considers direct dependencies or reasoning transitive dependencies based on reachability analysis, which neglects the NPM-specific dependency resolution rules as adapted during real installation, resulting in wrongly resolved dependencies. Consequently, further fine-grained analysis, such as precise vulnerability propagation and their evolution over time in dependencies, cannot be carried out precisely at a large scale, as well as deriving ecosystem-wide solutions for vulnerabilities in dependencies. To fill this gap, we propose a knowledge graph-based dependency resolution, which resolves the inner dependency relations of dependencies as trees (i.e., dependency trees), and investigates the security threats from vulnerabilities in dependency trees at a large scale. Specifically, we first construct a complete dependency-vulnerability knowledge graph (DVGraph) that captures the whole NPM ecosystem (over 10 million library versions and 60 million well-resolved dependency relations). Based on it, we propose a novel algorithm (DTResolver) to statically and precisely resolve dependency trees, as well as transitive vulnerability propagation paths, for each package by taking the official dependency resolution rules into account. Based on that, we carry out an ecosystem-wide empirical study on vulnerability propagation and its evolution in dependency trees. Our study unveils lots of useful findings, and we further discuss the lessons learned and solutions for different stakeholders to mitigate the vulnerability impact in NPM based on our findings. For example, we implement a dependency tree based vulnerability remediation method (DTReme) for NPM packages, and receive much better performance than the official tool (npm audit fix).
2023-05-19
Iv, James K. Howes, Georgiou, Marios, Malozemoff, Alex J., Shrimpton, Thomas.  2022.  Security Foundations for Application-Based Covert Communication Channels. 2022 IEEE Symposium on Security and Privacy (SP). :1971—1986.
We introduce the notion of an application-based covert channel—or ABCC—which provides a formal syntax for describing covert channels that tunnel messages through existing protocols. Our syntax captures many recent systems, including DeltaShaper (PETS 2017) and Protozoa (CCS 2020). We also define what it means for an ABCC to be secure against a passive eavesdropper, and prove that suitable abstractions of existing censorship circumvention systems satisfy our security notion. In doing so, we define a number of important non-cryptographic security assumptions that are often made implicitly in prior work. We believe our formalisms may be useful to censorship circumvention developers for reasoning about the security of their systems and the associated security assumptions required.
2023-03-06
Gori, Monica, Volpe, Gualtiero, Cappagli, Giulia, Volta, Erica, Cuturi, Luigi F..  2021.  Embodied multisensory training for learning in primary school children. 2021 {IEEE} {International} {Conference} on {Development} and {Learning} ({ICDL}). :1–7.
Recent scientific results show that audio feedback associated with body movements can be fundamental during the development to learn new spatial concepts [1], [2]. Within the weDraw project [3], [4], we have investigated how this link can be useful to learn mathematical concepts. Here we present a study investigating how mathematical skills changes after multisensory training based on human-computer interaction (RobotAngle and BodyFraction activities). We show that embodied angle and fractions exploration associated with audio and visual feedback can be used in typical children to improve cognition of spatial mathematical concepts. We finally present the exploitation of our results: an online, optimized version of one of the tested activity to be used at school. The training result suggests that audio and visual feedback associated with body movements is informative for spatial learning and reinforces the idea that spatial representation development is based on sensory-motor interactions.
2023-03-03
Du, Mingshu, Ma, Yuan, Lv, Na, Chen, Tianyu, Jia, Shijie, Zheng, Fangyu.  2022.  An Empirical Study on the Quality of Entropy Sources in Linux Random Number Generator. ICC 2022 - IEEE International Conference on Communications. :559–564.
Random numbers are essential for communications security, as they are widely employed as secret keys and other critical parameters of cryptographic algorithms. The Linux random number generator (LRNG) is the most popular open-source software-based random number generator (RNG). The security of LRNG is influenced by the overall design, especially the quality of entropy sources. Therefore, it is necessary to assess and quantify the quality of the entropy sources which contribute the main randomness to RNGs. In this paper, we perform an empirical study on the quality of entropy sources in LRNG with Linux kernel 5.6, and provide the following two findings. We first analyze two important entropy sources: jiffies and cycles, and propose a method to predict jiffies by cycles with high accuracy. The results indicate that, the jiffies can be correctly predicted thus contain almost no entropy in the condition of knowing cycles. The other important finding is the failure of interrupt cycles during system boot. The lower bits of cycles caused by interrupts contain little entropy, which is contrary to our traditional cognition that lower bits have more entropy. We believe these findings are of great significance to improve the efficiency and security of the RNG design on software platforms.
ISSN: 1938-1883
2023-02-17
Cobos, Luis-Pedro, Miao, Tianlei, Sowka, Kacper, Madzudzo, Garikayi, Ruddle, Alastair R., El Amam, Ehab.  2022.  Application of an Automotive Assurance Case Approach to Autonomous Marine Vessel Security. 2022 International Conference on Electrical, Computer, Communications and Mechatronics Engineering (ICECCME). :1–9.
The increase of autonomy in autonomous surface vehicles development brings along modified and new risks and potential hazards, this in turn, introduces the need for processes and methods for ensuring that systems are acceptable for their intended use with respect to dependability and safety concerns. One approach for evaluating software requirements for claims of safety is to employ an assurance case. Much like a legal case, the assurance case lays out an argument and supporting evidence to provide assurance on the software requirements. This paper analyses safety and security requirements relating to autonomous vessels, and regulations in the automotive industry and the marine industry before proposing a generic cybersecurity and safety assurance case that takes a general graphical approach of Goal Structuring Notation (GSN).
Rossi, Alessandra, Andriella, Antonio, Rossi, Silvia, Torras, Carme, Alenyà, Guillem.  2022.  Evaluating the Effect of Theory of Mind on People’s Trust in a Faulty Robot. 2022 31st IEEE International Conference on Robot and Human Interactive Communication (RO-MAN). :477–482.
The success of human-robot interaction is strongly affected by the people’s ability to infer others’ intentions and behaviours, and the level of people’s trust that others will abide by their same principles and social conventions to achieve a common goal. The ability of understanding and reasoning about other agents’ mental states is known as Theory of Mind (ToM). ToM and trust, therefore, are key factors in the positive outcome of human-robot interaction. We believe that a robot endowed with a ToM is able to gain people’s trust, even when this may occasionally make errors.In this work, we present a user study in the field in which participants (N=123) interacted with a robot that may or may not have a ToM, and may or may not exhibit erroneous behaviour. Our findings indicate that a robot with ToM is perceived as more reliable, and they trusted it more than a robot without a ToM even when the robot made errors. Finally, ToM results to be a key driver for tuning people’s trust in the robot even when the initial condition of the interaction changed (i.e., loss and regain of trust in a longer relationship).
ISSN: 1944-9437
2022-12-09
Casimiro, Maria, Romano, Paolo, Garlan, David, Rodrigues, Luís.  2022.  Towards a Framework for Adapting Machine Learning Components. 2022 IEEE International Conference on Autonomic Computing and Self-Organizing Systems (ACSOS). :131—140.
Machine Learning (ML) models are now commonly used as components in systems. As any other component, ML components can produce erroneous outputs that may penalize system utility. In this context, self-adaptive systems emerge as a natural approach to cope with ML mispredictions, through the execution of adaptation tactics such as model retraining. To synthesize an adaptation strategy, the self-adaptation manager needs to reason about the cost-benefit tradeoffs of the applicable tactics, which is a non-trivial task for tactics such as model retraining, whose benefits are both context- and data-dependent.To address this challenge, this paper proposes a probabilistic modeling framework that supports automated reasoning about the cost/benefit tradeoffs associated with improving ML components of ML-based systems. The key idea of the proposed approach is to decouple the problems of (i) estimating the expected performance improvement after retrain and (ii) estimating the impact of ML improved predictions on overall system utility.We demonstrate the application of the proposed framework by using it to self-adapt a state-of-the-art ML-based fraud-detection system, which we evaluate using a publicly-available, real fraud detection dataset. We show that by predicting system utility stemming from retraining a ML component, the probabilistic model checker can generate adaptation strategies that are significantly closer to the optimal, as compared against baselines such as periodic retraining, or reactive retraining.
2022-12-01
Bemus, Peter, Noran, Ovidiu.  2021.  Static vs Dynamic Architecture of Aware Cyber Physical Systems of Systems. 2021 IEEE 25th International Enterprise Distributed Object Computing Workshop (EDOCW). :186–193.
The Enterprise Architecture and Systems Engineering communities are often faced with complexity barriers that develop due to the fact that modern systems must be agile and resilient. This requires dynamic changes to the system so as to adapt to changing missions as well as changes in the internal and external environments. The requirement is not entirely new, but practitioners need guidance on how to manage the life cycle of such systems. This is a problem because we must be able to architect systems by alleviating the difficulties in systems life cycle management (e.g., by helping the enterprise- or systems engineer organise and maintain models and architecture descriptions of the system of interest). Building on Pask’s conversation theoretic model of aware (human or machine) individuals, the paper proposes a reference model for systems that maintain their own models real time, act efficiently, and create system-level awareness on all levers of aggregation.
2022-08-26
Winter, Kirsten, Coughlin, Nicholas, Smith, Graeme.  2021.  Backwards-directed information flow analysis for concurrent programs. 2021 IEEE 34th Computer Security Foundations Symposium (CSF). :1—16.
A number of approaches have been developed for analysing information flow in concurrent programs in a compositional manner, i.e., in terms of one thread at a time. Early approaches modelled the behaviour of a given thread's environment using simple read and write permissions on variables, or by associating specific behaviour with whether or not locks are held. Recent approaches allow more general representations of environmental behaviour, increasing applicability. This, however, comes at a cost. These approaches analyse the code in a forwards direction, from the start of the program to the end, constructing the program's entire state after each instruction. This process needs to take into account the environmental influence on all shared variables of the program. When environmental influence is modelled in a general way, this leads to increased complexity, hindering automation of the analysis. In this paper, we present a compositional information flow analysis for concurrent systems which is the first to support a general representation of environmental behaviour and be automated within a theorem prover. Our approach analyses the code in a backwards direction, from the end of the program to the start. Rather than constructing the entire state at each instruction, it generates only the security-related proof obligations. These are, in general, much simpler, referring to only a fraction of the program's shared variables and thus reducing the complexity introduced by environmental behaviour. For increased applicability, our approach analyses value-dependent information flow, where the security classification of a variable may depend on the current state. The resulting logic has been proved sound within the theorem prover Isabelle/HOL.
Telny, A. V., Monakhov, M. Yu., Aleksandrov, A. V., Matveeva, A. P..  2021.  On the Possibility of Using Cognitive Approaches in Information Security Tasks. 2021 Dynamics of Systems, Mechanisms and Machines (Dynamics). :1—6.

This article analyzes the possibilities of using cognitive approaches in forming expert assessments for solving information security problems. The experts use the contextual approach by A.Yu. Khrennikov’s as a basic model for the mathematical description of the quantum decision-making method. In the cognitive view, expert assessments are proposed to be considered as conditional probabilities with regard to the fulfillment of a set of certain conditions. However, the conditions in this approach are contextual, but not events like in Boolean algebra.

2022-08-12
Bendre, Nihar, Desai, Kevin, Najafirad, Peyman.  2021.  Show Why the Answer is Correct! Towards Explainable AI using Compositional Temporal Attention. 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC). :3006–3012.
Visual Question Answering (VQA) models have achieved significant success in recent times. Despite the success of VQA models, they are mostly black-box models providing no reasoning about the predicted answer, thus raising questions for their applicability in safety-critical such as autonomous systems and cyber-security. Current state of the art fail to better complex questions and thus are unable to exploit compositionality. To minimize the black-box effect of these models and also to make them better exploit compositionality, we propose a Dynamic Neural Network (DMN), which can understand a particular question and then dynamically assemble various relatively shallow deep learning modules from a pool of modules to form a network. We incorporate compositional temporal attention to these deep learning based modules to increase compositionality exploitation. This results in achieving better understanding of complex questions and also provides reasoning as to why the module predicts a particular answer. Experimental analysis on the two benchmark datasets, VQA2.0 and CLEVR, depicts that our model outperforms the previous approaches for Visual Question Answering task as well as provides better reasoning, thus making it reliable for mission critical applications like safety and security.
Maruyama, Yoshihiro.  2021.  Learning, Development, and Emergence of Compositionality in Natural Language Processing. 2021 IEEE International Conference on Development and Learning (ICDL). :1–7.
There are two paradigms in language processing, as characterised by symbolic compositional and statistical distributional modelling, which may be regarded as based upon the principles of compositionality (or symbolic recursion) and of contextuality (or the distributional hypothesis), respectively. Starting with philosophy of language as in Frege and Wittgenstein, we elucidate the nature of language and language processing from interdisciplinary perspectives across different fields of science. At the same time, we shed new light on conceptual issues in language processing on the basis of recent advances in Transformer-based models such as BERT and GPT-3. We link linguistic cognition with mathematical cognition through these discussions, explicating symbol grounding/emergence problems shared by both of them. We also discuss whether animal cognition can develop recursive compositional information processing.
2022-08-03
Deng, Yuxin, Chen, Zezhong, Du, Wenjie, Mao, Bifei, Liang, Zhizhang, Lin, Qiushi, Li, Jinghui.  2021.  Trustworthiness Derivation Tree: A Model of Evidence-Based Software Trustworthiness. 2021 IEEE 21st International Conference on Software Quality, Reliability and Security Companion (QRS-C). :487—493.
In order to analyze the trustworthiness of complex software systems, we propose a model of evidence-based software trustworthiness called trustworthiness derivation tree (TDT). The basic idea of constructing a TDT is to refine main properties into key ingredients and continue the refinement until basic facts such as evidences are reached. The skeleton of a TDT can be specified by a set of rules, which is convenient for automated reasoning in Prolog. We develop a visualization tool that can construct the skeleton of a TDT by taking the rules as input, and allow a user to edit the TDT in a graphical user interface. In a software development life cycle, TDTs can serve as a communication means for different stakeholders to agree on the properties about a system in the requirement analysis phase, and they can be used for deductive reasoning so as to verify whether the system achieves trustworthiness in the product validation phase. We have piloted the approach of using TDTs in more than a dozen real scenarios of software development. Indeed, using TDTs helped us to discover and then resolve some subtle problems.
2022-05-23
Chang, Xinyu, Wu, Bian.  2021.  Effects of Immersive Spherical Video-based Virtual Reality on Cognition and Affect Outcomes of Learning: A Meta-analysis. 2021 International Conference on Advanced Learning Technologies (ICALT). :389–391.
With the advancement of portable head-mounted displays, interest in educational application of immersive spherical video-based virtual reality (SVVR) has been emerging. However, it remains unclear regarding the effects of immersive SVVR on cognitive and affective outcomes. In this study, we retrieved 58 learning outcomes from 16 studies. A meta-analysis was performed using the random effects model to calculate the effect size. Several important moderators were also examined such as control group treatment, learning outcome type, interaction functionality, content instruction, learning domain, and learner's stage. The results show that immersive SVVR is more effective than other instructional conditions with a medium effect size. The key findings of the moderator analysis are that immersive SVVR has a greater impact on affective outcomes, as well as under the conditions that learning system provides interaction functionality or integrates with content instruction before virtual exploratory learning.
2022-04-21
Rathod, Paresh, Hämäläinen, Timo.  2017.  A Novel Model for Cybersecurity Economics and Analysis. 2017 IEEE International Conference on Computer and Information Technology (CIT). :274–279.
In recent times, major cybersecurity breaches and cyber fraud had huge negative impact on victim organisations. The biggest impact made on major areas of business activities. Majority of organisations facing cybersecurity adversity and advanced threats suffers from huge financial and reputation loss. The current security technologies, policies and processes are providing necessary capabilities and cybersecurity mechanism to solve cyber threats and risks. However, current solutions are not providing required mechanism for decision making on impact of cybersecurity breaches and fraud. In this paper, we are reporting initial findings and proposing conceptual solution. The paper is aiming to provide a novel model for Cybersecurity Economics and Analysis (CEA). We will contribute to increasing harmonization of European cybersecurity initiatives and reducing fragmented practices of cybersecurity solutions and also helping to reach EU Digital Single Market goal. By introducing Cybersecurity Readiness Level Metrics the project will measure and increase effectiveness of cybersecurity programs, while the cost-benefit framework will help to increase the economic and financial viability, effectiveness and value generation of cybersecurity solutions for organisation's strategic, tactical and operational imperative. The ambition of the research development and innovation (RDI) is to increase and re-establish the trust of the European citizens in European digital environments through practical solutions.
2022-02-24
Baelde, David, Delaune, Stéphanie, Jacomme, Charlie, Koutsos, Adrien, Moreau, Solène.  2021.  An Interactive Prover for Protocol Verification in the Computational Model. 2021 IEEE Symposium on Security and Privacy (SP). :537–554.
Given the central importance of designing secure protocols, providing solid mathematical foundations and computer-assisted methods to attest for their correctness is becoming crucial. Here, we elaborate on the formal approach introduced by Bana and Comon in [10], [11], which was originally designed to analyze protocols for a fixed number of sessions, and lacks support for proof mechanization.In this paper, we present a framework and an interactive prover allowing to mechanize proofs of security protocols for an arbitrary number of sessions in the computational model. More specifically, we develop a meta-logic as well as a proof system for deriving security properties. Proofs in our system only deal with high-level, symbolic representations of protocol executions, similar to proofs in the symbolic model, but providing security guarantees at the computational level. We have implemented our approach within a new interactive prover, the Squirrel prover, taking as input protocols specified in the applied pi-calculus, and we have performed a number of case studies covering a variety of primitives (hashes, encryption, signatures, Diffie-Hellman exponentiation) and security properties (authentication, strong secrecy, unlinkability).
2022-02-09
Ranade, Priyanka, Piplai, Aritran, Mittal, Sudip, Joshi, Anupam, Finin, Tim.  2021.  Generating Fake Cyber Threat Intelligence Using Transformer-Based Models. 2021 International Joint Conference on Neural Networks (IJCNN). :1–9.
Cyber-defense systems are being developed to automatically ingest Cyber Threat Intelligence (CTI) that contains semi-structured data and/or text to populate knowledge graphs. A potential risk is that fake CTI can be generated and spread through Open-Source Intelligence (OSINT) communities or on the Web to effect a data poisoning attack on these systems. Adversaries can use fake CTI examples as training input to subvert cyber defense systems, forcing their models to learn incorrect inputs to serve the attackers' malicious needs. In this paper, we show how to automatically generate fake CTI text descriptions using transformers. Given an initial prompt sentence, a public language model like GPT-2 with fine-tuning can generate plausible CTI text that can mislead cyber-defense systems. We use the generated fake CTI text to perform a data poisoning attack on a Cybersecurity Knowledge Graph (CKG) and a cybersecurity corpus. The attack introduced adverse impacts such as returning incorrect reasoning outputs, representation poisoning, and corruption of other dependent AI-based cyber defense systems. We evaluate with traditional approaches and conduct a human evaluation study with cyber-security professionals and threat hunters. Based on the study, professional threat hunters were equally likely to consider our fake generated CTI and authentic CTI as true.
2022-01-31
Peitek, Norman, Apel, Sven, Parnin, Chris, Brechmann, André, Siegmund, Janet.  2021.  Program Comprehension and Code Complexity Metrics: An fMRI Study. 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). :524–536.
Background: Researchers and practitioners have been using code complexity metrics for decades to predict how developers comprehend a program. While it is plausible and tempting to use code metrics for this purpose, their validity is debated, since they rely on simple code properties and rarely consider particularities of human cognition. Aims: We investigate whether and how code complexity metrics reflect difficulty of program comprehension. Method: We have conducted a functional magnetic resonance imaging (fMRI) study with 19 participants observing program comprehension of short code snippets at varying complexity levels. We dissected four classes of code complexity metrics and their relationship to neuronal, behavioral, and subjective correlates of program comprehension, overall analyzing more than 41 metrics. Results: While our data corroborate that complexity metrics can-to a limited degree-explain programmers' cognition in program comprehension, fMRI allowed us to gain insights into why some code properties are difficult to process. In particular, a code's textual size drives programmers' attention, and vocabulary size burdens programmers' working memory. Conclusion: Our results provide neuro-scientific evidence supporting warnings of prior research questioning the validity of code complexity metrics and pin down factors relevant to program comprehension. Future Work: We outline several follow-up experiments investigating fine-grained effects of code complexity and describe possible refinements to code complexity metrics.
2021-12-02
Piatkowska, Ewa, Gavriluta, Catalin, Smith, Paul, Andrén, Filip Pröstl.  2020.  Online Reasoning about the Root Causes of Software Rollout Failures in the Smart Grid. 2020 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm). :1–7.
An essential ingredient of the smart grid is software-based services. Increasingly, software is used to support control strategies and services that are critical to the grid's operation. Therefore, its correct operation is essential. For various reasons, software and its configuration needs to be updated. This update process represents a significant overhead for smart grid operators and failures can result in financial losses and grid instabilities. In this paper, we present a framework for determining the root causes of software rollout failures in the smart grid. It uses distributed sensors that indicate potential issues, such as anomalous grid states and cyber-attacks, and a causal inference engine based on a formalism called evidential networks. The aim of the framework is to support an adaptive approach to software rollouts, ensuring that a campaign completes in a timely and secure manner. The framework is evaluated for a software rollout use-case in a low voltage distribution grid. Experimental results indicate it can successfully discriminate between different root causes of failure, supporting an adaptive rollout strategy.
2021-10-12
Muller, Tim, Wang, Dongxia, Sun, Jun.  2020.  Provably Robust Decisions based on Potentially Malicious Sources of Information. 2020 IEEE 33rd Computer Security Foundations Symposium (CSF). :411–424.
Sometimes a security-critical decision must be made using information provided by peers. Think of routing messages, user reports, sensor data, navigational information, blockchain updates. Attackers manifest as peers that strategically report fake information. Trust models use the provided information, and attempt to suggest the correct decision. A model that appears accurate by empirical evaluation of attacks may still be susceptible to manipulation. For a security-critical decision, it is important to take the entire attack space into account. Therefore, we define the property of robustness: the probability of deciding correctly, regardless of what information attackers provide. We introduce the notion of realisations of honesty, which allow us to bypass reasoning about specific feedback. We present two schemes that are optimally robust under the right assumptions. The “majority-rule” principle is a special case of the other scheme which is more general, named “most plausible realisations”.
2021-06-01
Materzynska, Joanna, Xiao, Tete, Herzig, Roei, Xu, Huijuan, Wang, Xiaolong, Darrell, Trevor.  2020.  Something-Else: Compositional Action Recognition With Spatial-Temporal Interaction Networks. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). :1046–1056.
Human action is naturally compositional: humans can easily recognize and perform actions with objects that are different from those used in training demonstrations. In this paper, we study the compositionality of action by looking into the dynamics of subject-object interactions. We propose a novel model which can explicitly reason about the geometric relations between constituent objects and an agent performing an action. To train our model, we collect dense object box annotations on the Something-Something dataset. We propose a novel compositional action recognition task where the training combinations of verbs and nouns do not overlap with the test set. The novel aspects of our model are applicable to activities with prominent object interaction dynamics and to objects which can be tracked using state-of-the-art approaches; for activities without clearly defined spatial object-agent interactions, we rely on baseline scene-level spatio-temporal representations. We show the effectiveness of our approach not only on the proposed compositional action recognition task but also in a few-shot compositional setting which requires the model to generalize across both object appearance and action category.
Zheng, Wenbo, Yan, Lan, Gou, Chao, Wang, Fei-Yue.  2020.  Webly Supervised Knowledge Embedding Model for Visual Reasoning. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). :12442–12451.
Visual reasoning between visual image and natural language description is a long-standing challenge in computer vision. While recent approaches offer a great promise by compositionality or relational computing, most of them are oppressed by the challenge of training with datasets containing only a limited number of images with ground-truth texts. Besides, it is extremely time-consuming and difficult to build a larger dataset by annotating millions of images with text descriptions that may very likely lead to a biased model. Inspired by the majority success of webly supervised learning, we utilize readily-available web images with its noisy annotations for learning a robust representation. Our key idea is to presume on web images and corresponding tags along with fully annotated datasets in learning with knowledge embedding. We present a two-stage approach for the task that can augment knowledge through an effective embedding model with weakly supervised web data. This approach learns not only knowledge-based embeddings derived from key-value memory networks to make joint and full use of textual and visual information but also exploits the knowledge to improve the performance with knowledge-based representation learning for applying other general reasoning tasks. Experimental results on two benchmarks show that the proposed approach significantly improves performance compared with the state-of-the-art methods and guarantees the robustness of our model against visual reasoning tasks and other reasoning tasks.
Chen, Zhenfang, Wang, Peng, Ma, Lin, Wong, Kwan-Yee K., Wu, Qi.  2020.  Cops-Ref: A New Dataset and Task on Compositional Referring Expression Comprehension. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). :10083–10092.
Referring expression comprehension (REF) aims at identifying a particular object in a scene by a natural language expression. It requires joint reasoning over the textual and visual domains to solve the problem. Some popular referring expression datasets, however, fail to provide an ideal test bed for evaluating the reasoning ability of the models, mainly because 1) their expressions typically describe only some simple distinctive properties of the object and 2) their images contain limited distracting information. To bridge the gap, we propose a new dataset for visual reasoning in context of referring expression comprehension with two main features. First, we design a novel expression engine rendering various reasoning logics that can be flexibly combined with rich visual properties to generate expressions with varying compositionality. Second, to better exploit the full reasoning chain embodied in an expression, we propose a new test setting by adding additional distracting images containing objects sharing similar properties with the referent, thus minimising the success rate of reasoning-free cross-domain alignment. We evaluate several state-of-the-art REF models, but find none of them can achieve promising performance. A proposed modular hard mining strategy performs the best but still leaves substantial room for improvement.
Patnaikuni, Shrinivasan, Gengaje, Sachin.  2020.  Properness and Consistency of Syntactico-Semantic Reasoning using PCFG and MEBN. 2020 International Conference on Communication and Signal Processing (ICCSP). :0554–0557.
The paper proposes a formal approach for parsing grammatical derivations in the context of the principle of semantic compositionality by defining a mapping between Probabilistic Context Free Grammar (PCFG) and Multi Entity Bayesian Network (MEBN) theory, which is a first-order logic for modelling probabilistic knowledge bases. The principle of semantic compositionality states that meaning of compound expressions is dependent on meanings of constituent expressions forming the compound expression. Typical pattern analysis applications focus on syntactic patterns ignoring semantic patterns governing the domain in which pattern analysis is attempted. The paper introduces the concepts and terminologies of the mapping between PCFG and MEBN theory. Further the paper outlines a modified version of CYK parser algorithm for parsing PCFG derivations driven by MEBN. Using Kullback- Leibler divergence an outline for proving properness and consistency of the PCFG mapped with MEBN is discussed.