Biblio
Social and emotional intelligence of computer systems is increasingly important in human-AI (Artificial Intelligence) interactions. This paper presents a tangible AI interface, T.A.I, that enhances physical engagement in digital communication between users and a conversational AI agent. We describe a compact, pneumatically shape-changing hardware design with a rich set of physical gestures that actuate on mobile devices during real-time conversations. Our user study suggests that the physical presence provided by T.A.I increased users' empathy for, and social connection with the virtual intelligent system, leading to an improved Human-AI communication experience.
This paper contributes a systematic research approach as well as findings of an empirical study conducted to investigate the effect of virtual agents on task performance and player experience in digital games. As virtual agents are supposed to evoke social effects similar to real humans under certain conditions, the basic social phenomenon social facilitation is examined in a testbed game that was specifically developed to enable systematical variation of single impact factors of social facilitation. Independent variables were the presence of a virtual agent (present vs. not present) and the output device (ordinary monitor vs. head-mounted display). Results indicate social inhibition effects, but only for players using a head-mounted display. Additional potential impact factors and future research directions are discussed.
The goal of this work is to model a virtual character able to converse with different interpersonal attitudes. To build our model, we rely on the analysis of multimodal corpora of non-verbal behaviors. The interpretation of these behaviors depends on how they are sequenced (order) and distributed over time. To encompass the dynamics of non-verbal signals across both modalities and time, we make use of temporal sequence mining. Specifically, we propose a new algorithm for temporal sequence extraction. We apply our algorithm to extract temporal patterns of non-verbal behaviors expressing interpersonal attitudes from a corpus of job interviews. We demonstrate the efficiency of our algorithm in terms of significant accuracy improvement over the state-of-the-art algorithms.
The design of systems with independent agency to act on the environment or which can act as persuasive agents requires consideration of not only the technical aspects of design, but of the psychological, sociological, and philosophical aspects as well. Creating usable, safe, and ethical systems will require research into human-computer communication, in order to design systems that can create and maintain a relationship with users, explain their workings, and act in the best interests of both users and of the larger society.
Personal agent software is now in daily use in personal devices and in some organizational settings. While many advocate an agent sociality design paradigm that incorporates human-like features and social dialogues, it is unclear whether this is a good match for professionals who seek productivity instead of leisurely use. We conducted a 17-day field study of a prototype of a personal AI agent that helps employees find work-related information. Using log data, surveys, and interviews, we found individual differences in the preference for humanized social interactions (social-agent orientation), which led to different user needs and requirements for agent design. We also explored the effect of agent proactive interactions and found that they carried the risk of interruption, especially for users who were generally averse to interruptions at work. Further, we found that user differences in social-agent orientation and aversion to agent proactive interactions can be inferred from behavioral signals. Our results inform research into social agent design, proactive agent interaction, and personalization of AI agents.
In March 2016, several online news media reported on the inadequate emotional capabilities of interactive virtual assistants. While significant progress has been made in the general intelligence and functionality of virtual agents (VA), the emotional intelligent (EI) VA has yet been thoroughly explored. We examine user's perception of EI of virtual agents through Zara The Supergirl, a virtual agent that conducts question and answering type of conversational testing and counseling online. The results show that overall users perceive an emotion-expressing VA (EEVA) to be more EI than a non-emotion-expressing VA (NEEVA). However, simple affective expression may not be sufficient enough for EEVA to be perceived as fully EI.
We will demonstrate a conversational products recommendation agent. This system shows how we combine research in personalized recommendation systems with research in dialogue systems to build a virtual sales agent. Based on new deep learning technologies we developed, the virtual agent is capable of learning how to interact with users, how to answer user questions, what is the next question to ask, and what to recommend when chatting with a human user. Normally a descent conversational agent for a particular domain requires tens of thousands of hand labeled conversational data or hand written rules. This is a major barrier when launching a conversation agent for a new domain. We will explore and demonstrate the effectiveness of the learning solution even when there is no hand written rules or hand labeled training data.
Given the proliferation of digital assistants in everyday mobile technology, it appears inevitable that next generation vehicles will be embodied by similar agents, offering engaging, natural language interactions. However, speech can be cognitively captivating. It is therefore important to understand the demand that such interfaces may place on drivers. Twenty-five participants undertook four drives (counterbalanced), in a medium-fidelity driving simulator: 1. Interacting with a state-of-the-art digital driving assistant ('DDA') (presented using Wizard-of-Oz); 2. Engaged in a hands-free mobile phone conversation; 3. Undertaking the delayed-digit recall ('2-back') task and 4. With no secondary task (baseline). Physiological arousal, subjective workload assessment, tactile detection task (TDT) and driving performance measures consistently revealed the '2-back' drive as the most cognitively demanding (highest workload, poorest TDT performance). Mobile phone and DDA conditions were largely equivalent, attracting low/medium cognitive workload. Findings are discussed in the context of designing in-vehicle natural language interfaces to mitigate cognitive demand.
Voice-controlled intelligent personal assistants, such as Cortana, Google Now, Siri and Alexa, are increasingly becoming a part of users' daily lives, especially on mobile devices. They introduce a significant change in information access, not only by introducing voice control and touch gestures but also by enabling dialogues where the context is preserved. This raises the need for evaluation of their effectiveness in assisting users with their tasks. However, in order to understand which type of user interactions reflect different degrees of user satisfaction we need explicit judgements. In this paper, we describe a user study that was designed to measure user satisfaction over a range of typical scenarios of use: controlling a device, web search, and structured search dialogue. Using this data, we study how user satisfaction varied with different usage scenarios and what signals can be used for modeling satisfaction in the different scenarios. We find that the notion of satisfaction varies across different scenarios, and show that, in some scenarios (e.g. making a phone call), task completion is very important while for others (e.g. planning a night out), the amount of effort spent is key. We also study how the nature and complexity of the task at hand affects user satisfaction, and find that preserving the conversation context is essential and that overall task-level satisfaction cannot be reduced to query-level satisfaction alone. Finally, we shed light on the relative effectiveness and usefulness of voice-controlled intelligent agents, explaining their increasing popularity and uptake relative to the traditional query-response interaction.
Embodied conversational agents are changing the way humans interact with technology. In order to develop humanlike ECAs they need to be able to perform natural gestures that are used in day-to-day conversation. Gestures can give insight into an ECAs personality trait of extraversion, but what factors into it is still being explored. Our study focuses on two aspects of gesture: amplitude and frequency. Our goal is to find out whether agents should use specific gestures more frequently than others depending on the personality type they have been designed with. We also look to quantify gesture amplitude and compare it to a previous study on the perception of an agent's naturalness of its gestures. Our results showed some indication that introverts and extraverts judge the agent's naturalness similarly. The larger the amplitude our agent used, the more natural its gestures were perceived. The frequency of gestures between extraverts and introverts seem to contain hardly any difference, even in terms of types of gesture used.
We propose a multi party conversational social interface NAMIDA through a pilot study. The system consists of three robots that can converse with each other about environment throughout the road. Through this model, the directed utterances towards the driver diminishes by utilizing turn-taking process between the agents, and the mental workload of the driver can be reduced compared to the conventional one-to-one communication based approach that directly addresses the driver. We set up an experiment to compare the both approaches to explore their effects on the workload and attention behaviors of drivers. The results indicated that the multi-party conversational approach has a better effect on reducing certain workload factors. Also, the analysis of attention behaviors of drivers revealed that our method can better promote the drivers to focus on the road.
In this study, we are using a multi-party recording as a template for building a parametric speech synthesiser which is able to express different levels of attentiveness in backchannel tokens. This allowed us to investigate i) whether it is possible to express the same perceived level of attentiveness in synthesised than in natural backchannels; ii) whether it is possible to increase and decrease the perceived level of attentiveness of backchannels beyond the range observed in the original corpus.
The past four years have seen the rise of conversational agents (CAs) in everyday life. Apple, Microsoft, Amazon, Google and Facebook have all embedded proprietary CAs within their software and, increasingly, conversation is becoming a key mode of human-computer interaction. Whilst we have long been familiar with the notion of computers that speak, the investigative concern within HCI has been upon multimodality rather than dialogue alone, and there is no sense of how such interfaces are used in everyday life. This paper reports the findings of interviews with 14 users of CAs in an effort to understand the current interactional factors affecting everyday use. We find user expectations dramatically out of step with the operation of the systems, particularly in terms of known machine intelligence, system capability and goals. Using Norman's 'gulfs of execution and evaluation' [30] we consider the implications of these findings for the design of future systems.
This paper describes a system for embodied conversational agents developed by Inmerssion and one of the applications—Young Merlin: Trial by Fire —built with this system. In the Merlin application, the ECA and a human interact with speech in virtual reality. The goal of this application is to provide engaging VR experiences that build rapport through storytelling and verbal interactions. The agent is fully automated, and his attitude towards the user changes over time depending on the interaction. The conversational system was built through a declarative approach that supports animations, markup language, and gesture recognition. Future versions of Merlin will implement multi-character dialogs, additional actions, and extended interaction time.
We present a demonstration of the ARIA framework, a modular approach for rapid development of virtual humans for information retrieval that have linguistic, emotional, and social skills and a strong personality. We demonstrate the framework's capabilities in a scenario where `Alice in Wonderland', a popular English literature book, is embodied by a virtual human representing Alice. The user can engage in an information exchange dialogue, where Alice acts as the expert on the book, and the user as an interested novice. Besides speech recognition, sophisticated audio-visual behaviour analysis is used to inform the core agent dialogue module about the user's state and intentions, so that it can go beyond simple chat-bot dialogue. The behaviour generation module features a unique new capability of being able to deal gracefully with interruptions of the agent.
Information-centric networking (ICN) has been actively studied as a promising alternative to the IP-based Internet architecture with potential benefits in terms of network efficiency, privacy, security, and novel applications. However, it is difficult to adopt such wholesale replacement of the IP-based Internet to a new routing and service infrastructure due to the conflict among existing stakeholders, market players, and solution providers. To overcome these difficulties, we provide an evolutionary approach by which we enable the expected benefits of ICN for existing services. The demonstration shows that these benefits can be efficiently introduced and work with existing IP end-systems.
The Information-Centric Networking (ICN) paradigm is drastically different from traditional host-centric IP networking. As a consequence of the disparity between the two, the security models are also very different. The security model for IP is based on securing the end-to-end communication link between the communicating nodes whereas the ICN security model is based on securing data objects often termed as Object Security. Just like the traditional security model, Object security also poses a challenge of key management. This is especially concerning for ICN as data cached in its encrypted form should be usable by several different users. Attribute-Based Encryption (ABE) alleviates this problem by enabling data to be encrypted under a policy that suits several different types of users. Users with different sets of attributes can potentially decrypt the data hence eliminating the need to encrypt the data separately for each type of user. ABE is a more processing intensive task compared to traditional public key encryption methods hence posing a challenge for resource constrained environments with devices that have low memory and battery power. In this demo we show ABE encryption carried out on a resource constrained sensor platform. Encrypted data is transported over an ICN network and is decrypted only by clients that have the correct set of attributes.
Content-centric networking (CCN) is a networking paradigm that emphasizes request-response-based data transfer. A \\textbackslashem consumer\ issues a request explicitly referencing desired data by name. A \\textbackslashem producer\ assigns a name to each data it publishes. Names are used both to identify data to and route traffic between consumers and producers. The type, format, and representation of names are fundamental to CCN. Currently, names are represented as human-readable application-layer URIs. This has several important security and performance implications for the network. In this paper, we propose to transparently decouple application-layer names from their network-layer counterparts. We demonstrate a mapping between the two namespaces that can be deterministically computed by consumers and producers, using application names formatted according to the standard CCN URI scheme. Meanwhile, consumers and producers can continue to use application-layer names. We detail the computation and mapping function requirements and discuss their impact on consumers, producers, and routers. Finally, we comprehensively analyze several mapping functions to show their functional equivalence to standard application names and argue that they address several issues that stem from propagating application names into the network.
In content-based security, encrypted content as well as wrapped access keys are made freely available by an Information Centric Network: Only those clients which are able to unwrap the encryption key can access the protected content. In this paper we extend this model to computation chains where derived data (e.g. produced by a Named Function Network) also has to comply to the content-based security approach. A central problem to solve is the synchronized on-demand publishing of encrypted results and wrapped keys as well as defining the set of consumers which are authorized to access the derived data. In this paper we introduce "content-attendant policies" and report on a running prototype that demonstrates how to enforce data owner-defined access control policies despite fully decentralized and arbitrarily long computation chains.
The emerging Information-Centric Networking (ICN) paradigm is expected to facilitate content sharing among users. ICN will make it easy for users to appoint storage nodes, in various network locations, perhaps owned or controlled by them, where shared content can be stored and disseminated from. These storage nodes should be (somewhat) trusted since not only they have (some level of) access to user shared content, but they should also properly enforce access control. Traditional forms of encryption introduce significant overhead when it comes to sharing content with large and dynamic groups of users. To this end, proxy re-encryption provides a convenient solution. In this paper, we use Identity-Based Proxy Re-Encryption (IB-PRE) to provide confidentiality and access control for content items shared over ICN, realizing secure content distribution among dynamic sets of users. In contrast to similar IB-PRE based solutions, our design allows each user to generate the system parameters and the secret keys required by the underlay encryption scheme using their own \textbackslashemph\Private Key Generator\, therefore, our approach does not suffer from the key escrow problem. Moreover, our design further relaxes the trust requirements on the storage nodes by preventing them from sharing usable content with unauthorized users. Finally, our scheme does not require out-of-band secret key distribution.
The shift from the host-centric to the information-centric paradigm results in many benefits including native security, enhanced mobility, and scalability. The corresponding information-centric networking (ICN), also presents several important challenges, such as closest replica routing, client privacy, and client preference collection. The majority of these challenges have received the research community’s attention. However, no mechanisms have been proposed for the challenge of effective client preferences collection. In the era of big data analytics and recommender systems customer preferences are essential for providers such as Amazon and Netflix. However, with content served from in-network caches, the ICN paradigm indirectly undermines the gathering of these essential individualized preferences. In this paper, we discuss the requirements for client preference collections and present potential mechanisms that may be used for achieving it successfully.
Code diversification is an effective mitigation against return-oriented programming attacks, which breaks the assumptions of attackers about the location and structure of useful instruction sequences, known as "gadgets". Although a wide range of code diversification techniques of varying levels of granularity exist, most of them rely on the availability of source code, debug symbols, or the assumption of fully precise code disassembly, limiting their practical applicability for the protection of closed-source third-party applications. In-place code randomization has been proposed as an alternative binary-compatible diversification technique that is tolerant of partial disassembly coverage, in the expense though of leaving some gadgets intact, at the disposal of attackers. Consequently, the possibility of constructing robust ROP payloads using only the remaining non-randomized gadgets is still open. In this paper we present instruction displacement, a code diversification technique based on static binary instrumentation that does not rely on complete code disassembly coverage. Instruction displacement aims to improve the randomization coverage and entropy of existing binary-level code diversification techniques by displacing any remaining non-randomized gadgets to random locations. The results of our experimental evaluation demonstrate that instruction displacement reduces the number of non-randomized gadgets in the extracted code regions from 15.04% for standalone in-place code randomization, to 2.77% for the combination of both techniques. At the same time, the additional indirection introduced due to displacement incurs a negligible runtime overhead of 0.36% on average for the SPEC CPU2006 benchmarks.
Code reuse attacks based on return oriented programming (ROP) are becoming more and more prevalent every year. They started as a way to circumvent operating systems protections against injected code, but they are now also used as a technique to keep the malicious code hidden from detection and analysis systems. This means that while in the past ROP chains were short and simple (and therefore did not require any dedicated tool for their analysis), we recently started to observe very complex algorithms – such as a complete rootkit – implemented entirely as a sequence of ROP gadgets. In this paper, we present a set of techniques to analyze complex code reuse attacks. First, we identify and discuss the main challenges that complicate the reverse engineer of code implemented using ROP. Second, we propose an emulation-based framework to dissect, reconstruct, and simplify ROP chains. Finally, we test our tool on the most complex example available to date: a ROP rootkit containing four separate chains, two of them dynamically generated at runtime.
Despite a long history and numerous proposed defenses, memory corruption attacks are still viable. A secure and low-overhead defense against return-oriented programming (ROP) continues to elude the security community. Currently proposed solutions still must choose between either not fully protecting critical data and relying instead on information hiding, or using incomplete, coarse-grain checking that can be circumvented by a suitably skilled attacker. In this paper, we present a light-weighted memory protection approach (LMP) that uses Intel's MPX hardware extensions to provide complete, fast ROP protection without having to rely in information hiding. We demonstrate a prototype that defeats ROP attacks while incurring an average runtime overhead of 3.9%.
Remote attestation is a crucial security service particularly relevant to increasingly popular IoT (and other embedded) devices. It allows a trusted party (verifier) to learn the state of a remote, and potentially malware-infected, device (prover). Most existing approaches are static in nature and only check whether benign software is initially loaded on the prover. However, they are vulnerable to runtime attacks that hijack the application's control or data flow, e.g., via return-oriented programming or data-oriented exploits. As a concrete step towards more comprehensive runtime remote attestation, we present the design and implementation of Control-FLow ATtestation (C-FLAT) that enables remote attestation of an application's control-flow path, without requiring the source code. We describe a full prototype implementation of C-FLAT on Raspberry Pi using its ARM TrustZone hardware security extensions. We evaluate C-FLAT's performance using a real-world embedded (cyber-physical) application, and demonstrate its efficacy against control-flow hijacking attacks.