Biblio
Video surveillance, closed-circuit TV and IP-camera systems became virtually omnipresent and indispensable for many organizations, businesses, and users. Their main purpose is to provide physical security, increase safety, and prevent crime. They also became increasingly complex, comprising many communication means, embedded hardware and non-trivial firmware. However, most research to date focused mainly on the privacy aspects of such systems, and did not fully address their issues related to cyber-security in general, and visual layer (i.e., imagery semantics) attacks in particular. In this paper, we conduct a systematic review of existing and novel threats in video surveillance, closed-circuit TV and IP-camera systems based on publicly available data. The insights can then be used to better understand and identify the security and the privacy risks associated with the development, deployment and use of these systems. We study existing and novel threats, along with their existing or possible countermeasures, and summarize this knowledge into a comprehensive table that can be used in a practical way as a security checklist when assessing cyber-security level of existing or new CCTV designs and deployments. We also provide a set of recommendations and mitigations that can help improve the security and privacy levels provided by the hardware, the firmware, the network communications and the operation of video surveillance systems. We hope the findings in this paper will provide a valuable knowledge of the threat landscape that such systems are exposed to, as well as promote further research and widen the scope of this field beyond its current boundaries.
With recent advances in consumer electronics and the increasingly urgent need for public security, camera networks have evolved from their early role of providing simple and static monitoring to current complex systems capable of obtaining extensive video information for intelligent processing, such as target localization, identification, and tracking. In all cases, it is of vital importance that the optimal camera configuration (i.e., optimal location, orientation, etc.) is determined before cameras are deployed as a suboptimal placement solution will adversely affect intelligent video surveillance and video analytic algorithms. The optimal configuration may also provide substantial savings on the total number of cameras required to achieve the same level of utility. In this article, we examine most, if not all, of the recent approaches (post 2000) addressing camera placement in a structured manner. We believe that our work can serve as a first point of entry for readers wishing to start researching into this area or engineers who need to design a camera system in practice. To this end, we attempt to provide a complete study of relevant formulation strategies and brief introductions to most commonly used optimization techniques by researchers in this field. We hope our work to be inspirational to spark new ideas in the field.
This paper evaluates a new video surveillance platform presented in a previous study, through an abandoned object detection task. The proposed platform has a function of automated detection and alerting, which is still a big challenge for a machine algorithm due to its recall-precision tradeoff problem. To achieve both high recall and high precision simultaneously, a hybrid approach using crowdsourcing after image analysis is proposed. This approach, however, is still not clear about what extent it can improve detection accuracy and raise quicker alerts. In this paper, the experiment is conducted for abandoned object detection, as one of the most common surveillance tasks. The results show that detection accuracy was improved from 50% (without crowdsourcing) to stable 95-100% (with crowdsourcing) by majority vote of 7 crowdworkers for each task. In contrast, alert time issue still remains open to further discussion since at least 7+ minutes are required to get the best performance.
The widespread adoption of Location-Based Services (LBSs) has come with controversy about privacy. While leveraging location information leads to improving services through geo-contextualization, it rises privacy concerns as new knowledge can be inferred from location records, such as home/work places, habits or religious beliefs. To overcome this problem, several Location Privacy Protection Mechanisms (LPPMs) have been proposed in the literature these last years. However, every mechanism comes with its own configuration parameters that directly impact the privacy guarantees and the resulting utility of protected data. In this context, it can be difficult for a non-expert system designer to choose appropriate configuration parameters to use according to the expected privacy and utility. In this paper, we present a framework enabling the easy configuration of LPPMs. To achieve that, our framework performs an offline, in-depth automated analysis of LPPMs to provide the formal relationship between their configuration parameters and both privacy and the utility metrics. This framework is modular: by using different metrics, a system designer is able to fine-tune her LPPM according to her expected privacy and utility guarantees (i.e., the guarantee itself and the level of this guarantee). To illustrate the capability of our framework, we analyse Geo-Indistinguishability (a well known differentially private LPPM) and we provide the formal relationship between its &epsis; configuration parameter and two privacy and utility metrics.
This paper describes a small experimental study into the use of avatars to remediate the lecturer's absence in voice-over-slide material. Four different avatar behaviours are tested. Avatar A performs all the upper-body gestures of the lecturer, which were recorded using a 3D depth sensor. Avatar B is animated using few random gestures in order to create a natural presence that is unrelated to the speech. Avatar C only performs the lecturer's pointing gestures, as these are known to indicate important parts of a lecture. Finally, Avatar D performs "lecturer-like" gestures, but these are desynchronised with the speech. Preliminary results indicate students' preference for Avatars A and C. Although the effect of avatar behaviour on learning did not prove statistically significant, students' comments indicate that an avatar that behaves quietly and only performs some of the lecturer's gestures (pointing) is effective. The paper also presents a simple empirical method for automatically detecting pointing gestures in Kinect recorded lecture data.
Never Alone (2016) is a generative large-scale urban screen video-sound installation, which presents the idea of generative choreographies amongst multiple video agents, or "digital performers". This generative installation questions how we navigate in urban spaces and the ubiquity and disruptive nature of encounters within the cities' landscapes. The video agents explore precarious movement paths along the façade inhabiting landscapes that are both architectural and emotional.
In this study, we used a humanoid robot as a telepresence robot and compared with the basic telepresence robot which can only rotate the display in order to reveal the effect of embodiment. We also investigated the effect caused by changing the body size of the humanoid robot by using two different size of robots. Our experimental results revealed that the embodiment increases the remote person's social telepresence, familiarity, and directivity. The comparison between small and big humanoid robots showed no difference and both of them were effective.
Clickjacking attacks are emerging threats to websites of different sizes and shapes. They are particularly used by threat agents to get more likes and/or followers in Online Social Networks (OSNs). This paper reviews the clickjacking attacks and the classic solutions to tackle various forms of those attacks. Different approaches of Cross-Site Scripting attacks are implemented in this study to study the attack tools and methods. Various iFrame attacks have been developed to tamper with the integrity of the website interactions at the application layer. By visually demonstrating the attacks such as Cross-Site scripting (XSS) and Cross-Site Request Forgery (CSRF), users will be able to have a better understanding of such attacks in their formulation and the risks associated with them.
In this paper, we present E-VOX, an emotionally enhanced semantic ECA designed to work as a virtual assistant to search information from Wikipedia. It includes a cognitive-affective architecture that integrates an emotion model based on ALMA and the Soar cognitive architecture. This allows the ECA to take into account features needed for social interaction such as learning and emotion management. The architecture makes it possible to influence and modify the behavior of the agent depending on the feedback received from the user and other information from the environment, allowing the ECA to achieve a more realistic and believable interaction with the user. A completely functional prototype has been developed showing the feasibility of our approach.
Social and emotional intelligence of computer systems is increasingly important in human-AI (Artificial Intelligence) interactions. This paper presents a tangible AI interface, T.A.I, that enhances physical engagement in digital communication between users and a conversational AI agent. We describe a compact, pneumatically shape-changing hardware design with a rich set of physical gestures that actuate on mobile devices during real-time conversations. Our user study suggests that the physical presence provided by T.A.I increased users' empathy for, and social connection with the virtual intelligent system, leading to an improved Human-AI communication experience.
This paper contributes a systematic research approach as well as findings of an empirical study conducted to investigate the effect of virtual agents on task performance and player experience in digital games. As virtual agents are supposed to evoke social effects similar to real humans under certain conditions, the basic social phenomenon social facilitation is examined in a testbed game that was specifically developed to enable systematical variation of single impact factors of social facilitation. Independent variables were the presence of a virtual agent (present vs. not present) and the output device (ordinary monitor vs. head-mounted display). Results indicate social inhibition effects, but only for players using a head-mounted display. Additional potential impact factors and future research directions are discussed.
The goal of this work is to model a virtual character able to converse with different interpersonal attitudes. To build our model, we rely on the analysis of multimodal corpora of non-verbal behaviors. The interpretation of these behaviors depends on how they are sequenced (order) and distributed over time. To encompass the dynamics of non-verbal signals across both modalities and time, we make use of temporal sequence mining. Specifically, we propose a new algorithm for temporal sequence extraction. We apply our algorithm to extract temporal patterns of non-verbal behaviors expressing interpersonal attitudes from a corpus of job interviews. We demonstrate the efficiency of our algorithm in terms of significant accuracy improvement over the state-of-the-art algorithms.
The design of systems with independent agency to act on the environment or which can act as persuasive agents requires consideration of not only the technical aspects of design, but of the psychological, sociological, and philosophical aspects as well. Creating usable, safe, and ethical systems will require research into human-computer communication, in order to design systems that can create and maintain a relationship with users, explain their workings, and act in the best interests of both users and of the larger society.
Personal agent software is now in daily use in personal devices and in some organizational settings. While many advocate an agent sociality design paradigm that incorporates human-like features and social dialogues, it is unclear whether this is a good match for professionals who seek productivity instead of leisurely use. We conducted a 17-day field study of a prototype of a personal AI agent that helps employees find work-related information. Using log data, surveys, and interviews, we found individual differences in the preference for humanized social interactions (social-agent orientation), which led to different user needs and requirements for agent design. We also explored the effect of agent proactive interactions and found that they carried the risk of interruption, especially for users who were generally averse to interruptions at work. Further, we found that user differences in social-agent orientation and aversion to agent proactive interactions can be inferred from behavioral signals. Our results inform research into social agent design, proactive agent interaction, and personalization of AI agents.

