Biblio
Human behaviors are often prohibited, or permitted by social norms. Therefore, if autonomous agents interact with humans, they also need to reason about various legal rules, social and ethical social norms, so they would be trusted and accepted by humans. Inverse Reinforcement Learning (IRL) can be used for the autonomous agents to learn social norm-compliant behavior via expert demonstrations. However, norms are context-sensitive, i.e. different norms get activated in different contexts. For example, the privacy norm is activated for a domestic robot entering a bathroom where a person may be present, whereas it is not activated for the robot entering the kitchen. Representing various contexts in the state space of the robot, as well as getting expert demonstrations under all possible tasks and contexts is extremely challenging. Inspired by recent work on Modularized Normative MDP (MNMDP) and early work on context-sensitive RL, we propose a new IRL framework, Context-Sensitive Norm IRL (CNIRL). CNIRL treats states and contexts separately, and assumes that the expert determines the priority of every possible norm in the environment, where each norm is associated with a distinct reward function. The agent chooses the action to maximize its cumulative rewards. We present the CNIRL model and show that its computational complexity is scalable in the number of norms. We also show via two experimental scenarios that CNIRL can handle problems with changing context spaces.
With the wide application of modern robots, more concerns have been raised on security and privacy of robotic systems and applications. Although the Robot Operating System (ROS) is commonly used on different robots, there have been few work considering the security aspects of ROS. As ROS does not employ even the basic permission control mechanism, applications can access any resources without limitation, which could result in equipment damage, harm to human, as well as privacy leakage. In this paper we propose an access control mechanism for ROS based on an extended policy-based access control (PBAC) model. Specifically, we extend ROS to add an additional node dedicated for access control so that it can provide user identity and permission management services. The proposed mechanism also allows the administrator to revoke a permission dynamically. We implemented the proposed method in ROS and demonstrated its applicability and performance through several case studies.
Control-hijacking attacks include code injection attacks and code reuse attacks. In recent years, with the emergence of the defense mechanism data-execution prevention(DEP), code reuse attacks have become mainstream, such as return-oriented programming(ROP), Jump-Oriented Programming(JOP), and Counterfeit Object-oriented Programming(COOP). And a series of defensive measures have been proposed, such as DEP, address space layout randomization (ASLR), coarse-grained Control-Flow Integrity(CFI) and fine-grained CFI. In this paper, we propose a new attack called function-oriented programming(FOP) to construct malicious program behavior. FOP takes advantage of the existing function of the C program to induce attack. We propose concrete algorithms for FOP gadgets and build a tool to identify FOP gadgets. FOP can successfully bypass coarse-grained CFI, and FOP also can bypass some existing fine-grained CFI technologies, such as shadow stack technology. We show a real-world attack for proftpd1.3.0 server in the Linux x64 environment. We believe that the FOP attack will encourage people to come up with more effective defense measures.
This paper establishes a probability model of multiple paths scheme of quantum key distribution with public nodes among a set of paths which are used to transmit the key between the source node and the destination node. Then in order to be used in universal net topologies, combining with the key routing in the QKD network, the algorithm of the multiple paths scheme of key distribution we propose includes two major aspects: one is an approach which can confirm the number and the distance of the selection of paths, and the other is the strategy of stochastic paths with labels that can decrease the number of public nodes and avoid the phenomenon that the old scheme may produce loops and often get the nodes apart from the destination node father than current nodes. Finally, the paper demonstrates the rationality of the probability model and strategies about the algorithm.