Biblio
This paper studies the problem of designing optimal privacy mechanism with less energy cost. The eavesdropper and the defender with limited resources should choose which channel to eavesdrop and defend, respectively. A zero-sum stochastic game framework is used to model the interaction between the two players and the game is solved through the Nash Q-learning approach. A numerical example is given to verify the proposed method.
ISSN: 2688-0938
Software Defined Networking (SDN) provides opportunities for flexible and dynamic traffic engineering. However, in current SDN systems, routing strategies are based on traditional mechanisms which lack in real-time modification and less efficient resource utilization. To overcome these limitations, deep learning is used in this paper to improve the routing computation in SDN. This paper proposes Convolutional Deep Reinforcement Learning (CoDRL) model which is based on deep reinforcement learning agent for routing optimization in SDN to minimize the mean network delay and packet loss rate. The CoDRL model consists of Deep Deterministic Policy Gradients (DDPG) deep agent coupled with Convolution layer. The proposed model tends to automatically adapts the dynamic packet routing using network data obtained through the SDN controller, and provides the routing configuration that attempts to reduce network congestion and minimize the mean network delay. Hence, the proposed deep agent exhibits good convergence towards providing routing configurations that improves the network performance.
In this paper, we consider the problem of decentralized verification for large-scale cascade interconnections of linear subsystems such that dissipativity properties of the overall system are guaranteed with minimum knowledge of the dynamics. In order to achieve compositionality, we distribute the verification process among the individual subsystems, which utilize limited information received locally from their immediate neighbors. Furthermore, to obviate the need for full knowledge of the subsystem parameters, each decentralized verification rule employs a model-free learning structure; a reinforcement learning algorithm that allows for online evaluation of the appropriate storage function that can be used to verify dissipativity of the system up to that point. Finally, we show how the interconnection can be extended by adding learning-enabled subsystems while ensuring dissipativity.
Cascading failure, which can be triggered by both physical and cyber attacks, is among the most critical threats to the security and resilience of power grids. In current literature, researchers investigate the issue of cascading failure on smart grids mainly from the attacker's perspective. From the perspective of a grid defender or operator, however, it is also an important issue to restore the smart grid suffering from cascading failure back to normal operation as soon as possible. In this paper, we consider cascading failure in conjunction with the restoration process involving repairing of the failed nodes/links in a sequential fashion. Based on a realistic power flow cascading failure model, we exploit a Q-learning approach to develop a practical and effective policy to identify the optimal way of sequential restorations for large-scale smart grids. Simulation results on three power grid test benchmarks demonstrate the learning ability and the effectiveness of the proposed strategy.
Attack graph approach is a common tool for the analysis of network security. However, analysis of attack graphs could be complicated and difficult depending on the attack graph size. This paper presents an approximate analysis approach for attack graphs based on Q-learning. First, we employ multi-host multi-stage vulnerability analysis (MulVAL) to generate an attack graph for a given network topology. Then we refine the attack graph and generate a simplified graph called a transition graph. Next, we use a Q-learning model to find possible attack routes that an attacker could use to compromise the security of the network. Finally, we evaluate the approach by applying it to a typical IT network scenario with specific services, network configurations, and vulnerabilities.
Recent attacks show that threats to cyber infrastructure are not only increasing in volume, but are getting more sophisticated. The attacks may comprise multiple actions that are hard to differentiate from benign activity, and therefore common detection techniques have to deal with high false positive rates. Because of the imperfect performance of automated detection techniques, responses to such attacks are highly dependent on human-driven decision-making processes. While game theory has been applied to many problems that require rational decisionmaking, we find limitation on applying such method on security games. In this work, we propose Q-Learning to react automatically to the adversarial behavior of a suspicious user to secure the system. This work compares variations of Q-Learning with a traditional stochastic game. Simulation results show the possibility of Naive Q-Learning, despite restricted information on opponents.