Visible to the public Biblio

Filters: Keyword is state spaces  [Clear All Filters]
2020-12-02
Tsiligkaridis, T., Romero, D..  2018.  Reinforcement Learning with Budget-Constrained Nonparametric Function Approximation for Opportunistic Spectrum Access. 2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP). :579—583.

Opportunistic spectrum access is one of the emerging techniques for maximizing throughput in congested bands and is enabled by predicting idle slots in spectrum. We propose a kernel-based reinforcement learning approach coupled with a novel budget-constrained sparsification technique that efficiently captures the environment to find the best channel access actions. This approach allows learning and planning over the intrinsic state-action space and extends well to large state spaces. We apply our methods to evaluate coexistence of a reinforcement learning-based radio with a multi-channel adversarial radio and a single-channel carrier-sense multiple-access with collision avoidance (CSMA-CA) radio. Numerical experiments show the performance gains over carrier-sense systems.

2018-02-27
Huang, L., Chen, J., Zhu, Q..  2017.  A Factored MDP Approach to Optimal Mechanism Design for Resilient Large-Scale Interdependent Critical Infrastructures. 2017 Workshop on Modeling and Simulation of Cyber-Physical Energy Systems (MSCPES). :1–6.

Enhancing the security and resilience of interdependent infrastructures is crucial. In this paper, we establish a theoretical framework based on Markov decision processes (MDPs) to design optimal resiliency mechanisms for interdependent infrastructures. We use MDPs to capture the dynamics of the failure of constituent components of an infrastructure and their cyber-physical dependencies. Factored MDPs and approximate linear programming are adopted for an exponentially growing dimension of both state and action spaces. Under our approximation scheme, the optimally distributed policy is equivalent to the centralized one. Finally, case studies in a large-scale interdependent system demonstrate the effectiveness of the control strategy to enhance the network resilience to cascading failures.