Biblio
This paper presents a secure reinforcement learning (RL) based control method for unknown linear time-invariant cyber-physical systems (CPSs) that are subjected to compositional attacks such as eavesdropping and covert attack. We consider the attack scenario where the attacker learns about the dynamic model during the exploration phase of the learning conducted by the designer to learn a linear quadratic regulator (LQR), and thereafter, use such information to conduct a covert attack on the dynamic system, which we refer to as doubly learning-based control and attack (DLCA) framework. We propose a dynamic camouflaging based attack-resilient reinforcement learning (ARRL) algorithm which can learn the desired optimal controller for the dynamic system, and at the same time, can inject sufficient misinformation in the estimation of system dynamics by the attacker. The algorithm is accompanied by theoretical guarantees and extensive numerical experiments on a consensus multi-agent system and on a benchmark power grid model.
In a centralized Networked Control System (NCS), all agents share local data with a central processing unit that generates control commands for agents. The use of a communication network between the agents gives NCSs a distinct advantage in efficiency, design cost, and simplicity. However, this benefit comes at the expense of vulnerability to a range of cyber-physical attacks. Recently, novel defense mechanisms to counteract false data injection (FDI) attacks on NCSs have been developed for agents with linear dynamics but have not been thoroughly investigated for NCSs with nonlinear dynamics. This paper proposes an FDI attack mitigation strategy for NCSs composed of agents with nonlinear dynamics under disturbances and measurement noises. The proposed algorithm uses both learning and model-based approaches to estimate agents'states for FDI attack mitigation. A neural network is used to model uncertain dynamics and estimate the effect of FDI attacks. The controller and estimator are designed based on Lyapunov stability analysis. A simulation of robots with Euler-Lagrange dynamics is considered to demonstrate the developed controller's performance to respond to FDI attacks in real-time.
There were many researches about the parameter estimation of canonical dynamic systems recently. Extended Kalman filter (EKF) is a popular parameter estimation method in virtue of its easy applications. This paper focuses on parameter estimation for a class of canonical dynamic systems by EKF. By constructing associated differential equation, the convergence of EKF parameter estimation for the canonical dynamic systems is analyzed. And the simulation demonstrates the good performance.
In this paper, we investigate the Bayesian filtering problem for discrete nonlinear dynamical systems which contain random parameters. An augmented cubature Kalman filter (CKF) is developed to deal with the random parameters, where the state vector is enlarged by incorporating the random parameters. The corresponding number of cubature points is increased, so the augmented CKF method requires more computational complexity. However, the estimation accuracy is improved in comparison with that of the classical CKF method which uses the nominal values of the random parameters. An application to the mobile source localization with time difference of arrival (TDOA) measurements and random sensor positions is provided where the simulation results illustrate that the augmented CKF method leads to a superior performance in comparison with the classical CKF method.
The problem of analytical synthesis of the reduced order state observer for the bilinear dynamic system with scalar input and vector output has been considered. Formulas for calculation of the matrix coefficients of the nonlinear observer with estimation error asymptotically approaching zero have been obtained. Two modifications of observer dynamic equation have been proposed: the first one requires differentiation of an output signal and the second one does not. Based on the matrix canonization technology, the solvability conditions for the synthesis problem and analytical expressions for an acceptable set of solutions have been received. A precise step-by-step algorithm for calculating the observer coefficients has been offered. An example of the practical use of the developed algorithm has been given.
In this paper, based on the Hamiltonian, an alternative interpretation about the iterative adaptive dynamic programming (ADP) approach from the perspective of optimization is developed for discrete time nonlinear dynamic systems. The role of the Hamiltonian in iterative ADP is explained. The resulting Hamiltonian driven ADP is able to evaluate the performance with respect to arbitrary admissible policies, compare two different admissible policies and further improve the given admissible policy. The convergence of the Hamiltonian ADP to the optimal policy is proven. Implementation of the Hamiltonian-driven ADP by neural networks is discussed based on the assumption that each iterative policy and value function can be updated exactly. Finally, a simulation is conducted to verify the effectiveness of the presented Hamiltonian-driven ADP.
The new criterion for selecting the frequencies of the test polyharmonic signals is developed. It allows uniquely filtering the values of multidimensional transfer functions - Fourier-images of Volterra kernel from the partial component of the response of a nonlinear system. It is shown that this criterion significantly weakens the known limitations on the choice of frequencies and, as a result, reduces the number of interpolations during the restoration of the transfer function, and, the more significant, the higher the order of estimated transfer function.
Multi-robot transfer learning allows a robot to use data generated by a second, similar robot to improve its own behavior. The potential advantages are reducing the time of training and the unavoidable risks that exist during the training phase. Transfer learning algorithms aim to find an optimal transfer map between different robots. In this paper, we investigate, through a theoretical study of single-input single-output (SISO) systems, the properties of such optimal transfer maps. We first show that the optimal transfer learning map is, in general, a dynamic system. The main contribution of the paper is to provide an algorithm for determining the properties of this optimal dynamic map including its order and regressors (i.e., the variables it depends on). The proposed algorithm does not require detailed knowledge of the robots' dynamics, but relies on basic system properties easily obtainable through simple experimental tests. We validate the proposed algorithm experimentally through an example of transfer learning between two different quadrotor platforms. Experimental results show that an optimal dynamic map, with correct properties obtained from our proposed algorithm, achieves 60-70% reduction of transfer learning error compared to the cases when the data is directly transferred or transferred using an optimal static map.
The MgO-based magnetic tunnel junction (MTJ) is the basis of modern hard disk drives' magnetic read sensors. Within its operating bandwidth, the sensor's performance is significantly affected by nonlinear and oscillating behavior arising from the MTJ's magnetization dynamics at microwave frequencies. Static I-V curve measurements are commonly used to characterize sensor's nonlinear effects. Unfortunately, these do not sufficiently capture the MTJ's magnetization dynamics. In this paper, we demonstrate the use of the two-tone measurement technique for full treatment of the sensor's nonlinear effects in conjunction with dynamic ones. This approach is new in the field of magnetism and magnetic materials, and it has its challenges due to the nature of the device. Nevertheless, the experimental results demonstrate how the two-tone measurement technique can be used to characterize magnetic sensor nonlinear properties.
Multipath TCP (MP-TCP) has the potential to greatly improve application performance by using multiple paths transparently. We propose a fluid model for a large class of MP-TCP algorithms and identify design criteria that guarantee the existence, uniqueness, and stability of system equilibrium. We clarify how algorithm parameters impact TCP-friendliness, responsiveness, and window oscillation and demonstrate an inevitable tradeoff among these properties. We discuss the implications of these properties on the behavior of existing algorithms and motivate our algorithm Balia (balanced linked adaptation), which generalizes existing algorithms and strikes a good balance among TCP-friendliness, responsiveness, and window oscillation. We have implemented Balia in the Linux kernel. We use our prototype to compare the new algorithm to existing MP-TCP algorithms.
We consider a class of robust optimization problems that we call “robust-to-dynamics optimization” (RDO). The input to an RDO problem is twofold: (i) a mathematical program (e.g., an LP, SDP, IP, etc.), and (ii) a dynamical system (e.g., a linear, nonlinear, discrete, or continuous dynamics). The objective is to maximize over the set of initial conditions that forever remain feasible under the dynamics. The focus of this paper is on the case where the optimization problem is a linear program and the dynamics are linear. We establish some structural properties of the feasible set and prove that if the linear system is asymptotically stable, then the RDO problem can be solved in polynomial time. We also outline a semidefinite programming based algorithm for providing upper bounds on robust-to-dynamics linear programs.